By Jessie Angelica,Solution Architect Intern
DataWorks is the best platform for building big data warehouses, and provides features include Data Integration, DataStudio, Data Map, Data Quality, and DataService Studio. DataWorks supports the following compute engines: MaxCompute, E-MapReduce, Hologres, ADB for PostgreSQL and MySQL. DataWorks can synchronize batch data or real-time data between different data sources and allows you to scheduling of millions of tasks to streamline data processing.
Machine Learning Designer allows you to create a pipeline from a template or manually create a pipeline. You only need to drag and drop the components to the canvas. Set the parameters based on your needs and connect between the components to create a pipeline. Then, run the pipeline to fine-tune the trained model. After the pipeline is run, you can view the log and schedule the pipeline as a periodic task to allow the model generated by the pipeline to be automatically updated. You can view the analysis reports on the visualized dashboard and deploy the model to the EAS.
In the healthcare field, Alibaba Cloud could help to predict the risk of heart disease using DataWorks and PAI Designer. The data can be collected and clean it up with DataWorks, and use PAI Designer to build a predictive model. We fine-tune the model for accuracy and set it to run automatically with DataWorks. The trained model is then connected to healthcare systems for real-time risk assessments. We also use monitoring and integrate these predictions into patient care plans. This whole system aims to catch potential heart issues early on, offering personalized healthcare and continuously improving as we get more data and feedback.
1). In the Alibaba Cloud console, Go to DataWorks and click Create Workspace
2). Set your Workspace Name and Display Name, and select No in isolate Development and Production Environment. Then, Click Commit
3). Click Associate Now with MaxCompute.
4). Click Data Source in navigation page, and Add Data Source, then choose MaxCompute
5). Set the resource as shown in the following picture, and select Alibaba Cloud RAM Sub-Account. Click Test Connectivity, then Click Complete Create
6). Click Workspace, and Configure it
7). Select Data Source in DataStudio Modules. Click Data Source in the navigation page and Click Associate.
8). Back the Workspaces page. After a while, the status will be displayed as Normal, and the creation is successful.
Click Download to download data to a local file, will be used later.
(https://github.com/jessieangelica/heartdisease_data.git)
1). Go to DataStudio and click Create Workflow.
2). Set a name and click Create.
3). Create table and set the table name to "heart_data".
4). Select DDL mode. Copy the following content into it and Click Generate Table Schema.
5). Set the display name and Click Commit to the production environment.
6). Click Import data.
7). Select the "heart_data" table and click Next. Then, upload the "clevecp.txt" file you just downloaded and choose By Location. Click Import Data.
1). Go to Machine Learning Platform for AI and Click Designer
2). Click Create Pipeline. Set a name and click OK
3). After the Designer opened, we start to drag the Read Table node from the left to the node panel on the right. Set Table Name on the Table Selection tab page.
4). Drag an SQL script node to specify the data transmission direction. Click the node and copy the following content. Click Run. After the execution is successful, you can view the data for each node.
5). Create a Type Conversion node. Select all fields and click Confirm to convert all field types to the double type. Click Run the current node
6). Create a Normalized node. Select all fields and click Confirm. Click Run the current node
7). Create a Split node to perform training and testing, and set Split Ratio to 0.7. Click Run the current node.
8). Create a Logistic Regression node for binary classification. Then, Create a Prediction node to predict the model. For both, Click Feature Columns and select 13 fields, excluding ifhealth, and Click Reserved Output Column and select only ifhealth. Click Run the current node
9). Create a Confusion Matrix evaluation node. Select only ifhealth to original label column. Click Run the current node.
10). Create a Binary Classification Evaluation node. Select only ifhealth to original label column. Click Run the current node.
Key Management Service in Action: A Hands-On Guide for Data Encryption
99 posts | 15 followers
Follow5055118765133237 - January 17, 2023
Alibaba Clouder - August 12, 2020
JDP - November 11, 2021
Alibaba Cloud Project Hub - January 20, 2021
Alibaba Clouder - May 12, 2021
Alibaba Cloud Indonesia - November 22, 2023
99 posts | 15 followers
FollowA platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreAlibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreConduct large-scale data warehousing with MaxCompute
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreMore Posts by Alibaba Cloud Indonesia