By Garvin Li
This article implements an ad click-through rate (CTR) prediction scenario. Ad CTR prediction is a typical application in the advertising industry. By using history data to train the prediction model, this prediction method predicts daily increment data, and finds and advertises samples that meet the ad CTR standard.
The whole experiment uses Alibaba Cloud Machine Learning to perform data mining and uses DataWorks to perform scheduling and pushing. Here is the specific business scenario:
The detailed fields are as follows:
Because data shown in the following screenshot is randomly generated by using the random algorithm, this experiment doesn't evaluate results, and mainly describes the experiment establishment and the use and scheduling of DataWorks. History data of 20160919 and 20160920 is used to predict 20160921 data. The MaxCompute partition table is used.
The following diagram shows the experiment process.
The experiment can be roughly divided into four modules: data source importing (ads), data pre-processing (normalization), model training (binary logistic regression), and predicting (prediction).
The intermediate process includes two steps: data normalization and model training. Model training is to use history data to train the generated prediction model. (For more principle details, please see Heart disease prediction case)
The list of prediction results is "ad_result-1", as shown below.
Go to the homepage of the console, click DataWorks to access the Data IDE workspace.
DataWorks and the machine learning platform share the same set of projects. Select the project where the experiment to be scheduled for is located, and click Start Data Modeling.
Click New and select New Task
In the configuration section of the created task, select Node Task for Task Type and Machine Learning for Type.
After the node task has been created, select the machine learning task to be scheduled for and select scheduling time in the configuration bar on the right side. In this experiment, we choose to perform training and push information at 00:00 each day.
Click Submit. Submitted jobs will be effective next day.
After the scheduling task has been submitted, click Maintain to view logs
To learn more about Alibaba Cloud Machine Learning Platform for Artificial Intelligence (PAI), visit www.alibabacloud.com/product/machine-learning
Alibaba Cloud Machine Learning Platform for AI: Air Quality Forecasting
Alibaba Clouder - October 18, 2019
Alibaba Cloud Native Community - March 29, 2024
Apache Flink Community China - March 29, 2021
Alibaba Container Service - February 12, 2021
Farruh - October 1, 2023
Apache Flink Community China - July 27, 2021
Alibaba Cloud provides big data consulting services to help enterprises leverage advanced data technology.
Learn MoreAlibaba Cloud experts provide retailers with a lightweight and customized big data consulting service to help you assess your big data maturity and plan your big data journey.
Learn MoreA platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.
Learn MoreMore Posts by GarvinLi