This topic describes how to use a pipeline template provided by Machine Learning Designer of Platform for AI (PAI) to build a model for identifying users who steal electricity or are involved in electricity leakage. This significantly reduces the inspection workload of electrical inspection staff and ensures normal and safe electricity usage.
Background information
The traditional methods of identifying electricity theft and leakage and metering device failures include regular inspection, regular check of electricity meters, and reporting of electricity theft and leakage from users. These methods require manual operations and are inefficient in identifying users who steal electricity or are involved in electricity leakage. The staff of power supply bureaus use the existing automated system for metering electricity usage. To monitor electricity usage online, the staff use the system to trigger alerts for abnormal electricity usage and query electricity usage data. The system collects data about abnormal electricity usage, abnormal load, alerts reported by terminals and primary sites, and abnormal line loss. This way, relevant staff can identify electricity theft, electricity leakage, and metering device failures. After alerts are triggered, the system builds models for analyzing abnormal electricity usage based on the currents, voltages, and loads before and after the alert time. This also helps relevant staff identify electricity theft, electricity leakage, and metering device failures.
The existing automated system for metering electricity usage can monitor abnormal electricity usage. However, precise identification of users who steal electricity or are involved in electricity leakage is difficult due to frequent false positives and false negatives. In addition, experts need to determine the weight of each metric for the model based on their knowledge and experience. This process is subjective and results in unsatisfactory effects.
The existing automated system for metering electricity usage can collect electricity load data, such as the current, voltage, and power data, and alert data that terminals report. The load data can reflect the electricity usage of users. Electrical inspection staff can collect electricity theft and leakage data from the online inspection system or by conducting on-site inspection, and then record the data in the system. Based on the preceding data, PAI can abstract key features of users who steal electricity or are involved in electricity leakage and build a model for identifying these users to automatically identify electricity theft or leakage. This significantly reduces the inspection workload of electrical inspection staff and ensures normal and safe electricity usage.
Prerequisites
A workspace is created. For more information, see Create a workspace.
MaxCompute resources are associated with the workspace. For more information, see Manage workspaces.
Datasets
The following table describes the fields in the datasets that are used in this topic.
Field | Type | Parameter |
power_usage_decline_level | BIGINT | The electricity usage trend. |
line_loss_rate | BIGINT | The line loss. |
warning_num | BIGINT | The number of alerts. |
is_theff | BIGINT | Indicates whether users steal electricity or are involved in electricity leakage. |
Procedure
Go to the Machine Learning Designer page.
Log on to the PAI console.
In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace that you want to manage.
In the left-side navigation pane, choose to go to the Machine Learning Designer page.
Create a pipeline.
On the Visualized Modeling (Designer) page, click the Preset Templates tab.
On the Preset Templates tab, find the Power Theft Identification template and click Create.
In the Create Pipeline dialog box, configure the parameters. You can use their default values.
The value specified for the Pipeline Data Path parameter is the Object Storage Service (OSS) bucket path of the temporary data and models generated during the runtime of the pipeline.
Click OK.
It requires about 10 seconds to create the pipeline.
On the Pipelines tab, double-click the Power Theft Identification pipeline to enter the pipeline.
View the components of the pipeline on the canvas as shown in the following figure. The system automatically creates the pipeline based on the preset template.
Section
Description
①
The components displayed in this section perform statistical analysis.
The Correcoef component analyzes the impact of each feature on determining whether users steal electricity or are involved in electricity leakage.
The Feature Meta Runner component visualizes the distribution of data in the feature columns and the label column. In this pipeline, the feature columns are power_usage_decline_level, line_loss_rate, and warning_num. The label column is is_theft.
②
The Split component divides the dataset into a training dataset and a prediction dataset at the ratio of
8:2
.③
The Logistic Regression component is used to perform regression modeling on the training dataset. In the training dataset of this pipeline, the feature columns are power_usage_decline_level, line_loss_rate, and warning_num. The label column is is_theft.
④
The Prediction component predicts the result of applying the model to the prediction dataset. The Evaluate component evaluates the prediction accuracy.
Run the pipeline and view the results.
Click the run icon in the upper part of the canvas to run the pipeline.
After you run the pipeline, right-click the Correcoef component on the canvas and select Visual Analysis.
In the Corrcoef section, view the impact of each feature on determining whether users steal electricity or are involved in electricity leakage.
Each of the power_usage_deline_level, line_loss_rate, and warning_num features does not have an obvious correlation with is_theft. This means that whether users steal electricity or are involved in electricity leakage is determined based on multiple features.
Right-click Evaluate on the canvas and click Visual Analysis.
In the Evaluation Report dialog box, click the Charts tab to view the model evaluation indexes.
An AUC value that is closer to 1 indicates higher prediction accuracy of the model. In the preceding figure, the AUC value is greater than 0.8. This indicates that the prediction accuracy of the model is high.
References
For more information about algorithm components, see Correlation Coefficient Matrix and Data Pivoting.
You can use Machine Learning Designer to perform other AI development tasks. For more information, see Overview of Machine Learning Designer.