Prophet is an open source time series forecast algorithm provided by Facebook for data that has specific patterns. The Prophet component of Platform for AI (PAI) forecasts time series data for each row of MTable data and provides the prediction result for the subsequent time period. This topic describes how to configure the Prophet component.
Limits
You can use Prophet based only on the computing resources of MaxCompute.
Configure the component in the PAI console
Input ports
Input port (left-to-right)
Data type
Recommended upstream component
Required
Input Data
N/A
Yes
Component parameters
Tab
Parameter
Description
Field Setting
valueCol
The data type is STRING, and the data format is MTable. You can use the MATBEL aggregation component to construct the data. You can use the data column of the Datetime type as the time series. Example of MTable:
{"data":{"ds":["2019-05-07 00:00:00.0","2019-05-08 00:00:00.0"],"val":[8588.0,8521.0]},"schema":"ds TIMESTAMP,val DOUBLE"}
.reservedCols
The columns that are reserved for the algorithm.
Parameter Setting
predictionCol
The name of the prediction result column.
cap
The upper limit of the predicted value.
changepoint_prior_scale
Default value: 0.05.
change_point_range
The proportion of trend change points. Default value: 0.8.
changepoints
The list of change points. Separate multiple change points with commas (,). Example:
2021-05-02,2021-05-07
.daily_seasonality
Specifies whether to fit seasonality by day. Default value: auto.
floor
The lower limit of the predicted value.
growth
The type of the trend. Valid values:
LINEAR (default value)
Logistic
Flat
holidays
Separate multiple holidays with spaces. Example:
playoff:2021-05-03,2021-01-03 superbowl:2021-02-07,2021-11-02
.holidays_prior_scale
Holiday model parameters. Default value: 10.0.
include_history
Specifies whether to predict the value that corresponds to the date in the original data.
interval_width
The uncertainty interval. Default value: 0.8.
mcmc_samples
The number of samples used for Bayesian inference. The value of this parameter is an integer. If this parameter is set to 0, the maximum a posteriori probability (MAP) estimation is performed. The default estimate value is 100.
n_change_point
The maximum number of change points. Default value: 25.
predictNum
Valid values: (0, inf). Default value: 12.
predictionDetailCol
The name of the prediction details column.
seasonality_mode
The seasonality mode. Valid values:
ADDITIVE (default value)
MULTIPLICATIVE
seasonality_prior_scale
The parameter of the seasonality model. Default value: 10.0.
stanInit
The initial value. This parameter is empty by default.
uncertaintySamples
Default value: 1000. Samples are used to calculate statistical metrics. If you do not need to calculate statistical metrics and you want to accelerate the prediction, set this parameter to 0.
weekly_seasonality
Specifies whether to fit seasonality by week. Default value: auto.
yearly_seasonality
Specifies whether to fit seasonality by year. Default value: auto.
numThreads
The number of threads of the component.
Execution Tuning
Number of Workers
The number of cores. This parameter must be used together with the Memory per worker, unit MB parameter. The value of this parameter must be a positive integer. Valid values: [1,9999].
Memory per worker, unit MB
The memory size of each worker node. Valid values: 1024 to 64 × 1024. Unit: MB.
Configure the component by using code
You can copy the following code to the PyAlink Script component to allow the PyAlink Script component to function in the same manner as the Prophet component.
import time, datetime
import numpy as np
import pandas as pd
downloader = AlinkGlobalConfiguration.getPluginDownloader()
downloader.downloadPlugin('tf115_python_env_linux')
data = pd.DataFrame([
[1, datetime.datetime.fromtimestamp(1), 10.0],
[1, datetime.datetime.fromtimestamp(2), 11.0],
[1, datetime.datetime.fromtimestamp(3), 12.0],
[1, datetime.datetime.fromtimestamp(4), 13.0],
[1, datetime.datetime.fromtimestamp(5), 14.0],
[1, datetime.datetime.fromtimestamp(6), 15.0],
[1, datetime.datetime.fromtimestamp(7), 16.0],
[1, datetime.datetime.fromtimestamp(8), 17.0],
[1, datetime.datetime.fromtimestamp(9), 18.0],
[1, datetime.datetime.fromtimestamp(10), 19.0]
])
source = dataframeToOperator(data, schemaStr='id int, ts timestamp, val double', op_type='batch')
source.link(GroupByBatchOp()
.setGroupByPredicate("id")
.setSelectClause("id, mtable_agg(ts, val) as data")
).link(ProphetBatchOp()
.setValueCol("data")
.setPredictNum(4)
.setPredictionCol("pred")
).link(FlattenMTableBatchOp()
.setSelectedCol("pred_detail")
.setSchemaStr("ds timestamp, yhat double")
).print()
References
You can use the MTable Expander component to expand a MTable to a table. For more information, see MTable Expander.
For information about Machine Learning Designer components, see Overview of Machine Learning Designer.
Machine Learning Designer provides various preset algorithm components. You can select a component for data processing based on your business scenarios. For more information, see Component reference: overview of all components.