This topic describes how to develop a demo project on Spark on MaxCompute by using Java or Scala.
Download a demo project
Spark on MaxCompute provides a demo project template. We recommend that you download and copy the template to develop your application.
Run the following commands to download the demo project template:
# Download and compile the Spark 1.x template.
git clone https://github.com/aliyun/MaxCompute-Spark.git
cd spark-1.x
mvn clean package
# Download and compile the Spark 2.x template.
git clone https://github.com/aliyun/MaxCompute-Spark.git
cd spark-2.x
mvn clean package
Notice In the demo project, the scope parameter for the Spark dependency is set to provided. Do not modify this parameter. Otherwise, the submitted job does not run normally.
Spark 1.x examples
Spark 2.x examples
Examples of a Spark 2.x demo project:
- WordCount example (Scala)
- Example of reading data from or writing data to a MaxCompute table (Scala)
- GraphX PageRank example (Scala)
- MLlib KMeans-ON-OSS example (Scala)
- OSS UnstructuredData example (Scala)
- SparkPi example (Scala)
- Spark Streaming LogHub example (Scala)
- Example of using Spark Streaming LogHub to write data to MaxCompute (Scala)
- Spark Streaming DataHub example (Scala)
- Example of using Spark Streaming DataHub to write data to MaxCompute (Scala)
- Spark Streaming Kafka example (Scala)
- Spark StructuredStreaming DataHub example (Scala)
- Spark StructuredStreaming Kafka example (Scala)
- Spark StructuredStreaming LogHub example (Scala)
- Example of using PySpark to read data from or write data to a MaxCompute table (Python)
- Example of using PySpark to write data to OSS (Python)
- Spark SQL example (Java)
- Example of reading data from MaxCompute and writing the data to HBase
- Examples of reading data from and writing data to OSS objects
- Example of reading data from MaxCompute and writing the data to OSS