The Kafka service is supported in E-MapReduce (EMR) V3.4.0 and later.
Create a Kafka cluster
In the EMR console, create a Kafka cluster. For more information, see Create a Dataflow-Kafka cluster.
Kafka clusters with local disks
When you deploy the Kafka service on instances that use local disks, you must configure the parameters described in the following table on the Configure tab of the Kafka service in the EMR console.
Parameter | Description |
---|---|
default.replication.factor | The value is fixed to 3, which indicates that each topic has three replicas. |
min.insync.replicas | The value is fixed to 2, which indicates that the number of replicas is greater than or equal to 2. A write is successful only if the producer sets the request.required.acks parameter to all or -1 and the number of replicas that acknowledge the write is greater than or equal to 2. |
Parameters
You can view the configurations of the Kafka service on the Configure tab of the Kafka service in the EMR console.
Parameter | Description |
---|---|
zookeeper.connect | Specifies the hostname and port of the ZooKeeper server that is connected to the Kafka cluster. |
kafka.heap.opts | The heap memory size of the Kafka broker. |
num.io.threads | The number of I/O threads of the Kafka broker. The default value is twice the number of CPU cores of the master node. |
num.network.threads | The number of network threads of the Kafka broker. The default value is the number of CPU cores of the master node. |