Help Center> MapReduce Service> Getting Started> Using Kafka from Scratch
None

Getting Started with Kafka

MapReduce Service (MRS) provides enterprise-level big data clusters on the cloud. Tenants can fully control clusters and easily run big data components such as Hadoop, Spark, HBase, Kafka, and Storm.
This document describes how to use Hadoop to submit a wordcount job in normal and security clusters from scratch. A wordcount job is the most classic Hadoop job that counts words in massive amounts of text.

Step 1: Buy a Cluster

① Go to the Buy Cluster page.
② Click the Custom Config tab on the cluster purchase page.

1

MapReduce Service

Apply for a VPC.

2

Buy a Cluster

Apply for an ECS.

View Image

Step 2: Configure Software

① In Region, select a desired region.

② In Billing Mode, select Pay-per-use.
③ In Cluster Name, enter mrs_demo or specify a name according to naming rules.

④ In Cluster Type, select Streaming cluster.

⑤ In Version Type, select Normal.
⑥ In Cluster Version, select MRS 3.1.0.
⑦ Select all components of an streaming cluster. Use the default values for other parameters.

⑧ Click Next.

1

Configure Software - 01

Select the charging mode.

2

Configure Software - 02

View Image

Step 3: Configure Hardware

① In AZ, select AZ2.

② In Enterprise Project, select default.

③ Use the default values for VPC and Subnet, or click View VPC to create a VPC.
④ In Security Group, use the default value Auto create.
⑤ In EIP, use the default value Bind later.
⑥ In Cluster Node, use the default values of instance specifications for Master and Core nodes. Use the default values for the node count as well as data disk type and size. Do not add Task nodes.

⑦ Click Next.

1

Configure Hardware - 01

Obtain the instance's connection address.

2

Configure Hardware - 02

Download and install a client.

View Image

Step 4: Set Advanced Options

① Kerberos Authentication: Disable Kerberos authentication.

② Username: name of the Manager administrator. admin is used by default.

③ Set Password and Confirm Password to the password of the Manager administrator.

④ Set Login Mode to Password, and enter the password and confirm password for user root.

⑤ Retain the default value of Hostname Prefix.

⑥ Retain the default value of Set Advanced Options.

⑦ Click Next.

1

Set Advanced Options

Obtain the instance's connection address.

View Image

Step 5: Confirm Configuration

Configure displays the configuration information of the purchased cluster.

 Secure Communications: Select Enable

③ Click Buy Now. The page is displayed showing that the task has been submitted.
④ Click Back to Cluster List. You can view the status of the cluster on the Active Clusters page. It takes some time to create a cluster. The initial status of the cluster is Starting. After the cluster has been created successfully, the cluster status becomes Running.

1

Confirm Configuration

Obtain the instance's connection address.

View Image

Step 6: Installing the Kafka Client

① By default, the Kafka client is not automatically installed in the MRS 3.1.0 cluster. You need to perform this step. If the Kafka client has been installed, skip this step and go to Step 6.
② Choose Clusters > Active Clusters. On the Active Clusters page, click the cluster named mrs_demo to go to its details page.

③ Click Access Manager next to MRS Manager. On the page that is displayed, configure the EIP information and click OK. Enter the username and password to access FusionInsight Manager.

④ Choose Cluster > Services > HBase. On the page displayed, choose More > Download Client. In the Download Cluster Client dialog box, select Complete Client for Select Client Type, select a platform type, select Save to Path, and click OK. The Kafka client software package, for example, FusionInsight_Cluster_1_Kafka_Client.tar, is downloaded.

⑤ Use WinSCP as user root to upload the obtained software package to the directory on the server where the client is to be installed, for example, /opt/hadoopclient.
If the installation directory does not exist, the installation directory will be automatically created. However, if it exists, it must be empty. The directory path cannot contain spaces.

⑥ Log in to the active node as user root.

⑦ Go to the directory where the software package is stored and run the following commands to decompress and verify the software package, and decompress the obtained installation file:

cd /tmp/FusionInsight-Client

tar -xvf FusionInsight_Cluster_1_Kafka_Client.tar

sha256sum -c FusionInsight_Cluster_1_Kafka_ClientConfig.tar.sha256

tar -xvf FusionInsight_Cluster_1_Kafka_ClientConfig.tar

⑧ Go to the directory where the installation package is stored, and run the following command to install the client to a specified directory (absolute path), for example, /opt/hadoopclient:

cd /tmp/FusionInsight-Client/FusionInsight_Cluster_1_Kafka_ClientConfig

Run the ./install.sh /opt/hadoopclient command and wait until the client installation is complete.

⑨ Check whether the client is installed.

cd /opt/hadoopclient

source bigdata_env

Run the klist command to query and confirm authentication details. If the command is executed, the Kafka client is installed.

 

1

Installing the Kafka Client - 01

Obtain the instance's connection address.

1

Installing the Kafka Client - 02

Obtain the instance's connection address.

1

Installing the Kafka Client - 03

Obtain the instance's connection address.

View Image

Step 7: Log In to the Master Node Using VNC

① In the MRS management console, choose Clusters > Active Clusters. In the cluster list, click mrs_demo to switch to the cluster details page. On the Nodes tab page, locate the ECS, for which Type is Master1, and click the node name to switch to the ECS details page.
② Click Remote Login to remotely log in to the Master node by using user root and the password specified during cluster creation.

1

Log In to the Master Node Using VNC

Obtain the instance's connection address.

View Image

Step 8: Use the Kafka Client to Create a Topic

① Configure environment variables.
source /opt/client/bigdata_env
② If Kerberos authentication is enabled for the cluster, run the kinit admin command to authenticate the user and enter the password of user admin set during cluster creation as prompted. If Kerberos authentication is disabled for the cluster, skip this step.

③ In the MRS management console, choose Clusters > Active Clusters. In the cluster list, click mrs_demo to switch to the cluster details page.
④ Choose Components > ZooKeeper > Instances to view the IP addresses of the ZooKeeper instances.
Record the IP address of any ZooKeeper instance, for example, 192.168.0.237.
⑤ Run the following command to create a Kafka topic: 
kafka-topics.sh --create --zookeeper <IP address of the node where the ZooKeeper instance is located:2181/kafka> --partitions 2 --replication-factor 2 --topic <Topic name>
The figure on the right shows how to create a topic named test.

1

Configure Environment Variables

Obtain the instance's connection address.

2

Obtain the IP Address of the ZooKeeper Instance

Download and install a client.

3

Create a Topic

Download and install a client.

View Image

Step 9: Manage Messages in the Topic

① Choose Components > Kafka > Instances to view the IP addresses of the Kafka instances.
Record the IP address of any Kafka instance, for example, 192.168.0.237.
② Produce messages in topic test
Run the following command first:
kafka-console-producer.sh --broker-list <IP address of the node where the Kafka instance is located:9092> --topic <Topic name> --producer.config /opt/client/Kafka/kafka/config/producer.properties
And then, input specified information as the messages produced by the producer and then press Enter to send the messages. To stop producing messages, press Ctrl+C to exit.
③ Consume messages in topic test
kafka-console-consumer.sh --topic <Topic name> --bootstrap-server <IP address of the node where the Kafka instance is located:9092> --new-consumer --consumer.config /opt/client/Kafka/kafka/config/consumer.properties

Note: If Kerberos authentication is enabled in the cluster, change the port number 9092 to 21007 when running the preceding two commands. For details, see List of Open Source Component Ports.

 

1

Obtain the IP Address of the Kafka Instance

Obtain the instance's connection address.

1

Manage Messages in the Topic

Obtain the instance's connection address.

View Image