By Hitesh Jethva, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
Apache Cassandra is a free and open source NoSQL database management system intended for storing large amounts of data in a decentralized, highly available cluster. It is specially designed to handle large amounts of data across many servers and providing high availability with no single point of failure. Cassandra data model is inspired by Google Bigtable and developed by Facebook for its Facebook inbox search feature. It differs sharply from relational database management systems.
Features
In this tutorial, we will install and configure a single node Apache Cassandra on Ubuntu 16.04 with an Alibaba Cloud Elastic Compute Service (ECS) instance.
First, log in to your https://ecs.console.aliyun.com">Alibaba Cloud ECS Console. Create a new ECS instance, choosing Ubuntu 16.04 as the operating system with at least 2GB RAM. Connect to your ECS instance and log in as the root user.
Once you are logged into your Ubuntu 16.04 instance, run the following command to update your base system with the latest available packages.
apt-get update -y
Apache Cassandra is a cross-platform application written in Java. So you will need to install the latest version of Java to your server. By default, the latest version of Java is not available in the Ubuntu 16.04 default repository. So you will need to add the repository for that,
You can do it by running the following command:
apt-get install python-software-properties -y
add-apt-repository ppa:webupd8team/java -y
Next, update the repository and install java with the following command:
apt-get update -y
apt-get install oracle-java8-installer -y
Once the Java is installed, check the Java version with the following command:
java -version
Output:
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
By default, Apache Cassandra is not available in the Ubuntu16.04 repository. So you will need to add Apache Software Foundation repository to your server.
First, add the repository with the following command:
echo "deb http://www.apache.org/dist/cassandra/debian 36x main" | tee -a /etc/apt/sources.list.d/cassandra.list
Next, add public key for Cassandra with the following command:
curl https://www.apache.org/dist/cassandra/KEYS | apt-key add -
Next, update the repository and install Cassandra using the following command:
apt-get install cassandra -y
Once Cassandra is installed, start Cassandra service and enable it to start on boot time with the following command:
systemctl start cassandra
systemctl enable cassandra
You can check the status of Cassandra with the following command:
systemctl status cassandra
You should see the following output:
cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/init.d/cassandra; bad; vendor preset: enabled)
Active: active (running) since Sun 2018-07-08 17:02:50 IST; 15s ago
Docs: man:systemd-sysv-generator(8)
CGroup: /system.slice/cassandra.service
├─6617 /bin/sh /usr/sbin/cassandra -p /var/run/cassandra/cassandra.pid -H /var/lib/cassandra/java_1531049570.hprof -E /var/lib/cassa
├─6817 java -cp /etc/cassandra:/usr/share/cassandra/lib/ST4-4.0.8.jar:/usr/share/cassandra/lib/airline-0.6.jar:/usr/share/cassandra/
└─6818 grep -q Error: Exception thrown by the agent : java.lang.NullPointerException
Jul 08 17:02:50 Node1 systemd[1]: Starting LSB: distributed storage system for structured data...
Jul 08 17:02:50 Node1 systemd[1]: Started LSB: distributed storage system for structured data.
Jul 08 17:03:00 Node1 systemd[1]: Started LSB: distributed storage system for structured data.
Apache Cassandra is now installed, it's time to verify Cassandra Cluster. You can test it using the nodetool:
nodetool status
You should see the following output:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 112.09 KiB 256 100.0% a4539bba-2394-4cff-b85d-d9c0dd564b0a rack1
Cassandra comes with built-in command line interface tool cqlsh. Before using cqlsh tool, you will need to install Cassandra driver to your system. You can install it with the following command:
apt-get install python-pip -y
pip install cassandra-driver
export CQLSH_NO_BUNDLED=true
Now, you can connect the Cassandra Cluster using the following command:
cqlsh
After connecting Cassandra Cluster, you should see the following output:
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.6 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>
Cassandra is now installed, it's time to use Cassandra.
Let's create a test database and keyspace. First, connect the Cassandra Cluster using the following command:
cqlsh
Next, create a test database and keyspace:
cqlsh> CREATE KEYSPACE testdb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
Next, use the keyspace testdb:
cqlsh> use testdb;
Next, create a table name mybooks:
cqlsh:testdb> CREATE TABLE mybooks (id int PRIMARY KEY, title text, year text);
Next, describe the table using the following command:
cqlsh:testdb> DESC mybooks;
Output:
CREATE TABLE testdb.mybooks (
id int PRIMARY KEY,
title text,
year text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
You can combine your newly deployed Cassandra database with Alibaba Cloud products for big data development.
ECS Bare Metal Instance is based on next-generation virtualization technology independently developed by Alibaba Cloud, featuring both the elasticity of a virtual server and the high-performance and comprehensive features of a physical server.Super Computing Cluster, based on Elastic Bare Metal (EBM) instances and high-speed interconnectivity of RDMA (Remote Direct Memory Access) technology, provides ultimate computing performance and parallel computing cluster services for high-performance computing.
2,599 posts | 762 followers
FollowAlibaba Clouder - August 19, 2019
Alibaba Clouder - February 13, 2018
Alibaba Clouder - October 18, 2018
francisndungu - May 29, 2019
francisndungu - May 29, 2019
Alibaba Clouder - June 14, 2018
2,599 posts | 762 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreA key value database service that offers in-memory caching and high-speed access to applications hosted on the cloud
Learn MoreA secure, reliable, and elastically scalable cloud database service for automatic monitoring, backup, and recovery by time point
Learn MoreMore Posts by Alibaba Clouder