By Alex Mungai Muchiri, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
Database management systems play a much wider role in today's technology applications. The demands from internet of things (IoT), artificial intelligence (AI), and machine learning means that they have to go beyond the traditional role of managing data. Data is everywhere and it is shaping innovation in ways not previously thought possible. The use of sensors to achieve automation requires real-time database response. CrateDB serves a very useful role in IoT with its real-time SQL data management. In this first part of a two-series article, we explore the basics on how to use CrateDB with Python on an Alibaba Cloud Elastic Compute Service instance.
The old-fashioned database management system (DBMS) is not as old-fashioned as some might think. It has over the years been customized to meet data security, transformation, consistency and other useful features. It is also easy to maintain such systems since the advent of cloud computing has greatly removed challenges associated with them. Presently, reduced memory costs, increased storage and processing power helps in creating real-time ecosystems around the good old DBMS. Real-time data exchange is what makes CrateDB such an efficient IoT backend since 2013.
The Alibaba Cloud Elastic Compute Service is a fast and efficient platform to run your cloud applications. It scales better than traditional physical servers and costs a fraction of traditional deployments. Your IoT project can benefit immensely from the ECS plans. With the most advanced CPUs, your applications respond fast and are protected from DDoS and Trojans.
We are going to set up our database in an Ubuntu environment. Follow these instructions to setup your Ubuntu 16.04 server on Alibaba Cloud ECS.
To install your CrateDB Python Client, you should preferably have Python 3.4. Nonetheless, some older versions of Python are still supported. You should also have Pip installed in your system.
CrateDB's Python Client is a PyPI package installed by running the command line below on your Ubuntu 16.04 system:
sh$ pip install crate
Import the library after your package installs successfully by running the line below:
>>> from crate import client
Python is an interactive language, providing an interactive language shell or REPL. It come in handy when experimenting with new libraries or trying new code. In this tutorial, our recommendation is iPython, which is installed using the following command:
sh$ pip install iPython
Start it up is as shown below:
sh$ ipython
After you are done, import the Python Client libabry. You can attempt a variety of operations and familiarize yourself with iPython's functions by calling the help ()
function. Notably, you should be able to access it via any object.
The official PyPI package should be handled well within the various methods of handling Python dependencies. The official Python website lists many ways of doing so.
You can connect to a single node using the function connect ()
provided by CrateDB. Use the command below:
>>> connection = client.connect("<NODE_URL>", username="<USERNAME>")
Please note that is a link to a HTTP endpoint hosting CrateDB. If you are using authentication, replace the with the actual authentication username.
Also, the method above only works with later versions of CrateDB. If you use a previous version, you may want to check the documentation for the authentication methods in use. Some sample URLs as shown:
http://localhost:4200/
http://crate-1.vm.url.com:4200/
http://198.151.100.1:4200/
If you have hosted CrateDB on crate-1.vm.url.com:4200/
and listen to requests via port 4200, your node would look like http://crate-1.vm.url.com:4200/
The library automatically connects to a local host if a URL was not provided, with the default port being 4200 i.e. http://localhost:4200/
.
You will need to include all URLs of your nodes to connect to multiple ports. Use the format indicated
>>> connection = client.connect(["<NODE_1_URL>", "<NODE_2_URL>"], ...)
In this case, <NODE_1_URL>
and <NODE_1_URL>
represent actual URLs hosting your CrateDB as has been described previously. The number of nodes you can have are limitless. You should also be aware that the client attempts to connect to all nodes sequentially until a connection is established. After an unsuccessful attempt, a node is moved to the back of the list. When multiple queries are submitted at once, the method acts like a load balancer by distributing the requests to the nodes evenly.
The https
specification is one of the methods that is available for connecting to CrateDB. You will need to specify it in the provided URL:
>>> connection = client.connect('https://localhost:4200/', ...)
However, you must enforce SSL certificate verification since the default setting is unverified. If you need more information about setting up SSL on your database, follow this reference available on crate.io.
To enable server SSL validation, you will need to enable verification by setting the verify_ssl_cert
to True
. Before accessing that function, specify the certificate authority signing the SSL using the ca_cert
argument. Below is one way of attaining this objective:
>>> connection = client.connect(..., ca_cert="<CA_CERT_FILE>", verify_ssl_cert=True)
Remember that the <CA_CERT_FILE>
should be replaced with the path to the CA certificate.
Clients certificates validate SSL certificates at the client level. Use the command below to enable verification:
>>> connection = client.connect(..., cert_file="<CERT_FILE>", key_file="<KEY_FILE>")
You will need to include a path to the client certificate in the <CERT_FILE>
section. The <KEY_FILE>
points to the client private key file path. The most secure validation combines both client and server validation. Use both methods at once is that is your objective.
You can set the connection timeout
using in seconds using the timeout argument as shown:
>>> connection = client.connect(..., timeout=5)
Your other arguments fit into the ...
space. The client uses the Python socket timeout as default timeout if time is not specified.
You can have your server trace errors if connection attempts fail using the error_trace
argument; set it to True
.
>>> connection = client.connect(..., error_trace=True)
CrateDB introduced secure authentication in version 2.1.x. it is mandatory to supply the username in all later versions. To authenticate, use the command below:
>>> connection = client.connect(..., username="<USERNAME>", password="<PASSWORD>")
Remember to replace <USERNAME>
and <PASSWORD>
with the appropriate credentials to access the database. The default username for the CrateDB superuser is crate
and the password password
.
Use the schema
argument to select the desired schema using the argument below:
>>> connection = client.connect(..., schema="<SCHEMA>")
Don't forget to replace ...
the with your other arguments, and <SCHEMA>
with the chosen schema. Crate uses doc
as the default schema is no other schema is specified.
We are done with the initial setup, now let us move to make simple queries.
The CrateDB database API client uses a relatively familiar method to query the database. Furthermore, SQLAlchemy is very useful for this project, this link contains more information about the SQLAlchemy dialect.
You can use a database cursor to query CrateDB after establishing a connection. A sample cursor operation is as indicated:
>>> cursor = connection.cursor()
CrateDB allows both regular and bulk inserts for data.
The execute ()
method is used for regular data inserts. Execute using the command below:
>>> cursor.execute(
... """INSERT INTO locations (name, date, kind, position)
... VALUES (?, ?, ?, ?)""",
... ("Robert Cross", "2008-03-11", "Quasar", 7))
Keep in mind that you can have values of subsequent arguments interpolated into the string. To accomplish that, organize them as the ?
characters have been arranged. For instance, you could have the following values:
("Robert Cross", "2008-03-11", "Quasar", 7))
Building query strings using concatenation is never a good idea, it is always best to avoid that. Using the string interpolation of the client library guards against malicious input.
You can execute a bulk insert using the executemany()
method of the client library. It requires a list of second arguments as shown in the example below:
>>> cursor.executemany(
... """INSERT INTO locations (name, date, kind, position)
... VALUES (?, ?, ?, ?)""",
... [('Meiji', '2008-03-11', 'Quasar', 7),
... ('Napoleon', '2008-03-11', 'Quasar', 7)])
[{'rowcount': 1}, {'rowcount': 1}]
A result dictionary is returned by the method for each tuple indicating the number of rows that were inserted. The rowcount
value returned is -2
in the event of an error.
In the second part of this tutorial, we shall delve further into more sophisticated operations in the Python implementation for CrateDB.
CrateDB uses a variety of SQL, NoSQL and container technologies in its implementation and architecture. In this article, we have evaluated the basic methods of connecting a Python client and querying the database. IoT applications can benefit from the broad support of the Python Language.
We will be following through with further information on using CrateDB with the Alibaba Cloud Elastic Compute Service service.
Don't have an Alibaba Cloud account yet? Sign up for an account and try over 40 products for free worth up to $1200. Get Started with Alibaba Cloud to learn more.
Alex - September 18, 2018
Alex - July 26, 2018
Alibaba Clouder - May 28, 2018
Alibaba Clouder - August 2, 2018
Alibaba Container Service - August 30, 2024
ApsaraDB - October 14, 2021
Provides secure and reliable communication between devices and the IoT Platform which allows you to manage a large number of devices on a single IoT Platform.
Learn MoreA cloud solution for smart technology providers to quickly build stable, cost-efficient, and reliable ubiquitous platforms
Learn MoreMigrate your Internet Data Center’s (IDC) Internet gateway to the cloud securely through Alibaba Cloud’s high-quality Internet bandwidth and premium Mainland China route.
Learn MoreAlibaba Cloud (in partnership with Whale Cloud) helps telcos build an all-in-one telecommunication and digital lifestyle platform based on DingTalk.
Learn MoreMore Posts by Alex