By Alex Mungai Muchiri, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
The Internet of Things is on the rise, ushering in a new possibility in device interconnectivity. Prior to the advent of the concept, using data to control vehicles, machines, home appliances and even entire cities seemed like a dream. In the present, however, IoT applications store and analyze data collected by millions of embedded sensors to enable autonomous operations of physical systems. Real-time response in IoT is critical for smooth systems operations and this requires a database to do the following:
The sheer scale of data volumes and transaction intensity, as well as the complexity of analysis, integration, and machine learning have placed IoT developers up against the wall. Thankfully, CrateDB was designed to handle IoT complexity with ease. In this tutorial, we are going to look at how to deploy CrateDB for your application on an Alibaba Cloud Elastic Compute Service (ECS) server running on Ubuntu 16.04.
Before we begin with the tutorial, here are some important facts about CrateDB and its ability to act as a backbone to support demanding IoT workloads:
CrateDB is based on a shared-nothing architecture. In the setup, a cluster of synchronized nodes coordinate to execute read and write operations in a distributed manner. Notably, since operations are distributing across the cluster, database capacity depends on the number of cluster nodes. The database also incorporates automated sharding to improve fault tolerance and workload balancing. The final attribute about its architecture is its native containerization using Docker or Kubernetes. The figure below demonstrates CrateDB architectural configuration:
(Source: crate.io)
There are CrateDB versions for Debian GNU/Linux, Ubuntu, and RedHat Linux. However, you don't need to have the specific package if you choose One-Step Setup. The assumption is that your Alibaba ECS runs on either of the three, in which case, run the following command to install CrateDB:
sh$ bash -c "$(curl -L install.crate.io)"
The command also takes care of Java 8 installation and other housecleaning tasks.
In this step, we are going to import some test data to experiment with our CrateDB. You should be able to access the web administration user interface (UI) on your server by now. We shall be using it to import test data. Use the http://localhost:4200/ URL to access the interface on your server browser. Replace "localhost" with "CrateDB" if the first use case doesn't work.
We are now going to import some tweets using the Help tab on the admin UI, accessible on your left.
Follow the instructions after selecting Import Tweets for Testing. Try to settle for about 1000 tweets after which you should select the Tables option on the navigation menu. It should lead to the interface below:
The next phase is learning how to query CrateDB using the query console on the admin interface. Access the console interface from the Console tab on the navigation menu
We are going to use this screen to perform query operations of previously imported tweets. Use the Tables tab if you want to see the schema layout. In this example we are going to execute a simple filter query from the tweets collected in our database using the query below:
SELECT *
FROM tweets
WHERE account_user['followers_count'] > 100
LIMIT 100;
The output is tweets from users with more than 100 followers. Once typed into the query console, the operation queries the attributes of the object account_user
and selects those whose followers_count
is more than 100.
The screenshot below shows the output of the query:
CrateDB supports a wide range of query capabilities such as full-text search, distributed aggregation, as well as more complex scalar functions and data analysis. There's a whole range of capabilities included in the CrateDB query reference. Below are three other ways to query CrateDB
This command-line comes with CrateDB. Get it started using the following command to connect CrateDB running on your host.
sh$ ./bin/crash
Executing queries on CrateDB is as simple as typing the queries and hitting enter as shown in the example below:
SELECT account_user['followers_count']
FROM tweets
ORDER BY account_user['followers_count'] DESC
LIMIT 10;
CrateDB Shell has an auto-completion feature as you would notice when using this feature. The response would be something like this:
+---------------------------------+
| account_user['followers_count'] |
+---------------------------------+
| 1416583 |
| 1076527 |
| 1025614 |
| 886577 |
| 854054 |
| 818439 |
| 795778 |
| 761014 |
| 740071 |
| 673962 |
+---------------------------------+
SELECT 10 rows in set (0.003 sec)
The CrateDB HTTP API can also run queries if you are familiar with the method.
Example
SELECT COUNT(*) FROM tweets
We shall use a query to count the number of tweets in our database using HTTPie on localhost:4200
. Execute the function below:
sh$ http localhost:4200/_sql stmt="SELECT COUNT(*) FROM tweets"
The JSON response should be as follows:
HTTP/1.1 200 OK
content-length: 71
content-type: application/json; charset=UTF-8
{
"cols": [
"count(*)"
],
"duration": 11.847271,
"rowcount": 1,
"rows": [
[
3879
]
]
}
This response indicates that we have 3879 counts of one row and one column.
While CrateDB only ships with admin UI and Crash, there are many other tools that can be used on CrateDB as listed on this blog: clients category.
Alright, let's get started with the PDO driver. In this phase, we assume that you already have Alibaba Cloud ECS up and running on Ubuntu 16. Ensure that you have a Composer to get started on the project.
Include the PDO driver package in the composer.json file as shown below:
{
"require": {
"crate/crate-pdo":"~1.0.0"
}
}
Install the package like so:
sh$ composer install
In your PHP application, use the line below to require the autoload.php file in the Composer.
require DIR . '/vendor/autoload.php';
Now we can connect to CrateDB after this task
Noteworthy, PHP PDO uses Data Source Name (DSN) strings as shown by this basic version
crate::
Just remember to replace <HOST_ADDR>
with the host address of your Alibaba Cloud ECS where it is installed and <PORT>
with a valid HTTP endpoint port number. Below are some sample DSN strings
crate:localhost:4200
crate:crate-1.vm.example.com:4200
crate:198.51.100.1:4200
If you have a cluster of nodes, just remember to use a number to specify the respective node as shown:
crate::,:
There's no limit to the number of nodes you can have on your cluster, but ensure all nodes are separated using the comma (,) character.
Your client will attempt to connect to each of your cluster's nodes until one of them connects, with each node being loved to the last position after an unsuccessful connection attempt. CrateDB is able to achieve some sort of round-robin load balancing using this behavior.
Now, if you want to connect to a specific schema, specify it as below:
crate::/
The <SCHEMA>
should be the name of a schema in your database, if you don't, the default 'doc' will be used. Based on this requirement, we can have the following DNS strings
crate:localhost:4200/iot_schema
crate:crate-1.vm.url.com:4200,crate-2.vm.url.com:4200
crate:198.161.100.1:4200,198.161.100.2:4200/my_schema
Now we are ready to get connected. The setup below will get your PDO connection up and running
use Crate\PDO\PDO as PDO;
$dsn = '<DATA_SOURCE_NAME>';
$user = 'crate';
$password = null;
$options = null;
$connection = new PDO($dsn, $user, $password, $options);
CrateDB 2.1.x supports authentication, but earlier models do not require that you supply a username and password. Furthermore, CrateDB superuser does not require a password, which means that you don't need the argument above. However, that should only apply if you have not configured a custom database user.
Use the commands below to configure the settings once the connection is established
$connection->setAttribute(, );
The <ATTRIBUTE>
section requires a PDO attribute class constant, as indicated in this setAttribute guide.
The <VALUE>
is a value for the corresponding class constant, something like this:
PDO::ATTR_TIMEOUT
The attribute above represents timeout in seconds, whereby the HTTP connection drops after that time elapses. The default is 0.
There are several attributes associated with the PDO driver such as,
PDO::CRATE_ATTR_DEFAULT_SCHEMA (string)
It is the default PDO connection schema. Nonetheless, it is possible to query any schema in the database.
Under the SSL mode, you can specify the following:
PDO::CRATE_ATTR_SSL_MODE (int) (named attribute)
Disabling the SSL mode:
CRATE_ATTR_SSL_MODE_DISABLED (default)
Disable requirement for verification in SSL mode:
CRATE_ATTR_SSL_MODE_ENABLED_BUT_WITHOUT_HOST_VERIFICATION
Perform verification mode in SSL mode:
CRATE_ATTR_SSL_MODE_REQUIRED
Link up to client key file
PDO::CRATE_ATTR_SSL_KEY_PATH (string)
SSL client password
PDO::CRATE_ATTR_SSL_KEY_PASSWORD (string)
SSL certificate path
PDO::CRATE_ATTR_SSL_CERT_PATH (string)
File with SSL certificate password
PDO::CRATE_ATTR_SSL_CERT_PASSWORD (string)
SSL Certificate Authority (CA) file
PDO::CRATE_ATTR_SSL_CA_PATH (string)
There is a whole range of operations to use for setting up SSL certificates in this reference.
There are a number of supported fetch modes available for CrateDB PDO:
And that's how you set up CrateDB using PHP PDO on Alibaba cloud running on Ubuntu or any Linux variant. You can find further instructions here in case you are stuck (http://www.php.net/manual/en/intro.pdo.php ).
There are a few other things you may want to know about why to use CrateDB.
Almost everyone knows how to work with SQL, and that makes CrateDB a powerful and easy to integrate within your IoT application. The database is also very compatible with most tools used in other SQL implementations such as PostgreSQL as well as REST interfaces.
The combination of SQL and NoSQL makes CrateDB a familiar yet highly scalable database. The developers achieved this by using distributed SQL and opensource NoSQL technologies. Specifically, the following technologies are used in CrateDB:
Flexible schemas are a big advantage of working with CrateDB. Unlike rigid traditional schemas, CrateDB stores relationships in JSON documents, which can easily change the structure. For instance, if we were to use 200 sensors in our IoT application, CrateDB stores all that data in one table, which is 40 times faster to work with compared to a traditional SQL model.
An IoT system handles millions of data inputs from machines. CrateDB can write tens of thousands of inputs to each node every second and still handle data queries concurrently. Inserts are consistent and non-blocking, and amazing still, the issue of data durability and consistency has been addressed as well. The write-ahead logging handles durability while optimistic concurrency control ensures consistency of data. Furthermore, a refresh frequency setting at table-level enforces data consistency periodically.
CrateDB uses in-memory columnar indexing to achieve real-time responses. On each node, there is a memory resident column alleviates the data limitations associated with real-time databases. It is thus easy to use cached data to determine if there are any records meeting the query criteria on a specific node. The distribution of querying ensures fast performance as well.
CrateDB is written in Java to take care of distributed processing needs in a typical IoT project. You can run it on data centers, or at the edge before channeling data to a central cloud platform.
In summary, CrateDB is one of the most potent IoT databases available right now. In this tutorial, we have looked into how to use it with PHP PDO and evaluated some of its characteristics. However, at the moment, CrateDB does not support direct device-to-database integration. Nonetheless, the database should handle nearly all your IoT project requirements with ease.
Do you have an Alibaba Cloud account yet? Sign up for an account and try over 40 products for free worth up to $1200. Get Started with Alibaba Cloud to learn more.
Alex - September 18, 2018
Alex - September 18, 2018
francisndungu - August 22, 2021
Alibaba Clouder - February 27, 2019
Alibaba Clouder - April 28, 2018
H Ohara - May 8, 2024
Provides secure and reliable communication between devices and the IoT Platform which allows you to manage a large number of devices on a single IoT Platform.
Learn MoreA cloud solution for smart technology providers to quickly build stable, cost-efficient, and reliable ubiquitous platforms
Learn MoreMigrate your Internet Data Center’s (IDC) Internet gateway to the cloud securely through Alibaba Cloud’s high-quality Internet bandwidth and premium Mainland China route.
Learn MoreAlibaba Cloud (in partnership with Whale Cloud) helps telcos build an all-in-one telecommunication and digital lifestyle platform based on DingTalk.
Learn MoreMore Posts by Alex