By Afzaal Ahmad Zeeshan, Alibaba Cloud Community Blog author and Alibaba Cloud MVP.
Thankfully, those days are long gone when you had to excessively spend your dollars to store your data on the data server even when you do not use it that frequently or so to say anytime sooner. With the massive technical revolution and improvement almost in every field, the methods of storing, accessing and transferring information have changed seamlessly ¨C so has the ways of collecting, storing and protecting data. While moving along in this article first I will talk about the basic concepts of multi-temperature (is that what we call it from now on?) data storage and then we will gravitate the discussion towards Alibaba Cloud Object Storage Service (OSS) and how it provides different classification to cater the requirements of hot data and cold data without incurring excessive cost, because it's imperative to an organization's bottom line that it should not pay more than what it consumes.
Also, one major thing that I sometimes wonder is, that the terms of "hot" and "cold" are misunderstood in several scenarios. To better understand the purpose of the term, "hot" think of it as a viral or trending subject. Everybody wants to visit and to revisit it again, and again.
And yes, that brings us to the term, "cold", well that is basically just something that is not hot!
To begin with, let's first talk how data and its storage mechanism have been categorized based on different interest or access priorities, metaphorically. The data which is accessed most frequently, is thus stored in the nearest or closest spots from the accessing points such as solid state or flash drives and CPU can be called hot data. Whereas, the data which comparatively is less accessible or required is termed as warm data, and the data which has most likely very rare or no chances to be accessed and placed on the slowest storage medium is termed as cold data.
The real-world examples are
These are the examples of hot, warm and cold data respectively. However, probably for the generalization, warm data type is the somehow combined either with hot or cold data depending upon its inclination and usage. I will show how this is not the case on Alibaba Cloud, as Alibaba Cloud provides 3 tiers for data storage and can help us further expand the concept of hot vs warm vs cold data.
Therefore, depending upon the cold and hot nature of data, the major cloud storage vendors have tailored their storage plans; Alibaba Cloud Object Storage Service (OSS). Other cloud vendors such as Microsoft Azure, AWS and Google Cloud Platform have their own services and their respective titles to support these features.
Per say, hot storage will require premium storage and cloud resources due to its nature to be resource-intensive and highly in demand in nature. Moreover, business-oriented organizations require no data delay in their users' queries, and for this, data needs to be in the hottest tier using solid-state-drives for the performant transactional rate.
Similarly, since cloud storage also provides options of cheap (old, magnetic) hard disk drives that have a huge savings on the costs but with a performance penalty. Customers can purchase this service in order to have their data backed up and stored. In my previous article, I discussed backups and storage mediums for database backup. If you remember, we did try to look at the Alibaba Cloud OSS as a platform of data storage. If we would like to have our data backups kept for a couple of weeks, then maybe the coldest storage class might not be a good option-see the "Gotcha!" section at the bottom of this post.
Almost every new software application requires data to be written and read from. Every small to large-scaled business application require data, per say, to render user information or making a critical financial transaction. If we remove the category of the hot vs. cold data, even then the data structure has some roles that are played:
As the application becomes stable, the data requirements grow rapidly; complex views and queries, data size, backup strategies, scalability, consistency and latency. Over time, these expanding requirements have challenged cloud storage providers to take over, as teams want to focus on the core business growth rather than taking care of managing data and its issues, such as replication, high-availability, caching and pricing.
Likewise, out of couple of data services provided by Alibaba Cloud, OSS (Object Storage Service) is one of the highly recommended data storage service which offers you a wide range of features to deal with your data; free data upload, reliable storage, region-based backups, migration and cost-effective migration. The best part? You can always configure how your data is stored and how you are charged for that.
This is my home page for the Alibaba Cloud OSS, you can see there are 3 buckets in this service. I am using one of them to store the data, that doesn't have to be access too frequently. Then I have another bucket, which contains most of the content that I have hosted on the cloud. Finally, I am having a static website that was developed using React and is being hosted on Alibaba Cloud OSS, as a static website.
All three of these buckets have their own different configurations, and the pricing models as well as their replication, or high-availability settings are different, that are chosen for their own use cases.
This is an example of how you can create a bucket in Hot region, this bucket would be created with resources that provide highest performance when accessed from. Similarly, during the creation of the buckets, you can mark the bucket as Archive storage bucket, and it would be configured to manage such.
Alibaba Cloud OSS also provides you with the basic explanation of the bucket storage class that you are choosing. This can help you in making the decision for the storage classes.
OSS introduces mainly three types of data classification. These types are specifically designed to facilitate the varying requirements of cold and hot data:
Standard object storage service promises to provide almost zero delay, high reliability with almost 99.999999999999% (twelve 9s) data availability. Standard storage plan is highly performant data storage plan which is recommended to use for frequent data access; for the scenario which cover hot data handling such as online banking applications, online picture and video editing or sharing platforms etc. Possibly more importantly, you should consider the top tier key features of OSS Standard Storage and oversee if your hot data requirements demand something identical.
As name implies, it is designed to fulfil the accessibility requirements for comparatively less frequently accessed data such as enduring data records and other types of long-lived data, therefore can be considered to meet the specification of warm data storage, partially. Infrequent Access OSS plan is cheaper than standard one. It provides real-time data access, but there will be data retrieval charges as per the unit size of data object. Also, in IA storage plan, the minimum billable data object size is 64KB meaning that you will have to pay for 64KB data size even if object's actual size is less than it, which is not the case in standard plan.
To fulfil the requirements of cold data which is highly likely not to be accessed over a couple of days or weeks, OSS comes up with Archive storage. In benchmarks, some delay can be expected for data to be restored before being available to be read. There are a few conditions to store data in archive storage:
I had mentioned that you cannot change the storage class of a bucket from one type to another once the bucket has been created-that is true and applicable. Logical data conversion among all three storage classes as per the dynamic requirements of data (hot to cold or contrary) is a potent feature of OSS. This transition is part of OSS lifecycle management. Currently, it supports following conversions automatically:
From standard data plan to infrequent access or achieve plan. This is the most happening conversion though; it is most likely that data which requires to be most frequently accessible over a period and would need move in cold storage until the next year, per say any such event or so.
From infrequent access to archive. For a slightly different situation than above, for such a requirement that data accessibility changes from less frequent access to no access for a quite long period. Just to note that, after the object type conversion, pricing calculation happens as per the converted object type.
Check out the normal flow of this conversion here.
Cold storage classes often come with some minimum storage days requirement. On the documentation page, you can find the minimum number of days required before you can delete an object. For Infrequent Access (IA) class, the minimum number of days is 30 days, and for Archive class the minimum number of days is 60 days. If you delete the objects before these days, you will be charged some fees.
One more thing to note here is that there will be no caching for your data in cold regions. So, each request would have to query the object and return it as it was accessed for the first time-a huge performance loss here.
Finally, you cannot change the storage class once the bucket has been created. Fear not, you can always copy the data from one bucket to another-but remember, there might be an early delete penalty.
9 posts | 1 followers
Followzhuoran - February 5, 2021
Apache Flink Community - April 9, 2024
Alibaba Clouder - March 19, 2019
Alibaba Developer - January 10, 2020
Alibaba Cloud Native - July 18, 2024
Alibaba Clouder - July 31, 2020
9 posts | 1 followers
FollowProvides scalable, distributed, and high-performance block storage and object storage services in a software-defined manner.
Learn MoreBuild a Data Lake with Alibaba Cloud Object Storage Service (OSS) with 99.9999999999% (12 9s) availability, 99.995% SLA, and high scalability
Learn MoreAn encrypted and secure cloud storage service which stores, processes and accesses massive amounts of data from anywhere in the world
Learn MorePlan and optimize your storage budget with flexible storage services
Learn MoreMore Posts by afzaalvirgoboy