By Alexandru Andrei, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud's incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
In the previous tutorial, Use Your Storage Space More Effectively with ZFS - Exploring vdevs, we have explored various types of vdevs that make a ZFS storage pool. Of course, a pool is just a building block that wouldn't have much use without datasets such as ZFS filesystems, volumes, snapshots or clones. While we have used the zpool
command to interact with pools, for datasets the command is zfs
.
When we have created the previous pool, named "fourth", ZFS already created a dataset for us, as we can see with:
zfs list
Command output should be:
root@ubuntu:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
fourth 67K 38.5G 19K /fourth
By default, the filesystem will be mounted in /name_of_pool
, although this can be changed later on (example: zfs set mountpoint=/fourth fourth
). Different mount points can also be chosen when the pool is created, by passing option arguments such as in this example: zpool create -m /other_mountpoint -f fourth vdb vdc
.
Let's create another ZFS filesystem:
zfs create fourth/images
As you can see, the syntax is simple: zfs create name_of_pool/name_of_dataset. These look and act in a similar fashion to regular directories. Children filesystems can be added to the parent filesystems with commands like zfs create fourth/images/store1
. We can automatically create multiple non-existing parents by adding the "-p" switch, e.g. zfs create -p fourth/a/b/c/d/e
.
But how are multiple ZFS filesystems useful? Besides logically separating data, they offer a way to set different properties on different datasets. For example you could create a dataset to store images and disable compression on it and create another dataset to store text files and enable compression on that one. They also allow the administrator to apply different snapshot policies, set maximum space that each filesystem can use (quotas), optimize properties for different uses (e.g. adjust recordsize to optimize for storing databases) and many other things.
Let's see properties for our two datasets:
zfs list
Example output of command:
root@ubuntu:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
fourth 129K 38.5G 19K /fourth
fourth/images 19K 38.5G 19K /fourth/images
As you can see, the filesystems show the same amount of free space on them. This is one of the many advantages of ZFS. Since you don't need to preallocate space for each filesystem, they can each grow freely at their own rate. This means that you don't have to estimate future requirements and you won't need to manually grow filesystems (and maybe partitions) in the future, as is the case with other solutions. Allocating more disk space to your datasets is as simple as growing your pool by adding more devices or replacing them with larger disks. And you don't have to interrupt services, reboot the machine or suspend disk writing.
In some scenarios though, you will want to limit certain filesystems from growing too much. For example, you might have one ZFS filesystem dedicated to each customer, to store their files. They each should be able to access a maximum of 5GB of storage, according to their payment plan. Let's go through an example. Create the following filesystems:
zfs create -p fourth/customers/1
Now let's set a quota for customer "1":
zfs set quota=5G fourth/customers/1
zfs list
will reflect the newly set maximum space this dataset can use:
root@ubuntu:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
fourth 180K 38.5G 19K /fourth
fourth/customers 38K 38.5G 19K /fourth/customers
fourth/customers/1 19K 5.00G 19K /fourth/customers/1
fourth/images 19K 38.5G 19K /fourth/images
And that's all there is to it, let's move on to compression. Create another filesystem:
zfs create fourth/text
Text is highly compressible so let's enable that on this dataset to get all the benefits (saving disk space and faster reading and writing):
zfs set compression=on fourth/text
Set compression to "on" before storing data on a dataset since ZFS will start to compress only information that is added after the option is enabled. There's almost never a reason to not enable compression so do that for every dataset except those that will only store content that is already compressed (images, video files, etc.) Compression uses the CPU, but the algorithm is so efficient and CPU's are so powerful today, that the performance cost is very small (simply put, it uses very little processing power). And in most cases the CPU is sitting idle anyway (e.g. even on a busy server where it may sit at 70% usage on average, 30% of the time it is doing nothing useful while it could be compressing, saving you disk space and speeding up reading and writing data).
You can take a look at other dataset properties you can adjust, with:
zfs get all
Let's explore how useful snapshots can be. Create some files:
touch /fourth/customers/1/importantdata{1..100}
Let's see the files we created:
ls /fourth/customers/1/
These will represent important data that a customer may have. Now let's take a snapshot:
zfs snapshot fourth/customers/1@firstsnapshot
Let's assume, that somehow, 50 of the customer's files have been deleted:
rm /fourth/customers/1/importantdata{1..50}
The customer complains that he can't find his data. But, you have automatic daily snapshots enabled. You quickly check to see what has changed between the last snapshot and current contents:
zfs diff fourth/customers/1@firstsnapshot
Which shows you what has been removed, created, modified and renamed, represented by the following leading signs in the output: -, +, M, R.
There is also a special hidden directory in each dataset that allows you to manually explore snapshot contents:
ls /fourth/customers/1/.zfs/snapshot/
By going to a specific snapshot directory, you can view and copy files selectively if you don't have to restore the entire thing.
We can now see that the customer only has 50 out of the 100 files available:
ls /fourth/customers/1/
And we can restore his snapshot instantly:
zfs rollback fourth/customers/1@firstsnapshot
All his files are reinstated:
ls /fourth/customers/1/
To view all snapshots available:
zfs list -t snapshot
To view all types of datasets:
zfs list -t all
To free up disk space, very old snapshots that are no longer needed can be deleted with:
zfs destroy fourth/customers/1@firstsnapshot
To clone a dataset, we first have to snapshot it:
zfs snapshot fourth/customers/1@firstsnapshot
Now we can create a clone:
zfs clone fourth/customers/1@firstsnapshot fourth/customers/cloneof1
This will allow you to change content in the clone without changing the original (it's kind of a writable snapshot, whereas a regular snapshot is read-only). Also, content that is identical in both datasets will only be stored once, potentially saving large amounts of disk space. Imagine a scenario where original customer data already occupies 500GB; that will amount to 500GB of space saved. If you now change just a few megabytes of data, only these differences will be stored, so you will only need an additional few megabytes of data to store both of these datasets.
Let's assume that we're testing some type of optimization on customer data and we found a way to reduce the number of files needed:
rm /fourth/customers/cloneof1/importantdata{1..99}
When we're happy with the changes, we can "promote" the clone:
zfs promote fourth/customers/cloneof1
This removes the dependency on its origin, which allows us to delete the original content (not possible before promotion):
zfs destroy fourth/customers/1
Rename the clone:
zfs rename fourth/customers/cloneof1 fourth/customers/1
We've now improved and replaced the older customer data.
It's worth noting that clones can also be snapshotted, just like a regular ZFS filesystem.
As mentioned in previous tutorials, you can view volumes as a sort of virtual, raw storage devices. These would be used when you need the advantages and features ZFS offers but don't need the ZFS filesystem on top of your storage pool(s). Or, just like we'll do, you can mix them: use filesystems and volumes side by side.
Create a 10GB volume called "virtualdisk":
zfs create -V 10G fourth/virtualdisk
You will find these volumes added in /dev/zvol
. We can use them just like normal disks, for example we can create an ext4 filesystem on top of it:
mkfs.ext4 /dev/zvol/fourth/virtualdisk
Note: although you can create a filesystem directly on a raw disk, it's recommended you partition it first, but we've skipped that for the sake of simplicity.
Let's also take advantage of some ZFS features, like compression, which ext4 doesn't have natively:
zfs set compression=on fourth/virtualdisk
This will not let you write more than 10GB on your filesystem because ext4 itself imposes a limit to the size it has been created with. But reads and writes will be faster because there will be less bytes that the disks have to store and retrieve. Also, since the volume size has been preallocated, it will take 10GB out of the space available on your storage pool, even if the data it actually holds is much less than that. Preallocation is recommended for volumes to prevent certain types of errors that might lead to data corruption if the pool would run out of space. If you can avoid/afford the risk, you can get disk space saving benefits of compression and dynamic allocation of data, as the volume grows, by making it "sparse". Simply put: with compression on, if you store 10GB of highly compressible data, the volume may only take 3GB from your storage pool. To create a sparse volume you add the -s
parameter, so the previous command would look like this: zfs create -s -V 10G fourth/virtualdisk
.
Mount the ext4 filesystem:
mount /dev/zvol/fourth/virtualdisk /mnt
Check available space on the filesystem:
df -h /mnt
Now let's see another one of the advantages of ZFS, how easy it is to resize a volume:
zfs set volsize=20G fourth/virtualdisk
The ext4 filesystem has to be grown to take advantage of the new size:
resize2fs /dev/zvol/fourth/virtualdisk
Now let's see if it worked:
df -h /mnt
Check how much space this volume is using in our pool:
zfs list
The output might look like this:
root@ubuntu:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
fourth 20.6G 17.9G 19K /fourth
fourth/customers 54K 17.9G 19K /fourth/customers
fourth/customers/1 35K 17.9G 20K /fourth/customers/1
fourth/images 19K 17.9G 19K /fourth/images
fourth/text 19K 17.9G 19K /fourth/text
fourth/virtualdisk 20.6G 38.4G 126M -
As mentioned, even if at the moment the volume is empty, space is preallocated so it takes 20GB out of our pool. But even though it wasn't initially created as a sparse volume, we can change it now:
zfs set refreservation=none fourth/virtualdisk
Now let's see what changed:
zfs list
Example output:
root@ubuntu:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
fourth 127M 38.4G 19K /fourth
fourth/customers 54K 38.4G 19K /fourth/customers
fourth/customers/1 35K 38.4G 20K /fourth/customers/1
fourth/images 19K 38.4G 19K /fourth/images
fourth/text 19K 38.4G 19K /fourth/text
fourth/virtualdisk 126M 38.4G 126M -
We've regained almost 20GB of usable space on our pool.
Let's see the benefits of compression in action. Create a 1GB file in /mnt
, where our ext4 filesystem, backed by the ZFS volume, is mounted:
dd if=/dev/zero of=/mnt/test status=progress bs=1M count=1024
If we check the contents of /mnt
, we'll see that our file, "test" is 1GB in size:
ls -lh /mnt/
But when we verify how much space the volume is using in the pool:
zfs list
We'll see that it needs almost no additional space. This is an extreme case, since we've created a file made only of binary zeros and that is extremely compressible. But in real world scenarios, you will still see high compression ratios, especially with text-based content (e.g. website code, databases that use text records, git repositories, etc.)
Tip: when using ext4 on a ZFS volume, you may notice that after deleting data in /mnt
, the volume doesn't reflect any gains in usable space. This is because, for efficiency, a lot of filesystems like ext4 don't actually remove the data on disk, they just dereference it. Otherwise, deleting 100GB of information would take a very long time and make your system slow. This means that deleted files continue to exist in random blocks on disk, consequently on the ZFS volume too. To free up space, you would use a command such as fstrim /mnt
to actually erase unused data in the ext4 filesystem. Only use the tool when needed, as to not "tire" the physical devices unnecessarily (although the numbers are pretty high these days, devices have a limited number of write cycles).
Don't forget that a lot of the other ZFS specific features are also available on volumes (e.g snapshots and clones). I encourage you to try these features out today on your Alibaba Cloud Elastic Cloud Service (ECS) Linux instance.
Use Your Storage Space More Effectively with ZFS: Exploring vdevs
Use Your Storage Space More Effectively with ZFS: Introduction
2,599 posts | 764 followers
FollowAlibaba Clouder - November 14, 2018
Alibaba Clouder - November 14, 2018
Alibaba Clouder - August 23, 2018
digoal - January 30, 2022
Alibaba Clouder - June 22, 2020
Ahmed Gad - August 26, 2019
2,599 posts | 764 followers
FollowElastic and secure virtual cloud servers to cater all your cloud hosting needs.
Learn MoreAn encrypted and secure cloud storage service which stores, processes and accesses massive amounts of data from anywhere in the world
Learn MoreSimple, scalable, on-demand and reliable network attached storage for use with ECS instances, HPC and Container Service.
Learn MoreMore Posts by Alibaba Clouder