By Zhang Youdong.
To understand how the compact command and other related commands work in MongoDB, let's discuss it through addressing a series of questions that may pop up in somebody's head when they first come across the command.
Stay tuned for more helpful articles and tutorials about MongoDB from the team of engineers at Alibaba Cloud ApsaraDB for MongoDB.
The compact command, as its name suggests, can help to compact the shard space, which can be extremely useful in MongoDB. Also, this is something the remove command simply doesn't do. To make things even clearer, consider the visualization shown below:
Now let's discuss two related commands and how they are different from each other, but before we do so, let's see how exactly each one of them deletes files:
As you can infer from the above descriptions, the remove command generates logical free space, which can be used to write new data immediately, but the total physical space that is occupied by the file will not be reclaimed immediately. But, generally, as long as the data is continuously written, physical space being fragmented is typically not a major cause of concern, and the compact command is not required for the collection.
In some scenarios, however, after a large amount of data is removed, read and write performance may be affected with there being fewer subsequent writes as a result. So, if you want to reclaim space, you may need to explicitly call the compact command. Hence, this is just one more reason why you may want to run the compact command instead of either one of these commands.
Well, the compact command isn't perfect either and does have its drawbacks, too, and so you'll have to be careful when you use it. Let me explain.
When you use the compact command to compact a collection, a mutex (mutual exclusion object) write lock will be added to the database where the collection is located, which may cause all read/write requests to the database to be blocked. And, as a result, the compact command may take a long time to execute. The time it takes generally corresponds to the amount of data in the collection.
This, of course, still doesn't subtract all of the benefits of the compact command, though. But, generally it's recommended to perform a compact command during off-peak hours so to avoid any pesky interruptions to your business services. This is something we recommend all our customers do.
The compact command is ultimately completed by the storage engine WiredTiger. That is, when you run a compact command, WiredTiger constantly writes the data in the background of the collection file to the idle space in the front, and then gradually truncates the file to reclaim physical space.
Before performing each round of the compact command, WiredTiger also checks whether one of the following conditions are met.
If neither one of the above conditions is met, it means that performing the compact command wouldn't be able to reclaim at least 10% of the physical space, as is the case with the second condition. In this case, the compact command would be quit.
In other words, sometimes when a large collection is compacted, the compact command immediately returns "OK," but in reality the physical space of the collection remains unchanged. This is because WiredTiger deems that the collection does not need to be compacted.
To know how much space will be reclaimed by the compact command, you need to know the amount of empty space available to be reused by WiredTiger. The amount of empty space available is reflected in the output of db.collection.stats() under the heading wiredTiger.block-manager.file bytes available for reuse
. Consider the following example for reference:
mymongo:PRIMARY> db.coll.stats().wiredTiger["block-manager"]["file bytes available for reuse"]
5033984
For more information about other related questions, check out this FAQ page, which covers more related issues.
Before running the compact command, you'll want to make sure that you have read the content covered in this blog, especially the stuff covered below, and understand the principle and impact of the compact command.
// compact somedb.somecollection
use somedb
db.runCommnd({compact: "somecollection"})
// compact oplog,execute the force option on the copy set primary
use local
db.runCommnd({compact: "somecollection", force: true})
You can read more about this here.
All-New High-Performance PolarDB Stack, Highly Compatible with Oracle
Introducing Alibaba Cloud's InfluxDB® Raft HybridStorage Solution
ApsaraDB - January 12, 2023
Data Geek - March 12, 2021
Michelle - July 10, 2018
Alibaba Clouder - January 9, 2018
ApsaraDB - December 22, 2022
ApsaraDB - September 14, 2023
A secure, reliable, and elastically scalable cloud database service for automatic monitoring, backup, and recovery by time point
Learn MorePlan and optimize your storage budget with flexible storage services
Learn MoreA low-code development platform to make work easier
Learn MoreHelp enterprises build high-quality, stable mobile apps
Learn MoreMore Posts by ApsaraDB