Resharding for Adding and Removing Shards

About this Task

You can use resharding to distribute your sharded collections to new shards. You can also use it to remove shards faster than chunk migrations.

The resharding operation performs these phases in order:

The clone phase duplicates the current collection data.
The building indexes phase builds indexes on the resharded collection.
The catch-up phase applies any pending write operations to the resharded collection.
The commit phase renames the temporary collection and drops the old collection to perform a cut-over.

Before you Begin

Before you reshard, you must calculate your cluster's Storage Requirements, Latency Requirements, and any Additional Resource Requirements.

Storage Requirements

Calculate the required storage space for the resharding operation by adding your collection size and index size, assuming a minimum oplog window of 24 hours by using this formula:

Available storage required on each shard = [(collection size + index size) *2 ] / number of shards the collection will be distributed across.

For example, a 2TB collection and 400GB of indexes distributed across 4 shards will need a minimum of 1.2TB of available storage per shard:

[ (2 TB + 400GB) * 2 ] / 4 shards = 1.2 TB / shard

You must confirm that you have the available storage space in your cluster.

If there is insufficient space or I/O headroom available, you must increase the storage size. If there is insufficient CPU headroom, you must scale up the cluster by selecting a higher instance size.

Tip

If your MongoDB cluster is hosted on Atlas, you can use the Atlas UI to review storage, CPU, and I/O headroom metrics.

Latency Requirements

You must ensure that your application can tolerate two seconds where the collection being resharded blocks writes. When writes are blocked, your application experiences an increase in latency. If your workload cannot tolerate this requirement, use chunk migrations to balance your cluster.

Additional Resource Requirements

Your cluster must meet these additional requirements:

A minimum oplog window of 24 hours.
I/O capacity below 50%.
CPU load below 80%.

Steps

Add or remove shards to your cluster.

To add shards to your cluster, see Add Shards to a Cluster. To remove shards from your cluster, see Remove Shards from a Sharded Cluster.

Reshard sharded collections one at a time to the same shard key.

Use the reshardCollection command with the forceRedistribution option to redistribute data across the cluster.

db.adminCommand(
   {
      reshardCollection: "<db>.<collection>",
      key: { "<shardkey>" },
      forceRedistribution: true
   }
)

Resharding with forceRedistribution: true rewrites the data across all shards in the cluster that are not in a draining state. By default, resharding uses numInitialChunks: 90. Resharding creates at least numInitialChunks - 1 chunks in a cluster. If you have more than 90 shards, specify a higher number of numInitialChunks in the reshardCollection command.

Monitor the resharding operation.

To monitor the resharding operation, you can use the $currentOp pipeline stage:

db.getSiblingDB("admin").aggregate(
   [
     { $currentOp: { allUsers: true, localOps: false } },
     {
       $match: {
         type: "op",
         "originatingCommand.reshardCollection": "<database>.<collection>"
       }
     }
   ]
)

Note

To see updated values, you need to continuously run the pipeline.

The $currentOp pipeline outputs:

totalOperationTimeElapsedSecs: elapsed operation time in seconds
remainingOperationTimeEstimatedSecs: estimated time remaining in seconds for the current resharding operation. It is returned as -1 when a new resharding operation starts.
Starting in MongoDB 7.0, remainingOperationTimeEstimatedSecs is also available on the coordinator during a resharding operation.
remainingOperationTimeEstimatedSecs is set to a pessimistic time estimate:
- The catch-up phase time estimate is set to the clone phase time, which is a relatively long time.
- In practice, if there are only a few pending write operations, the actual catch-up phase time is relatively short.

[
  {
    shard: '<shard>',
    type: 'op',
    desc: 'ReshardingRecipientService | ReshardingDonorService | ReshardingCoordinatorService <reshardingUUID>',
    op: 'command',
    ns: '<database>.<collection>',
    originatingCommand: {
      reshardCollection: '<database>.<collection>',
      key: <shardkey>,
      unique: <boolean>,
      collation: { locale: 'simple' }
    },
    totalOperationTimeElapsedSecs: <number>,
    remainingOperationTimeEstimatedSecs: <number>,
    ...
  },
  ...
]

Resharding with forceRedistribution: true rewrites the collection data to all the relevant shards and drops the old collection. It is the fastest method to move data in a cluster.

Learn More

Back

Reshard a Collection back to the Same Shard Key

Reference