Cluster maintenance
The ydbops utility uses CMS to perform cluster maintenance without losing availability. You can also use CMS directly through the gRPC API.
Rolling restart
To perform a rolling restart of the entire cluster, you can use the command:
$ ydbops restart
By default, the strong
availability mode will be used, minimizing the risk of losing availability. You can override it using the --availability-mode
parameter.
The ydbops
utility will automatically create a maintenance task to restart the entire cluster using the specified availability mode. As it progresses, ydbops
will update the maintenance task and obtain exclusive locks on nodes in CMS until all nodes are restarted.
Take a host out for maintenance
To take a host out for maintenance, follow these steps:
-
Create a maintenance task using the command:
$ ydbops maintenance create --hosts=<fqdn> --duration=<seconds>
This command will create a maintenance task that will take an exclusive lock on the host with the fully qualified domain name
<fqdn>
for<seconds>
seconds. -
After creating the task, you need to update its state until the lock is taken, using the command:
$ ydbops maintenance refresh --task-id=<id>
This command will update the task with identifier
<id>
and attempt to take the required lock. When you receive aPERFORMED
response, you can proceed to the next step. -
Perform host maintenance while the lock is held.
-
After completing the work, you need to release the lock on the host using the command:
$ ydbops maintenance complete --task-id=<id> --hosts=<fqdn>