Backup and restore guide
Overview
Walrus storage nodes provide backup and restore functionality for the primary database containing blob data. This guide covers configuration requirements, operational procedures, and best practices for both automated and manual backup processes, as well as restore operations.
The current backup implementation creates full copies of the database files. This means backups require substantial disk space (approximately the same size as your active database). A checkpoint-based solution is planned for a future release.
Prerequisites
- Storage node must be running with appropriate permissions to create backups
- Sufficient disk space for backup storage (recommended: separate physical volume)
- Unix/Linux operating system with support for Unix domain sockets
walrus
user account with appropriate permissions
Local administration socket configuration
The backup system communicates with running storage nodes through a Unix domain socket. To enable this functionality:
-
Configure the administration socket path in your node configuration file:
admin_socket_path: /opt/walrus/admin.socket
-
Restart the storage node to initialize the socket:
sudo systemctl restart walrus-node.service
-
Verify socket creation:
ls -la /opt/walrus/admin.socket
The storage node creates the socket with permissions srw------- 1 walrus walrus
, ensuring
that only the walrus
user can send operations to it. This is critical for security, as
operations sent to this socket are executed directly on the running storage node.
Currently supported operations include:
- local-admin checkpoint
- local-admin log-level
Automated periodic backups
Storage nodes support scheduled automatic backups through checkpoint configuration. Add the following configs to your node configuration:
checkpoint_config:
# Directory where backups will be stored
db_checkpoint_dir: /opt/walrus/checkpoints
# Number of backups to retain (oldest will be deleted)
max_db_checkpoints: 2
# Backup frequency (example: 4-hour interval)
db_checkpoint_interval:
secs: 14400 # 4 hours in seconds
nanos: 0
# Sync in-memory data to disk before creating a bakcup
sync: true
# Maximum concurrent backup operations
max_background_operations: 1
# Enable/disable automated backups
periodic_db_checkpoints: true
Manual backup creation
Create on-demand backups using the local-admin
command:
The following commands assume walrus-node
is in your system's PATH. If it's not, replace
walrus-node
with the full path to the binary, for example:
/opt/walrus/bin/walrus-node
# Basic backup command
sudo -u walrus walrus-node local-admin \
--socket-path /opt/walrus/admin.socket \
checkpoint create \
--path /opt/walrus/backups/manual-backup-name
The backup operation runs in the background within the storage node. Once the backup creation is initialized, the process continues independently even if the command-line interface is terminated.
List Available Backups
sudo -u walrus walrus-node local-admin \
--socket-path /opt/walrus/admin.socket \
checkpoint list \
--path /opt/walrus/checkpoints
Sample output:
Backups:
Backup ID: 1, Size: 85.9 GB, Files: 1055, Created: 2025-07-02T00:25:48Z
Backup ID: 2, Size: 86.2 GB, Files: 1058, Created: 2025-07-02T04:25:52Z
Restore from a backup
Do not copy backup directories directly to the storage node's data path. The restore tool must be used to properly reconstruct the database from checkpoint files. Directly copied content cannot be recognized by the storage engine.
Follow these steps to restore from a backup:
-
Stop the storage node service:
sudo systemctl stop walrus-node.service # Verify the service is stopped sudo systemctl status walrus-node.service
-
Optional: Backup current database
# Assuming the Walrus storage path is: `storage_path: /opt/walrus/db` sudo -u walrus cp -r /opt/walrus/db /opt/walrus/db.backup.$(date +%Y%m%d-%H%M%S)
This command saves:
- Main database files
- Events database (
/opt/walrus/db/events/
) - Event blob data (
/opt/walrus/db/event_blob_writer/
)
-
Clear existing data (if performing a clean restore):
Remove all existing database files to ensure a clean restore,
# Assuming the Walrus storage path is: `storage_path: /opt/walrus/db` sudo -u walrus rm -rf /opt/walrus/db/*
This command removes:
- Main database files
- Events database (
/opt/walrus/db/events/
) - Event blob data (
/opt/walrus/db/event_blob_writer/
)
-
Restore main database:
The restore process can take significant time depending on database size. Run the restore command in a persistent session using
tmux
orscreen
to prevent interruption if your connection drops.# Restore from specific checkpoint sudo -u walrus walrus-node \ restore \ --db-checkpoint-path /opt/walrus/checkpoints \ --db-path /opt/walrus/db \ --checkpoint-id 2 # Or restore from latest checkpoint (omit --checkpoint-id) sudo -u walrus walrus-node \ restore \ --db-checkpoint-path /opt/walrus/checkpoints \ --db-path /opt/walrus/db
-
Start the storage node:
sudo systemctl start walrus-node.service # Monitor startup logs sudo journalctl -u walrus-node.service -f
The storage node will begin downloading and replaying events. This process may take some time before
the node transitions to Active
state.