Sanoid#
We use Sanoid to:
- automatically take regular snapshots of ZFS Datasets
- automatically clean snapshots according to a retention policy
- sync datasets between servers thanks to the
syncoid
command
sanoid snapshot configuration#
/etc/sanoid/sanoid.conf
contains the configuration for sanoid snapshots.
That is how frequently you want to do them, and the retention policy (how much to keep snapshots).
There are generally two kind of templates:
- one for datasets that are synced from a different server. In this case we don't want to create snapshots as we already receive the one from source. We only want to purge old snapshots.
- one for datasets where the source is this server. In this case we want to regularly create snapshots and purge old ones.
We then have different retention strategies based on the type of data.
sanoid checks#
We have a timer/service sanoid_check that checks that we have recent snapshots for datasets. This is useful to verify sanoid is running, or syncoid is doing it's job.
The default is to check every ZFS datasets, but the one you list with no_sanoid_checks:
in the comments of your sanoid.conf
file.
You can put more than one dataset per line, by separating them with ":".
For example:
# no_sanoid_checks:rpool/logs-nginx:
# no_sanoid_checks:rpool/obf-old:rpool/opf-old:
syncoid service and configuration#
Sanoid does not come with a systemd service for syncoid,
so we created one, see: confs/common/systemd/system/syncoid.service
The syncoid service can synchronize to or from a server.
But it is always preferred to be in pull mode. The idea is to avoid having elevated privileges on the distant server. So if an attacker gains privilege access on one server, it can't gain access to the other server (and eg. remove or encrypt all data, including backups).
- We use a user named
operator (eg. off2operator) on the remote server we want to pull from - We use zfs allow command to give the
hold,send
permissions to this user
The service simply use each line of /etc/sanoid/syncoid-args.conf
as arguments to syncoid
command.
getting status#
You can use :
systemctl status sanoid.service
and systemctl status syncoid.service
to see the logs of last synchronization.
Also you can list snapshot on source / destination ZFS datasets to see if there are recent ones:
/usr/sbin/zfs list -t snap <pool>/<dataset/path>
Install#
Sanoid is installed by using the official repository, building the deb and installing it.
It provides a sanoid systemd service and a timer unit that just have to be enabled.
For syncoid to be launched by systemd, we created a service (see syncoid service and configuration). This service is declared as a dependency of the sanoid so that it runs just after it.
How to build and install sanoid deb#
See install documentation. I exactly follow the instructions.
cd /opt
git clone https://github.com/jimsalterjrs/sanoid.git
cd sanoid
# checkout latest stable release (or stay on master for bleeding edge stuff, but expect bugs!)
git checkout $(git tag | grep "^v" | tail -n 1)
ln -s packages/debian .
apt install debhelper libcapture-tiny-perl libconfig-inifiles-perl pv lzop mbuffer build-essential git
dpkg-buildpackage -uc -us
sudo apt install ../sanoid_*_all.deb
How to enable sanoid service#
Create conf for sanoid and link it
ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/sanoid/sanoid.conf /etc/sanoid/
for unit in email-failures@.service sanoid_check.service sanoid_check.timer sanoid.service.d; \
do ln -s /opt/openfoodfacts-infrastructure/confs/off1/systemd/system/$unit /etc/systemd/system ; \
done
systemctl daemon-reload
systemctl enable --now sanoid_check.timer
systemctl enable --now sanoid.service
How to enable syncoid service#
Create conf for syncoid and link it
ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/sanoid/syncoid-args.conf /etc/sanoid/
Enable syncoid service:
ln -s /opt/openfoodfacts-infrastructure/confs/$SERVER_NAME/systemd/system/syncoid.service /etc/systemd/system
systemctl daemon-reload
systemctl enable --now syncoid.service
How to setup synchronization without using root#
Say we want to pull data from zfs-hdd, zfs-nvme and rpool for PROD_SERVER to BACKUP_SERVER
creating operator on PROD_SERVER#
OPERATOR=${BACKUP_SERVER}operator
adduser $OPERATOR
# choose a random password (pwgen 16 16) and discard it
# copy public key
mkdir /home/$OPERATOR/.ssh
vim /home/$OPERATOR/.ssh/authorized_keys
# copy BACKUP_SERVER root public key
chown -R /home/$OPERATOR
chmod go-rwx -R /home/$OPERATOR/.ssh
Adding needed permissions to pull zfs syncs
zfs allow $OPERATOR hold,send zfs-hdd
zfs allow $OPERATOR hold,send zfs-nvme
zfs allow $OPERATOR hold,send rpool
test connection on BACKUP_SERVER#
On BACKUP_SERVER, test ssh connection:
OPERATOR=${BACKUP_SERVER}operator
ssh $OPERATOR@<ip for server>
config syncoid#
You have sanoid running on the $PROD_SERVER, and creating snapshot for the dataset you want to backup remotely.
You have sanoid and syncoid already configured on BACKUP_SERVER.
We can now add lines to syncoid-args.conf
, on BACKUP_SERVER
they must use the --no-privilege-elevation
and --no-sync-snap
options
(if you want to create a sync snap,
you will have to also grand snapshot creation to $OPERATOR user on $PROD_SERVER).
Use --recursive
to also backup subdatasets.
Don't forget to create a sane retention policy (with autosnap=no
) in sanoid on $BACKUP_SERVER to remove old data.
Note: because of the 6h timeout, if you have big datasets, you may want to do the first synchronization before enabling the service.