2023-05-23 move Images to ZFS#
We want to move images of open food facts from a normal storage on off1 to a ZFS dataset on off2, replicated on ovh3.
We want to move to ZFS because it makes it really easy to backup thanks to ZFS snapshot and syncing capabilities. It was already done for products.
Strategy#
The strategy is the following
- first reverse current syncing of ZFS so that ovh3 is synced from off2
- mount image dataset on off1 thanks to NFS (NB: use nolock option)
-
for each top level folders
- rsync the folder to the ZFS dataset
- move the folder to a new name (xxx.old)
- replace by a symlink that point to the synced folder on the ZFS dataset (mounted on off1)
- rsync again to be sure we didn't miss anything
At the end we will have a product images folder full of symlinks.
-
At this point we can replace move the image folder ot image.old and NFS mount the ZFS dataset as image
We have two script implementing that for products that we can transform for images. They are in scripts/zfs/migration.sh
and scripts/zfs/migration-ean8.sh
Doing it#
Verifications before reversing current syncing#
I first verify if off2 is in sync with ovh3:
(off2)$ sudo zfs list -t snapshot zfs-hdd/off/images
...
zfs-hdd/off/images@20230220220000 1.75G - 9.70T -
zfs-hdd/off/images@20230221070000 1.61G - 9.70T -
zfs-hdd/off/images@20230221183000 232K - 9.70T -
(ovh3)$ sudo zfs list -t snapshot rpool/off/images
...
rpool/off/images@20230221070000 2.21G - 10.1T -
rpool/off/images@20230221183000 6.76G - 10.1T -
rpool/off/images@20230516083018 22.3G - 10.1T -
Sync from ovh3 (we had to use -F)
zfs send -i rpool/off/images@20230221183000 rpool/off/images@20230516083018 |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
I also see that last snapshot on ovh3 has not all modifications:
$ zfs list -po written rpool/off/images
WRITTEN
24130245744
$ zfs get -Hr written rpool/off/images
rpool/off/images written 22.5G -
...
TIMESTAMP=$(date +%Y%m%d%H%M%S)
zfs snapshot rpool/off/images@$TIMESTAMP
zfs send -i rpool/off/images@20230516083018 rpool/off/images@$TIMESTAMP |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
But after that, on ovh3 the dataset keeps continuing being written at !
We will suppose that it's because of atime
(access time) which must be stored.
Reverse current syncing#
Configuring#
On ovh3, I will use sanoid but with the synced data format:
# /etc/sanoid/sanoid.conf
…
[rpool/off/images]
use_template=synced_data
recursive=no
…
On off, I will use sanoid with normal format:
# /etc/sanoid/sanoid.conf
…
[zfs-hdd/off/images]
use_template=prod_data
recursive=no
…
And configure syncoid to replicate to ovh3
# /ect/sanoid/syncoid-args.conf
…
--no-sync-snap zfs-hdd/off/images root@ovh3.openfoodfacts.org:rpool/off/images
…
Failing#
Let's try a run of syncoid:
syncoid --no-sync-snap zfs-hdd/off/images root@ovh3.openfoodfacts.org:rpool/off/images
NEWEST SNAPSHOT: 20230523165657
INFO: no snapshots on source newer than 20230523165657 on target. Nothing to do, not syncing.
This made me think I should add a TimeoutStartSec to syncoid for such case (eg. 6h).
After stoping the stalled syncoid, I wait for the next run.
It's catastrophic, OVH3 does not serve images anymore. Looking at processes I see a zfs rollback
on rpool/off/images
and it's normal since syncoid tries to put it on last snapshot before syncing. But in the meantime, NGINX does not have access to images… and it seems to take a long time to happen.
Disabling atime on ovh3#
I decided to disable atime on ovh3 to see if it make the writes stops.
On ovh3 (useful resource):
$ zfs set atime=off rpool/off/images
$ # verifying
$ zfs get atime rpool/off/images
NAME PROPERTY VALUE SOURCE
rpool/off/images atime off local
$ mount |grep rpool/off/images
rpool/off/images on /rpool/off/images type zfs (rw,noatime,xattr,noacl)
I then take a snapshot:
TIMESTAMP=$(date +%Y%m%d%H%M%S)
zfs snapshot rpool/off/images@$TIMESTAMP
Now it seems nothing is written anymore on the disk:
$ zfs list -po written rpool/off/images
WRITTEN
0
So the guess about atime was right.
I synced back to off2:
zfs send -i rpool/off/images@20230523165657 rpool/off/images@20230531083016 |ssh off2.openfoodfacts.org zfs recv zfs-hdd/off/images -F
syncing with syncoid#
Now that we have a stable ZFS on ovh3 we can activate the syncoid sync.
NFS Mount off2 ZFS dataset on off1#
Let's mount the images images ZFS volume from off2 to off1.
I first install nfs server on off2 and enable nfs sharing on my dataset:
apt install nfs-kernel-server
zfs set sharenfs="rw=@10.0.0.1/32" zfs-hdd/off/images
Then on off1,
To install NFS, I add to update the sources list in /etc/apt/sources.list
and replace fr2.ftp.debian.org
by archive.debian.org
Then
apt update
apt install nfs-common
Trying to mount
mkdir /mnt/off2
mkdir /mnt/off2/off-images
mount -t nfs -o rw "10.0.0.2:/zfs-hdd/off/images" /mnt/off2/off-images
ls /mnt/off2/off-images
Adding to /etc/fstab
:
…
# off2 NFS mounts
10.0.0.2:/zfs-hdd/off/images /mnt/off2/off-images nfs rw,nolock,nofail 0 0
…
Testing rsync#
Doing rsync manually for one folder to see the time it takes. Taking some 3?? code. It shows it was fast enough on second pass.
Running script#
I use migration.sh and wrote scripts/zfs/migration-images.sh
and copied it to off1.
Running it works fine. But it had to be relaunched several time because sometime rsync did end up with non zero error.
EAN8 are the longest to migrate due to their sequentiality.
It took around 6 days.
Switching old folder for new folder#
EAN8 folders continually get created, so I ended up with a single command to do last syncs and change the link to the NFS mount:
./migration-images.sh && \
./migration-images.sh && \
unlink /srv/off/html/images/products && \
ln -s /mnt/off2/off-images/products /srv/off/html/images/products
and do a last check that we have no non migrated directory left:
find /srv2/off/html/images/products.old/ -maxdepth 1 -type d -not -name '*.old'