update

Edwin Eefting
2022-01-07 09:55:48 +01:00
parent 303a6d3b1e
commit dc25bc711f
4 changed files with 144 additions and 0 deletions

46
Performance.md Normal file

@ -0,0 +1,46 @@
## Performance tips
If you have a large number of datasets its important to keep the following tips in mind.
## Speeding up SSH
You can make your ssh connections persistent and greatly speed up zfs-autobackup:
On the server that initiates the backup add this to your ~/.ssh/config:
```console
Host *
ControlPath ~/.ssh/control-master-%r@%h:%p
ControlMaster auto
ControlPersist 3600
```
Thanks @mariusvw :)
## Buffering
Also it might help to use the --buffer option to use IO buffering during the data transfer.
This might speed up things since it smooths out sudden IO bursts that are frequent during a zfs send or recv.
## Less work
zfs-autobackup generate a lot less work by using --no-holds and --allow-empty.
This saves a lot of extra zfs-commands per dataset.
## Some statistics
To get some idea of how fast zfs-autobackup is, I did some test on my laptop, with a SKHynix_HFS512GD9TNI-L2B0B disk. I'm using zfs 2.0.2.
I created 100 empty datasets and measured the total runtime of zfs-autobackup. I used all the performance tips below. (--no-holds, --allow-empty, ssh ControlMaster)
* without ssh: 15 seconds. (>6 datasets/s)
* either ssh-target or ssh-source=localhost: 20 seconds (5 datasets/s)
* both ssh-target and ssh-source=localhost: 24 seconds (4 datasets/s)
To be bold I created 2500 datasets, but that also was no problem. So it seems it should be possible to use zfs-autobackup with thousands of datasets.
If you need more performance let me know.
NOTE: There is actually a performance regression in ZFS version 2: https://github.com/openzfs/zfs/issues/11560 Use --no-progress as workaround.

22
Piping.md Normal file

@ -0,0 +1,22 @@
## Transfer buffering, compression and rate limiting.
If you're transferring over a slow link it might be useful to use `--compress=zstd-fast`. This will compress the data before sending, so it uses less bandwidth. An alternative to this is to use --zfs-compressed: This will transfer blocks that already have compression intact. (--compress will usually compress much better but uses much more resources. --zfs-compressed uses the least resources, but can be a disadvantage if you want to use a different compression method on the target.)
You can also limit the datarate by using the `--rate` option.
The `--buffer` option might also help since it acts as an IO buffer: zfs send can vary wildly between completely idle and huge bursts of data. When zfs send is idle, the buffer will continue transferring data over the slow link.
It's also possible to add custom send or receive pipes with `--send-pipe` and `--recv-pipe`.
These options all work together and the buffer on the receiving side is only added if appropriate. When all options are active:
On the sending side:
```
zfs send -> send buffer -> custom send pipes -> compression -> transfer rate limiter
```
On the receiving side:
```
decompression -> custom recv pipes -> buffer -> zfs recv
```

45
PrePost.md Normal file

@ -0,0 +1,45 @@
## Running custom commands before and after snapshotting
You can run commands before and after the snapshot to freeze databases for example, to make the on-disk data consistent before snapshotting.
Note: a ZFS snapshot is atomic, so most of the time freezing isnt really needed. If you restore a snapshot, the application will just think the server (or application) had a regular crash. Most modern databases can handle this fine.
## Method 1: Use snapshot mode
Its possible to use zfs-autobackup in snapshot-only mode. This way you can just create a script that contains the pre and post steps:
```console
#freeze stuff
some-freeze-command
#make snapshot
zfs-autobackup backup1
#unfreeze stuff
some-unfreeze-command
#make backup
zfs-autobackup backup1 --no-snapshot --ssh-target backupserver backups/db1
```
This has the disadvantage that you might have to do the error handling yourself. Also if the source is remote, you have to use the correct ssh command and escaping as well.
## Method 2: Use --pre-snapshot-cmd and --post-snapshot-cmd
With this method, zfs-autobackup will handle the pre and post execution for you.
For example:
```sh
zfs-autobackup \
--pre-snapshot-cmd 'some-freeze-command'\
--post-snapshot-cmd 'some-unfreeze-command'\
--ssh-target backupserver backups/db1
```
The way this works:
* The pre and post commands are always executed on the source side. (via ssh if needed)
* If a pre-command fails, it will immediately execute the post-commands and exit with an error code.
* All post-commands are always executed. Even if the pre-commands or actual snapshot have failed. This way you can be sure that stuff is always cleaned up and unfreezed.

31
Problems.md Normal file

@ -0,0 +1,31 @@
## Common problems you might encounter
> It keeps asking for my SSH password
You forgot to setup automatic login via SSH keys, look in the example how to do this.
> cannot receive incremental stream: invalid backup stream
This usually means you've created a new snapshot on the target side during a backup. If you restart zfs-autobackup, it will automaticly abort the invalid partially received snapshot and start over.
> cannot receive incremental stream: destination has been modified since most recent snapshot
This means files have been modified on the target side somehow.
You can use --rollback to automaticly rollback such changes. Also try destroying the target dataset and using --clear-mountpoint on the next run. This way it wont get mounted.
> internal error: Invalid argument
In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.
Try using something like: --filter-properties xattr or --ignore-transfer-errors.
> zfs receive fails, but snapshot seems to be received successful.
This happens if you transfer between different Operating systems/zfs versions or feature sets.
Try using the --ignore-transfer-errors option. This will ignore the error. It will still check if the snapshot is actually received correctly.
> cannot receive incremental stream: kernel modules must be upgraded to receive this stream.
This happens if you forget to use --encrypt, while the target datasets are already encrypted. (Very strange error message indeed)