Compare commits

...

36 Commits

Author SHA1 Message Date
231f41e195 update 2020-03-17 23:52:57 +01:00
7c1546fb49 improved --rollback code. detect and show incompatible snapshots on target. added --destroy-incompatible option. fixes #34 2020-03-17 23:51:16 +01:00
b1dd2b55f8 improved error logging 2020-03-17 19:55:16 +01:00
4ed53eb03f fix linter isue 2020-03-15 23:02:03 +01:00
6f8c73b87f rc7 2020-03-15 22:59:21 +01:00
ee03da2f9b exposed --min-change value as a parameter. (was hardcoded at 200000) 2020-03-15 22:54:14 +01:00
e737d0a79f improved example and cleaned up 2020-03-15 21:42:09 +01:00
cbd281c79d Merge pull request #32 from mariusvw/feature/ssh-keygen
Feature/ssh keygen
2020-03-14 22:49:26 +01:00
dfd38985d1 Merge remote-tracking branch 'remotes/mariusvw/feature/ssh-config' 2020-03-14 22:46:53 +01:00
f1c15cec18 Merge pull request #30 from mariusvw/feature/issue-25
Issue #25, disable colors on non-tty
2020-03-14 22:15:43 +01:00
1bc35f5812 explained splitting of jobs 2020-03-14 22:14:11 +01:00
805a3147b5 added --no-send option. snapshots that are obsolete are now destroyed at the beginning of each dataset-transfer. this allows using --no-send as way to just thinout old snapshots. cleaned up stderr output when resuming. 2020-03-14 22:04:16 +01:00
944435cbd1 Added another ssh-keygen example without passphrase 2020-03-09 10:16:40 +01:00
022a7d75e2 Updated readme with ssh-keygen example 2020-03-09 10:13:48 +01:00
14ac525525 Issue #25, disable colors on non-tty 2020-03-08 23:55:07 +01:00
3a45951361 Updated README 2020-03-08 23:16:37 +01:00
2a300bbcba Added support for custom ssh client config 2020-03-08 23:16:11 +01:00
bdeb4c40fa Cleanup whitespace 2020-03-08 23:05:49 +01:00
e8b90abfde Cleaned whitespace 2020-03-08 22:05:48 +01:00
1d9c25d3b4 prevent emitting useless error messages in some cases when holding/release/destroying snapshots 2020-02-25 20:15:11 +01:00
56d7f8c754 rc5 2020-02-25 18:41:04 +01:00
ef5bca3de1 start at correct snapshot when full send 2020-02-25 18:35:35 +01:00
3b2a19d492 migrate/other-snapshot feature almost done 2020-02-25 00:58:25 +01:00
d2314c0143 imp is not used 2020-02-24 14:30:07 +01:00
f3a80991c9 fix release stuff 2020-02-24 14:20:42 +01:00
bd3321e879 actually set canmount=noauto instead of filtering it. 2020-02-23 23:36:34 +01:00
55e18cc613 fix #18 2020-02-23 23:00:37 +01:00
9d5534c11e create snapshots per pool. fixes #20 2020-02-23 22:25:47 +01:00
93d0823c82 fixes 2020-02-23 21:29:47 +01:00
0285eb31a7 fixes 2020-02-23 21:27:01 +01:00
4c4cd36f9f get rid of the _ . its confusing 2020-02-21 18:16:22 +01:00
9e8c6f7732 explained readonly property better 2020-02-20 09:40:50 +01:00
f305f00d91 fix 2020-02-20 01:21:00 +01:00
3e06a8e2fa --buffer not suppoert (yet/anyomre) 2020-02-20 01:17:25 +01:00
241716cf6d clearer 2020-02-20 01:13:26 +01:00
1b7f7fd140 added clearity 2020-02-20 01:09:13 +01:00
6 changed files with 1845 additions and 1621 deletions

364
README.md
View File

@ -17,12 +17,25 @@
* More robust error handling.
* Prepared for future enhanchements.
* Supports raw backups for encryption.
* Custom SSH client config.
## Introduction
ZFS autobackup is used to periodicly backup ZFS filesystems to other locations. This is done using the very effcient zfs send and receive commands.
This is a tool I wrote to make replicating ZFS datasets easy and reliable. You can either use it as a backup tool or as a replication tool.
It has the following features:
You can select what to backup by setting a custom `ZFS property`. This allows you to set and forget: Configure it so it backups your entire pool, and you never have to worry about backupping again. Even new datasets you create later will be backupped.
Other settings are just specified on the commandline. This also makes it easier to setup and test zfs-autobackup and helps you fix all the issues you might encounter. When you're done you can just copy/paste your command to a cron or script.
Since its using ZFS commands, you can see what its actually doing by specifying `--debug`. This also helps a lot if you run into some strange problem or error. You can just copy-paste the command that fails and play around with it on the commandline. (also something I missed in other tools)
An imporant feature thats missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your zfs datasets.
Another nice thing is progress reporting with `--progress`. Its very usefull with HUGE datasets, when you want to know how many hours/days it will take.
zfs-autobackup tries to be the easiest to use backup tool for zfs.
## Features
* Works across operating systems: Tested with Linux, FreeBSD/FreeNAS and SmartOS.
* Works in combination with existing replication systems. (Like Proxmox HA)
@ -49,123 +62,97 @@ It has the following features:
## Installation
Use pip to install:
### Using pip
The recommended way on most servers is to use pip:
```console
[root@server ~]# pip install zfs-autobackup
[root@server ~]# pip install --upgrade zfs-autobackup
```
This can also be used to upgrade zfs-autobackup to the newest stable version.
### Using easy_install
On older servers you might have to use easy_install
```console
[root@server ~]# easy_install zfs-autobackup
```
Its also possible to just download <https://raw.githubusercontent.com/psy0rz/zfs_autobackup/v3/bin/zfs_autobackup> and run it directly.
### Direct download
## Usage
Its also possible to just download <https://raw.githubusercontent.com/psy0rz/zfs_autobackup/master/bin/zfs-autobackup> and run it directly.
The only requirement that is sometimes missing is the `argparse` python module. Optionally you can install `colorma` for colors.
It should work with python 2.7 and higher.
## Example
In this example we're going to backup a machine called `pve` to a machine called `backup`.
### Setup SSH login
zfs-autobackup needs passwordless login via ssh. This means generating an ssh key and copying it to the remote server.
#### Generate SSH key on `backup`
On the server that runs zfs-autobackup you need to create an SSH key. You only need to do this once.
Use the `ssh-keygen` command and leave the passphrase empty:
```console
[root@server ~]# zfs-autobackup --help
usage: zfs-autobackup [-h] [--ssh-source SSH_SOURCE] [--ssh-target SSH_TARGET]
[--keep-source KEEP_SOURCE] [--keep-target KEEP_TARGET]
[--no-snapshot] [--allow-empty] [--ignore-replicated]
[--no-holds] [--resume] [--strip-path STRIP_PATH]
[--buffer BUFFER] [--clear-refreservation]
[--clear-mountpoint]
[--filter-properties FILTER_PROPERTIES]
[--set-properties SET_PROPERTIES] [--rollback]
[--ignore-transfer-errors] [--raw] [--test] [--verbose]
[--debug] [--debug-output] [--progress]
backup_name target_path
ZFS autobackup 3.0-beta6
positional arguments:
backup_name Name of the backup (you should set the zfs property
"autobackup:backup-name" to true on filesystems you
want to backup
target_path Target ZFS filesystem
optional arguments:
-h, --help show this help message and exit
--ssh-source SSH_SOURCE
Source host to get backup from. (user@hostname)
Default None.
--ssh-target SSH_TARGET
Target host to push backup to. (user@hostname) Default
None.
--keep-source KEEP_SOURCE
Thinning schedule for old source snapshots. Default:
10,1d1w,1w1m,1m1y
--keep-target KEEP_TARGET
Thinning schedule for old target snapshots. Default:
10,1d1w,1w1m,1m1y
--no-snapshot dont create new snapshot (usefull for finishing
uncompleted backups, or cleanups)
--allow-empty if nothing has changed, still create empty snapshots.
--ignore-replicated Ignore datasets that seem to be replicated some other
way. (No changes since lastest snapshot. Usefull for
proxmox HA replication)
--no-holds Dont lock snapshots on the source. (Usefull to allow
proxmox HA replication to switches nodes)
--resume support resuming of interrupted transfers by using the
zfs extensible_dataset feature (both zpools should
have it enabled) Disadvantage is that you need to use
zfs recv -A if another snapshot is created on the
target during a receive. Otherwise it will keep
failing.
--strip-path STRIP_PATH
number of directory to strip from path (use 1 when
cloning zones between 2 SmartOS machines)
--buffer BUFFER Use mbuffer with specified size to speedup zfs
transfer. (e.g. --buffer 1G) Will also show nice
progress output.
--clear-refreservation
Filter "refreservation" property. (recommended, safes
space. same as --filter-properties refreservation)
--clear-mountpoint Filter "canmount" property. You still have to set
canmount=noauto on the backup server. (recommended,
prevents mount conflicts. same as --filter-properties
canmount)
--filter-properties FILTER_PROPERTIES
List of propererties to "filter" when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--set-properties SET_PROPERTIES
List of propererties to override when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--rollback Rollback changes on the target before starting a
backup. (normally you can prevent changes by setting
the readonly property on the target_path to on)
--ignore-transfer-errors
Ignore transfer errors (still checks if received
filesystem exists. usefull for acltype errors)
--raw For encrypted datasets, send data exactly as it exists
on disk.
--test dont change anything, just show what would be done
(still does all read-only operations)
--verbose verbose output
--debug Show zfs commands that are executed, stops after an
exception.
--debug-output Show zfs commands and their output/exit codes. (noisy)
--progress show zfs progress output (to stderr)
When a filesystem fails, zfs_backup will continue and report the number of
failures at that end. Also the exit code will indicate the number of failures.
root@backup:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:McJhCxvaxvFhO/3e8Lf5gzSrlTWew7/bwrd2U2EHymE root@backup
The key's randomart image is:
+---[RSA 2048]----+
| + = |
| + X * E . |
| . = B + o o . |
| . o + o o.|
| S o .oo|
| . + o= +|
| . ++==.|
| .+o**|
| .. +B@|
+----[SHA256]-----+
root@backup:~#
```
## Backup example
#### Copy SSH key to `pve`
In this example we're going to backup a machine called `pve` to our backupserver.
Now you need to copy the public part of the key to `pve`
Its important to choose a unique and consistent backup name. In this case we name our backup: `offsite1`.
The `ssh-copy-id` command is a handy tool to automate this. It will just ask for your password.
```console
root@backup:~# ssh-copy-id root@pve.server.com
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@pve.server.com'"
and check to make sure that only the key(s) you wanted were added.
root@backup:~#
```
### Select filesystems to backup
On the source zfs system set the ```autobackup:offsite``` zfs property to true:
Its important to choose a unique and consistent backup name. In this case we name our backup: `offsite1`.
On the source zfs system set the ```autobackup:offsite1``` zfs property to true:
```console
[root@pve ~]# zfs set autobackup:offsite1=true rpool
@ -197,14 +184,10 @@ rpool/swap autobackup:offsite1 false
### Running zfs-autobackup
There are 2 ways to run the backup, but the endresult is always the same. Its just a matter of security (trust relations between the servers) and preference.
First install the ssh-key on the server that you specify with --ssh-source or --ssh-target.
#### Method 1: Run the script on the backup server and pull the data from the server specfied by --ssh-source. This is usually the preferred way and prevents a hacked server from accesing the backup-data
Run the script on the backup server and pull the data from the server specfied by --ssh-source.
```console
[root@backup ~]# zfs-autobackup --ssh-source pve.server.com offsite1 backup/pve --progress --verbose --resume
[root@backup ~]# zfs-autobackup --ssh-source pve.server.com offsite1 backup/pve --progress --verbose
#### Settings summary
[Source] Datasets on: pve.server.com
@ -213,14 +196,14 @@ First install the ssh-key on the server that you specify with --ssh-source or --
[Source] Keep oldest of 1 week, delete after 1 month.
[Source] Keep oldest of 1 month, delete after 1 year.
[Source] Send all datasets that have 'autobackup:offsite1=true' or 'autobackup:offsite1=child'
[Target] Datasets are local
[Target] Keep the last 10 snapshots.
[Target] Keep oldest of 1 day, delete after 1 week.
[Target] Keep oldest of 1 week, delete after 1 month.
[Target] Keep oldest of 1 month, delete after 1 year.
[Target] Receive datasets under: backup/pve
#### Selecting
[Source] rpool: Selected (direct selection)
[Source] rpool/ROOT: Selected (inherited selection)
@ -228,15 +211,14 @@ First install the ssh-key on the server that you specify with --ssh-source or --
[Source] rpool/data: Selected (inherited selection)
[Source] rpool/data/vm-100-disk-0: Selected (inherited selection)
[Source] rpool/swap: Ignored (disabled)
#### Snapshotting
[Source] rpool: No changes since offsite1-20200218175435
[Source] rpool/ROOT: No changes since offsite1-20200218175435
[Source] rpool/data: No changes since offsite1-20200218175435
[Source] Creating snapshot offsite1-20200218180123
#### Transferring
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175435: resuming
#### Sending and thinning
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175435: receiving full
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175547: receiving incremental
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175706: receiving incremental
@ -247,50 +229,34 @@ First install the ssh-key on the server that you specify with --ssh-source or --
...
```
#### Method 2: Run the script on the server and push the data to the backup server specified by --ssh-target
```console
[root@pve ~]# zfs-autobackup --ssh-target backup.server.com offsite1 backup/pve --progress --verbose --resume
#### Settings summary
[Source] Datasets are local
[Source] Keep the last 10 snapshots.
[Source] Keep oldest of 1 day, delete after 1 week.
[Source] Keep oldest of 1 week, delete after 1 month.
[Source] Keep oldest of 1 month, delete after 1 year.
[Source] Send all datasets that have 'autobackup:offsite1=true' or 'autobackup:offsite1=child'
[Target] Datasets on: backup.server.com
[Target] Keep the last 10 snapshots.
[Target] Keep oldest of 1 day, delete after 1 week.
[Target] Keep oldest of 1 week, delete after 1 month.
[Target] Keep oldest of 1 month, delete after 1 year.
[Target] Receive datasets under: backup/pve
...
```
Note that this is called a "pull" backup: The backup server pulls the backup from the server. This is usually the preferred way.
Its also possible to let a server push its backup to the backup-server. However this has security implications. In that case you would setup the SSH keys the other way around and use the --ssh-target parameter on the server.
### Automatic backups
Now everytime you run the command, zfs-autobackup will create a new snapshot and replicate your data.
Now everytime you run the command, zfs-autobackup will create a new snapshot and replicate your data.
Older snapshots will evertually be deleted, depending on the --keep-source and --keep-target settings. (The defaults are shown above under the 'Settings summary')
Older snapshots will evertually be deleted, depending on the `--keep-source` and `--keep-target` settings. (The defaults are shown above under the 'Settings summary')
Once you've got the correct settings for your situation, you can just store the command in a cronjob. Or just create a script and run it manually when you need it.
Once you've got the correct settings for your situation, you can just store the command in a cronjob.
Or just create a script and run it manually when you need it.
## Tips
* Use ```--verbose``` to see details, otherwise zfs-autobackup will be quiet and only show errors, like a nice unix command.
* Use ```--debug``` if something goes wrong and you want to see the commands that are executed. This will also stop at the first error.
* Use ```--resume``` to be able to resume aborted backups. (not all zfs versions support this)
* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. If there are changes the next backup will fail and will require a zfs rollback. (by using the --rollback option for example)
* You can split up the snapshotting and sending tasks by creating two cronjobs. Use ```--no-send``` for the snapshotter-cronjob and use ```--no-snapshot``` for the send-cronjob. This is usefull if you only want to send at night or if your send take too long.
* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. (Normally, if there are changes the next backup will fail and will require a zfs rollback.) Note that readonly means you cant change the CONTENTS of the dataset directly. Its still possible to receive new datasets and manipulate properties etc.
* Use ```--clear-refreservation``` to save space on your backup server.
* Use ```--clear-mountpoint``` to prevent the target server from mounting the backupped filesystem in the wrong place during a reboot.
* Use ```--resume``` to be able to resume aborted backups. (not all zfs versions support this)
### Speeding up SSH and prevent connection flooding
### Speeding up SSH
Add this to your ~/.ssh/config:
You can make your ssh connections persistent and greatly speed up zfs-autobackup:
On the backup-server add this to your ~/.ssh/config:
```console
Host *
@ -299,8 +265,6 @@ Host *
ControlPersist 3600
```
This will make all your ssh connections persistent and greatly speed up zfs-autobackup for jobs with short intervals.
Thanks @mariusvw :)
### Specifying ssh port or options
@ -321,16 +285,126 @@ Also uses compression on slow links.
Look in man ssh_config for many more options.
## Usage
Here you find all the options:
```console
[root@server ~]# zfs-autobackup --help
usage: zfs-autobackup [-h] [--ssh-config SSH_CONFIG] [--ssh-source SSH_SOURCE]
[--ssh-target SSH_TARGET] [--keep-source KEEP_SOURCE]
[--keep-target KEEP_TARGET] [--other-snapshots]
[--no-snapshot] [--no-send] [--min-change MIN_CHANGE]
[--allow-empty] [--ignore-replicated] [--no-holds]
[--resume] [--strip-path STRIP_PATH]
[--clear-refreservation] [--clear-mountpoint]
[--filter-properties FILTER_PROPERTIES]
[--set-properties SET_PROPERTIES] [--rollback]
[--destroy-incompatible] [--ignore-transfer-errors]
[--raw] [--test] [--verbose] [--debug] [--debug-output]
[--progress]
backup_name target_path
zfs-autobackup v3.0-rc8 - Copyright 2020 E.H.Eefting (edwin@datux.nl)
positional arguments:
backup_name Name of the backup (you should set the zfs property
"autobackup:backup-name" to true on filesystems you
want to backup
target_path Target ZFS filesystem
optional arguments:
-h, --help show this help message and exit
--ssh-config SSH_CONFIG
Custom ssh client config
--ssh-source SSH_SOURCE
Source host to get backup from. (user@hostname)
Default None.
--ssh-target SSH_TARGET
Target host to push backup to. (user@hostname) Default
None.
--keep-source KEEP_SOURCE
Thinning schedule for old source snapshots. Default:
10,1d1w,1w1m,1m1y
--keep-target KEEP_TARGET
Thinning schedule for old target snapshots. Default:
10,1d1w,1w1m,1m1y
--other-snapshots Send over other snapshots as well, not just the ones
created by this tool.
--no-snapshot Dont create new snapshots (usefull for finishing
uncompleted backups, or cleanups)
--no-send Dont send snapshots (usefull for cleanups, or if you
want a serperate send-cronjob)
--min-change MIN_CHANGE
Number of bytes written after which we consider a
dataset changed (default 200000)
--allow-empty If nothing has changed, still create empty snapshots.
(same as --min-change=0)
--ignore-replicated Ignore datasets that seem to be replicated some other
way. (No changes since lastest snapshot. Usefull for
proxmox HA replication)
--no-holds Dont lock snapshots on the source. (Usefull to allow
proxmox HA replication to switches nodes)
--resume Support resuming of interrupted transfers by using the
zfs extensible_dataset feature (both zpools should
have it enabled) Disadvantage is that you need to use
zfs recv -A if another snapshot is created on the
target during a receive. Otherwise it will keep
failing.
--strip-path STRIP_PATH
Number of directory to strip from path (use 1 when
cloning zones between 2 SmartOS machines)
--clear-refreservation
Filter "refreservation" property. (recommended, safes
space. same as --filter-properties refreservation)
--clear-mountpoint Set property canmount=noauto for new datasets.
(recommended, prevents mount conflicts. same as --set-
properties canmount=noauto)
--filter-properties FILTER_PROPERTIES
List of propererties to "filter" when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--set-properties SET_PROPERTIES
List of propererties to override when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--rollback Rollback changes to the latest target snapshot before
starting. (normally you can prevent changes by setting
the readonly property on the target_path to on)
--destroy-incompatible
Destroy incompatible snapshots on target. Use with
care! (implies --rollback)
--ignore-transfer-errors
Ignore transfer errors (still checks if received
filesystem exists. usefull for acltype errors)
--raw For encrypted datasets, send data exactly as it exists
on disk.
--test dont change anything, just show what would be done
(still does all read-only operations)
--verbose verbose output
--debug Show zfs commands that are executed, stops after an
exception.
--debug-output Show zfs commands and their output/exit codes. (noisy)
--progress show zfs progress output (to stderr)
When a filesystem fails, zfs_backup will continue and report the number of
failures at that end. Also the exit code will indicate the number of failures.
```
## Troubleshooting
> ### cannot receive incremental stream: invalid backup stream
### It keeps asking for my SSH password
You forgot to setup automatic login via SSH keys, look in the example how to do this.
### It says 'cannot receive incremental stream: invalid backup stream'
This usually means you've created a new snapshot on the target side during a backup:
* Solution 1: Restart zfs-autobackup and make sure you dont use --resume. If you did use --resume, be sure to "abort" the recveive on the target side with zfs recv -A.
* Solution 2: Destroy the newly created snapshot and restart zfs-autobackup.
> ### internal error: Invalid argument
### It says 'internal error: Invalid argument'
In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.

View File

@ -1 +0,0 @@
zfs_autobackup

1622
bin/zfs-autobackup Executable file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -1 +1 @@
zfs_autobackup
zfs-autobackup

View File

@ -15,3 +15,4 @@ source token
python3 -m twine check dist/*
python3 -m twine upload dist/*
git push --tags

View File

@ -2,7 +2,7 @@ import setuptools
import bin.zfs_autobackup
import os
os.system("git tag -m'' -a v{}".format(bin.zfs_autobackup.VERSION))
os.system("git tag -m ' ' -a v{}".format(bin.zfs_autobackup.VERSION))
with open("README.md", "r") as fh:
long_description = fh.read()
@ -17,7 +17,7 @@ setuptools.setup(
long_description_content_type="text/markdown",
url="https://github.com/psy0rz/zfs_autobackup",
scripts=["bin/zfs_autobackup", "bin/zfs-autobackup"],
scripts=["bin/zfs-autobackup"],
packages=setuptools.find_packages(),
classifiers=[
"Programming Language :: Python :: 2",