Compare commits

...

33 Commits

Author SHA1 Message Date
8a960389d1 fix url for pypi 2020-03-31 20:01:28 +02:00
c7cd73ae1f Merge branch 'master' of github.com:psy0rz/zfs_autobackup 2020-03-31 19:33:26 +02:00
c8c1d0fd27 transparancy 2020-03-31 19:33:09 +02:00
c090979f3e spelling and links 2020-03-31 19:23:04 +02:00
3a4062c983 tried to clear up thinner documetion and output 2020-03-31 19:06:46 +02:00
bcf73c6e5c --min-change is now has 1 instead of 200000 as default. fixes #37 2020-03-29 23:47:26 +02:00
9cf5ce188a added thinner documentation. fixes #21 2020-03-29 23:23:47 +02:00
a226309ce5 spelling 2020-03-29 23:22:18 +02:00
231f41e195 update 2020-03-17 23:52:57 +01:00
7c1546fb49 improved --rollback code. detect and show incompatible snapshots on target. added --destroy-incompatible option. fixes #34 2020-03-17 23:51:16 +01:00
b1dd2b55f8 improved error logging 2020-03-17 19:55:16 +01:00
4ed53eb03f fix linter isue 2020-03-15 23:02:03 +01:00
6f8c73b87f rc7 2020-03-15 22:59:21 +01:00
ee03da2f9b exposed --min-change value as a parameter. (was hardcoded at 200000) 2020-03-15 22:54:14 +01:00
e737d0a79f improved example and cleaned up 2020-03-15 21:42:09 +01:00
cbd281c79d Merge pull request #32 from mariusvw/feature/ssh-keygen
Feature/ssh keygen
2020-03-14 22:49:26 +01:00
dfd38985d1 Merge remote-tracking branch 'remotes/mariusvw/feature/ssh-config' 2020-03-14 22:46:53 +01:00
f1c15cec18 Merge pull request #30 from mariusvw/feature/issue-25
Issue #25, disable colors on non-tty
2020-03-14 22:15:43 +01:00
1bc35f5812 explained splitting of jobs 2020-03-14 22:14:11 +01:00
805a3147b5 added --no-send option. snapshots that are obsolete are now destroyed at the beginning of each dataset-transfer. this allows using --no-send as way to just thinout old snapshots. cleaned up stderr output when resuming. 2020-03-14 22:04:16 +01:00
944435cbd1 Added another ssh-keygen example without passphrase 2020-03-09 10:16:40 +01:00
022a7d75e2 Updated readme with ssh-keygen example 2020-03-09 10:13:48 +01:00
14ac525525 Issue #25, disable colors on non-tty 2020-03-08 23:55:07 +01:00
3a45951361 Updated README 2020-03-08 23:16:37 +01:00
2a300bbcba Added support for custom ssh client config 2020-03-08 23:16:11 +01:00
bdeb4c40fa Cleanup whitespace 2020-03-08 23:05:49 +01:00
e8b90abfde Cleaned whitespace 2020-03-08 22:05:48 +01:00
1d9c25d3b4 prevent emitting useless error messages in some cases when holding/release/destroying snapshots 2020-02-25 20:15:11 +01:00
56d7f8c754 rc5 2020-02-25 18:41:04 +01:00
ef5bca3de1 start at correct snapshot when full send 2020-02-25 18:35:35 +01:00
3b2a19d492 migrate/other-snapshot feature almost done 2020-02-25 00:58:25 +01:00
d2314c0143 imp is not used 2020-02-24 14:30:07 +01:00
f3a80991c9 fix release stuff 2020-02-24 14:20:42 +01:00
6 changed files with 604 additions and 355 deletions

451
README.md
View File

@ -4,19 +4,20 @@
* Complete rewrite, cleaner object oriented code.
* Python 3 and 2 support.
* Installable via pip.
* Installable via [pip](https://pypi.org/project/zfs-autobackup/).
* Backwards compatible with your current backups and parameters.
* Progressive thinning (via a destroy schedule. default schedule should be fine for most people)
* Cleaner output, with optional color support (pip install colorama).
* Clear distinction between local and remote output.
* Summary at the beginning, displaying what will happen and the current thinning-schedule.
* More effient destroying/skipping snaphots on the fly. (no more space issues if your backup is way behind)
* More efficient destroying/skipping snapshots on the fly. (no more space issues if your backup is way behind)
* Progress indicator (--progress)
* Better property management (--set-properties and --filter-properties)
* Better resume handling, automaticly abort invalid resumes.
* Better resume handling, automatically abort invalid resumes.
* More robust error handling.
* Prepared for future enhanchements.
* Prepared for future enhancements.
* Supports raw backups for encryption.
* Custom SSH client config.
## Introduction
@ -28,9 +29,9 @@ Other settings are just specified on the commandline. This also makes it easier
Since its using ZFS commands, you can see what its actually doing by specifying `--debug`. This also helps a lot if you run into some strange problem or error. You can just copy-paste the command that fails and play around with it on the commandline. (also something I missed in other tools)
An imporant feature thats missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your zfs datasets.
An important feature thats missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your zfs datasets.
Another nice thing is progress reporting with `--progress`. Its very usefull with HUGE datasets, when you want to know how many hours/days it will take.
Another nice thing is progress reporting with `--progress`. Its very useful with HUGE datasets, when you want to know how many hours/days it will take.
zfs-autobackup tries to be the easiest to use backup tool for zfs.
@ -63,12 +64,14 @@ zfs-autobackup tries to be the easiest to use backup tool for zfs.
### Using pip
The recommended way on most servers is to use pip:
The recommended way on most servers is to use [pip](https://pypi.org/project/zfs-autobackup/):
```console
[root@server ~]# pip install zfs-autobackup
[root@server ~]# pip install --upgrade zfs-autobackup
```
This can also be used to upgrade zfs-autobackup to the newest stable version.
### Using easy_install
On older servers you might have to use easy_install
@ -87,13 +90,69 @@ It should work with python 2.7 and higher.
## Example
In this example we're going to backup a machine called `pve` to our backupserver.
In this example we're going to backup a machine called `pve` to a machine called `backup`.
Its important to choose a unique and consistent backup name. In this case we name our backup: `offsite1`.
### Setup SSH login
zfs-autobackup needs passwordless login via ssh. This means generating an ssh key and copying it to the remote server.
#### Generate SSH key on `backup`
On the server that runs zfs-autobackup you need to create an SSH key. You only need to do this once.
Use the `ssh-keygen` command and leave the passphrase empty:
```console
root@backup:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:McJhCxvaxvFhO/3e8Lf5gzSrlTWew7/bwrd2U2EHymE root@backup
The key's randomart image is:
+---[RSA 2048]----+
| + = |
| + X * E . |
| . = B + o o . |
| . o + o o.|
| S o .oo|
| . + o= +|
| . ++==.|
| .+o**|
| .. +B@|
+----[SHA256]-----+
root@backup:~#
```
#### Copy SSH key to `pve`
Now you need to copy the public part of the key to `pve`
The `ssh-copy-id` command is a handy tool to automate this. It will just ask for your password.
```console
root@backup:~# ssh-copy-id root@pve.server.com
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@pve.server.com'"
and check to make sure that only the key(s) you wanted were added.
root@backup:~#
```
### Select filesystems to backup
On the source zfs system set the ```autobackup:offsite``` zfs property to true:
Its important to choose a unique and consistent backup name. In this case we name our backup: `offsite1`.
On the source zfs system set the ```autobackup:offsite1``` zfs property to true:
```console
[root@pve ~]# zfs set autobackup:offsite1=true rpool
@ -108,7 +167,7 @@ rpool/swap autobackup:offsite1 true
...
```
Because we dont want to backup everything, we can exclude certain filesystem by setting the property to false:
Because we don't want to backup everything, we can exclude certain filesystem by setting the property to false:
```console
[root@pve ~]# zfs set autobackup:offsite1=false rpool/swap
@ -125,13 +184,7 @@ rpool/swap autobackup:offsite1 false
### Running zfs-autobackup
Before you start, make sure you can login to the server without password, by using `SSH keys`. Look at the troubleshooting section for more info.
There are 2 ways to run the backup, but the endresult is always the same. Its just a matter of security (trust relations between the servers) and preference.
#### Method 1: Pull backup
Run the script on the backup server and pull the data from the server specfied by --ssh-source. This is usually the preferred way and prevents a hacked server from accesing the backup-data.
Run the script on the backup server and pull the data from the server specified by --ssh-source.
```console
[root@backup ~]# zfs-autobackup --ssh-source pve.server.com offsite1 backup/pve --progress --verbose
@ -139,18 +192,18 @@ Run the script on the backup server and pull the data from the server specfied b
#### Settings summary
[Source] Datasets on: pve.server.com
[Source] Keep the last 10 snapshots.
[Source] Keep oldest of 1 day, delete after 1 week.
[Source] Keep oldest of 1 week, delete after 1 month.
[Source] Keep oldest of 1 month, delete after 1 year.
[Source] Keep every 1 day, delete after 1 week.
[Source] Keep every 1 week, delete after 1 month.
[Source] Keep every 1 month, delete after 1 year.
[Source] Send all datasets that have 'autobackup:offsite1=true' or 'autobackup:offsite1=child'
[Target] Datasets are local
[Target] Keep the last 10 snapshots.
[Target] Keep oldest of 1 day, delete after 1 week.
[Target] Keep oldest of 1 week, delete after 1 month.
[Target] Keep oldest of 1 month, delete after 1 year.
[Target] Keep every 1 day, delete after 1 week.
[Target] Keep every 1 week, delete after 1 month.
[Target] Keep every 1 month, delete after 1 year.
[Target] Receive datasets under: backup/pve
#### Selecting
[Source] rpool: Selected (direct selection)
[Source] rpool/ROOT: Selected (inherited selection)
@ -158,14 +211,14 @@ Run the script on the backup server and pull the data from the server specfied b
[Source] rpool/data: Selected (inherited selection)
[Source] rpool/data/vm-100-disk-0: Selected (inherited selection)
[Source] rpool/swap: Ignored (disabled)
#### Snapshotting
[Source] rpool: No changes since offsite1-20200218175435
[Source] rpool/ROOT: No changes since offsite1-20200218175435
[Source] rpool/data: No changes since offsite1-20200218175435
[Source] Creating snapshot offsite1-20200218180123
#### Transferring
#### Sending and thinning
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175435: receiving full
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175547: receiving incremental
[Target] backup/pve/rpool/ROOT/pve-1@offsite1-20200218175706: receiving incremental
@ -176,147 +229,112 @@ Run the script on the backup server and pull the data from the server specfied b
...
```
#### Method 2: push backup
Note that this is called a "pull" backup: The backup server pulls the backup from the server. This is usually the preferred way.
Run the script on the server and push the data to the backup server specified by --ssh-target.
```console
[root@pve ~]# zfs-autobackup --ssh-target backup.server.com offsite1 backup/pve --progress --verbose
#### Settings summary
[Source] Datasets are local
[Source] Keep the last 10 snapshots.
[Source] Keep oldest of 1 day, delete after 1 week.
[Source] Keep oldest of 1 week, delete after 1 month.
[Source] Keep oldest of 1 month, delete after 1 year.
[Source] Send all datasets that have 'autobackup:offsite1=true' or 'autobackup:offsite1=child'
[Target] Datasets on: backup.server.com
[Target] Keep the last 10 snapshots.
[Target] Keep oldest of 1 day, delete after 1 week.
[Target] Keep oldest of 1 week, delete after 1 month.
[Target] Keep oldest of 1 month, delete after 1 year.
[Target] Receive datasets under: backup/pve
...
```
Its also possible to let a server push its backup to the backup-server. However this has security implications. In that case you would setup the SSH keys the other way around and use the --ssh-target parameter on the server.
### Automatic backups
Now everytime you run the command, zfs-autobackup will create a new snapshot and replicate your data.
Now every time you run the command, zfs-autobackup will create a new snapshot and replicate your data.
Older snapshots will evertually be deleted, depending on the `--keep-source` and `--keep-target` settings. (The defaults are shown above under the 'Settings summary')
Older snapshots will eventually be deleted, depending on the `--keep-source` and `--keep-target` settings. (The defaults are shown above under the 'Settings summary')
Once you've got the correct settings for your situation, you can just store the command in a cronjob.
Once you've got the correct settings for your situation, you can just store the command in a cronjob.
Or just create a script and run it manually when you need it.
### Thinning out obsolete snapshots
The thinner is the thing that destroys old snapshots on the source and target.
The thinner operates "stateless": There is nothing in the name or properties of a snapshot that indicates how long it will be kept. Everytime zfs-autobackup runs, it will look at the timestamp of all the existing snapshots. From there it will determine which snapshots are obsolete according to your schedule. The advantage of this stateless system is that you can always change the schedule.
Note that the thinner will ONLY destroy snapshots that are matching the naming pattern of zfs-autobackup. If you use `--other-snapshots`, it wont destroy those snapshots after replicating them to the target.
#### Thinning schedule
The default thinning schedule is: `10,1d1w,1w1m,1m1y`.
The schedule consists of multiple rules separated by a `,`
A plain number specifies how many snapshots you want to always keep, regardless of time or interval.
The format of the other rules is: `<Interval><TTL>`.
* Interval: The minimum interval between the snapshots. Snapshots with intervals smaller than this will be destroyed.
* TTL: The maximum time to life time of a snapshot, after that they will be destroyed.
* These are the time units you can use for interval and TTL:
* `y`: Years
* `m`: Months
* `d`: Days
* `h`: Hours
* `min`: Minutes
* `s`: Seconds
Since this might sound very complicated, the `--verbose` option will show you what it all means:
```console
[Source] Keep the last 10 snapshots.
[Source] Keep every 1 day, delete after 1 week.
[Source] Keep every 1 week, delete after 1 month.
[Source] Keep every 1 month, delete after 1 year.
```
A snapshot will only be destroyed if it not needed anymore by ANY of the rules.
You can specify as many rules as you need. The order of the rules doesn't matter.
Keep in mind its up to you to actually run zfs-autobackup often enough: If you want to keep hourly snapshots, you have to make sure you at least run it every hour.
However, its no problem if you run it more or less often than that: The thinner will still do its best to choose an optimal set of snapshots to choose.
If you want to keep as few snapshots as possible, just specify 0. (`--keep-source=0` for example)
If you want to keep ALL the snapshots, just specify a very high number.
#### More details about the Thinner
We will give a practical example of how the thinner operates.
Say we want have 3 thinner rules:
* We want to keep daily snapshots for 7 days.
* We want to keep weekly snapshots for 4 weeks.
* We want to keep monthly snapshots for 12 months.
So far we have taken 4 snapshots at random moments:
![thinner example](https://raw.githubusercontent.com/psy0rz/zfs_autobackup/master/doc/thinner.png)
For every rule, the thinner will divide the timeline in blocks and assign each snapshot to a block.
A block can only be assigned one snapshot: If multiple snapshots fall into the same block, it only assigns it to the oldest that we want to keep.
The colors show to which block a snapshot belongs:
* Snapshot 1: This snapshot belongs to daily block 1, weekly block 0 and monthly block 0. However the daily block is too old.
* Snapshot 2: Since weekly block 0 and monthly block 0 already have a snapshot, it only belongs to daily block 4.
* Snapshot 3: This snapshot belongs to daily block 8 and weekly block 1.
* Snapshot 4: Since daily block 8 already has a snapshot, this one doesn't belong to anything and can be deleted right away. (it will be keeped for now since its the last snapshot)
zfs-autobackup will re-evaluate this on every run: As soon as a snapshot doesn't belong to any block anymore it will be destroyed.
Snapshots on the source that still have to be send to the target wont be destroyed off course. (If the target still wants them, according to the target schedule)
## Tips
* Use ```--verbose``` to see details, otherwise zfs-autobackup will be quiet and only show errors, like a nice unix command.
* Use ```--debug``` if something goes wrong and you want to see the commands that are executed. This will also stop at the first error.
* Use ```--resume``` to be able to resume aborted backups. (not all zfs versions support this)
* You can split up the snapshotting and sending tasks by creating two cronjobs. Use ```--no-send``` for the snapshotter-cronjob and use ```--no-snapshot``` for the send-cronjob. This is usefull if you only want to send at night or if your send take too long.
* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. (Normally, if there are changes the next backup will fail and will require a zfs rollback.) Note that readonly means you cant change the CONTENTS of the dataset directly. Its still possible to receive new datasets and manipulate properties etc.
* Use ```--clear-refreservation``` to save space on your backup server.
* Use ```--clear-mountpoint``` to prevent the target server from mounting the backupped filesystem in the wrong place during a reboot.
* Use ```--resume``` to be able to resume aborted backups. (not all zfs versions support this)
## Usage
### Speeding up SSH
Here you find all the options:
You can make your ssh connections persistent and greatly speed up zfs-autobackup:
```console
[root@server ~]# zfs-autobackup --help
usage: zfs-autobackup [-h] [--ssh-source SSH_SOURCE] [--ssh-target SSH_TARGET]
[--keep-source KEEP_SOURCE] [--keep-target KEEP_TARGET]
[--no-snapshot] [--allow-empty] [--ignore-replicated]
[--no-holds] [--resume] [--strip-path STRIP_PATH]
[--buffer BUFFER] [--clear-refreservation]
[--clear-mountpoint]
[--filter-properties FILTER_PROPERTIES]
[--set-properties SET_PROPERTIES] [--rollback]
[--ignore-transfer-errors] [--raw] [--test] [--verbose]
[--debug] [--debug-output] [--progress]
backup_name target_path
ZFS autobackup 3.0-rc3
positional arguments:
backup_name Name of the backup (you should set the zfs property
"autobackup:backup-name" to true on filesystems you
want to backup
target_path Target ZFS filesystem
optional arguments:
-h, --help show this help message and exit
--ssh-source SSH_SOURCE
Source host to get backup from. (user@hostname)
Default None.
--ssh-target SSH_TARGET
Target host to push backup to. (user@hostname) Default
None.
--keep-source KEEP_SOURCE
Thinning schedule for old source snapshots. Default:
10,1d1w,1w1m,1m1y
--keep-target KEEP_TARGET
Thinning schedule for old target snapshots. Default:
10,1d1w,1w1m,1m1y
--no-snapshot dont create new snapshot (usefull for finishing
uncompleted backups, or cleanups)
--allow-empty if nothing has changed, still create empty snapshots.
--ignore-replicated Ignore datasets that seem to be replicated some other
way. (No changes since lastest snapshot. Usefull for
proxmox HA replication)
--no-holds Dont lock snapshots on the source. (Usefull to allow
proxmox HA replication to switches nodes)
--resume support resuming of interrupted transfers by using the
zfs extensible_dataset feature (both zpools should
have it enabled) Disadvantage is that you need to use
zfs recv -A if another snapshot is created on the
target during a receive. Otherwise it will keep
failing.
--strip-path STRIP_PATH
number of directory to strip from path (use 1 when
cloning zones between 2 SmartOS machines)
--clear-refreservation
Filter "refreservation" property. (recommended, safes
space. same as --filter-properties refreservation)
--clear-mountpoint Filter "canmount" property. You still have to set
canmount=noauto on the backup server. (recommended,
prevents mount conflicts. same as --filter-properties
canmount)
--filter-properties FILTER_PROPERTIES
List of propererties to "filter" when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--set-properties SET_PROPERTIES
List of propererties to override when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--rollback Rollback changes on the target before starting a
backup. (normally you can prevent changes by setting
the readonly property on the target_path to on)
--ignore-transfer-errors
Ignore transfer errors (still checks if received
filesystem exists. usefull for acltype errors)
--raw For encrypted datasets, send data exactly as it exists
on disk.
--test dont change anything, just show what would be done
(still does all read-only operations)
--verbose verbose output
--debug Show zfs commands that are executed, stops after an
exception.
--debug-output Show zfs commands and their output/exit codes. (noisy)
--progress show zfs progress output (to stderr)
When a filesystem fails, zfs_backup will continue and report the number of
failures at that end. Also the exit code will indicate the number of failures.
```
### Speeding up SSH and prevent connection flooding
Add this to your ~/.ssh/config:
On the backup-server add this to your ~/.ssh/config:
```console
Host *
@ -325,8 +343,6 @@ Host *
ControlPersist 3600
```
This will make all your ssh connections persistent and greatly speed up zfs-autobackup for jobs with short intervals.
Thanks @mariusvw :)
### Specifying ssh port or options
@ -347,24 +363,126 @@ Also uses compression on slow links.
Look in man ssh_config for many more options.
## Usage
Here you find all the options:
```console
[root@server ~]# zfs-autobackup --help
usage: zfs-autobackup [-h] [--ssh-config SSH_CONFIG] [--ssh-source SSH_SOURCE]
[--ssh-target SSH_TARGET] [--keep-source KEEP_SOURCE]
[--keep-target KEEP_TARGET] [--other-snapshots]
[--no-snapshot] [--no-send] [--min-change MIN_CHANGE]
[--allow-empty] [--ignore-replicated] [--no-holds]
[--resume] [--strip-path STRIP_PATH]
[--clear-refreservation] [--clear-mountpoint]
[--filter-properties FILTER_PROPERTIES]
[--set-properties SET_PROPERTIES] [--rollback]
[--destroy-incompatible] [--ignore-transfer-errors]
[--raw] [--test] [--verbose] [--debug] [--debug-output]
[--progress]
backup_name target_path
zfs-autobackup v3.0-rc8 - Copyright 2020 E.H.Eefting (edwin@datux.nl)
positional arguments:
backup_name Name of the backup (you should set the zfs property
"autobackup:backup-name" to true on filesystems you
want to backup
target_path Target ZFS filesystem
optional arguments:
-h, --help show this help message and exit
--ssh-config SSH_CONFIG
Custom ssh client config
--ssh-source SSH_SOURCE
Source host to get backup from. (user@hostname)
Default None.
--ssh-target SSH_TARGET
Target host to push backup to. (user@hostname) Default
None.
--keep-source KEEP_SOURCE
Thinning schedule for old source snapshots. Default:
10,1d1w,1w1m,1m1y
--keep-target KEEP_TARGET
Thinning schedule for old target snapshots. Default:
10,1d1w,1w1m,1m1y
--other-snapshots Send over other snapshots as well, not just the ones
created by this tool.
--no-snapshot Dont create new snapshots (usefull for finishing
uncompleted backups, or cleanups)
--no-send Dont send snapshots (usefull for cleanups, or if you
want a separate send-cronjob)
--min-change MIN_CHANGE
Number of bytes written after which we consider a
dataset changed (default 1)
--allow-empty If nothing has changed, still create empty snapshots.
(same as --min-change=0)
--ignore-replicated Ignore datasets that seem to be replicated some other
way. (No changes since lastest snapshot. Usefull for
proxmox HA replication)
--no-holds Dont lock snapshots on the source. (Usefull to allow
proxmox HA replication to switches nodes)
--resume Support resuming of interrupted transfers by using the
zfs extensible_dataset feature (both zpools should
have it enabled) Disadvantage is that you need to use
zfs recv -A if another snapshot is created on the
target during a receive. Otherwise it will keep
failing.
--strip-path STRIP_PATH
Number of directory to strip from path (use 1 when
cloning zones between 2 SmartOS machines)
--clear-refreservation
Filter "refreservation" property. (recommended, safes
space. same as --filter-properties refreservation)
--clear-mountpoint Set property canmount=noauto for new datasets.
(recommended, prevents mount conflicts. same as --set-
properties canmount=noauto)
--filter-properties FILTER_PROPERTIES
List of properties to "filter" when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--set-properties SET_PROPERTIES
List of properties to override when receiving
filesystems. (you can still restore them with zfs
inherit -S)
--rollback Rollback changes to the latest target snapshot before
starting. (normally you can prevent changes by setting
the readonly property on the target_path to on)
--destroy-incompatible
Destroy incompatible snapshots on target. Use with
care! (implies --rollback)
--ignore-transfer-errors
Ignore transfer errors (still checks if received
filesystem exists. usefull for acltype errors)
--raw For encrypted datasets, send data exactly as it exists
on disk.
--test dont change anything, just show what would be done
(still does all read-only operations)
--verbose verbose output
--debug Show zfs commands that are executed, stops after an
exception.
--debug-output Show zfs commands and their output/exit codes. (noisy)
--progress show zfs progress output (to stderr)
When a filesystem fails, zfs_backup will continue and report the number of
failures at that end. Also the exit code will indicate the number of failures.
```
## Troubleshooting
### It keeps asking for my SSH password
You forgot to setup automatic login via SSH keys:
You forgot to setup automatic login via SSH keys, look in the example how to do this.
* Create a SSH key on the server that you want to run zfs-autobackup on. Use `ssh-keygen`.
* Copy the public key to your clipboard. Get it with `cat /root/.ssh/id_rsa.pub`
* Add the key to the server you specified with --ssh-source or --ssh-target. Create and add it to `/root/.ssh/authorized_keys`
> ### cannot receive incremental stream: invalid backup stream
### It says 'cannot receive incremental stream: invalid backup stream'
This usually means you've created a new snapshot on the target side during a backup:
* Solution 1: Restart zfs-autobackup and make sure you dont use --resume. If you did use --resume, be sure to "abort" the recveive on the target side with zfs recv -A.
* Solution 1: Restart zfs-autobackup and make sure you don't use --resume. If you did use --resume, be sure to "abort" the receive on the target side with zfs recv -A.
* Solution 2: Destroy the newly created snapshot and restart zfs-autobackup.
> ### internal error: Invalid argument
### It says 'internal error: Invalid argument'
In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.
@ -390,13 +508,13 @@ Put this command directly after the zfs_backup command in your cronjob:
zabbix-job-status backup_smartos01_fs1 daily $?
```
This will update the zabbix server with the exitcode and will also alert you if the job didnt run for more than 2 days.
This will update the zabbix server with the exit code and will also alert you if the job didn't run for more than 2 days.
## Backuping up a proxmox cluster with HA replication
Due to the nature of proxmox we had to make a few enhancements to zfs-autobackup. This will probably also benefit other systems that use their own replication in combination with zfs-autobackup.
All data under rpool/data can be on multiple nodes of the cluster. The naming of those filesystem is unique over the whole cluster. Because of this we should backup rpool/data of all nodes to the same destination. This way we wont have duplicate backups of the filesystems that are replicated. Because of various options, you can even migrate hosts and zfs-autobackup will be fine. (and it will get the next backup from the new node automaticly)
All data under rpool/data can be on multiple nodes of the cluster. The naming of those filesystem is unique over the whole cluster. Because of this we should backup rpool/data of all nodes to the same destination. This way we wont have duplicate backups of the filesystems that are replicated. Because of various options, you can even migrate hosts and zfs-autobackup will be fine. (and it will get the next backup from the new node automatically)
In the example below we have 3 nodes, named h4, h5 and h6.
@ -422,6 +540,7 @@ Extra options needed for proxmox with HA:
* --no-holds: To allow proxmox to destroy our snapshots if a VM migrates to another node.
* --ignore-replicated: To ignore the replicated filesystems of proxmox on the receiving proxmox nodes. (e.g: only backup from the node where the VM is active)
* --min-change 200000: Ignore replicated works by checking if there are no changes since the last snapshot. However for some reason proxmox always has some small changes. (Probably house-keeping data are something? This always was fine and suddenly changed with an update)
I use the following backup script on the backup server:
@ -429,7 +548,7 @@ I use the following backup script on the backup server:
for H in h4 h5 h6; do
echo "################################### DATA $H"
#backup data filesystems to a common place
./zfs-autobackup --ssh-source root@$H data_smartos03 zones/backup/zfsbackups/pxe1_data --clear-refreservation --clear-mountpoint --ignore-transfer-errors --strip-path 2 --verbose --resume --ignore-replicated --no-holds $@
./zfs-autobackup --ssh-source root@$H data_smartos03 zones/backup/zfsbackups/pxe1_data --clear-refreservation --clear-mountpoint --ignore-transfer-errors --strip-path 2 --verbose --resume --ignore-replicated --min-change 200000 --no-holds $@
zabbix-job-status backup_$H""_data_smartos03 daily $? >/dev/null 2>/dev/null
echo "################################### RPOOL $H"

View File

@ -13,22 +13,21 @@ import re
import traceback
import subprocess
import pprint
# import cStringIO
import time
import argparse
from pprint import pprint as p
import select
use_color=False
if sys.stdout.isatty():
try:
import colorama
use_color=True
except ImportError:
pass
import imp
try:
import colorama
use_color=True
except ImportError:
use_color=False
VERSION="3.0-rc4"
VERSION="3.0-rc9"
HEADER="zfs-autobackup v{} - Copyright 2020 E.H.Eefting (edwin@datux.nl)\n".format(VERSION)
class Log:
def __init__(self, show_debug=False, show_verbose=False):
@ -118,7 +117,7 @@ class ThinnerRule:
self.rule_str=rule_str
self.human_str="Keep oldest of {} {}{}, delete after {} {}{}.".format(
self.human_str="Keep every {} {}{}, delete after {} {}{}.".format(
period_amount, self.TIME_DESC[period_unit], period_amount!=1 and "s" or "", ttl_amount, self.TIME_DESC[ttl_unit], ttl_amount!=1 and "s" or "" )
@ -172,7 +171,6 @@ class Thinner:
objects: list of objects to thin. every object should have timestamp attribute.
keep_objects: objects to always keep (these should also be in normal objects list, so we can use them to perhaps delete other obsolete objects)
return( keeps, removes )
"""
@ -307,12 +305,14 @@ class ExecuteNode:
"""an endpoint to execute local or remote commands via ssh"""
def __init__(self, ssh_to=None, readonly=False, debug_output=False):
"""ssh_to: server you want to ssh to. none means local
readonly: only execute commands that dont make any changes (usefull for testing-runs)
def __init__(self, ssh_config=None, ssh_to=None, readonly=False, debug_output=False):
"""ssh_config: custom ssh config
ssh_to: server you want to ssh to. none means local
readonly: only execute commands that don't make any changes (usefull for testing-runs)
debug_output: show output and exit codes of commands in debugging output.
"""
self.ssh_config=ssh_config
self.ssh_to=ssh_to
self.readonly=readonly
self.debug_output=debug_output
@ -347,7 +347,7 @@ class ExecuteNode:
def run(self, cmd, input=None, tab_split=False, valid_exitcodes=[ 0 ], readonly=False, hide_errors=False, pipe=False, return_stderr=False):
"""run a command on the node
readonly: make this True if the command doesnt make any changes and is safe to execute in testmode
readonly: make this True if the command doesn't make any changes and is safe to execute in testmode
pipe: Instead of executing, return a pipe-handle to be used to input to another run() command. (just like a | in linux)
input: Can be None, a string or a pipe-handle you got from another run()
return_stderr: return both stdout and stderr as a tuple
@ -357,12 +357,17 @@ class ExecuteNode:
#use ssh?
if self.ssh_to != None:
encoded_cmd.extend(["ssh".encode('utf-8'), self.ssh_to.encode('utf-8')])
encoded_cmd.append("ssh".encode('utf-8'))
if self.ssh_config != None:
encoded_cmd.extend(["-F".encode('utf-8'), self.ssh_config.encode('utf-8')])
encoded_cmd.append(self.ssh_to.encode('utf-8'))
#make sure the command gets all the data in utf8 format:
#(this is neccesary if LC_ALL=en_US.utf8 is not set in the environment)
#(this is necessary if LC_ALL=en_US.utf8 is not set in the environment)
for arg in cmd:
#add single quotes for remote commands to support spaces and other wierd stuff (remote commands are executed in a shell)
#add single quotes for remote commands to support spaces and other weird stuff (remote commands are executed in a shell)
encoded_cmd.append( ("'"+arg+"'").encode('utf-8'))
else:
@ -481,7 +486,7 @@ class ExecuteNode:
class ZfsDataset():
"""a zfs dataset (filesystem/volume/snapshot/clone)
Note that a dataset doesnt have to actually exist (yet/anymore)
Note that a dataset doesn't have to actually exist (yet/anymore)
Also most properties are cached for performance-reasons, but also to allow --test to function correctly.
"""
@ -492,11 +497,10 @@ class ZfsDataset():
'volume': [ "canmount" ],
}
ZFS_MAX_UNCHANGED_BYTES=200000
def __init__(self, zfs_node, name, force_exists=None):
"""name: full path of the zfs dataset
exists: specifiy if you already know a dataset exists or not. for performance reasons. (othewise it will have to check with zfs list when needed)
exists: specify if you already know a dataset exists or not. for performance reasons. (otherwise it will have to check with zfs list when needed)
"""
self.zfs_node=zfs_node
self.name=name #full name
@ -509,6 +513,9 @@ class ZfsDataset():
return(self.name)
def __eq__(self, obj):
if not isinstance(obj, ZfsDataset):
return(False)
return(self.name == obj.name)
def verbose(self,txt):
@ -581,36 +588,35 @@ class ZfsDataset():
return(ZfsDataset(self.zfs_node, self.rstrip_path(1)))
def find_our_prev_snapshot(self, snapshot):
"""find our previous snapshot in this dataset. None if it doesnt exist"""
def find_prev_snapshot(self, snapshot, other_snapshots=False):
"""find previous snapshot in this dataset. None if it doesn't exist.
other_snapshots: set to true to also return snapshots that where not created by us. (is_ours)
"""
if self.is_snapshot:
raise(Exception("Please call this on a dataset."))
try:
index=self.find_our_snapshot_index(snapshot)
if index!=None and index>0:
return(self.our_snapshots[index-1])
else:
return(None)
except:
return(None)
index=self.find_snapshot_index(snapshot)
while index:
index=index-1
if other_snapshots or self.snapshots[index].is_ours():
return(self.snapshots[index])
return(None)
def find_our_next_snapshot(self, snapshot):
"""find our next snapshot in this dataset. None if it doesnt exist"""
def find_next_snapshot(self, snapshot, other_snapshots=False):
"""find next snapshot in this dataset. None if it doesn't exist"""
if self.is_snapshot:
raise(Exception("Please call this on a dataset."))
try:
index=self.find_our_snapshot_index(snapshot)
if index!=None and index>=0 and index<len(self.our_snapshots)-1:
return(self.our_snapshots[index+1])
else:
return(None)
except:
return(None)
index=self.find_snapshot_index(snapshot)
while index!=None and index<len(self.snapshots)-1:
index=index+1
if other_snapshots or self.snapshots[index].is_ours():
return(self.snapshots[index])
return(None)
@cached_property
@ -630,7 +636,7 @@ class ZfsDataset():
def create_filesystem(self, parents=False):
"""create a filesytem"""
"""create a filesystem"""
if parents:
self.verbose("Creating filesystem and parents")
self.zfs_node.run(["zfs", "create", "-p", self.name ])
@ -644,9 +650,13 @@ class ZfsDataset():
def destroy(self, fail_exception=False):
"""destroy the dataset. by default failures are not an exception, so we can continue making backups"""
self.verbose("Destroying")
self.release()
try:
self.zfs_node.run(["zfs", "destroy", self.name])
self.zfs_node.run(["zfs", "destroy", "-d", self.name])
self.invalidate()
self.force_exists=False
return(True)
@ -679,35 +689,56 @@ class ZfsDataset():
return(ret)
def is_changed(self):
def is_changed(self, min_changed_bytes=1):
"""dataset is changed since ANY latest snapshot ?"""
self.debug("Checking if dataset is changed")
#NOTE: filesystems can have a very small amount written without actual changes in some cases
if int(self.properties['written'])<=self.ZFS_MAX_UNCHANGED_BYTES:
if min_changed_bytes==0:
return(True)
if int(self.properties['written'])<min_changed_bytes:
return(False)
else:
return(True)
def is_ours(self):
"""return true if this snapshot is created by this backup_nanme"""
"""return true if this snapshot is created by this backup_name"""
if re.match("^"+self.zfs_node.backup_name+"-[0-9]*$", self.snapshot_name):
return(True)
else:
return(False)
@property
def _hold_name(self):
return("zfs_autobackup:"+self.zfs_node.backup_name)
@property
def holds(self):
"""get list of holds for dataset"""
output=self.zfs_node.run([ "zfs" , "holds", "-H", self.name ], valid_exitcodes=[ 0 ], tab_split=True, readonly=True)
return(map(lambda fields: fields[1], output))
def is_hold(self):
"""did we hold this snapshot?"""
return(self._hold_name in self.holds)
def hold(self):
"""hold dataset"""
self.debug("holding")
self.zfs_node.run([ "zfs" , "hold", "zfs_autobackup:"+self.zfs_node.backup_name, self.name ], valid_exitcodes=[ 0,1 ])
self.zfs_node.run([ "zfs" , "hold", self._hold_name, self.name ], valid_exitcodes=[ 0,1 ])
def release(self):
"""release dataset"""
self.debug("releasing")
self.zfs_node.run([ "zfs" , "release", "zfs_autobackup:"+self.zfs_node.backup_name, self.name ], valid_exitcodes=[ 0,1 ])
if self.zfs_node.readonly or self.is_hold():
self.debug("releasing")
self.zfs_node.run([ "zfs" , "release", self._hold_name, self.name ], valid_exitcodes=[ 0,1 ])
@property
@ -757,22 +788,22 @@ class ZfsDataset():
def find_snapshot(self, snapshot):
"""find snapshot by snapshot (can be a snapshot_name or ZfsDataset)"""
"""find snapshot by snapshot (can be a snapshot_name or a different ZfsDataset )"""
if not isinstance(snapshot,ZfsDataset):
snapshot_name=snapshot
else:
snapshot_name=snapshot.snapshot_name
for snapshot in self.our_snapshots:
for snapshot in self.snapshots:
if snapshot.snapshot_name==snapshot_name:
return(snapshot)
return(None)
def find_our_snapshot_index(self, snapshot):
"""find our snapshot index by snapshot (can be a snapshot_name or ZfsDataset)"""
def find_snapshot_index(self, snapshot):
"""find snapshot index by snapshot (can be a snapshot_name or ZfsDataset)"""
if not isinstance(snapshot,ZfsDataset):
snapshot_name=snapshot
@ -780,7 +811,7 @@ class ZfsDataset():
snapshot_name=snapshot.snapshot_name
index=0
for snapshot in self.our_snapshots:
for snapshot in self.snapshots:
if snapshot.snapshot_name==snapshot_name:
return(index)
index=index+1
@ -789,24 +820,35 @@ class ZfsDataset():
@cached_property
def is_changed_ours(self):
"""dataset is changed since OUR latest snapshot?"""
self.debug("Checking if dataset is changed since our snapshot")
if not self.our_snapshots:
return(True)
def written_since_ours(self):
"""get number of bytes written since our last snapshot"""
self.debug("Getting bytes written since our last snapshot")
latest_snapshot=self.our_snapshots[-1]
cmd=[ "zfs", "get","-H" ,"-ovalue", "-p", "written@"+str(latest_snapshot), self.name ]
output=self.zfs_node.run(readonly=True, tab_split=False, cmd=cmd, valid_exitcodes=[ 0 ])
return(int(output[0]))
def is_changed_ours(self, min_changed_bytes=1):
"""dataset is changed since OUR latest snapshot?"""
if min_changed_bytes==0:
return(True)
if not self.our_snapshots:
return(True)
#NOTE: filesystems can have a very small amount written without actual changes in some cases
if int(output[0])<=self.ZFS_MAX_UNCHANGED_BYTES:
if self.written_since_ours<min_changed_bytes:
return(False)
return(True)
@cached_property
def recursive_datasets(self, types="filesystem,volume"):
"""get all datasets recursively under us"""
@ -824,7 +866,7 @@ class ZfsDataset():
"""returns a pipe with zfs send output for this snapshot
resume: Use resuming (both sides need to support it)
resume_token: resume sending from this token. (in that case we dont need to know snapshot names)
resume_token: resume sending from this token. (in that case we don't need to know snapshot names)
"""
#### build source command
@ -850,7 +892,7 @@ class ZfsDataset():
cmd.append("-P")
#resume a previous send? (dont need more parameters in that case)
#resume a previous send? (don't need more parameters in that case)
if resume_token:
cmd.extend([ "-t", resume_token ])
@ -868,7 +910,7 @@ class ZfsDataset():
# if args.buffer and args.ssh_source!="local":
# cmd.append("|mbuffer -m {}".format(args.buffer))
#NOTE: this doenst start the send yet, it only returns a subprocess.Pipe
#NOTE: this doesn't start the send yet, it only returns a subprocess.Pipe
return(self.zfs_node.run(cmd, pipe=True))
@ -883,7 +925,7 @@ class ZfsDataset():
cmd.extend(["zfs", "recv"])
#dont mount filesystem that is received
#don't mount filesystem that is received
cmd.append("-u")
for property in filter_properties:
@ -918,7 +960,8 @@ class ZfsDataset():
#check if transfer was really ok (exit codes have been wrong before due to bugs in zfs-utils and can be ignored by some parameters)
if not self.exists:
raise(Exception("Target doesnt exist after transfer, something went wrong."))
self.error("error during transfer")
raise(Exception("Target doesn't exist after transfer, something went wrong."))
# if args.buffer and args.ssh_target!="local":
# cmd.append("|mbuffer -m {}".format(args.buffer))
@ -939,7 +982,7 @@ class ZfsDataset():
if not prev_snapshot:
target_snapshot.verbose("receiving full".format(self.snapshot_name))
else:
#incemental
#incremental
target_snapshot.verbose("receiving incremental".format(self.snapshot_name))
#do it
@ -952,9 +995,13 @@ class ZfsDataset():
def rollback(self):
"""rollback to this snapshot"""
"""rollback to latest existing snapshot on this dataset"""
self.debug("Rolling back")
self.zfs_node.run(["zfs", "rollback", self.name])
for snapshot in reversed(self.snapshots):
if snapshot.exists:
self.zfs_node.run(["zfs", "rollback", snapshot.name])
return
def get_resume_snapshot(self, resume_token):
@ -979,38 +1026,73 @@ class ZfsDataset():
def thin(self, keeps=[]):
def thin(self, keeps=[], ignores=[]):
"""determines list of snapshots that should be kept or deleted based on the thinning schedule. cull the herd!
keep: list of snapshots to always keep (usually the last)
ignores: snapshots to completely ignore (usually incompatible target snapshots that are going to be destroyed anyway)
returns: ( keeps, obsoletes )
"""
return(self.zfs_node.thinner.thin(self.our_snapshots, keep_objects=keeps))
snapshots=[snapshot for snapshot in self.our_snapshots if snapshot not in ignores]
return(self.zfs_node.thinner.thin(snapshots, keep_objects=keeps))
def find_common_snapshot(self, target_dataset):
"""find latest coommon snapshot between us and target
"""find latest common snapshot between us and target
returns None if its an initial transfer
"""
if not target_dataset.our_snapshots:
if not target_dataset.snapshots:
#target has nothing yet
return(None)
else:
snapshot=self.find_snapshot(target_dataset.our_snapshots[-1].snapshot_name)
# snapshot=self.find_snapshot(target_dataset.snapshots[-1].snapshot_name)
if not snapshot:
#try to find another common snapshot as rollback-suggestion for admin
for target_snapshot in reversed(target_dataset.our_snapshots):
if self.find_snapshot(target_snapshot):
target_snapshot.error("Latest common snapshot, roll back to this.")
raise(Exception("Cant find latest target snapshot on source."))
target_dataset.error("Cant find common snapshot with target. ")
raise(Exception("You probablly need to delete the target dataset to fix this."))
# if not snapshot:
#try to common snapshot
for source_snapshot in reversed(self.snapshots):
if target_dataset.find_snapshot(source_snapshot):
source_snapshot.debug("common snapshot")
return(source_snapshot)
target_dataset.error("Cant find common snapshot with source.")
raise(Exception("You probably need to delete the target dataset to fix this."))
snapshot.debug("common snapshot")
def find_start_snapshot(self, common_snapshot, other_snapshots):
"""finds first snapshot to send"""
if not common_snapshot:
if not self.snapshots:
start_snapshot=None
else:
#start from beginning
start_snapshot=self.snapshots[0]
if not start_snapshot.is_ours() and not other_snapshots:
# try to start at a snapshot thats ours
start_snapshot=self.find_next_snapshot(start_snapshot, other_snapshots)
else:
start_snapshot=self.find_next_snapshot(common_snapshot, other_snapshots)
return(start_snapshot)
def find_incompatible_snapshots(self, common_snapshot):
"""returns a list of snapshots that is incompatible for a zfs recv onto the common_snapshot.
all direct followup snapshots with written=0 are compatible."""
ret=[]
if common_snapshot and self.snapshots:
followup=True
for snapshot in self.snapshots[self.find_snapshot_index(common_snapshot)+1:]:
if not followup or int(snapshot.properties['written'])!=0:
followup=False
ret.append(snapshot)
return(ret)
return(snapshot)
def get_allowed_properties(self, filter_properties, set_properties):
"""only returns lists of allowed properties for this dataset type"""
@ -1030,23 +1112,68 @@ class ZfsDataset():
return ( ( allowed_filter_properties, allowed_set_properties ) )
def sync_snapshots(self, target_dataset, show_progress=False, resume=True, filter_properties=[], set_properties=[], ignore_recv_exit_code=False, source_holds=True, rollback=False, raw=False):
"""sync this dataset's snapshots to target_dataset,"""
#determine start snapshot (the first snapshot after the common snapshot)
def sync_snapshots(self, target_dataset, show_progress=False, resume=True, filter_properties=[], set_properties=[], ignore_recv_exit_code=False, source_holds=True, rollback=False, raw=False, other_snapshots=False, no_send=False, destroy_incompatible=False):
"""sync this dataset's snapshots to target_dataset, while also thinning out old snapshots along the way."""
#determine common and start snapshot
target_dataset.debug("Determining start snapshot")
common_snapshot=self.find_common_snapshot(target_dataset)
if not common_snapshot:
#start from beginning
start_snapshot=self.our_snapshots[0]
else:
#roll target back to common snapshot
if rollback:
target_dataset.find_snapshot(common_snapshot).rollback()
start_snapshot=self.find_our_next_snapshot(common_snapshot)
start_snapshot=self.find_start_snapshot(common_snapshot, other_snapshots)
#should be destroyed before attempting zfs recv:
incompatible_target_snapshots=target_dataset.find_incompatible_snapshots(common_snapshot)
#resume?
#make target snapshot list the same as source, by adding virtual non-existing ones to the list.
target_dataset.debug("Creating virtual target snapshots")
source_snapshot=start_snapshot
while source_snapshot:
#create virtual target snapshot
virtual_snapshot=ZfsDataset(target_dataset.zfs_node, target_dataset.filesystem_name+"@"+source_snapshot.snapshot_name,force_exists=False)
target_dataset.snapshots.append(virtual_snapshot)
source_snapshot=self.find_next_snapshot(source_snapshot, other_snapshots)
#now let thinner decide what we want on both sides as final state (after all transfers are done)
self.debug("Create thinning list")
if self.our_snapshots:
(source_keeps, source_obsoletes)=self.thin(keeps=[self.our_snapshots[-1]])
else:
source_keeps=[]
source_obsoletes=[]
if target_dataset.our_snapshots:
(target_keeps, target_obsoletes)=target_dataset.thin(keeps=[target_dataset.our_snapshots[-1]], ignores=incompatible_target_snapshots)
else:
target_keeps=[]
target_obsoletes=[]
#on source: destroy all obsoletes before common. but after common, only delete snapshots that target also doesn't want to explicitly keep
before_common=True
for source_snapshot in self.snapshots:
if common_snapshot and source_snapshot.snapshot_name==common_snapshot.snapshot_name:
before_common=False
#never destroy common snapshot
else:
target_snapshot=target_dataset.find_snapshot(source_snapshot)
if (source_snapshot in source_obsoletes) and (before_common or (target_snapshot not in target_keeps)):
source_snapshot.destroy()
#on target: destroy everything thats obsolete, except common_snapshot
for target_snapshot in target_dataset.snapshots:
if (target_snapshot in target_obsoletes) and (not common_snapshot or target_snapshot.snapshot_name!=common_snapshot.snapshot_name):
if target_snapshot.exists:
target_snapshot.destroy()
#now actually transfer the snapshots, if we want
if no_send:
return
#resume?
resume_token=None
if 'receive_resume_token' in target_dataset.properties:
resume_token=target_dataset.properties['receive_resume_token']
@ -1058,46 +1185,33 @@ class ZfsDataset():
resume_token=None
#create virtual target snapshots
target_dataset.debug("Creating virtual target snapshots")
source_snapshot=start_snapshot
while source_snapshot:
#create virtual target snapshot
virtual_snapshot=ZfsDataset(target_dataset.zfs_node, target_dataset.filesystem_name+"@"+source_snapshot.snapshot_name,force_exists=False)
target_dataset.snapshots.append(virtual_snapshot)
source_snapshot=self.find_our_next_snapshot(source_snapshot)
#incompatible target snapshots?
if incompatible_target_snapshots:
if not destroy_incompatible:
for snapshot in incompatible_target_snapshots:
snapshot.error("Incompatible snapshot")
raise(Exception("Please destroy incompatible snapshots or use --destroy-incompatible."))
else:
for snapshot in incompatible_target_snapshots:
snapshot.verbose("Incompatible snapshot")
snapshot.destroy()
target_dataset.snapshots.remove(snapshot)
#now let thinner decide what we want on both sides as final state (after transfers are done)
self.debug("Create thinning list")
(source_keeps, source_obsoletes)=self.thin(keeps=[self.our_snapshots[-1]])
(target_keeps, target_obsoletes)=target_dataset.thin(keeps=[target_dataset.our_snapshots[-1]])
#stuff that is before common snapshot can be deleted rightaway
if common_snapshot:
for source_snapshot in self.our_snapshots:
if source_snapshot.snapshot_name==common_snapshot.snapshot_name:
break
#rollback target to latest?
if rollback:
target_dataset.rollback()
if source_snapshot in source_obsoletes:
source_snapshot.destroy()
for target_snapshot in target_dataset.our_snapshots:
if target_snapshot.snapshot_name==common_snapshot.snapshot_name:
break
if target_snapshot in target_obsoletes:
target_snapshot.destroy()
#now send/destroy the rest off the source
#now actually transfer the snapshots
prev_source_snapshot=common_snapshot
prev_target_snapshot=target_dataset.find_snapshot(common_snapshot)
source_snapshot=start_snapshot
while source_snapshot:
target_snapshot=target_dataset.find_snapshot(source_snapshot) #virtual
target_snapshot=target_dataset.find_snapshot(source_snapshot) #still virtual
#does target actually want it?
if target_snapshot in target_keeps:
( allowed_filter_properties, allowed_set_properties ) = self.get_allowed_properties(filter_properties, set_properties)
if target_snapshot not in target_obsoletes:
( allowed_filter_properties, allowed_set_properties ) = self.get_allowed_properties(filter_properties, set_properties) #NOTE: should we let transfer_snapshot handle this?
source_snapshot.transfer_snapshot(target_snapshot, prev_snapshot=prev_source_snapshot, show_progress=show_progress, resume=resume, filter_properties=allowed_filter_properties, set_properties=allowed_set_properties, ignore_recv_exit_code=ignore_recv_exit_code, resume_token=resume_token, raw=raw)
resume_token=None
@ -1110,29 +1224,26 @@ class ZfsDataset():
prev_source_snapshot.release()
target_dataset.find_snapshot(prev_source_snapshot).release()
#we may destroy the previous source snapshot now, if we dont want it anymore
if prev_source_snapshot and (prev_source_snapshot not in source_keeps):
# we may now destroy the previous source snapshot if its obsolete
if prev_source_snapshot in source_obsoletes:
prev_source_snapshot.destroy()
if prev_target_snapshot and (prev_target_snapshot not in target_keeps):
# destroy the previous target snapshot if obsolete (usually this is only the common_snapshot, the rest was already destroyed or will not be send)
prev_target_snapshot=target_dataset.find_snapshot(common_snapshot)
if prev_target_snapshot in target_obsoletes:
prev_target_snapshot.destroy()
prev_source_snapshot=source_snapshot
prev_target_snapshot=target_snapshot
else:
source_snapshot.debug("skipped (target doesnt need it)")
source_snapshot.debug("skipped (target doesn't need it)")
#was it actually a resume?
if resume_token:
target_dataset.debug("aborting resume, since we dont want that snapshot anymore")
target_dataset.debug("aborting resume, since we don't want that snapshot anymore")
target_dataset.abort_resume()
resume_token=None
#destroy it if we also dont want it anymore:
if source_snapshot not in source_keeps:
source_snapshot.destroy()
resume_token=None
source_snapshot=self.find_our_next_snapshot(source_snapshot)
source_snapshot=self.find_next_snapshot(source_snapshot, other_snapshots)
@ -1140,7 +1251,7 @@ class ZfsDataset():
class ZfsNode(ExecuteNode):
"""a node that contains zfs datasets. implements global (systemwide/pool wide) zfs commands"""
def __init__(self, backup_name, zfs_autobackup, ssh_to=None, readonly=False, description="", debug_output=False, thinner=Thinner()):
def __init__(self, backup_name, zfs_autobackup, ssh_config=None, ssh_to=None, readonly=False, description="", debug_output=False, thinner=Thinner()):
self.backup_name=backup_name
if not description:
self.description=ssh_to
@ -1149,6 +1260,9 @@ class ZfsNode(ExecuteNode):
self.zfs_autobackup=zfs_autobackup #for logging
if ssh_config:
self.verbose("Using custom SSH config: {}".format(ssh_config))
if ssh_to:
self.verbose("Datasets on: {}".format(ssh_to))
else:
@ -1164,7 +1278,7 @@ class ZfsNode(ExecuteNode):
self.thinner=thinner
ExecuteNode.__init__(self, ssh_to=ssh_to, readonly=readonly, debug_output=debug_output)
ExecuteNode.__init__(self, ssh_config=ssh_config, ssh_to=ssh_to, readonly=readonly, debug_output=debug_output)
def reset_progress(self):
@ -1172,9 +1286,9 @@ class ZfsNode(ExecuteNode):
self._progress_total_bytes=0
self._progress_start_time=time.time()
def _parse_stderr_pipe(self, line, hide_errors):
"""try to parse progress output of a piped zfs recv -Pv """
def parse_zfs_progress(self, line, hide_errors, prefix):
"""try to parse progress output of zfs recv -Pv, and don't show it as error to the user """
#is it progress output?
progress_fields=line.rstrip().split("\t")
@ -1182,10 +1296,11 @@ class ZfsNode(ExecuteNode):
if (line.find("nvlist version")==0 or
line.find("resume token contents")==0 or
len(progress_fields)!=1 or
line.find("skipping ")==0):
line.find("skipping ")==0 or
re.match("send from .*estimated size is ", line)):
#always output for debugging offcourse
self.debug("STDERR|> "+line.rstrip())
self.debug(prefix+line.rstrip())
#actual usefull info
if len(progress_fields)>=3:
@ -1207,15 +1322,18 @@ class ZfsNode(ExecuteNode):
return
# #is it progress output?
# if progress_output.find("nv")
#normal output without progress stuff
#still do the normal stderr output handling
if hide_errors:
self.debug("STDERR|> "+line.rstrip())
self.debug(prefix+line.rstrip())
else:
self.error("STDERR|> "+line.rstrip())
self.error(prefix+line.rstrip())
def _parse_stderr_pipe(self, line, hide_errors):
self.parse_zfs_progress(line, hide_errors, "STDERR|> ")
def _parse_stderr(self, line, hide_errors):
self.parse_zfs_progress(line, hide_errors, "STDERR > ")
def verbose(self,txt):
self.zfs_autobackup.verbose("{} {}".format(self.description, txt))
@ -1231,23 +1349,21 @@ class ZfsNode(ExecuteNode):
return(self.backup_name+"-"+time.strftime("%Y%m%d%H%M%S"))
def consistent_snapshot(self, datasets, snapshot_name, allow_empty=True):
def consistent_snapshot(self, datasets, snapshot_name, min_changed_bytes):
"""create a consistent (atomic) snapshot of specified datasets, per pool.
allow_empty: Allow empty snapshots. (compared to our latest snapshot)
"""
pools={}
#collect snapshots that we want to make, per pool
for dataset in datasets:
if not allow_empty:
if not dataset.is_changed_ours:
dataset.verbose("No changes since {}".format(dataset.our_snapshots[-1].snapshot_name))
continue
if not dataset.is_changed_ours(min_changed_bytes):
dataset.verbose("No changes since {}".format(dataset.our_snapshots[-1].snapshot_name))
continue
snapshot=ZfsDataset(dataset.zfs_node, dataset.name+"@"+snapshot_name)
pool=dataset.split_path()[0]
if not pool in pools:
pools[pool]=[]
@ -1255,14 +1371,13 @@ class ZfsNode(ExecuteNode):
pools[pool].append(snapshot)
#add snapshot to cache (also usefull in testmode)
dataset.snapshots.append(snapshot)
dataset.snapshots.append(snapshot) #NOTE: this will trigger zfs list
if not pools:
self.verbose("No changes anywhere: not creating snapshots.")
return
#create consitent snapshot per pool
#create consistent snapshot per pool
for (pool_name, snapshots) in pools.items():
cmd=[ "zfs", "snapshot" ]
@ -1324,8 +1439,9 @@ class ZfsAutobackup:
def __init__(self):
parser = argparse.ArgumentParser(
description='ZFS autobackup '+VERSION,
description=HEADER,
epilog='When a filesystem fails, zfs_backup will continue and report the number of failures at that end. Also the exit code will indicate the number of failures.')
parser.add_argument('--ssh-config', default=None, help='Custom ssh client config')
parser.add_argument('--ssh-source', default=None, help='Source host to get backup from. (user@hostname) Default %(default)s.')
parser.add_argument('--ssh-target', default=None, help='Target host to push backup to. (user@hostname) Default %(default)s.')
parser.add_argument('--keep-source', type=str, default="10,1d1w,1w1m,1m1y", help='Thinning schedule for old source snapshots. Default: %(default)s')
@ -1334,26 +1450,28 @@ class ZfsAutobackup:
parser.add_argument('backup_name', help='Name of the backup (you should set the zfs property "autobackup:backup-name" to true on filesystems you want to backup')
parser.add_argument('target_path', help='Target ZFS filesystem')
parser.add_argument('--no-snapshot', action='store_true', help='dont create new snapshot (usefull for finishing uncompleted backups, or cleanups)')
#Not appliciable anymore, version 3 alreadhy does optimal cleaning
# parser.add_argument('--no-send', action='store_true', help='dont send snapshots (usefull to only do a cleanup)')
parser.add_argument('--allow-empty', action='store_true', help='if nothing has changed, still create empty snapshots.')
parser.add_argument('--other-snapshots', action='store_true', help='Send over other snapshots as well, not just the ones created by this tool.')
parser.add_argument('--no-snapshot', action='store_true', help='Don\'t create new snapshots (usefull for finishing uncompleted backups, or cleanups)')
parser.add_argument('--no-send', action='store_true', help='Don\'t send snapshots (usefull for cleanups, or if you want a serperate send-cronjob)')
parser.add_argument('--min-change', type=int, default=1, help='Number of bytes written after which we consider a dataset changed (default %(default)s)')
parser.add_argument('--allow-empty', action='store_true', help='If nothing has changed, still create empty snapshots. (same as --min-change=0)')
parser.add_argument('--ignore-replicated', action='store_true', help='Ignore datasets that seem to be replicated some other way. (No changes since lastest snapshot. Usefull for proxmox HA replication)')
parser.add_argument('--no-holds', action='store_true', help='Dont lock snapshots on the source. (Usefull to allow proxmox HA replication to switches nodes)')
parser.add_argument('--no-holds', action='store_true', help='Don\'t lock snapshots on the source. (Usefull to allow proxmox HA replication to switches nodes)')
#not sure if this ever was usefull:
# parser.add_argument('--ignore-new', action='store_true', help='Ignore filesystem if there are already newer snapshots for it on the target (use with caution)')
parser.add_argument('--resume', action='store_true', help='support resuming of interrupted transfers by using the zfs extensible_dataset feature (both zpools should have it enabled) Disadvantage is that you need to use zfs recv -A if another snapshot is created on the target during a receive. Otherwise it will keep failing.')
parser.add_argument('--strip-path', default=0, type=int, help='number of directory to strip from path (use 1 when cloning zones between 2 SmartOS machines)')
parser.add_argument('--resume', action='store_true', help='Support resuming of interrupted transfers by using the zfs extensible_dataset feature (both zpools should have it enabled) Disadvantage is that you need to use zfs recv -A if another snapshot is created on the target during a receive. Otherwise it will keep failing.')
parser.add_argument('--strip-path', default=0, type=int, help='Number of directory to strip from path (use 1 when cloning zones between 2 SmartOS machines)')
# parser.add_argument('--buffer', default="", help='Use mbuffer with specified size to speedup zfs transfer. (e.g. --buffer 1G) Will also show nice progress output.')
# parser.add_argument('--destroy-stale', action='store_true', help='Destroy stale backups that have no more snapshots. Be sure to verify the output before using this! ')
parser.add_argument('--clear-refreservation', action='store_true', help='Filter "refreservation" property. (recommended, safes space. same as --filter-properties refreservation)')
parser.add_argument('--clear-mountpoint', action='store_true', help='Set property canmount=noauto for new datasets. (recommended, prevents mount conflicts. same as --set-properties canmount=noauto)')
parser.add_argument('--filter-properties', type=str, help='List of propererties to "filter" when receiving filesystems. (you can still restore them with zfs inherit -S)')
parser.add_argument('--filter-properties', type=str, help='List of properties to "filter" when receiving filesystems. (you can still restore them with zfs inherit -S)')
parser.add_argument('--set-properties', type=str, help='List of propererties to override when receiving filesystems. (you can still restore them with zfs inherit -S)')
parser.add_argument('--rollback', action='store_true', help='Rollback changes on the target before starting a backup. (normally you can prevent changes by setting the readonly property on the target_path to on)')
parser.add_argument('--rollback', action='store_true', help='Rollback changes to the latest target snapshot before starting. (normally you can prevent changes by setting the readonly property on the target_path to on)')
parser.add_argument('--destroy-incompatible', action='store_true', help='Destroy incompatible snapshots on target. Use with care! (implies --rollback)')
parser.add_argument('--ignore-transfer-errors', action='store_true', help='Ignore transfer errors (still checks if received filesystem exists. usefull for acltype errors)')
parser.add_argument('--raw', action='store_true', help='For encrypted datasets, send data exactly as it exists on disk.')
@ -1375,6 +1493,12 @@ class ZfsAutobackup:
if self.args.test:
self.args.verbose=True
if args.allow_empty:
args.min_change=0
if args.destroy_incompatible:
args.rollback=True
self.log=Log(show_debug=self.args.debug, show_verbose=self.args.verbose)
@ -1392,6 +1516,9 @@ class ZfsAutobackup:
self.log.verbose("#### "+title)
def run(self):
self.verbose (HEADER)
if self.args.test:
self.verbose("TEST MODE - SIMULATING WITHOUT MAKING ANY CHANGES")
@ -1399,14 +1526,14 @@ class ZfsAutobackup:
description="[Source]"
source_thinner=Thinner(self.args.keep_source)
source_node=ZfsNode(self.args.backup_name, self, ssh_to=self.args.ssh_source, readonly=self.args.test, debug_output=self.args.debug_output, description=description, thinner=source_thinner)
source_node=ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config, ssh_to=self.args.ssh_source, readonly=self.args.test, debug_output=self.args.debug_output, description=description, thinner=source_thinner)
source_node.verbose("Send all datasets that have 'autobackup:{}=true' or 'autobackup:{}=child'".format(self.args.backup_name, self.args.backup_name))
self.verbose("")
description="[Target]"
target_thinner=Thinner(self.args.keep_target)
target_node=ZfsNode(self.args.backup_name, self, ssh_to=self.args.ssh_target, readonly=self.args.test, debug_output=self.args.debug_output, description=description, thinner=target_thinner)
target_node=ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config, ssh_to=self.args.ssh_target, readonly=self.args.test, debug_output=self.args.debug_output, description=description, thinner=target_thinner)
target_node.verbose("Receive datasets under: {}".format(self.args.target_path))
self.set_title("Selecting")
@ -1423,7 +1550,7 @@ class ZfsAutobackup:
else:
self.set_title("Filtering already replicated filesystems")
for selected_source_dataset in selected_source_datasets:
if selected_source_dataset.is_changed():
if selected_source_dataset.is_changed(self.args.min_change):
source_datasets.append(selected_source_dataset)
else:
selected_source_dataset.verbose("Ignoring, already replicated")
@ -1431,10 +1558,14 @@ class ZfsAutobackup:
if not self.args.no_snapshot:
self.set_title("Snapshotting")
source_node.consistent_snapshot(source_datasets, source_node.new_snapshotname(), allow_empty=self.args.allow_empty)
source_node.consistent_snapshot(source_datasets, source_node.new_snapshotname(), min_changed_bytes=self.args.min_change)
self.set_title("Transferring")
if self.args.no_send:
self.set_title("Thinning")
else:
self.set_title("Sending and thinning")
if self.args.filter_properties:
filter_properties=self.args.filter_properties.split(",")
@ -1450,7 +1581,7 @@ class ZfsAutobackup:
filter_properties.append("refreservation")
if self.args.clear_mountpoint:
set_properties.append( "canmount=noauto" )
set_properties.append("canmount=noauto")
fail_count=0
for source_dataset in source_datasets:
@ -1461,13 +1592,13 @@ class ZfsAutobackup:
target_dataset=ZfsDataset(target_node, target_name)
#ensure parents exists
if not target_dataset.parent.exists:
if not self.args.no_send and not target_dataset.parent.exists:
target_dataset.parent.create_filesystem(parents=True)
source_dataset.sync_snapshots(target_dataset, show_progress=self.args.progress, resume=self.args.resume, filter_properties=filter_properties, set_properties=set_properties, ignore_recv_exit_code=self.args.ignore_transfer_errors, source_holds= not self.args.no_holds, rollback=self.args.rollback, raw=self.args.raw)
source_dataset.sync_snapshots(target_dataset, show_progress=self.args.progress, resume=self.args.resume, filter_properties=filter_properties, set_properties=set_properties, ignore_recv_exit_code=self.args.ignore_transfer_errors, source_holds= not self.args.no_holds, rollback=self.args.rollback, raw=self.args.raw, other_snapshots=self.args.other_snapshots, no_send=self.args.no_send, destroy_incompatible=self.args.destroy_incompatible)
except Exception as e:
fail_count=fail_count+1
source_dataset.error("DATASET FAILED: "+str(e))
self.error("DATASET FAILED: "+str(e))
if self.args.debug:
raise
@ -1477,7 +1608,7 @@ class ZfsAutobackup:
if self.args.test:
self.set_title("All tests successfull.")
else:
self.set_title("All backups completed succesfully")
self.set_title("All backups completed successfully")
else:
self.error("{} datasets failed!".format(fail_count))
@ -1489,5 +1620,3 @@ class ZfsAutobackup:
if __name__ == "__main__":
zfs_autobackup=ZfsAutobackup()
sys.exit(zfs_autobackup.run())

BIN
doc/thinner.odg Normal file

Binary file not shown.

BIN
doc/thinner.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

@ -15,3 +15,4 @@ source token
python3 -m twine check dist/*
python3 -m twine upload dist/*
git push --tags

View File

@ -2,7 +2,7 @@ import setuptools
import bin.zfs_autobackup
import os
os.system("git tag -m'' -a v{}".format(bin.zfs_autobackup.VERSION))
os.system("git tag -m ' ' -a v{}".format(bin.zfs_autobackup.VERSION))
with open("README.md", "r") as fh:
long_description = fh.read()