Compare commits

..

83 Commits

Author SHA1 Message Date
71270f8de6 bla 2019-11-10 01:00:26 +01:00
c98137ad42 Update README.md 2019-11-10 00:46:44 +01:00
a8d4c110ec fix 2019-11-10 00:44:52 +01:00
72b6213410 now works with both python 2 and 3 2019-11-10 00:41:18 +01:00
467b0588c9 clearer message 2019-11-09 23:53:30 +01:00
48ff1f7d2f destroys no longer fatal 2019-11-09 23:18:24 +01:00
29078b7c04 tips in case of missing last target snapshot on source 2019-10-29 00:32:55 +01:00
678012b255 termology 2019-10-28 21:34:52 +01:00
90b147aa13 better summary 2019-10-28 21:32:19 +01:00
1511642509 be very clear when running in testmode 2019-10-28 21:12:49 +01:00
a6878e1037 filter illegal properties per dataset type. change clear-options to filtering instead of setting 2019-10-28 20:54:01 +01:00
403ccb0a05 ditch abort hack. clean exit code handling 2019-10-28 20:16:18 +01:00
e455b42825 fix 2019-10-28 19:58:24 +01:00
0313876811 starting tests on production 2019-10-28 19:42:51 +01:00
4e525d97be wip 2019-10-28 19:17:41 +01:00
b56e1d1a84 wip 2019-10-28 18:55:19 +01:00
f114114993 wip 2019-10-28 18:50:43 +01:00
d367d9aa98 wip 2019-10-28 17:30:52 +01:00
ff55a6d413 wip 2019-10-28 15:05:29 +01:00
d80a636b12 wip 2019-10-28 13:25:21 +01:00
3fd80c9307 wip 2019-10-27 15:01:20 +01:00
70eda7a9a7 wip 2019-10-27 14:28:59 +01:00
71f2d1aa43 wip 2019-10-27 14:28:25 +01:00
b6fe4edb1c wip 2019-10-27 14:23:38 +01:00
9ae57a270f wip 2019-10-27 14:16:39 +01:00
48a55ebb5e wip 2019-10-27 13:39:11 +01:00
2a219fdcc5 wip 2019-10-27 13:11:54 +01:00
e7919489fb wip 2019-10-27 13:00:38 +01:00
17882449e0 wip 2019-10-27 12:39:42 +01:00
47337d5706 wip 2019-10-27 11:27:21 +01:00
c4bbce6fda wip 2019-10-27 11:16:41 +01:00
8763850917 wip 2019-10-26 00:36:08 +02:00
a589b2bf24 wip 2019-10-25 13:28:51 +02:00
1e9227869a wip 2019-10-25 13:25:53 +02:00
9d594305e3 wip 2019-10-24 00:58:33 +02:00
87d0354a67 wip 2019-10-24 00:29:49 +02:00
5fd92874e8 wip 2019-10-24 00:26:27 +02:00
66d7beb7ac wip 2019-10-24 00:17:50 +02:00
98b3902b4c wip 2019-10-24 00:04:18 +02:00
73214d4d2b wip 2019-10-23 23:44:00 +02:00
f259d01ec3 wip 2019-10-23 23:38:22 +02:00
5f5e2a8433 wip 2019-10-23 23:10:43 +02:00
66727c55b0 wip 2019-10-23 21:01:21 +02:00
673db7c014 wip 2019-10-23 00:23:20 +02:00
637963c046 wip 2019-10-22 23:13:29 +02:00
2d11229c26 wip 2019-10-22 23:10:27 +02:00
8fbbb59055 wip 2019-10-22 22:59:53 +02:00
052890a7e0 wip 2019-10-22 21:59:01 +02:00
34d0c5d67b completed progressive thinner class 2019-10-22 20:24:43 +02:00
63d2091712 wip 2019-10-22 17:43:15 +02:00
ebbc05d52b different approach 2019-10-22 02:43:55 +02:00
bf985998b3 Update README.md 2019-10-22 01:18:37 +02:00
6ff3cec0e1 wip 2019-10-21 21:40:26 +02:00
823616d455 wip 2019-10-21 15:46:19 +02:00
dd1476331b wip 2019-10-21 13:56:51 +02:00
058a189aa5 wip 2019-10-21 13:08:50 +02:00
04cc860db3 wip 2019-10-21 12:02:25 +02:00
96741ac843 wip 2019-10-21 01:42:15 +02:00
f5c8e558a3 wip 2019-10-21 01:25:37 +02:00
1af1c351bb created clean way to send and recv zfs filesystems 2019-10-21 00:34:28 +02:00
aed5d6f8a6 cleaned up piping 2019-10-20 23:44:08 +02:00
e83c297f92 rewriting output handling 2019-10-20 23:21:45 +02:00
d24cc5ba7b wip 2019-10-20 20:30:15 +02:00
91cf07f47d wip 2019-10-20 15:31:19 +02:00
57874e8e3e wip 2019-10-20 14:56:45 +02:00
fb1f0d90ad wip 2019-10-20 14:34:46 +02:00
5abd371329 wip 2019-10-20 02:23:22 +02:00
62b9d0ba39 wip 2019-10-20 01:47:00 +02:00
5e8c7fa968 wip 2019-10-20 00:31:31 +02:00
27f2397843 wip 2019-10-19 21:50:57 +02:00
afae972040 wip 2019-10-19 20:42:48 +02:00
71f23fede1 wip 2019-10-19 20:30:32 +02:00
5cb98589bf wip 2019-10-19 20:24:42 +02:00
6b50460542 wip 2019-10-19 19:45:04 +02:00
b98ffec10c wip 2019-10-19 19:04:13 +02:00
b97eed404a wip 2019-10-19 18:34:14 +02:00
fe39f42a9d wip 2019-10-19 16:47:28 +02:00
9ee5b2545c wip 2019-10-19 15:43:45 +02:00
1cbf92cabc wip 2019-10-19 14:45:24 +02:00
d12bff05ab wip 2019-10-19 13:52:34 +02:00
765dbf124a blah 2019-02-17 22:57:47 +01:00
cae8ec3e70 WIP 2017-07-30 00:52:31 +02:00
441a323fb2 refactoring for oop and a better diff-engine 2017-07-28 01:46:13 +02:00
3 changed files with 1361 additions and 1046 deletions

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
.vscode/settings.json

341
README.md
View File

@ -1,325 +1,16 @@
# ZFS autobackup
(checkout v3.0-beta for the new cool stuff: https://github.com/psy0rz/zfs_autobackup/blob/v3/README.md)
Official releases: https://github.com/psy0rz/zfs_autobackup/releases
Introduction
============
ZFS autobackup is used to periodicly backup ZFS filesystems to other locations. This is done using the very effcient zfs send and receive commands.
It has the following features:
* Works across operating systems: Tested with Linux, FreeBSD/FreeNAS and SmartOS.
* Works in combination with existing replication systems. (Like Proxmox HA)
* Automatically selects filesystems to backup by looking at a simple ZFS property. (recursive)
* Creates consistent snapshots. (takes all snapshots at once, atomic.)
* Multiple backups modes:
* "push" local data to a backup-server via SSH.
* "pull" remote data from a server via SSH and backup it locally.
* Backup local data on the same server.
* Can be scheduled via a simple cronjob or run directly from commandline.
* Supports resuming of interrupted transfers. (via the zfs extensible_dataset feature)
* Backups and snapshots can be named to prevent conflicts. (multiple backups from and to the same filesystems are no problem)
* Always creates a new snapshot before starting.
* Checks everything but tries continue on non-fatal errors when possible. (Reports error-count when done)
* Ability to 'finish' aborted backups to see what goes wrong.
* Easy to debug and has a test-mode. Actual unix commands are printed.
* Keeps latest X snapshots remote and locally. (default 30, configurable)
* Uses zfs-holds on important snapshots so they cant be accidentally destroyed.
* Easy installation:
* Only one host needs the zfs_autobackup script. The other host just needs ssh and the zfs command.
* Written in python and uses zfs-commands, no 3rd party dependency's or libraries.
* No separate config files or properties. Just one command you can copy/paste in your backup script.
Usage
====
```
usage: zfs_autobackup [-h] [--ssh-source SSH_SOURCE] [--ssh-target SSH_TARGET]
[--keep-source KEEP_SOURCE] [--keep-target KEEP_TARGET]
[--no-snapshot] [--no-send] [--allow-empty]
[--ignore-replicated] [--no-holds] [--ignore-new]
[--resume] [--strip-path STRIP_PATH] [--buffer BUFFER]
[--clear-refreservation] [--clear-mountpoint]
[--filter-properties FILTER_PROPERTIES] [--rollback]
[--ignore-transfer-errors] [--test] [--verbose]
[--debug]
backup_name target_path
ZFS autobackup v2.4
positional arguments:
backup_name Name of the backup (you should set the zfs property
"autobackup:backup-name" to true on filesystems you
want to backup
target_path Target path
optional arguments:
-h, --help show this help message and exit
--ssh-source SSH_SOURCE
Source host to get backup from. (user@hostname)
Default local.
--ssh-target SSH_TARGET
Target host to push backup to. (user@hostname) Default
local.
--keep-source KEEP_SOURCE
Number of days to keep old snapshots on source.
Default 30.
--keep-target KEEP_TARGET
Number of days to keep old snapshots on target.
Default 30.
--no-snapshot dont create new snapshot (usefull for finishing
uncompleted backups, or cleanups)
--no-send dont send snapshots (usefull to only do a cleanup)
--allow-empty if nothing has changed, still create empty snapshots.
--ignore-replicated Ignore datasets that seem to be replicated some other
way. (No changes since lastest snapshot. Usefull for
proxmox HA replication)
--no-holds Dont lock snapshots on the source. (Usefull to allow
proxmox HA replication to switches nodes)
--ignore-new Ignore filesystem if there are already newer snapshots
for it on the target (use with caution)
--resume support resuming of interrupted transfers by using the
zfs extensible_dataset feature (both zpools should
have it enabled) Disadvantage is that you need to use
zfs recv -A if another snapshot is created on the
target during a receive. Otherwise it will keep
failing.
--strip-path STRIP_PATH
number of directory to strip from path (use 1 when
cloning zones between 2 SmartOS machines)
--buffer BUFFER Use mbuffer with specified size to speedup zfs
transfer. (e.g. --buffer 1G) Will also show nice
progress output.
--clear-refreservation
Set refreservation property to none for new
filesystems. Usefull when backupping SmartOS volumes.
(recommended)
--clear-mountpoint Sets canmount=noauto property, to prevent the received
filesystem from mounting over existing filesystems.
(recommended)
--filter-properties FILTER_PROPERTIES
Filter properties when receiving filesystems. Can be
specified multiple times. (Example: If you send data
from Linux to FreeNAS, you should filter xattr)
--rollback Rollback changes on the target before starting a
backup. (normally you can prevent changes by setting
the readonly property on the target_path to on)
--ignore-transfer-errors
Ignore transfer errors (still checks if received
filesystem exists. usefull for acltype errors)
--test dont change anything, just show what would be done
(still does all read-only operations)
--verbose verbose output
--debug debug output (shows commands that are executed)
When a filesystem fails, zfs_backup will continue and report the number of
failures at that end. Also the exit code will indicate the number of failures.
```
Backup example
==============
In this example we're going to backup a SmartOS machine called `smartos01` to our fileserver called `fs1`.
Its important to choose a unique and consistent backup name. In this case we name our backup: `smartos01_fs1`.
Select filesystems to backup
----------------------------
On the source zfs system set the ```autobackup:smartos01_fs1``` zfs property to true:
```
[root@smartos01 ~]# zfs set autobackup:smartos01_fs1=true zones
[root@smartos01 ~]# zfs get -t filesystem autobackup:smartos01_fs1
NAME PROPERTY VALUE SOURCE
zones autobackup:smartos01_fs1 true local
zones/1eb33958-72c1-11e4-af42-ff0790f603dd autobackup:smartos01_fs1 true inherited from zones
zones/3c71a6cd-6857-407c-880c-09225ce4208e autobackup:smartos01_fs1 true inherited from zones
zones/3c905e49-81c0-4a5a-91c3-fc7996f97d47 autobackup:smartos01_fs1 true inherited from zones
...
```
Because we dont want to backup everything, we can exclude certain filesystem by setting the property to false:
```
[root@smartos01 ~]# zfs set autobackup:smartos01_fs1=false zones/backup
[root@smartos01 ~]# zfs get -t filesystem autobackup:smartos01_fs1
NAME PROPERTY VALUE SOURCE
zones autobackup:smartos01_fs1 true local
zones/1eb33958-72c1-11e4-af42-ff0790f603dd autobackup:smartos01_fs1 true inherited from zones
...
zones/backup autobackup:smartos01_fs1 false local
zones/backup/fs1 autobackup:smartos01_fs1 false inherited from zones/backup
...
```
Running zfs_autobackup
----------------------
There are 2 ways to run the backup, but the endresult is always the same. Its just a matter of security (trust relations between the servers) and preference.
First install the ssh-key on the server that you specify with --ssh-source or --ssh-target.
Method 1: Run the script on the backup server and pull the data from the server specfied by --ssh-source. This is usually the preferred way and prevents a hacked server from accesing the backup-data:
```
root@fs1:/home/psy# ./zfs_autobackup --ssh-source root@1.2.3.4 smartos01_fs1 fs1/zones/backup/zfsbackups/smartos01.server.com --verbose
Getting selected source filesystems for backup smartos01_fs1 on root@1.2.3.4
Selected: zones (direct selection)
Selected: zones/1eb33958-72c1-11e4-af42-ff0790f603dd (inherited selection)
Selected: zones/325dbc5e-2b90-11e3-8a3e-bfdcb1582a8d (inherited selection)
...
Ignoring: zones/backup (disabled)
Ignoring: zones/backup/fs1 (disabled)
...
Creating source snapshot smartos01_fs1-20151030203738 on root@1.2.3.4
Getting source snapshot-list from root@1.2.3.4
Getting target snapshot-list from local
Tranferring zones incremental backup between snapshots smartos01_fs1-20151030175345...smartos01_fs1-20151030203738
...
received 1.09MB stream in 1 seconds (1.09MB/sec)
Destroying old snapshots on source
Destroying old snapshots on target
All done
```
Method 2: Run the script on the server and push the data to the backup server specified by --ssh-target:
```
./zfs_autobackup --ssh-target root@2.2.2.2 smartos01_fs1 fs1/zones/backup/zfsbackups/smartos01.server.com --verbose --compress
...
All done
```
Tips
====
* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. If there are changes the next backup will fail and will require a zfs rollback. (by using the --rollback option for example)
* Use ```--clear-refreservation``` to save space on your backup server.
* Use ```--clear-mountpoint``` to prevent the target server from mounting the backupped filesystem in the wrong place during a reboot. If this happens on systems like SmartOS or Openindia, svc://filesystem/local wont be able to mount some stuff and you need to resolve these issues on the console.
Speeding up SSH and prevent connection flooding
-----------------------------------------------
Add this to your ~/.ssh/config:
```
Host *
ControlPath ~/.ssh/control-master-%r@%h:%p
ControlMaster auto
ControlPersist 3600
```
This will make all your ssh connections persistent and greatly speed up zfs_autobackup for jobs with short intervals.
Thanks @mariusvw :)
Specifying ssh port or options
------------------------------
The correct way to do this is by creating ~/.ssh/config:
```
Host smartos04
Hostname 1.2.3.4
Port 1234
user root
Compression yes
```
This way you can just specify "smartos04" as host.
Also uses compression on slow links.
Look in man ssh_config for many more options.
Troubleshooting
===============
`cannot receive incremental stream: invalid backup stream`
This usually means you've created a new snapshot on the target side during a backup.
* Solution 1: Restart zfs_autobackup and make sure you dont use --resume. If you did use --resume, be sure to "abort" the recveive on the target side with zfs recv -A.
* Solution 2: Destroy the newly created snapshot and restart zfs_autobackup.
`internal error: Invalid argument`
In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.
Try using something like: --filter-properties xattr
Restore example
===============
Restoring can be done with simple zfs commands. For example, use this to restore a specific SmartOS disk image to a temporary restore location:
```
root@fs1:/home/psy# zfs send fs1/zones/backup/zfsbackups/smartos01.server.com/zones/a3abd6c8-24c6-4125-9e35-192e2eca5908-disk0@smartos01_fs1-20160110000003 | ssh root@2.2.2.2 "zfs recv zones/restore"
```
After that you can rename the disk image from the temporary location to the location of a new SmartOS machine you've created.
Monitoring with Zabbix-jobs
===========================
You can monitor backups by using my zabbix-jobs script. (https://github.com/psy0rz/stuff/tree/master/zabbix-jobs)
Put this command directly after the zfs_backup command in your cronjob:
```
zabbix-job-status backup_smartos01_fs1 daily $?
```
This will update the zabbix server with the exitcode and will also alert you if the job didnt run for more than 2 days.
Backuping up a proxmox cluster with HA replication
==================================================
Due to the nature of proxmox we had to make a few enhancements to zfs_autobackup. This will probably also benefit other systems that use their own replication in combination with zfs_autobackup.
All data under rpool/data can be on multiple nodes of the cluster. The naming of those filesystem is unique over the whole cluster. Because of this we should backup rpool/data of all nodes to the same destination. This way we wont have duplicate backups of the filesystems that are replicated. Because of various options, you can even migrate hosts and zfs_autobackup will be fine. (and it will get the next backup from the new node automaticly)
In the example below we have 3 nodes, named h4, h5 and h6.
The backup will go to a machine named smartos03.
Preparing the proxmox nodes
---------------------------
On each node select the filesystems as following:
```
root@h4:~# zfs set autobackup:h4_smartos03=true rpool
root@h4:~# zfs set autobackup:h4_smartos03=false rpool/data
root@h4:~# zfs set autobackup:data_smartos03=child rpool/data
```
* rpool will be backuped the usual way, and is named h4_smartos03. (each node will have a unique name)
* rpool/data will be excluded from the usual backup
* The CHILDREN of rpool/data be selected for a cluster wide backup named data_smartos03. (each node uses the same backup name)
Preparing the backup server
---------------------------
Extra options needed for proxmox with HA:
* --no-holds: To allow proxmox to destroy our snapshots if a VM migrates to another node.
* --ignore-replicated: To ignore the replicated filesystems of proxmox on the receiving proxmox nodes. (e.g: only backup from the node where the VM is active)
I use the following backup script on the backup server:
```
for H in h4 h5 h6; do
echo "################################### DATA $H"
#backup data filesystems to a common place
./zfs_autobackup --ssh-source root@$H data_smartos03 zones/backup/zfsbackups/pxe1_data --clear-refreservation --clear-mountpoint --ignore-transfer-errors --strip-path 2 --verbose --resume --ignore-replicated --no-holds $@
zabbix-job-status backup_$H""_data_smartos03 daily $? >/dev/null 2>/dev/null
echo "################################### RPOOL $H"
#backup rpool to own place
./zfs_autobackup --ssh-source root@$H $H""_smartos03 zones/backup/zfsbackups/$H --verbose --clear-refreservation --clear-mountpoint --resume --ignore-transfer-errors $@
zabbix-job-status backup_$H""_smartos03 daily $? >/dev/null 2>/dev/null
done
```
# ZFS autobackup v3 - TEST VERSION
New in v3:
* Complete rewrite, cleaner object oriented code.
* Python 3 and 2 support.
* Backwards compatible with your current backups and parameters.
* Progressive thinning (via a destroy schedule. default schedule should be fine for most people)
* Cleaner output, with optional color support (pip install colorama).
* Clear distinction between local and remote output.
* Summary at the beginning, displaying what will happen and the current thinning-schedule.
* More effient destroying/skipping snaphots on the fly. (no more space issues if your backup is way behind)
* Progress indicator (--progress)
* Better property management (--set-properties and --filter-properties)
* Better resume handling, automaticly abort invalid resumes.
* More robust error handling.
* Prepared for future enhanchements.

File diff suppressed because it is too large Load Diff