Compare commits
56 Commits
v3.0-beta2
...
v3.0-beta3
| Author | SHA1 | Date | |
|---|---|---|---|
| 17445ec54a | |||
| 07a150618a | |||
| 067f3b92d1 | |||
| 71a394cfc7 | |||
| bfc36ac87f | |||
| ad47b26f56 | |||
| f38da17592 | |||
| d973905303 | |||
| 82465acd5b | |||
| 514131d67c | |||
| dfcae1613b | |||
| 67b21b4015 | |||
| 3907c850a6 | |||
| 3b9b96243b | |||
| 54235f455a | |||
| c176b968a9 | |||
| 921f7df0a5 | |||
| edee598cf8 | |||
| 80b3272f0f | |||
| 617e0fb69b | |||
| 46a85fd170 | |||
| 1f59229419 | |||
| fcd98e2d87 | |||
| dd8b2442ec | |||
| 291040eb2d | |||
| d12a132f3f | |||
| 2255e0e691 | |||
| 6a481ed6a4 | |||
| 11d051122b | |||
| 511311eee7 | |||
| fa405dce57 | |||
| bf37322aba | |||
| a7bf1e8af8 | |||
| 352c61fd00 | |||
| 0fe09ea535 | |||
| 64c9b84102 | |||
| 8a2e1d36d7 | |||
| a120fbb85f | |||
| 42bbecc571 | |||
| b8d744869d | |||
| c253a17b75 | |||
| d1fe00aee2 | |||
| 85d2e1a635 | |||
| 9455918708 | |||
| 5316737388 | |||
| c6afa33e62 | |||
| a8d0ff9f37 | |||
| cc1725e3be | |||
| 42b71bbc74 | |||
| 84d44a267a | |||
| ba89dc8bb2 | |||
| 62178e424e | |||
| b0ffdb4893 | |||
| cc45122e3e | |||
| e872d79677 | |||
| e74e50d4e8 |
186
README.md
186
README.md
@ -1,13 +1,19 @@
|
||||
# ZFS autobackup
|
||||
|
||||
(checkout v3.0-beta for the new cool stuff: https://github.com/psy0rz/zfs_autobackup/blob/v3/README.md)
|
||||
|
||||
Official releases: https://github.com/psy0rz/zfs_autobackup/releases
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
ZFS autobackup is used to periodicly backup ZFS filesystems to other locations. This is done using the very effcient zfs send and receive commands.
|
||||
|
||||
It has the following features:
|
||||
* Automaticly selects filesystems to backup by looking at a simple ZFS property.
|
||||
* Creates consistent snapshots.
|
||||
* Works across operating systems: Tested with Linux, FreeBSD/FreeNAS and SmartOS.
|
||||
* Works in combination with existing replication systems. (Like Proxmox HA)
|
||||
* Automatically selects filesystems to backup by looking at a simple ZFS property. (recursive)
|
||||
* Creates consistent snapshots. (takes all snapshots at once, atomic.)
|
||||
* Multiple backups modes:
|
||||
* "push" local data to a backup-server via SSH.
|
||||
* "pull" remote data from a server via SSH and backup it locally.
|
||||
@ -16,32 +22,37 @@ It has the following features:
|
||||
* Supports resuming of interrupted transfers. (via the zfs extensible_dataset feature)
|
||||
* Backups and snapshots can be named to prevent conflicts. (multiple backups from and to the same filesystems are no problem)
|
||||
* Always creates a new snapshot before starting.
|
||||
* Checks everything and aborts on errors.
|
||||
* Checks everything but tries continue on non-fatal errors when possible. (Reports error-count when done)
|
||||
* Ability to 'finish' aborted backups to see what goes wrong.
|
||||
* Easy to debug and has a test-mode. Actual unix commands are printed.
|
||||
* Keeps latest X snapshots remote and locally. (default 30, configurable)
|
||||
* Uses zfs-holds on important snapshots so they cant be accidentally destroyed.
|
||||
* Easy installation:
|
||||
* Only one host needs the zfs_autobackup script. The other host just needs ssh and the zfs command.
|
||||
* Written in python and uses zfs-commands, no 3rd party dependencys or libraries.
|
||||
* Written in python and uses zfs-commands, no 3rd party dependency's or libraries.
|
||||
* No separate config files or properties. Just one command you can copy/paste in your backup script.
|
||||
|
||||
Usage
|
||||
====
|
||||
```
|
||||
usage: zfs_autobackup [-h] [--ssh-source SSH_SOURCE] [--ssh-target SSH_TARGET]
|
||||
[--ssh-cipher SSH_CIPHER] [--keep-source KEEP_SOURCE]
|
||||
[--keep-target KEEP_TARGET] [--no-snapshot] [--no-send]
|
||||
[--resume] [--strip-path STRIP_PATH] [--destroy-stale]
|
||||
[--keep-source KEEP_SOURCE] [--keep-target KEEP_TARGET]
|
||||
[--no-snapshot] [--no-send] [--allow-empty]
|
||||
[--ignore-replicated] [--no-holds] [--ignore-new]
|
||||
[--resume] [--strip-path STRIP_PATH] [--buffer BUFFER]
|
||||
[--clear-refreservation] [--clear-mountpoint]
|
||||
[--rollback] [--compress] [--test] [--verbose] [--debug]
|
||||
backup_name target_fs
|
||||
[--filter-properties FILTER_PROPERTIES] [--rollback]
|
||||
[--ignore-transfer-errors] [--test] [--verbose]
|
||||
[--debug]
|
||||
backup_name target_path
|
||||
|
||||
ZFS autobackup v2.1
|
||||
ZFS autobackup v2.4
|
||||
|
||||
positional arguments:
|
||||
backup_name Name of the backup (you should set the zfs property
|
||||
"autobackup:backup-name" to true on filesystems you
|
||||
want to backup
|
||||
target_fs Target filesystem
|
||||
target_path Target path
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
@ -51,8 +62,6 @@ optional arguments:
|
||||
--ssh-target SSH_TARGET
|
||||
Target host to push backup to. (user@hostname) Default
|
||||
local.
|
||||
--ssh-cipher SSH_CIPHER
|
||||
SSH cipher to use (default None)
|
||||
--keep-source KEEP_SOURCE
|
||||
Number of days to keep old snapshots on source.
|
||||
Default 30.
|
||||
@ -62,14 +71,26 @@ optional arguments:
|
||||
--no-snapshot dont create new snapshot (usefull for finishing
|
||||
uncompleted backups, or cleanups)
|
||||
--no-send dont send snapshots (usefull to only do a cleanup)
|
||||
--allow-empty if nothing has changed, still create empty snapshots.
|
||||
--ignore-replicated Ignore datasets that seem to be replicated some other
|
||||
way. (No changes since lastest snapshot. Usefull for
|
||||
proxmox HA replication)
|
||||
--no-holds Dont lock snapshots on the source. (Usefull to allow
|
||||
proxmox HA replication to switches nodes)
|
||||
--ignore-new Ignore filesystem if there are already newer snapshots
|
||||
for it on the target (use with caution)
|
||||
--resume support resuming of interrupted transfers by using the
|
||||
zfs extensible_dataset feature (both zpools should
|
||||
have it enabled)
|
||||
have it enabled) Disadvantage is that you need to use
|
||||
zfs recv -A if another snapshot is created on the
|
||||
target during a receive. Otherwise it will keep
|
||||
failing.
|
||||
--strip-path STRIP_PATH
|
||||
number of directory to strip from path (use 1 when
|
||||
cloning zones between 2 SmartOS machines)
|
||||
--destroy-stale Destroy stale backups that have no more snapshots. Be
|
||||
sure to verify the output before using this!
|
||||
--buffer BUFFER Use mbuffer with specified size to speedup zfs
|
||||
transfer. (e.g. --buffer 1G) Will also show nice
|
||||
progress output.
|
||||
--clear-refreservation
|
||||
Set refreservation property to none for new
|
||||
filesystems. Usefull when backupping SmartOS volumes.
|
||||
@ -77,14 +98,23 @@ optional arguments:
|
||||
--clear-mountpoint Sets canmount=noauto property, to prevent the received
|
||||
filesystem from mounting over existing filesystems.
|
||||
(recommended)
|
||||
--filter-properties FILTER_PROPERTIES
|
||||
Filter properties when receiving filesystems. Can be
|
||||
specified multiple times. (Example: If you send data
|
||||
from Linux to FreeNAS, you should filter xattr)
|
||||
--rollback Rollback changes on the target before starting a
|
||||
backup. (normally you can prevent changes by setting
|
||||
the readonly property on the target_fs to on)
|
||||
--compress use compression during zfs send/recv
|
||||
the readonly property on the target_path to on)
|
||||
--ignore-transfer-errors
|
||||
Ignore transfer errors (still checks if received
|
||||
filesystem exists. usefull for acltype errors)
|
||||
--test dont change anything, just show what would be done
|
||||
(still does all read-only operations)
|
||||
--verbose verbose output
|
||||
--debug debug output (shows commands that are executed)
|
||||
|
||||
When a filesystem fails, zfs_backup will continue and report the number of
|
||||
failures at that end. Also the exit code will indicate the number of failures.
|
||||
```
|
||||
|
||||
Backup example
|
||||
@ -131,7 +161,7 @@ First install the ssh-key on the server that you specify with --ssh-source or --
|
||||
|
||||
Method 1: Run the script on the backup server and pull the data from the server specfied by --ssh-source. This is usually the preferred way and prevents a hacked server from accesing the backup-data:
|
||||
```
|
||||
root@fs1:/home/psy# ./zfs_autobackup --ssh-source root@1.2.3.4 smartos01_fs1 fs1/zones/backup/zfsbackups/smartos01.server.com --verbose --compress
|
||||
root@fs1:/home/psy# ./zfs_autobackup --ssh-source root@1.2.3.4 smartos01_fs1 fs1/zones/backup/zfsbackups/smartos01.server.com --verbose
|
||||
Getting selected source filesystems for backup smartos01_fs1 on root@1.2.3.4
|
||||
Selected: zones (direct selection)
|
||||
Selected: zones/1eb33958-72c1-11e4-af42-ff0790f603dd (inherited selection)
|
||||
@ -160,12 +190,63 @@ All done
|
||||
```
|
||||
|
||||
Tips
|
||||
----
|
||||
====
|
||||
|
||||
* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. If there are changes the next backup will fail and will require a zfs rollback. (by using the --rollback option for example)
|
||||
* Use ```--clear-refreservation``` to save space on your backup server.
|
||||
* Use ```--clear-mountpoint``` to prevent the target server from mounting the backupped filesystem in the wrong place during a reboot. If this happens on systems like SmartOS or Openindia, svc://filesystem/local wont be able to mount some stuff and you need to resolve these issues on the console.
|
||||
|
||||
Speeding up SSH and prevent connection flooding
|
||||
-----------------------------------------------
|
||||
|
||||
Add this to your ~/.ssh/config:
|
||||
```
|
||||
Host *
|
||||
ControlPath ~/.ssh/control-master-%r@%h:%p
|
||||
ControlMaster auto
|
||||
ControlPersist 3600
|
||||
```
|
||||
|
||||
This will make all your ssh connections persistent and greatly speed up zfs_autobackup for jobs with short intervals.
|
||||
|
||||
Thanks @mariusvw :)
|
||||
|
||||
|
||||
Specifying ssh port or options
|
||||
------------------------------
|
||||
|
||||
The correct way to do this is by creating ~/.ssh/config:
|
||||
```
|
||||
Host smartos04
|
||||
Hostname 1.2.3.4
|
||||
Port 1234
|
||||
user root
|
||||
Compression yes
|
||||
```
|
||||
|
||||
This way you can just specify "smartos04" as host.
|
||||
|
||||
Also uses compression on slow links.
|
||||
|
||||
Look in man ssh_config for many more options.
|
||||
|
||||
Troubleshooting
|
||||
===============
|
||||
|
||||
`cannot receive incremental stream: invalid backup stream`
|
||||
|
||||
This usually means you've created a new snapshot on the target side during a backup.
|
||||
* Solution 1: Restart zfs_autobackup and make sure you dont use --resume. If you did use --resume, be sure to "abort" the recveive on the target side with zfs recv -A.
|
||||
* Solution 2: Destroy the newly created snapshot and restart zfs_autobackup.
|
||||
|
||||
|
||||
`internal error: Invalid argument`
|
||||
|
||||
In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.
|
||||
|
||||
Try using something like: --filter-properties xattr
|
||||
|
||||
|
||||
Restore example
|
||||
===============
|
||||
|
||||
@ -178,19 +259,6 @@ root@fs1:/home/psy# zfs send fs1/zones/backup/zfsbackups/smartos01.server.com/z
|
||||
|
||||
After that you can rename the disk image from the temporary location to the location of a new SmartOS machine you've created.
|
||||
|
||||
Snapshotting example
|
||||
====================
|
||||
|
||||
Sending huge snapshots cant be resumed when a connection is interrupted: Next time zfs_autobackup is started, the whole snapshot will be transferred again. For this reason you might want to have multiple small snapshots.
|
||||
|
||||
The --no-send option can be usefull for this. This way you can already create small snapshots every few hours:
|
||||
````
|
||||
[root@smartos2 ~]# zfs_autobackup --ssh-source root@smartos1 smartos1_freenas1 zones --verbose --ssh-cipher chacha20-poly1305@openssh.com --no-send
|
||||
````
|
||||
|
||||
Later when our freenas1 server is ready we can use the same command without the --no-send at freenas1. At that point the server will receive all the small snapshots up to that point.
|
||||
|
||||
|
||||
|
||||
Monitoring with Zabbix-jobs
|
||||
===========================
|
||||
@ -203,3 +271,55 @@ zabbix-job-status backup_smartos01_fs1 daily $?
|
||||
```
|
||||
|
||||
This will update the zabbix server with the exitcode and will also alert you if the job didnt run for more than 2 days.
|
||||
|
||||
|
||||
Backuping up a proxmox cluster with HA replication
|
||||
==================================================
|
||||
|
||||
Due to the nature of proxmox we had to make a few enhancements to zfs_autobackup. This will probably also benefit other systems that use their own replication in combination with zfs_autobackup.
|
||||
|
||||
All data under rpool/data can be on multiple nodes of the cluster. The naming of those filesystem is unique over the whole cluster. Because of this we should backup rpool/data of all nodes to the same destination. This way we wont have duplicate backups of the filesystems that are replicated. Because of various options, you can even migrate hosts and zfs_autobackup will be fine. (and it will get the next backup from the new node automaticly)
|
||||
|
||||
|
||||
In the example below we have 3 nodes, named h4, h5 and h6.
|
||||
|
||||
The backup will go to a machine named smartos03.
|
||||
|
||||
Preparing the proxmox nodes
|
||||
---------------------------
|
||||
|
||||
On each node select the filesystems as following:
|
||||
```
|
||||
root@h4:~# zfs set autobackup:h4_smartos03=true rpool
|
||||
root@h4:~# zfs set autobackup:h4_smartos03=false rpool/data
|
||||
root@h4:~# zfs set autobackup:data_smartos03=child rpool/data
|
||||
|
||||
```
|
||||
|
||||
* rpool will be backuped the usual way, and is named h4_smartos03. (each node will have a unique name)
|
||||
* rpool/data will be excluded from the usual backup
|
||||
* The CHILDREN of rpool/data be selected for a cluster wide backup named data_smartos03. (each node uses the same backup name)
|
||||
|
||||
|
||||
Preparing the backup server
|
||||
---------------------------
|
||||
|
||||
Extra options needed for proxmox with HA:
|
||||
* --no-holds: To allow proxmox to destroy our snapshots if a VM migrates to another node.
|
||||
* --ignore-replicated: To ignore the replicated filesystems of proxmox on the receiving proxmox nodes. (e.g: only backup from the node where the VM is active)
|
||||
|
||||
|
||||
I use the following backup script on the backup server:
|
||||
```
|
||||
for H in h4 h5 h6; do
|
||||
echo "################################### DATA $H"
|
||||
#backup data filesystems to a common place
|
||||
./zfs_autobackup --ssh-source root@$H data_smartos03 zones/backup/zfsbackups/pxe1_data --clear-refreservation --clear-mountpoint --ignore-transfer-errors --strip-path 2 --verbose --resume --ignore-replicated --no-holds $@
|
||||
zabbix-job-status backup_$H""_data_smartos03 daily $? >/dev/null 2>/dev/null
|
||||
|
||||
echo "################################### RPOOL $H"
|
||||
#backup rpool to own place
|
||||
./zfs_autobackup --ssh-source root@$H $H""_smartos03 zones/backup/zfsbackups/$H --verbose --clear-refreservation --clear-mountpoint --resume --ignore-transfer-errors $@
|
||||
zabbix-job-status backup_$H""_smartos03 daily $? >/dev/null 2>/dev/null
|
||||
done
|
||||
```
|
||||
|
||||
595
zfs_autobackup
595
zfs_autobackup
@ -1,8 +1,8 @@
|
||||
#!/usr/bin/env python
|
||||
#!/usr/bin/env python2
|
||||
# -*- coding: utf8 -*-
|
||||
|
||||
|
||||
|
||||
#(C)edwin@datux.nl -- Edwin Eefting
|
||||
#Release under GPL.
|
||||
|
||||
from __future__ import print_function
|
||||
import os
|
||||
@ -11,25 +11,26 @@ import re
|
||||
import traceback
|
||||
import subprocess
|
||||
import pprint
|
||||
import cStringIO
|
||||
import time
|
||||
|
||||
|
||||
def error(txt):
|
||||
print(txt, file=sys.stderr)
|
||||
|
||||
|
||||
|
||||
def verbose(txt):
|
||||
if args.verbose:
|
||||
print(txt)
|
||||
|
||||
|
||||
|
||||
def debug(txt):
|
||||
if args.debug:
|
||||
print(txt)
|
||||
|
||||
#fatal abort execution, exit code 255
|
||||
def abort(txt):
|
||||
error(txt)
|
||||
sys.exit(255)
|
||||
|
||||
|
||||
|
||||
"""run a command. specifiy ssh user@host to run remotely"""
|
||||
def run(cmd, input=None, ssh_to="local", tab_split=False, valid_exitcodes=[ 0 ], test=False):
|
||||
@ -40,10 +41,6 @@ def run(cmd, input=None, ssh_to="local", tab_split=False, valid_exitcodes=[ 0 ],
|
||||
#use ssh?
|
||||
if ssh_to != "local":
|
||||
encoded_cmd.extend(["ssh", ssh_to])
|
||||
if args.ssh_cipher:
|
||||
encoded_cmd.extend(["-c", args.ssh_cipher])
|
||||
if args.compress:
|
||||
encoded_cmd.append("-C")
|
||||
|
||||
|
||||
#make sure the command gets all the data in utf8 format:
|
||||
@ -104,22 +101,24 @@ def zfs_get_selected_filesystems(ssh_to, backup_name):
|
||||
for source_filesystem in source_filesystems:
|
||||
(name,value,source)=source_filesystem
|
||||
if value=="false":
|
||||
verbose("Ignoring: {0} (disabled)".format(name))
|
||||
verbose("* Ignored : {0} (disabled)".format(name))
|
||||
|
||||
else:
|
||||
if source=="local":
|
||||
selected_filesystems.append(name)
|
||||
if source=="local" and ( value=="true" or value=="child"):
|
||||
direct_filesystems.append(name)
|
||||
verbose("Selected: {0} (direct selection)".format(name))
|
||||
elif source.find("inherited from ")==0:
|
||||
|
||||
if source=="local" and value=="true":
|
||||
selected_filesystems.append(name)
|
||||
verbose("* Selected: {0} (direct selection)".format(name))
|
||||
elif source.find("inherited from ")==0 and (value=="true" or value=="child"):
|
||||
inherited_from=re.sub("^inherited from ", "", source)
|
||||
if inherited_from in direct_filesystems:
|
||||
selected_filesystems.append(name)
|
||||
verbose("Selected: {0} (inherited selection)".format(name))
|
||||
verbose("* Selected: {0} (inherited selection)".format(name))
|
||||
else:
|
||||
verbose("Ignored: {0} (already a backup)".format(name))
|
||||
verbose("* Ignored : {0} (already a backup)".format(name))
|
||||
else:
|
||||
vebose("Ignored: {0} ({0})".format(source))
|
||||
verbose("* Ignored : {0} (only childs)".format(name))
|
||||
|
||||
return(selected_filesystems)
|
||||
|
||||
@ -130,7 +129,6 @@ def zfs_get_resumable_filesystems(ssh_to, filesystems):
|
||||
cmd=[ "zfs", "get", "-t", "volume,filesystem", "-o", "name,value", "-H", "receive_resume_token" ]
|
||||
cmd.extend(filesystems)
|
||||
|
||||
#TODO: get rid of ugly errors for non-existing target filesystems
|
||||
resumable_filesystems=run(ssh_to=ssh_to, tab_split=True, cmd=cmd, valid_exitcodes= [ 0,1 ] )
|
||||
|
||||
ret={}
|
||||
@ -166,23 +164,32 @@ test_snapshots={}
|
||||
|
||||
|
||||
|
||||
"""create snapshot on multiple filesystems at once (atomicly)"""
|
||||
"""create snapshot on multiple filesystems at once (atomicly per pool)"""
|
||||
def zfs_create_snapshot(ssh_to, filesystems, snapshot):
|
||||
|
||||
cmd=[ "zfs", "snapshot" ]
|
||||
|
||||
#collect per pool, zfs can only take atomic snapshots per pool
|
||||
pools={}
|
||||
for filesystem in filesystems:
|
||||
cmd.append(filesystem+"@"+snapshot)
|
||||
pool=filesystem.split('/')[0]
|
||||
if pool not in pools:
|
||||
pools[pool]=[]
|
||||
pools[pool].append(filesystem)
|
||||
|
||||
#in testmode we dont actually make changes, so keep them in a list to simulate
|
||||
if args.test:
|
||||
if not ssh_to in test_snapshots:
|
||||
test_snapshots[ssh_to]={}
|
||||
if not filesystem in test_snapshots[ssh_to]:
|
||||
test_snapshots[ssh_to][filesystem]=[]
|
||||
test_snapshots[ssh_to][filesystem].append(snapshot)
|
||||
for pool in pools:
|
||||
cmd=[ "zfs", "snapshot" ]
|
||||
for filesystem in pools[pool]:
|
||||
cmd.append(filesystem+"@"+snapshot)
|
||||
|
||||
run(ssh_to=ssh_to, tab_split=False, cmd=cmd, test=args.test)
|
||||
#in testmode we dont actually make changes, so keep them in a list to simulate
|
||||
# if args.test:
|
||||
# if not ssh_to in test_snapshots:
|
||||
# test_snapshots[ssh_to]={}
|
||||
# if not filesystem in test_snapshots[ssh_to]:
|
||||
# test_snapshots[ssh_to][filesystem]=[]
|
||||
# test_snapshots[ssh_to][filesystem].append(snapshot)
|
||||
|
||||
run(ssh_to=ssh_to, tab_split=False, cmd=cmd, test=args.test)
|
||||
|
||||
|
||||
"""get names of all snapshots for specified filesystems belonging to backup_name
|
||||
@ -194,13 +201,12 @@ def zfs_get_snapshots(ssh_to, filesystems, backup_name):
|
||||
ret={}
|
||||
|
||||
if filesystems:
|
||||
#TODO: get rid of ugly errors for non-existing target filesystems
|
||||
cmd=[
|
||||
"zfs", "list", "-d", "1", "-r", "-t" ,"snapshot", "-H", "-o", "name"
|
||||
]
|
||||
cmd.extend(filesystems)
|
||||
|
||||
snapshots=run(ssh_to=ssh_to, tab_split=False, cmd=cmd, valid_exitcodes=[ 0,1 ])
|
||||
snapshots=run(ssh_to=ssh_to, tab_split=False, cmd=cmd, valid_exitcodes=[ 0 ])
|
||||
|
||||
|
||||
for snapshot in snapshots:
|
||||
@ -211,23 +217,46 @@ def zfs_get_snapshots(ssh_to, filesystems, backup_name):
|
||||
ret[filesystem].append(snapshot_name)
|
||||
|
||||
#also add any test-snapshots that where created with --test mode
|
||||
if args.test:
|
||||
if ssh_to in test_snapshots:
|
||||
for filesystem in filesystems:
|
||||
if filesystem in test_snapshots[ssh_to]:
|
||||
if not filesystem in ret:
|
||||
ret[filesystem]=[]
|
||||
ret[filesystem].extend(test_snapshots[ssh_to][filesystem])
|
||||
# if args.test:
|
||||
# if ssh_to in test_snapshots:
|
||||
# for filesystem in filesystems:
|
||||
# if filesystem in test_snapshots[ssh_to]:
|
||||
# if not filesystem in ret:
|
||||
# ret[filesystem]=[]
|
||||
# ret[filesystem].extend(test_snapshots[ssh_to][filesystem])
|
||||
|
||||
return(ret)
|
||||
|
||||
|
||||
def default_tag():
|
||||
return("zfs_autobackup:"+args.backup_name)
|
||||
|
||||
"""hold a snapshot so it cant be destroyed accidently by admin or other processes"""
|
||||
def zfs_hold_snapshot(ssh_to, snapshot, tag=None):
|
||||
cmd=[
|
||||
"zfs", "hold", tag or default_tag(), snapshot
|
||||
]
|
||||
|
||||
run(ssh_to=ssh_to, test=args.test, tab_split=False, cmd=cmd, valid_exitcodes=[ 0, 1 ])
|
||||
|
||||
|
||||
"""release a snapshot"""
|
||||
def zfs_release_snapshot(ssh_to, snapshot, tag=None):
|
||||
cmd=[
|
||||
"zfs", "release", tag or default_tag(), snapshot
|
||||
]
|
||||
|
||||
run(ssh_to=ssh_to, test=args.test, tab_split=False, cmd=cmd, valid_exitcodes=[ 0, 1 ])
|
||||
|
||||
|
||||
|
||||
"""transfer a zfs snapshot from source to target. both can be either local or via ssh.
|
||||
|
||||
|
||||
TODO:
|
||||
|
||||
(parially implemented, local buffer is a bit more annoying to do)
|
||||
|
||||
buffering: specify buffer_size to use mbuffer (or alike) to apply buffering where neccesary
|
||||
|
||||
local to local:
|
||||
@ -240,7 +269,6 @@ remote send -> remote buffer -> ssh -> local buffer -> local receive
|
||||
remote to remote:
|
||||
remote send -> remote buffer -> ssh -> local buffer -> ssh -> remote buffer -> remote receive
|
||||
|
||||
TODO: can we string together all the zfs sends and recvs, so that we only need to use 1 ssh connection? should be faster if there are many small snaphots
|
||||
|
||||
|
||||
|
||||
@ -253,26 +281,31 @@ def zfs_transfer(ssh_source, source_filesystem, first_snapshot, second_snapshot,
|
||||
|
||||
if ssh_source != "local":
|
||||
source_cmd.extend([ "ssh", ssh_source ])
|
||||
if args.ssh_cipher:
|
||||
source_cmd.extend(["-c", args.ssh_cipher])
|
||||
if args.compress:
|
||||
source_cmd.append("-C")
|
||||
|
||||
source_cmd.extend(["zfs", "send", ])
|
||||
|
||||
#all kind of performance options:
|
||||
source_cmd.append("-L") # large block support
|
||||
source_cmd.append("-e") # WRITE_EMBEDDED, more compact stream
|
||||
source_cmd.append("-c") # use compressed WRITE records
|
||||
if not args.resume:
|
||||
source_cmd.append("-D") # dedupped stream, sends less duplicate data
|
||||
|
||||
|
||||
|
||||
#only verbose in debug mode, lots of output
|
||||
if args.debug:
|
||||
if args.debug :
|
||||
source_cmd.append("-v")
|
||||
|
||||
|
||||
if not first_snapshot:
|
||||
txt="Initial transfer of "+source_filesystem+" snapshot "+second_snapshot
|
||||
txt=">>> Transfer: "+source_filesystem+"@"+second_snapshot
|
||||
else:
|
||||
txt="Incremental transfer of "+source_filesystem+" between snapshots "+first_snapshot+"..."+second_snapshot
|
||||
txt=">>> Transfer: "+source_filesystem+"@"+first_snapshot+"...@"+second_snapshot
|
||||
|
||||
if resume_token:
|
||||
source_cmd.extend([ "-t", resume_token ])
|
||||
verbose("RESUMING "+txt)
|
||||
txt=txt+" [RESUMED]"
|
||||
|
||||
else:
|
||||
source_cmd.append("-p")
|
||||
@ -285,22 +318,26 @@ def zfs_transfer(ssh_source, source_filesystem, first_snapshot, second_snapshot,
|
||||
else:
|
||||
source_cmd.append(source_filesystem + "@" + second_snapshot)
|
||||
|
||||
verbose(txt)
|
||||
verbose(txt)
|
||||
|
||||
if args.buffer and args.ssh_source!="local":
|
||||
source_cmd.append("|mbuffer -m {}".format(args.buffer))
|
||||
|
||||
|
||||
#### build target command
|
||||
target_cmd=[]
|
||||
|
||||
if ssh_target != "local":
|
||||
target_cmd.extend([ "ssh", ssh_target ])
|
||||
if args.ssh_cipher:
|
||||
target_cmd.extend(["-c", args.ssh_cipher])
|
||||
if args.compress:
|
||||
target_cmd.append("-C")
|
||||
|
||||
target_cmd.extend(["zfs", "recv", "-u" ])
|
||||
|
||||
#also verbose in --verbose mode so we can see the transfer speed when its completed
|
||||
if args.verbose or args.debug:
|
||||
# filter certain properties on receive (usefull for linux->freebsd in some cases)
|
||||
if args.filter_properties:
|
||||
for filter_property in args.filter_properties:
|
||||
target_cmd.extend([ "-x" , filter_property ])
|
||||
|
||||
if args.debug:
|
||||
target_cmd.append("-v")
|
||||
|
||||
if args.resume:
|
||||
@ -312,6 +349,8 @@ def zfs_transfer(ssh_source, source_filesystem, first_snapshot, second_snapshot,
|
||||
else:
|
||||
target_cmd.append(target_filesystem)
|
||||
|
||||
if args.buffer and args.ssh_target!="local":
|
||||
target_cmd.append("|mbuffer -m {}".format(args.buffer))
|
||||
|
||||
|
||||
#### make sure parent on target exists
|
||||
@ -332,44 +371,54 @@ def zfs_transfer(ssh_source, source_filesystem, first_snapshot, second_snapshot,
|
||||
source_proc.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
|
||||
target_proc.communicate()
|
||||
|
||||
if source_proc.returncode:
|
||||
raise(subprocess.CalledProcessError(source_proc.returncode, source_cmd))
|
||||
if not args.ignore_transfer_errors:
|
||||
if source_proc.returncode:
|
||||
raise(subprocess.CalledProcessError(source_proc.returncode, source_cmd))
|
||||
|
||||
#zfs recv sometimes gives an exitcode 1 while the transfer was succesfull, therefore we ignore exit 1's and do an extra check to see if the snapshot is there.
|
||||
if target_proc.returncode and target_proc.returncode!=1:
|
||||
raise(subprocess.CalledProcessError(target_proc.returncode, target_cmd))
|
||||
#zfs recv sometimes gives an exitcode 1 while the transfer was succesfull, therefore we ignore exit 1's and do an extra check to see if the snapshot is there.
|
||||
if target_proc.returncode and target_proc.returncode!=1:
|
||||
raise(subprocess.CalledProcessError(target_proc.returncode, target_cmd))
|
||||
|
||||
debug("Verifying if snapshot exists on target")
|
||||
run(ssh_to=ssh_target, cmd=["zfs", "list", target_filesystem+"@"+second_snapshot ])
|
||||
|
||||
|
||||
|
||||
"""get filesystems that where already backupped to a target. """
|
||||
def zfs_get_backupped_filesystems(ssh_to, backup_name, target_fs):
|
||||
#get all target filesystems that have received or inherited the backup propert, under the target_fs tree
|
||||
ret=run(ssh_to=ssh_to, tab_split=False, cmd=[
|
||||
"zfs", "get", "-r", "-t", "volume,filesystem", "-o", "name", "-s", "received,inherited", "-H", "autobackup:"+backup_name, target_fs
|
||||
#NOTE: unreliable when using with autobackup:bla=child
|
||||
# """get filesystems that where already backupped to a target. """
|
||||
# def zfs_get_backupped_filesystems(ssh_to, backup_name, target_path):
|
||||
# #get all target filesystems that have received or inherited the backup propert, under the target_path tree
|
||||
# ret=run(ssh_to=ssh_to, tab_split=False, valid_exitcodes=[ 0,1 ], cmd=[
|
||||
# "zfs", "get", "-r", "-t", "volume,filesystem", "-o", "name", "-s", "received,inherited", "-H", "autobackup:"+backup_name, target_path
|
||||
# ])
|
||||
#
|
||||
# return(ret)
|
||||
|
||||
"""get existing filesystems """
|
||||
def zfs_get_existing_filesystems(ssh_to, target_path):
|
||||
#get all target filesystems that have received or inherited the backup propert, under the target_path tree
|
||||
ret=run(ssh_to=ssh_to, tab_split=False, valid_exitcodes=[ 0,1 ], cmd=[
|
||||
"zfs", "list", "-r", "-t", "volume,filesystem", "-o", "name", "-H", target_path
|
||||
])
|
||||
|
||||
return(ret)
|
||||
|
||||
|
||||
|
||||
"""get filesystems that where once backupped to target but are no longer selected on source
|
||||
|
||||
these are filesystems that are not in the list in target_filesystems.
|
||||
|
||||
this happens when filesystems are destroyed or unselected on the source.
|
||||
"""
|
||||
def get_stale_backupped_filesystems(ssh_to, backup_name, target_fs, target_filesystems):
|
||||
def get_stale_backupped_filesystems(backup_name, target_path, target_filesystems, existing_target_filesystems):
|
||||
|
||||
|
||||
backupped_filesystems=zfs_get_backupped_filesystems(ssh_to=ssh_to, backup_name=backup_name, target_fs=target_fs)
|
||||
|
||||
#determine backupped filesystems that are not in target_filesystems anymore
|
||||
stale_backupped_filesystems=[]
|
||||
for backupped_filesystem in backupped_filesystems:
|
||||
if backupped_filesystem not in target_filesystems:
|
||||
stale_backupped_filesystems.append(backupped_filesystem)
|
||||
for existing_target_filesystem in existing_target_filesystems:
|
||||
if existing_target_filesystem not in target_filesystems:
|
||||
stale_backupped_filesystems.append(existing_target_filesystem)
|
||||
|
||||
return(stale_backupped_filesystems)
|
||||
|
||||
@ -397,11 +446,50 @@ def lstrip_path(path, count):
|
||||
return("/".join(path.split("/")[count:]))
|
||||
|
||||
|
||||
"""get list of filesystems that are changed, compared to specified latest snapshot. """
|
||||
def zfs_get_unchanged_snapshots(ssh_to, snapshots):
|
||||
|
||||
ret=[]
|
||||
for ( filesystem, snapshot_list ) in snapshots.items():
|
||||
latest_snapshot=snapshot_list[-1]
|
||||
|
||||
cmd=[ "zfs", "get","-H" ,"-ovalue", "written@"+latest_snapshot, filesystem ]
|
||||
|
||||
|
||||
output=run(ssh_to=ssh_to, tab_split=False, cmd=cmd, valid_exitcodes=[ 0 ])
|
||||
|
||||
if output[0]=="0B" or output[0]=="0":
|
||||
ret.append(filesystem)
|
||||
|
||||
return(ret)
|
||||
|
||||
"""get filesytems that are have changed since any snapshot."""
|
||||
def zfs_get_unchanged_filesystems(ssh_to, filesystems):
|
||||
|
||||
ret=[]
|
||||
cmd=[ "zfs", "get","-H" ,"-oname,value", "written" ]
|
||||
cmd.extend(filesystems)
|
||||
output=run(ssh_to=ssh_to, tab_split=True, cmd=cmd, valid_exitcodes=[ 0 ])
|
||||
|
||||
for ( filesystem , written ) in output:
|
||||
if written=="0B" or written=="0":
|
||||
ret.append(filesystem)
|
||||
|
||||
return(ret)
|
||||
|
||||
|
||||
|
||||
#fugly..
|
||||
failures=0
|
||||
#something failed, but we try to continue with the rest
|
||||
def failed(txt):
|
||||
global failures
|
||||
failures=failures+1
|
||||
error("FAILURE: "+txt+"\n")
|
||||
|
||||
|
||||
def zfs_autobackup():
|
||||
|
||||
|
||||
|
||||
############## data gathering section
|
||||
|
||||
if args.test:
|
||||
@ -411,55 +499,96 @@ def zfs_autobackup():
|
||||
|
||||
### getting and determinging source/target filesystems
|
||||
|
||||
# get selected filesystem on backup source
|
||||
# get selected filesystems on backup source
|
||||
verbose("Getting selected source filesystems for backup {0} on {1}".format(args.backup_name,args.ssh_source))
|
||||
source_filesystems=zfs_get_selected_filesystems(args.ssh_source, args.backup_name)
|
||||
|
||||
#nothing todo
|
||||
if not source_filesystems:
|
||||
error("No filesystems source selected, please do a 'zfs set autobackup:{0}=true' on {1}".format(args.backup_name,args.ssh_source))
|
||||
sys.exit(1)
|
||||
abort("No source filesystems selected, please do a 'zfs set autobackup:{0}=true' on {1}".format(args.backup_name,args.ssh_source))
|
||||
|
||||
if args.ignore_replicated:
|
||||
replicated_filesystems=zfs_get_unchanged_filesystems(args.ssh_source, source_filesystems)
|
||||
for replicated_filesystem in replicated_filesystems:
|
||||
if replicated_filesystem in source_filesystems:
|
||||
source_filesystems.remove(replicated_filesystem)
|
||||
verbose("* Already replicated: {}".format(replicated_filesystem))
|
||||
|
||||
if not source_filesystems:
|
||||
verbose("Nothing to do, all filesystems are already replicated.")
|
||||
sys.exit(0)
|
||||
|
||||
# determine target filesystems
|
||||
target_filesystems=[]
|
||||
for source_filesystem in source_filesystems:
|
||||
#append args.target_fs prefix and strip args.strip_path paths from source_filesystem
|
||||
target_filesystems.append(args.target_fs + "/" + lstrip_path(source_filesystem, args.strip_path))
|
||||
#append args.target_path prefix and strip args.strip_path paths from source_filesystem
|
||||
target_filesystems.append(args.target_path + "/" + lstrip_path(source_filesystem, args.strip_path))
|
||||
debug("Wanted target filesystems:\n"+str(pprint.pformat(target_filesystems)))
|
||||
|
||||
# get actual existing target filesystems. (including ones that might not be in the backupset anymore)
|
||||
verbose("Getting existing target filesystems")
|
||||
existing_target_filesystems=zfs_get_existing_filesystems(ssh_to=args.ssh_target, target_path=args.target_path)
|
||||
debug("Existing target filesystems:\n"+str(pprint.pformat(existing_target_filesystems)))
|
||||
common_target_filesystems=list(set(target_filesystems) & set(existing_target_filesystems))
|
||||
debug("Common target filesystems (target filesystems that also exist on source):\n"+str(pprint.pformat(common_target_filesystems)))
|
||||
|
||||
|
||||
### creating snapshots
|
||||
# this is one of the first things we do, so that in case of failures we still have snapshots.
|
||||
|
||||
#create new snapshot?
|
||||
if not args.no_snapshot:
|
||||
new_snapshot_name=args.backup_name+"-"+time.strftime("%Y%m%d%H%M%S")
|
||||
verbose("Creating source snapshot {0} on {1} ".format(new_snapshot_name, args.ssh_source))
|
||||
zfs_create_snapshot(args.ssh_source, source_filesystems, new_snapshot_name)
|
||||
|
||||
|
||||
### get resumable transfers
|
||||
### get resumable transfers from target
|
||||
resumable_target_filesystems={}
|
||||
if args.resume:
|
||||
verbose("Checking for aborted transfers that can be resumed")
|
||||
#Important: use target_filesystem, not existing_target_filesystems (during initial transfer its resumable but doesnt exsit yet)
|
||||
resumable_target_filesystems=zfs_get_resumable_filesystems(args.ssh_target, target_filesystems)
|
||||
debug("Resumable filesystems: "+str(pprint.pformat(resumable_target_filesystems)))
|
||||
debug("Resumable filesystems:\n"+str(pprint.pformat(resumable_target_filesystems)))
|
||||
|
||||
|
||||
### get all snapshots of all selected filesystems on both source and target
|
||||
### get existing target snapshots
|
||||
target_snapshots={}
|
||||
if common_target_filesystems:
|
||||
verbose("Getting target snapshot-list from {0}".format(args.ssh_target))
|
||||
target_snapshots=zfs_get_snapshots(args.ssh_target, common_target_filesystems, args.backup_name)
|
||||
# except subprocess.CalledProcessError:
|
||||
# verbose("(ignoring errors, probably initial backup for this filesystem)")
|
||||
# pass
|
||||
debug("Target snapshots:\n" + str(pprint.pformat(target_snapshots)))
|
||||
|
||||
|
||||
### get eixsting source snapshots
|
||||
verbose("Getting source snapshot-list from {0}".format(args.ssh_source))
|
||||
source_snapshots=zfs_get_snapshots(args.ssh_source, source_filesystems, args.backup_name)
|
||||
debug("Source snapshots: " + str(pprint.pformat(source_snapshots)))
|
||||
debug("Source snapshots:\n" + str(pprint.pformat(source_snapshots)))
|
||||
|
||||
|
||||
### create new snapshots on source
|
||||
if not args.no_snapshot:
|
||||
#determine which filesystems changed since last snapshot
|
||||
if not args.allow_empty and not args.ignore_replicated:
|
||||
#determine which filesystemn are unchanged since OUR snapshots. (not since ANY snapshot)
|
||||
unchanged_filesystems=zfs_get_unchanged_snapshots(args.ssh_source, source_snapshots)
|
||||
|
||||
else:
|
||||
unchanged_filesystems=[]
|
||||
|
||||
snapshot_filesystems=[]
|
||||
for source_filesystem in source_filesystems:
|
||||
if source_filesystem not in unchanged_filesystems:
|
||||
snapshot_filesystems.append(source_filesystem)
|
||||
else:
|
||||
verbose("* Not snapshotting {}, no changes found.".format(source_filesystem))
|
||||
|
||||
#create snapshots
|
||||
if snapshot_filesystems:
|
||||
new_snapshot_name=args.backup_name+"-"+time.strftime("%Y%m%d%H%M%S")
|
||||
verbose("Creating source snapshots {0} on {1} ".format(new_snapshot_name, args.ssh_source))
|
||||
zfs_create_snapshot(args.ssh_source, snapshot_filesystems, new_snapshot_name)
|
||||
else:
|
||||
verbose("No changes at all, not creating snapshot.")
|
||||
|
||||
#add it to the list of source filesystems
|
||||
for snapshot_filesystem in snapshot_filesystems:
|
||||
source_snapshots.setdefault(snapshot_filesystem,[]).append(new_snapshot_name)
|
||||
|
||||
|
||||
target_snapshots={}
|
||||
try:
|
||||
verbose("Getting target snapshot-list from {0}".format(args.ssh_target))
|
||||
target_snapshots=zfs_get_snapshots(args.ssh_target, target_filesystems, args.backup_name)
|
||||
except subprocess.CalledProcessError:
|
||||
verbose("(ignoring errors, probably initial backup for this filesystem)")
|
||||
pass
|
||||
debug("Target snapshots: " + str(pprint.pformat(target_snapshots)))
|
||||
|
||||
|
||||
#obsolete snapshots that may be removed
|
||||
@ -472,166 +601,204 @@ def zfs_autobackup():
|
||||
|
||||
#determine which snapshots to send for each filesystem
|
||||
for source_filesystem in source_filesystems:
|
||||
target_filesystem=args.target_fs + "/" + lstrip_path(source_filesystem, args.strip_path)
|
||||
try:
|
||||
target_filesystem=args.target_path + "/" + lstrip_path(source_filesystem, args.strip_path)
|
||||
|
||||
if source_filesystem not in source_snapshots:
|
||||
#this happens if you use --no-snapshot and there are new filesystems without snapshots
|
||||
verbose("Skipping source filesystem {0}, no snapshots found".format(source_filesystem))
|
||||
else:
|
||||
|
||||
#incremental or initial send?
|
||||
if target_filesystem in target_snapshots and target_snapshots[target_filesystem]:
|
||||
#incremental mode, determine what to send and what is obsolete
|
||||
|
||||
#latest succesfully send snapshot, should be common on both source and target
|
||||
latest_target_snapshot=target_snapshots[target_filesystem][-1]
|
||||
|
||||
if latest_target_snapshot not in source_snapshots[source_filesystem]:
|
||||
#cant find latest target anymore. find first common snapshot and inform user
|
||||
error="Cant find latest target snapshot on source, did you destroy it accidently? "+source_filesystem+"@"+latest_target_snapshot
|
||||
for latest_target_snapshot in reversed(target_snapshots[target_filesystem]):
|
||||
if latest_target_snapshot in source_snapshots[source_filesystem]:
|
||||
error=error+"\nYou could solve this by rolling back to: "+target_filesystem+"@"+latest_target_snapshot;
|
||||
break
|
||||
|
||||
raise(Exception(error))
|
||||
|
||||
#send all new source snapshots that come AFTER the last target snapshot
|
||||
latest_source_index=source_snapshots[source_filesystem].index(latest_target_snapshot)
|
||||
send_snapshots=source_snapshots[source_filesystem][latest_source_index+1:]
|
||||
|
||||
#source snapshots that come BEFORE last target snapshot are obsolete
|
||||
source_obsolete_snapshots[source_filesystem]=source_snapshots[source_filesystem][0:latest_source_index]
|
||||
|
||||
#target snapshots that come BEFORE last target snapshot are obsolete
|
||||
latest_target_index=target_snapshots[target_filesystem].index(latest_target_snapshot)
|
||||
target_obsolete_snapshots[target_filesystem]=target_snapshots[target_filesystem][0:latest_target_index]
|
||||
if source_filesystem not in source_snapshots:
|
||||
#this happens if you use --no-snapshot and there are new filesystems without snapshots
|
||||
verbose("* Skipping source filesystem {0}, no snapshots found".format(source_filesystem))
|
||||
else:
|
||||
#initial mode, send all snapshots, nothing is obsolete:
|
||||
latest_target_snapshot=None
|
||||
send_snapshots=source_snapshots[source_filesystem]
|
||||
target_obsolete_snapshots[target_filesystem]=[]
|
||||
source_obsolete_snapshots[source_filesystem]=[]
|
||||
|
||||
#now actually send the snapshots
|
||||
if not args.no_send:
|
||||
#incremental or initial send?
|
||||
if target_filesystem in target_snapshots and target_snapshots[target_filesystem]:
|
||||
#incremental mode, determine what to send and what is obsolete
|
||||
|
||||
if send_snapshots and args.rollback and latest_target_snapshot:
|
||||
#roll back any changes on target
|
||||
debug("Rolling back target to latest snapshot.")
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "rollback", target_filesystem+"@"+latest_target_snapshot ])
|
||||
#latest succesfully send snapshot, should be common on both source and target
|
||||
latest_target_snapshot=target_snapshots[target_filesystem][-1]
|
||||
|
||||
if latest_target_snapshot not in source_snapshots[source_filesystem]:
|
||||
#cant find latest target anymore. find first common snapshot and inform user
|
||||
error_msg="Cant find latest target snapshot on source for '{}', did you destroy/rename it?".format(source_filesystem)
|
||||
error_msg=error_msg+"\nLatest on target : "+target_filesystem+"@"+latest_target_snapshot
|
||||
error_msg=error_msg+"\nMissing on source: "+source_filesystem+"@"+latest_target_snapshot
|
||||
found=False
|
||||
for latest_target_snapshot in reversed(target_snapshots[target_filesystem]):
|
||||
if latest_target_snapshot in source_snapshots[source_filesystem]:
|
||||
error_msg=error_msg+"\nYou could solve this by rolling back to this common snapshot on target: "+target_filesystem+"@"+latest_target_snapshot
|
||||
found=True
|
||||
break
|
||||
if not found:
|
||||
error_msg=error_msg+"\nAlso could not find an earlier common snapshot to rollback to."
|
||||
else:
|
||||
if args.ignore_new:
|
||||
verbose("* Skipping source filesystem '{0}', target already has newer snapshots.".format(source_filesystem))
|
||||
continue
|
||||
|
||||
raise(Exception(error_msg))
|
||||
|
||||
#send all new source snapshots that come AFTER the last target snapshot
|
||||
latest_source_index=source_snapshots[source_filesystem].index(latest_target_snapshot)
|
||||
send_snapshots=source_snapshots[source_filesystem][latest_source_index+1:]
|
||||
|
||||
#source snapshots that come BEFORE last target snapshot are obsolete
|
||||
source_obsolete_snapshots[source_filesystem]=source_snapshots[source_filesystem][0:latest_source_index]
|
||||
|
||||
#target snapshots that come BEFORE last target snapshot are obsolete
|
||||
latest_target_index=target_snapshots[target_filesystem].index(latest_target_snapshot)
|
||||
target_obsolete_snapshots[target_filesystem]=target_snapshots[target_filesystem][0:latest_target_index]
|
||||
else:
|
||||
#initial mode, send all snapshots, nothing is obsolete:
|
||||
latest_target_snapshot=None
|
||||
send_snapshots=source_snapshots[source_filesystem]
|
||||
target_obsolete_snapshots[target_filesystem]=[]
|
||||
source_obsolete_snapshots[source_filesystem]=[]
|
||||
|
||||
#now actually send the snapshots
|
||||
if not args.no_send:
|
||||
|
||||
if send_snapshots and args.rollback and latest_target_snapshot:
|
||||
#roll back any changes on target
|
||||
debug("Rolling back target to latest snapshot.")
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "rollback", target_filesystem+"@"+latest_target_snapshot ])
|
||||
|
||||
|
||||
for send_snapshot in send_snapshots:
|
||||
for send_snapshot in send_snapshots:
|
||||
|
||||
#resumable?
|
||||
if target_filesystem in resumable_target_filesystems:
|
||||
resume_token=resumable_target_filesystems.pop(target_filesystem)
|
||||
else:
|
||||
resume_token=None
|
||||
#resumable?
|
||||
if target_filesystem in resumable_target_filesystems:
|
||||
resume_token=resumable_target_filesystems.pop(target_filesystem)
|
||||
else:
|
||||
resume_token=None
|
||||
|
||||
zfs_transfer(
|
||||
ssh_source=args.ssh_source, source_filesystem=source_filesystem,
|
||||
first_snapshot=latest_target_snapshot, second_snapshot=send_snapshot,
|
||||
ssh_target=args.ssh_target, target_filesystem=target_filesystem,
|
||||
resume_token=resume_token
|
||||
)
|
||||
#hold the snapshot we're sending on the source
|
||||
if not args.no_holds:
|
||||
zfs_hold_snapshot(ssh_to=args.ssh_source, snapshot=source_filesystem+"@"+send_snapshot)
|
||||
|
||||
zfs_transfer(
|
||||
ssh_source=args.ssh_source, source_filesystem=source_filesystem,
|
||||
first_snapshot=latest_target_snapshot, second_snapshot=send_snapshot,
|
||||
ssh_target=args.ssh_target, target_filesystem=target_filesystem,
|
||||
resume_token=resume_token
|
||||
)
|
||||
|
||||
#hold the snapshot we just send to the target
|
||||
zfs_hold_snapshot(ssh_to=args.ssh_target, snapshot=target_filesystem+"@"+send_snapshot)
|
||||
|
||||
|
||||
|
||||
#now that we succesfully transferred this snapshot, the previous snapshot is obsolete:
|
||||
if latest_target_snapshot:
|
||||
target_obsolete_snapshots[target_filesystem].append(latest_target_snapshot)
|
||||
source_obsolete_snapshots[source_filesystem].append(latest_target_snapshot)
|
||||
#we just received a new filesytem?
|
||||
else:
|
||||
if args.clear_refreservation:
|
||||
debug("Clearing refreservation to save space.")
|
||||
#now that we succesfully transferred this snapshot, the previous snapshot is obsolete:
|
||||
if latest_target_snapshot:
|
||||
zfs_release_snapshot(ssh_to=args.ssh_target, snapshot=target_filesystem+"@"+latest_target_snapshot)
|
||||
target_obsolete_snapshots[target_filesystem].append(latest_target_snapshot)
|
||||
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "set", "refreservation=none", target_filesystem ])
|
||||
if not args.no_holds:
|
||||
zfs_release_snapshot(ssh_to=args.ssh_source, snapshot=source_filesystem+"@"+latest_target_snapshot)
|
||||
source_obsolete_snapshots[source_filesystem].append(latest_target_snapshot)
|
||||
#we just received a new filesytem?
|
||||
else:
|
||||
if args.clear_refreservation:
|
||||
debug("Clearing refreservation to save space.")
|
||||
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "set", "refreservation=none", target_filesystem ])
|
||||
|
||||
|
||||
if args.clear_mountpoint:
|
||||
debug("Setting canmount=noauto to prevent auto-mounting in the wrong place. (ignoring errors)")
|
||||
if args.clear_mountpoint:
|
||||
debug("Setting canmount=noauto to prevent auto-mounting in the wrong place. (ignoring errors)")
|
||||
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "set", "canmount=noauto", target_filesystem ], valid_exitcodes= [0, 1] )
|
||||
run(ssh_to=args.ssh_target, test=args.test, cmd=["zfs", "set", "canmount=noauto", target_filesystem ], valid_exitcodes= [0, 1] )
|
||||
|
||||
|
||||
latest_target_snapshot=send_snapshot
|
||||
|
||||
latest_target_snapshot=send_snapshot
|
||||
# failed, skip this source_filesystem
|
||||
except Exception as e:
|
||||
failed(str(e))
|
||||
|
||||
|
||||
############## cleanup section
|
||||
#we only do cleanups after everything is complete, to keep everything consistent (same snapshots everywhere)
|
||||
|
||||
|
||||
#find stale backups on target that have become obsolete
|
||||
verbose("Getting stale filesystems and snapshots from {0}".format(args.ssh_target))
|
||||
stale_target_filesystems=get_stale_backupped_filesystems(ssh_to=args.ssh_target, backup_name=args.backup_name, target_fs=args.target_fs, target_filesystems=target_filesystems)
|
||||
debug("Stale target filesystems: {0}".format("\n".join(stale_target_filesystems)))
|
||||
if not args.ignore_replicated:
|
||||
#find stale backups on target that have become obsolete
|
||||
|
||||
stale_target_snapshots=zfs_get_snapshots(args.ssh_target, stale_target_filesystems, args.backup_name)
|
||||
debug("Stale target snapshots: " + str(pprint.pformat(stale_target_snapshots)))
|
||||
target_obsolete_snapshots.update(stale_target_snapshots)
|
||||
stale_target_filesystems=get_stale_backupped_filesystems(backup_name=args.backup_name, target_path=args.target_path, target_filesystems=target_filesystems, existing_target_filesystems=existing_target_filesystems)
|
||||
debug("Stale target filesystems: {0}".format("\n".join(stale_target_filesystems)))
|
||||
|
||||
#determine stale filesystems that have no snapshots left (the can be destroyed)
|
||||
#TODO: prevent destroying filesystems that have underlying filesystems that are still active.
|
||||
stale_target_destroys=[]
|
||||
for stale_target_filesystem in stale_target_filesystems:
|
||||
if stale_target_filesystem not in stale_target_snapshots:
|
||||
stale_target_destroys.append(stale_target_filesystem)
|
||||
stale_target_snapshots=zfs_get_snapshots(args.ssh_target, stale_target_filesystems, args.backup_name)
|
||||
debug("Stale target snapshots: " + str(pprint.pformat(stale_target_snapshots)))
|
||||
target_obsolete_snapshots.update(stale_target_snapshots)
|
||||
|
||||
#determine stale filesystems that have no snapshots left (the can be destroyed)
|
||||
stale_target_destroys=[]
|
||||
for stale_target_filesystem in stale_target_filesystems:
|
||||
if stale_target_filesystem not in stale_target_snapshots:
|
||||
stale_target_destroys.append(stale_target_filesystem)
|
||||
|
||||
if stale_target_destroys:
|
||||
#NOTE: dont destroy automaticly..not safe enough.
|
||||
# if args.destroy_stale:
|
||||
# verbose("Destroying stale filesystems on target {0}:\n{1}".format(args.ssh_target, "\n".join(stale_target_destroys)))
|
||||
# zfs_destroy(ssh_to=args.ssh_target, filesystems=stale_target_destroys, recursive=True)
|
||||
# else:
|
||||
verbose("Stale filesystems on {0}:\n{1}".format(args.ssh_target, "\n".join(stale_target_destroys)))
|
||||
else:
|
||||
verbose("NOTE: Cant determine stale target filesystems while using ignore_replicated.")
|
||||
|
||||
if stale_target_destroys:
|
||||
if args.destroy_stale:
|
||||
verbose("Destroying stale filesystems on target {0}:\n{1}".format(args.ssh_target, "\n".join(stale_target_destroys)))
|
||||
zfs_destroy(ssh_to=args.ssh_target, filesystems=stale_target_destroys, recursive=True)
|
||||
else:
|
||||
verbose("Stale filesystems on {0}, use --destroy-stale to destroy:\n{1}".format(args.ssh_target, "\n".join(stale_target_destroys)))
|
||||
|
||||
|
||||
#now actually destroy the old snapshots
|
||||
source_destroys=determine_destroy_list(source_obsolete_snapshots, args.keep_source)
|
||||
if source_destroys:
|
||||
verbose("Destroying old snapshots on source {0}:\n{1}".format(args.ssh_source, "\n".join(source_destroys)))
|
||||
zfs_destroy_snapshots(ssh_to=args.ssh_source, snapshots=source_destroys)
|
||||
try:
|
||||
zfs_destroy_snapshots(ssh_to=args.ssh_source, snapshots=source_destroys)
|
||||
except Exception as e:
|
||||
failed(str(e))
|
||||
|
||||
|
||||
target_destroys=determine_destroy_list(target_obsolete_snapshots, args.keep_target)
|
||||
if target_destroys:
|
||||
verbose("Destroying old snapshots on target {0}:\n{1}".format(args.ssh_target, "\n".join(target_destroys)))
|
||||
zfs_destroy_snapshots(ssh_to=args.ssh_target, snapshots=target_destroys)
|
||||
|
||||
|
||||
verbose("All done")
|
||||
|
||||
try:
|
||||
zfs_destroy_snapshots(ssh_to=args.ssh_target, snapshots=target_destroys)
|
||||
except Exception as e:
|
||||
failed(str(e))
|
||||
|
||||
|
||||
################################################################## ENTRY POINT
|
||||
|
||||
# parse arguments
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser(description='ZFS autobackup v2.1')
|
||||
parser = argparse.ArgumentParser(
|
||||
description='ZFS autobackup v2.4',
|
||||
epilog='When a filesystem fails, zfs_backup will continue and report the number of failures at that end. Also the exit code will indicate the number of failures.')
|
||||
parser.add_argument('--ssh-source', default="local", help='Source host to get backup from. (user@hostname) Default %(default)s.')
|
||||
parser.add_argument('--ssh-target', default="local", help='Target host to push backup to. (user@hostname) Default %(default)s.')
|
||||
parser.add_argument('--ssh-cipher', default=None, help='SSH cipher to use (default %(default)s)')
|
||||
parser.add_argument('--keep-source', type=int, default=30, help='Number of days to keep old snapshots on source. Default %(default)s.')
|
||||
parser.add_argument('--keep-target', type=int, default=30, help='Number of days to keep old snapshots on target. Default %(default)s.')
|
||||
parser.add_argument('backup_name', help='Name of the backup (you should set the zfs property "autobackup:backup-name" to true on filesystems you want to backup')
|
||||
parser.add_argument('target_fs', help='Target filesystem')
|
||||
parser.add_argument('target_path', help='Target path')
|
||||
|
||||
parser.add_argument('--no-snapshot', action='store_true', help='dont create new snapshot (usefull for finishing uncompleted backups, or cleanups)')
|
||||
parser.add_argument('--no-send', action='store_true', help='dont send snapshots (usefull to only do a cleanup)')
|
||||
parser.add_argument('--resume', action='store_true', help='support resuming of interrupted transfers by using the zfs extensible_dataset feature (both zpools should have it enabled)')
|
||||
parser.add_argument('--allow-empty', action='store_true', help='if nothing has changed, still create empty snapshots.')
|
||||
parser.add_argument('--ignore-replicated', action='store_true', help='Ignore datasets that seem to be replicated some other way. (No changes since lastest snapshot. Usefull for proxmox HA replication)')
|
||||
parser.add_argument('--no-holds', action='store_true', help='Dont lock snapshots on the source. (Usefull to allow proxmox HA replication to switches nodes)')
|
||||
parser.add_argument('--ignore-new', action='store_true', help='Ignore filesystem if there are already newer snapshots for it on the target (use with caution)')
|
||||
|
||||
parser.add_argument('--resume', action='store_true', help='support resuming of interrupted transfers by using the zfs extensible_dataset feature (both zpools should have it enabled) Disadvantage is that you need to use zfs recv -A if another snapshot is created on the target during a receive. Otherwise it will keep failing.')
|
||||
parser.add_argument('--strip-path', default=0, type=int, help='number of directory to strip from path (use 1 when cloning zones between 2 SmartOS machines)')
|
||||
parser.add_argument('--buffer', default="", help='Use mbuffer with specified size to speedup zfs transfer. (e.g. --buffer 1G) Will also show nice progress output.')
|
||||
|
||||
|
||||
parser.add_argument('--destroy-stale', action='store_true', help='Destroy stale backups that have no more snapshots. Be sure to verify the output before using this! ')
|
||||
# parser.add_argument('--destroy-stale', action='store_true', help='Destroy stale backups that have no more snapshots. Be sure to verify the output before using this! ')
|
||||
parser.add_argument('--clear-refreservation', action='store_true', help='Set refreservation property to none for new filesystems. Usefull when backupping SmartOS volumes. (recommended)')
|
||||
parser.add_argument('--clear-mountpoint', action='store_true', help='Sets canmount=noauto property, to prevent the received filesystem from mounting over existing filesystems. (recommended)')
|
||||
parser.add_argument('--rollback', action='store_true', help='Rollback changes on the target before starting a backup. (normally you can prevent changes by setting the readonly property on the target_fs to on)')
|
||||
parser.add_argument('--filter-properties', action='append', help='Filter properties when receiving filesystems. Can be specified multiple times. (Example: If you send data from Linux to FreeNAS, you should filter xattr)')
|
||||
parser.add_argument('--rollback', action='store_true', help='Rollback changes on the target before starting a backup. (normally you can prevent changes by setting the readonly property on the target_path to on)')
|
||||
parser.add_argument('--ignore-transfer-errors', action='store_true', help='Ignore transfer errors (still checks if received filesystem exists. usefull for acltype errors)')
|
||||
|
||||
|
||||
parser.add_argument('--compress', action='store_true', help='use compression during zfs send/recv')
|
||||
parser.add_argument('--test', action='store_true', help='dont change anything, just show what would be done (still does all read-only operations)')
|
||||
parser.add_argument('--verbose', action='store_true', help='verbose output')
|
||||
parser.add_argument('--debug', action='store_true', help='debug output (shows commands that are executed)')
|
||||
@ -639,5 +806,23 @@ parser.add_argument('--debug', action='store_true', help='debug output (shows co
|
||||
#note args is the only global variable we use, since its a global readonly setting anyway
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.ignore_replicated and args.allow_empty:
|
||||
abort("Cannot use allow_empty with ignore_replicated.")
|
||||
|
||||
zfs_autobackup()
|
||||
|
||||
try:
|
||||
zfs_autobackup()
|
||||
if not failures:
|
||||
verbose("All operations completed succesfully.")
|
||||
sys.exit(0)
|
||||
else:
|
||||
verbose("{} OPERATION(S) FAILED!".format(failures))
|
||||
#exit with the number of failures.
|
||||
sys.exit(min(255,failures))
|
||||
|
||||
except Exception as e:
|
||||
if args.debug:
|
||||
raise
|
||||
else:
|
||||
print(str(e))
|
||||
abort("FATAL ERROR")
|
||||
|
||||
Reference in New Issue
Block a user