cleanedup and improved select-code

restore progres. verify --destroy-incompabible output.
nicer help
2021-03-16 23:40:31 +01:00 · 2021-03-11 11:57:51 +01:00 · 2021-03-03 15:48:44 +01:00 · 2021-03-03 11:50:11 +01:00 · 2021-03-01 00:04:10 +01:00 · 2021-02-27 21:36:03 +01:00
10 changed files with 457 additions and 251 deletions
--- a/README.md
+++ b/README.md
@ -148,6 +148,8 @@ rpool/swap                              autobackup:offsite1  true
 ...
 ```

+ZFS properties are ```inherited``` by child datasets. Since we've set the property on the highest dataset, we're essentially backupping the whole pool.
+
 Because we don't want to backup everything, we can exclude certain filesystem by setting the property to false:

 ```console
@ -163,6 +165,13 @@ rpool/swap                              autobackup:offsite1  false
 ...
 ```

+The autobackup-property can have 3 values:
+ * ```true```: Backup the dataset and all its children 
+ * ```false```: Dont backup the dataset and all its children. (used to exclude certain datasets)
+ * ```child```: Only backup the children off the dataset, not the dataset itself.
+
+Only use the zfs-command to set these properties, not the zpool command. 
+
 ### Running zfs-autobackup

 Run the script on the backup server and pull the data from the server specified by --ssh-source.
@ -654,10 +663,10 @@ for HOST in $HOSTS; do
 done
 ```

-This script will also send the backup status to Zabbix. (if you've installed my zabbix-job-status script)
+This script will also send the backup status to Zabbix. (if you've installed my zabbix-job-status script https://github.com/psy0rz/stuff/tree/master/zabbix-jobs)

 # Sponsor list

 This project was sponsorred by:

-* (None so far)
+* JetBrains (Provided me with a license for their whole professional product line, https://www.jetbrains.com/pycharm/ )
--- a/tests/test_destroymissing.py
+++ b/tests/test_destroymissing.py
@ -96,7 +96,7 @@ class TestZfsNode(unittest2.TestCase):
                #now tries to destroy our own last snapshot (before the final destroy of the dataset)
                self.assertIn("fs1@test-20101111000000: Destroying", buf.getvalue())
                #but cant finish because still in use:
-                self.assertIn("fs1: Error during destoy missing", buf.getvalue())
+                self.assertIn("fs1: Error during --destroy-missing", buf.getvalue())

            shelltest("zfs destroy test_target1/clone1")

--- a/tests/test_externalfailures.py
+++ b/tests/test_externalfailures.py
@ -1,7 +1,7 @@
 from basetest import *


-class TestZfsNode(unittest2.TestCase):
+class TestExternalFailures(unittest2.TestCase):

    def setUp(self):
        prepare_zpools()
@ -259,8 +259,28 @@ test_target1/test_source2/fs2/sub@test-20101111000002
        with patch('time.strftime', return_value="20101111000001"):
            self.assertTrue(ZfsAutobackup("test test_target1 --verbose --allow-empty".split(" ")).run())

-    ############# TODO:
+    #UPDATE: offcourse the one thing that wasn't tested had a bug :(  (in ExecuteNode.run()).
    def test_ignoretransfererrors(self):

-        self.skipTest(
-            "todo: create some kind of situation where zfs recv exits with an error but transfer is still ok (happens in practice with acltype)")
+            self.skipTest("Not sure how to implement a test for this without some serious hacking and patching.")
+
+#         #recreate target pool without any features
+#         # shelltest("zfs set compress=on test_source1; zpool destroy test_target1; zpool create test_target1 -o feature@project_quota=disabled /dev/ram2")
+#
+#         with patch('time.strftime', return_value="20101111000000"):
+#             self.assertFalse(ZfsAutobackup("test test_target1 --verbose --allow-empty --no-progress".split(" ")).run())
+#
+#         r = shelltest("zfs list -H -o name -r -t all test_target1")
+#
+#         self.assertMultiLineEqual(r, """
+# test_target1
+# test_target1/test_source1
+# test_target1/test_source1/fs1
+# test_target1/test_source1/fs1@test-20101111000002
+# test_target1/test_source1/fs1/sub
+# test_target1/test_source1/fs1/sub@test-20101111000002
+# test_target1/test_source2
+# test_target1/test_source2/fs2
+# test_target1/test_source2/fs2/sub
+# test_target1/test_source2/fs2/sub@test-20101111000002
+#         """)
--- a/tests/test_zfsautobackup.py
+++ b/tests/test_zfsautobackup.py
@ -541,6 +541,30 @@ test_target1/test_source2/fs2/sub@test-20101111000000  canmount  -         -
            #should succeed by destroying incompatibles
            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --allow-empty --destroy-incompatible".split(" ")).run())

+        r = shelltest("zfs list -H -o name -r -t all test_target1")
+        self.assertMultiLineEqual(r, """
+test_target1
+test_target1/test_source1
+test_target1/test_source1/fs1
+test_target1/test_source1/fs1@test-20101111000000
+test_target1/test_source1/fs1@compatible1
+test_target1/test_source1/fs1@compatible2
+test_target1/test_source1/fs1@test-20101111000001
+test_target1/test_source1/fs1@test-20101111000002
+test_target1/test_source1/fs1@test-20101111000003
+test_target1/test_source1/fs1/sub
+test_target1/test_source1/fs1/sub@test-20101111000000
+test_target1/test_source1/fs1/sub@test-20101111000001
+test_target1/test_source1/fs1/sub@test-20101111000002
+test_target1/test_source1/fs1/sub@test-20101111000003
+test_target1/test_source2
+test_target1/test_source2/fs2
+test_target1/test_source2/fs2/sub
+test_target1/test_source2/fs2/sub@test-20101111000000
+test_target1/test_source2/fs2/sub@test-20101111000001
+test_target1/test_source2/fs2/sub@test-20101111000002
+test_target1/test_source2/fs2/sub@test-20101111000003
+""")



--- a/tests/test_zfsautobackup31.py
+++ b/tests/test_zfsautobackup31.py
@ -0,0 +1,49 @@
+from basetest import *
+import time
+
+class TestZfsAutobackup31(unittest2.TestCase):
+
+    def setUp(self):
+        prepare_zpools()
+        self.longMessage=True
+
+    def test_no_thinning(self):
+
+        with patch('time.strftime', return_value="20101111000000"):
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --allow-empty".split(" ")).run())
+
+        with patch('time.strftime', return_value="20101111000001"):
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --allow-empty --keep-target=0 --keep-source=0 --no-thinning".split(" ")).run())
+
+            r=shelltest("zfs list -H -o name -r -t all "+TEST_POOLS)
+            self.assertMultiLineEqual(r,"""
+test_source1
+test_source1/fs1
+test_source1/fs1@test-20101111000000
+test_source1/fs1@test-20101111000001
+test_source1/fs1/sub
+test_source1/fs1/sub@test-20101111000000
+test_source1/fs1/sub@test-20101111000001
+test_source2
+test_source2/fs2
+test_source2/fs2/sub
+test_source2/fs2/sub@test-20101111000000
+test_source2/fs2/sub@test-20101111000001
+test_source2/fs3
+test_source2/fs3/sub
+test_target1
+test_target1/test_source1
+test_target1/test_source1/fs1
+test_target1/test_source1/fs1@test-20101111000000
+test_target1/test_source1/fs1@test-20101111000001
+test_target1/test_source1/fs1/sub
+test_target1/test_source1/fs1/sub@test-20101111000000
+test_target1/test_source1/fs1/sub@test-20101111000001
+test_target1/test_source2
+test_target1/test_source2/fs2
+test_target1/test_source2/fs2/sub
+test_target1/test_source2/fs2/sub@test-20101111000000
+test_target1/test_source2/fs2/sub@test-20101111000001
+""")
+
+
--- a/tests/test_zfsnode.py
+++ b/tests/test_zfsnode.py
@ -115,7 +115,7 @@ test_target1
    def test_supportedrecvoptions(self):
        logger=LogStub()
        description="[Source]"
-        #NOTE: this couldnt hang via ssh if we dont close filehandles properly. (which was a previous bug)
+        #NOTE: this could hang via ssh if we dont close filehandles properly. (which was a previous bug)
        node=ZfsNode("test", logger, description=description, ssh_to='localhost')
        self.assertIsInstance(node.supported_recv_options, list)

--- a/zfs_autobackup/ExecuteNode.py
+++ b/zfs_autobackup/ExecuteNode.py
@ -4,6 +4,7 @@ import subprocess

 from zfs_autobackup.LogStub import LogStub

+
 class ExecuteNode(LogStub):
    """an endpoint to execute local or remote commands via ssh"""

@ -44,22 +45,13 @@ class ExecuteNode(LogStub):
        else:
            self.error("STDERR|> " + line.rstrip())

-    def run(self, cmd, inp=None, tab_split=False, valid_exitcodes=None, readonly=False, hide_errors=False, pipe=False,
-            return_stderr=False):
-        """run a command on the node cmd: the actual command, should be a list, where the first item is the command
-        and the rest are parameters. input: Can be None, a string or a pipe-handle you got from another run()
-        tab_split: split tabbed files in output into a list valid_exitcodes: list of valid exit codes for this
-        command (checks exit code of both sides of a pipe) readonly: make this True if the command doesn't make any
-        changes and is safe to execute in testmode hide_errors: don't show stderr output as error, instead show it as
-        debugging output (use to hide expected errors) pipe: Instead of executing, return a pipe-handle to be used to
-        input to another run() command. (just like a | in linux) return_stderr: return both stdout and stderr as a
-        tuple. (only returns stderr from this side of the pipe)
-        """
+    def _encode_cmd(self, cmd):
+        """returns cmd in encoded and escaped form that can be used with popen."""

-        if not valid_exitcodes:
-            valid_exitcodes = [0]
+        encoded_cmd=[]

-        encoded_cmd = []
+        # make sure the command gets all the data in utf8 format:
+        # (this is necessary if LC_ALL=en_US.utf8 is not set in the environment)

        # use ssh?
        if self.ssh_to is not None:
@ -70,8 +62,6 @@ class ExecuteNode(LogStub):

            encoded_cmd.append(self.ssh_to.encode('utf-8'))

-            # make sure the command gets all the data in utf8 format:
-            # (this is necessary if LC_ALL=en_US.utf8 is not set in the environment)
            for arg in cmd:
                # add single quotes for remote commands to support spaces and other weird stuff (remote commands are
                # executed in a shell) and escape existing single quotes (bash needs ' to end the quoted string,
@ -83,6 +73,31 @@ class ExecuteNode(LogStub):
            for arg in cmd:
                encoded_cmd.append(arg.encode('utf-8'))

+        return encoded_cmd
+
+    def run(self, cmd, inp=None, tab_split=False, valid_exitcodes=None, readonly=False, hide_errors=False, pipe=False,
+            return_stderr=False):
+        """run a command on the node.
+
+        :param cmd: the actual command, should be a list, where the first item is the command
+                    and the rest are parameters.
+        :param inp: Can be None, a string or a pipe-handle you got from another run()
+        :param tab_split: split tabbed files in output into a list
+        :param valid_exitcodes: list of valid exit codes for this command (checks exit code of both sides of a pipe)
+                                Use [] to accept all exit codes.
+        :param readonly: make this True if the command doesn't make any changes and is safe to execute in testmode
+        :param hide_errors: don't show stderr output as error, instead show it as debugging output (use to hide expected errors)
+        :param pipe: Instead of executing, return a pipe-handle to be used to
+                     input to another run() command. (just like a | in linux)
+        :param return_stderr: return both stdout and stderr as a tuple. (normally only returns stdout)
+
+        """
+
+        if valid_exitcodes is None:
+            valid_exitcodes = [0]
+
+        encoded_cmd = self._encode_cmd(cmd)
+
        # debug and test stuff
        debug_txt = ""
        for c in encoded_cmd:
@ -196,4 +211,4 @@ class ExecuteNode(LogStub):
        if return_stderr:
            return output_lines, error_lines
        else:
-            return output_lines
+            return output_lines
--- a/zfs_autobackup/ZfsAutobackup.py
+++ b/zfs_autobackup/ZfsAutobackup.py
@ -12,7 +12,7 @@ from zfs_autobackup.ThinnerRule import ThinnerRule
 class ZfsAutobackup:
    """main class"""

-    VERSION = "3.0.1-beta7"
+    VERSION = "3.1-beta2"
    HEADER = "zfs-autobackup v{} - Copyright 2020 E.H.Eefting (edwin@datux.nl)".format(VERSION)

    def __init__(self, argv, print_arguments=True):
@ -23,16 +23,15 @@ class ZfsAutobackup:

        parser = argparse.ArgumentParser(
            description=self.HEADER,
-            epilog='When a filesystem fails, zfs_backup will continue and report the number of failures at that end. '
-                   'Also the exit code will indicate the number of failures. Full manual at: https://github.com/psy0rz/zfs_autobackup')
-        parser.add_argument('--ssh-config', default=None, help='Custom ssh client config')
-        parser.add_argument('--ssh-source', default=None,
-                            help='Source host to get backup from. (user@hostname) Default %(default)s.')
-        parser.add_argument('--ssh-target', default=None,
-                            help='Target host to push backup to. (user@hostname) Default  %(default)s.')
-        parser.add_argument('--keep-source', type=str, default="10,1d1w,1w1m,1m1y",
+            epilog='Full manual at: https://github.com/psy0rz/zfs_autobackup')
+        parser.add_argument('--ssh-config', metavar='CONFIG-FILE', default=None, help='Custom ssh client config')
+        parser.add_argument('--ssh-source', metavar='USER@HOST', default=None,
+                            help='Source host to get backup from.')
+        parser.add_argument('--ssh-target', metavar='USER@HOST', default=None,
+                            help='Target host to push backup to.')
+        parser.add_argument('--keep-source', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
                            help='Thinning schedule for old source snapshots. Default: %(default)s')
-        parser.add_argument('--keep-target', type=str, default="10,1d1w,1w1m,1m1y",
+        parser.add_argument('--keep-target', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
                            help='Thinning schedule for old target snapshots. Default: %(default)s')

        parser.add_argument('backup_name', metavar='backup-name',
@ -48,8 +47,10 @@ class ZfsAutobackup:
                            help='Don\'t create new snapshots (useful for finishing uncompleted backups, or cleanups)')
        parser.add_argument('--no-send', action='store_true',
                            help='Don\'t send snapshots (useful for cleanups, or if you want a serperate send-cronjob)')
-        #        parser.add_argument('--no-thinning', action='store_true', help='Don\'t run the thinner.')
-        parser.add_argument('--min-change', type=int, default=1,
+        parser.add_argument('--no-thinning', action='store_true', help="Do not destroy any snapshots.")
+        parser.add_argument('--no-holds', action='store_true',
+                            help='Don\'t hold snapshots. (Faster. Allows you to destroy common snapshot.)')
+        parser.add_argument('--min-change', metavar='BYTES', type=int, default=1,
                            help='Number of bytes written after which we consider a dataset changed (default %('
                                 'default)s)')
        parser.add_argument('--allow-empty', action='store_true',
@ -57,11 +58,9 @@ class ZfsAutobackup:
        parser.add_argument('--ignore-replicated', action='store_true',
                            help='Ignore datasets that seem to be replicated some other way. (No changes since '
                                 'lastest snapshot. Useful for proxmox HA replication)')
-        parser.add_argument('--no-holds', action='store_true',
-                            help='Don\'t hold snapshots. (Faster)')

        parser.add_argument('--resume', action='store_true', help=argparse.SUPPRESS)
-        parser.add_argument('--strip-path', default=0, type=int,
+        parser.add_argument('--strip-path', metavar='N', default=0, type=int,
                            help='Number of directories to strip from target path (use 1 when cloning zones between 2 '
                                 'SmartOS machines)')
        # parser.add_argument('--buffer', default="",  help='Use mbuffer with specified size to speedup zfs transfer.
@ -73,10 +72,10 @@ class ZfsAutobackup:
        parser.add_argument('--clear-mountpoint', action='store_true',
                            help='Set property canmount=noauto for new datasets. (recommended, prevents mount '
                                 'conflicts. same as --set-properties canmount=noauto)')
-        parser.add_argument('--filter-properties', type=str,
+        parser.add_argument('--filter-properties', metavar='PROPERY,...', type=str,
                            help='List of properties to "filter" when receiving filesystems. (you can still restore '
                                 'them with zfs inherit -S)')
-        parser.add_argument('--set-properties', type=str,
+        parser.add_argument('--set-properties', metavar='PROPERTY=VALUE,...', type=str,
                            help='List of propererties to override when receiving filesystems. (you can still restore '
                                 'them with zfs inherit -S)')
        parser.add_argument('--rollback', action='store_true',
@ -84,7 +83,7 @@ class ZfsAutobackup:
                                 'prevent changes by setting the readonly property on the target_path to on)')
        parser.add_argument('--destroy-incompatible', action='store_true',
                            help='Destroy incompatible snapshots on target. Use with care! (implies --rollback)')
-        parser.add_argument('--destroy-missing', type=str, default=None,
+        parser.add_argument('--destroy-missing', metavar="SCHEDULE", type=str, default=None,
                            help='Destroy datasets on target that are missing on the source. Specify the time since '
                                 'the last snapshot, e.g: --destroy-missing 30d')
        parser.add_argument('--ignore-transfer-errors', action='store_true',
@ -103,14 +102,21 @@ class ZfsAutobackup:
                            help='Show zfs commands and their output/exit codes. (noisy)')
        parser.add_argument('--progress', action='store_true',
                            help='show zfs progress output. Enabled automaticly on ttys. (use --no-progress to disable)')
-        parser.add_argument('--no-progress', action='store_true', help=argparse.SUPPRESS) #needed to workaround a zfs recv -v bug
+        parser.add_argument('--no-progress', action='store_true', help=argparse.SUPPRESS) # needed to workaround a zfs recv -v bug
+
+        # parser.add_argument('--output-pipe', metavar="COMMAND", default=[], action='append',
+        #                     help='add zfs send output pipe command')
+        #
+        # parser.add_argument('--input-pipe', metavar="COMMAND", default=[], action='append',
+        #                     help='add zfs recv input pipe command')
+

        # note args is the only global variable we use, since its a global readonly setting anyway
        args = parser.parse_args(argv)

        self.args = args

-        #auto enable progress?
+        # auto enable progress?
        if sys.stderr.isatty() and not args.no_progress:
            args.progress = True

@ -148,48 +154,77 @@ class ZfsAutobackup:
        self.log.verbose("")
        self.log.verbose("#### " + title)

-    # sync datasets, or thin-only on both sides
-    # target is needed for this.
-    def sync_datasets(self, source_node, source_datasets):
+    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
+    def thin_missing_targets(self, target_dataset, used_target_datasets):
+        """thin target datasets that are missing on the source."""

-        description = "[Target]"
+        self.debug("Thinning obsolete datasets")

-        self.set_title("Target settings")
+        for dataset in target_dataset.recursive_datasets:
+            try:
+                if dataset not in used_target_datasets:
+                    dataset.debug("Missing on source, thinning")
+                    dataset.thin()

-        target_thinner = Thinner(self.args.keep_target)
-        target_node = ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config, ssh_to=self.args.ssh_target,
-                              readonly=self.args.test, debug_output=self.args.debug_output, description=description,
-                              thinner=target_thinner)
-        target_node.verbose("Receive datasets under: {}".format(self.args.target_path))
+            except Exception as e:
+                dataset.error("Error during thinning of missing datasets ({})".format(str(e)))

-        if self.args.no_send:
-            self.set_title("Thinning source and target")
-        else:
-            self.set_title("Sending and thinning")
+    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
+    def destroy_missing_targets(self, target_dataset, used_target_datasets):
+        """destroy target datasets that are missing on the source and that meet the requirements"""

-        # check if exists, to prevent vague errors
-        target_dataset = ZfsDataset(target_node, self.args.target_path)
-        if not target_dataset.exists:
-            self.error("Target path '{}' does not exist. Please create this dataset first.".format(target_dataset))
-            return 255
+        self.debug("Destroying obsolete datasets")

-        if self.args.filter_properties:
-            filter_properties = self.args.filter_properties.split(",")
-        else:
-            filter_properties = []
+        for dataset in target_dataset.recursive_datasets:
+            try:
+                if dataset not in used_target_datasets:

-        if self.args.set_properties:
-            set_properties = self.args.set_properties.split(",")
-        else:
-            set_properties = []
+                    # cant do anything without our own snapshots
+                    if not dataset.our_snapshots:
+                        if dataset.datasets:
+                            # its not a leaf, just ignore
+                            dataset.debug("Destroy missing: ignoring")
+                        else:
+                            dataset.verbose(
+                                "Destroy missing: has no snapshots made by us. (please destroy manually)")
+                    else:
+                        # past the deadline?
+                        deadline_ttl = ThinnerRule("0s" + self.args.destroy_missing).ttl
+                        now = int(time.time())
+                        if dataset.our_snapshots[-1].timestamp + deadline_ttl > now:
+                            dataset.verbose("Destroy missing: Waiting for deadline.")
+                        else:

-        if self.args.clear_refreservation:
-            filter_properties.append("refreservation")
+                            dataset.debug("Destroy missing: Removing our snapshots.")

-        if self.args.clear_mountpoint:
-            set_properties.append("canmount=noauto")
+                            # remove all our snaphots, except last, to safe space in case we fail later on
+                            for snapshot in dataset.our_snapshots[:-1]:
+                                snapshot.destroy(fail_exception=True)
+
+                            # does it have other snapshots?
+                            has_others = False
+                            for snapshot in dataset.snapshots:
+                                if not snapshot.is_ours():
+                                    has_others = True
+                                    break
+
+                            if has_others:
+                                dataset.verbose("Destroy missing: Still in use by other snapshots")
+                            else:
+                                if dataset.datasets:
+                                    dataset.verbose("Destroy missing: Still has children here.")
+                                else:
+                                    dataset.verbose("Destroy missing.")
+                                    dataset.our_snapshots[-1].destroy(fail_exception=True)
+                                    dataset.destroy(fail_exception=True)
+
+            except Exception as e:
+                dataset.error("Error during --destroy-missing: {}".format(str(e)))
+
+    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
+    def sync_datasets(self, source_node, source_datasets, target_node):
+        """Sync datasets, or thin-only on both sides"""

-        # sync datasets
        fail_count = 0
        target_datasets = []
        for source_dataset in source_datasets:
@ -207,86 +242,36 @@ class ZfsAutobackup:
                        and not target_dataset.parent.exists:
                    target_dataset.parent.create_filesystem(parents=True)

-                # determine common zpool features
+                # determine common zpool features (cached, so no problem we call it often)
                source_features = source_node.get_zfs_pool(source_dataset.split_path()[0]).features
                target_features = target_node.get_zfs_pool(target_dataset.split_path()[0]).features
                common_features = source_features and target_features
-                # source_dataset.debug("Common features: {}".format(common_features))

+                # sync the snapshots of this dataset
                source_dataset.sync_snapshots(target_dataset, show_progress=self.args.progress,
-                                              features=common_features, filter_properties=filter_properties,
-                                              set_properties=set_properties,
+                                              features=common_features, filter_properties=self.filter_properties_list(),
+                                              set_properties=self.set_properties_list(),
                                              ignore_recv_exit_code=self.args.ignore_transfer_errors,
                                              holds=not self.args.no_holds, rollback=self.args.rollback,
-                                              raw=self.args.raw, other_snapshots=self.args.other_snapshots,
+                                              raw=self.args.raw, also_other_snapshots=self.args.other_snapshots,
                                              no_send=self.args.no_send,
-                                              destroy_incompatible=self.args.destroy_incompatible)
+                                              destroy_incompatible=self.args.destroy_incompatible,
+                                              no_thinning=self.args.no_thinning)
            except Exception as e:
                fail_count = fail_count + 1
                source_dataset.error("FAILED: " + str(e))
                if self.args.debug:
                    raise

-        # if not self.args.no_thinning:
-        self.thin_missing_targets(ZfsDataset(target_node, self.args.target_path), target_datasets)
+        target_path_dataset=ZfsDataset(target_node, self.args.target_path)
+        if not self.args.no_thinning:
+            self.thin_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)
+
+        if self.args.destroy_missing is not None:
+            self.destroy_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)

        return fail_count

-    def thin_missing_targets(self, target_dataset, used_target_datasets):
-        """thin/destroy target datasets that are missing on the source."""
-
-        self.debug("Thinning obsolete datasets")
-
-        for dataset in target_dataset.recursive_datasets:
-            try:
-                if dataset not in used_target_datasets:
-                    dataset.debug("Missing on source, thinning")
-                    dataset.thin()
-
-                    # destroy_missing enabled?
-                    if self.args.destroy_missing is not None:
-
-                        # cant do anything without our own snapshots
-                        if not dataset.our_snapshots:
-                            if dataset.datasets:
-                                dataset.debug("Destroy missing: ignoring")
-                            else:
-                                dataset.verbose(
-                                    "Destroy missing: has no snapshots made by us. (please destroy manually)")
-                        else:
-                            # past the deadline?
-                            deadline_ttl = ThinnerRule("0s" + self.args.destroy_missing).ttl
-                            now = int(time.time())
-                            if dataset.our_snapshots[-1].timestamp + deadline_ttl > now:
-                                dataset.verbose("Destroy missing: Waiting for deadline.")
-                            else:
-
-                                dataset.debug("Destroy missing: Removing our snapshots.")
-
-                                # remove all our snaphots, except last, to safe space in case we fail later on
-                                for snapshot in dataset.our_snapshots[:-1]:
-                                    snapshot.destroy(fail_exception=True)
-
-                                # does it have other snapshots?
-                                has_others = False
-                                for snapshot in dataset.snapshots:
-                                    if not snapshot.is_ours():
-                                        has_others = True
-                                        break
-
-                                if has_others:
-                                    dataset.verbose("Destroy missing: Still in use by other snapshots")
-                                else:
-                                    if dataset.datasets:
-                                        dataset.verbose("Destroy missing: Still has children here.")
-                                    else:
-                                        dataset.verbose("Destroy missing.")
-                                        dataset.our_snapshots[-1].destroy(fail_exception=True)
-                                        dataset.destroy(fail_exception=True)
-
-            except Exception as e:
-                dataset.error("Error during destoy missing ({})".format(str(e)))
-
    def thin_source(self, source_datasets):

        self.set_title("Thinning source")
@ -294,6 +279,44 @@ class ZfsAutobackup:
        for source_dataset in source_datasets:
            source_dataset.thin(skip_holds=True)

+    def filter_replicated(self, datasets):
+        if not self.args.ignore_replicated:
+            return datasets
+        else:
+            self.set_title("Filtering already replicated filesystems")
+            ret = []
+            for dataset in datasets:
+                if dataset.is_changed(self.args.min_change):
+                    ret.append(dataset)
+                else:
+                    dataset.verbose("Ignoring, already replicated")
+
+            return(ret)
+
+    def filter_properties_list(self):
+
+        if self.args.filter_properties:
+            filter_properties = self.args.filter_properties.split(",")
+        else:
+            filter_properties = []
+
+        if self.args.clear_refreservation:
+            filter_properties.append("refreservation")
+
+        return filter_properties
+
+    def set_properties_list(self):
+
+        if self.args.set_properties:
+            set_properties = self.args.set_properties.split(",")
+        else:
+            set_properties = []
+
+        if self.args.clear_mountpoint:
+            set_properties.append("canmount=noauto")
+
+        return set_properties
+
    def run(self):

        try:
@ -323,18 +346,8 @@ class ZfsAutobackup:
                        self.args.backup_name))
                return 255

-            source_datasets = []
-
            # filter out already replicated stuff?
-            if not self.args.ignore_replicated:
-                source_datasets = selected_source_datasets
-            else:
-                self.set_title("Filtering already replicated filesystems")
-                for selected_source_dataset in selected_source_datasets:
-                    if selected_source_dataset.is_changed(self.args.min_change):
-                        source_datasets.append(selected_source_dataset)
-                    else:
-                        selected_source_dataset.verbose("Ignoring, already replicated")
+            source_datasets = self.filter_replicated(selected_source_datasets)

            if not self.args.no_snapshot:
                self.set_title("Snapshotting")
@ -343,9 +356,37 @@ class ZfsAutobackup:

            # if target is specified, we sync the datasets, otherwise we just thin the source. (e.g. snapshot mode)
            if self.args.target_path:
-                fail_count = self.sync_datasets(source_node, source_datasets)
+
+                # create target_node
+                self.set_title("Target settings")
+                target_thinner = Thinner(self.args.keep_target)
+                target_node = ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config,
+                                      ssh_to=self.args.ssh_target,
+                                      readonly=self.args.test, debug_output=self.args.debug_output,
+                                      description="[Target]",
+                                      thinner=target_thinner)
+                target_node.verbose("Receive datasets under: {}".format(self.args.target_path))
+
+                if self.args.no_send:
+                    self.set_title("Thinning source and target")
+                else:
+                    self.set_title("Sending and thinning")
+
+                # check if exists, to prevent vague errors
+                target_dataset = ZfsDataset(target_node, self.args.target_path)
+                if not target_dataset.exists:
+                    raise(Exception(
+                        "Target path '{}' does not exist. Please create this dataset first.".format(target_dataset)))
+
+                # do the actual sync
+                fail_count = self.sync_datasets(
+                    source_node=source_node,
+                    source_datasets=source_datasets,
+                    target_node=target_node)
+
            else:
-                self.thin_source(source_datasets)
+                if not self.args.no_thinning:
+                    self.thin_source(source_datasets)
                fail_count = 0

            if not fail_count:
--- a/zfs_autobackup/ZfsDataset.py
+++ b/zfs_autobackup/ZfsDataset.py
@ -90,6 +90,35 @@ class ZfsDataset:
        """true if this dataset is a snapshot"""
        return self.name.find("@") != -1

+    def is_selected(self, value, source, inherited, ignore_received):
+        """determine if dataset should be selected for backup (called from ZfsNode)"""
+
+        # sanity checks
+        if source not in ["local", "received", "-"]:
+            # probably a program error in zfs-autobackup or new feature in zfs
+            raise (Exception(
+                "{} autobackup-property has illegal source: '{}' (possible BUG)".format(self.name, source)))
+        if value not in ["false", "true", "child", "-"]:
+            # user error
+            raise (Exception(
+                "{} autobackup-property has illegal value: '{}'".format(self.name, value)))
+
+        # now determine if its actually selected
+        if value == "false":
+            self.verbose("Ignored (disabled)")
+            return False
+        elif value == "true" or (value == "child" and inherited):
+            if source == "local":
+                self.verbose("Selected")
+                return True
+            elif source == "received":
+                if ignore_received:
+                    self.verbose("Ignored (local backup)")
+                    return False
+                else:
+                    self.verbose("Selected")
+                    return True
+
    @CachedProperty
    def parent(self):
        """get zfs-parent of this dataset. for snapshots this means it will get the filesystem/volume that it belongs
@ -102,10 +131,10 @@ class ZfsDataset:
        else:
            return ZfsDataset(self.zfs_node, self.rstrip_path(1))

-    def find_prev_snapshot(self, snapshot, other_snapshots=False):
+    def find_prev_snapshot(self, snapshot, also_other_snapshots=False):
        """find previous snapshot in this dataset. None if it doesn't exist.

-        other_snapshots: set to true to also return snapshots that where not created by us. (is_ours)
+        also_other_snapshots: set to true to also return snapshots that where not created by us. (is_ours)
        """

        if self.is_snapshot:
@ -114,11 +143,11 @@ class ZfsDataset:
        index = self.find_snapshot_index(snapshot)
        while index:
            index = index - 1
-            if other_snapshots or self.snapshots[index].is_ours():
+            if also_other_snapshots or self.snapshots[index].is_ours():
                return self.snapshots[index]
        return None

-    def find_next_snapshot(self, snapshot, other_snapshots=False):
+    def find_next_snapshot(self, snapshot, also_other_snapshots=False):
        """find next snapshot in this dataset. None if it doesn't exist"""

        if self.is_snapshot:
@ -127,7 +156,7 @@ class ZfsDataset:
        index = self.find_snapshot_index(snapshot)
        while index is not None and index < len(self.snapshots) - 1:
            index = index + 1
-            if other_snapshots or self.snapshots[index].is_ours():
+            if also_other_snapshots or self.snapshots[index].is_ours():
                return self.snapshots[index]
        return None

@ -277,7 +306,6 @@ class ZfsDataset:
    def snapshots(self):
        """get all snapshots of this dataset"""

-
        if not self.exists:
            return []

@ -431,9 +459,6 @@ class ZfsDataset:

            cmd.append(self.name)

-        # if args.buffer and args.ssh_source!="local":
-        #     cmd.append("|mbuffer -m {}".format(args.buffer))
-
        # NOTE: this doesn't start the send yet, it only returns a subprocess.Pipe
        return self.zfs_node.run(cmd, pipe=True)

@ -489,15 +514,12 @@ class ZfsDataset:
        if self.zfs_node.readonly:
            self.force_exists = True

-        # check if transfer was really ok (exit codes have been wrong before due to bugs in zfs-utils and can be
-        # ignored by some parameters)
+        # check if transfer was really ok (exit codes have been wrong before due to bugs in zfs-utils and some
+        # errors should be ignored, thats where the ignore_exitcodes is for.)
        if not self.exists:
            self.error("error during transfer")
            raise (Exception("Target doesn't exist after transfer, something went wrong."))

-        # if args.buffer and  args.ssh_target!="local":
-        #     cmd.append("|mbuffer -m {}".format(args.buffer))
-
    def transfer_snapshot(self, target_snapshot, features, prev_snapshot=None, show_progress=False,
                          filter_properties=None, set_properties=None, ignore_recv_exit_code=False, resume_token=None,
                          raw=False):
@ -609,21 +631,24 @@ class ZfsDataset:
            target_dataset.error("Cant find common snapshot with source.")
            raise (Exception("You probably need to delete the target dataset to fix this."))

-    def find_start_snapshot(self, common_snapshot, other_snapshots):
-        """finds first snapshot to send"""
+    def find_start_snapshot(self, common_snapshot, also_other_snapshots):
+        """finds first snapshot to send
+        :rtype: ZfsDataset or None if we cant find it.
+        """

        if not common_snapshot:
            if not self.snapshots:
                start_snapshot = None
            else:
-                # start from beginning
+                # no common snapshot, start from beginning
                start_snapshot = self.snapshots[0]

-                if not start_snapshot.is_ours() and not other_snapshots:
+                if not start_snapshot.is_ours() and not also_other_snapshots:
                    # try to start at a snapshot thats ours
-                    start_snapshot = self.find_next_snapshot(start_snapshot, other_snapshots)
+                    start_snapshot = self.find_next_snapshot(start_snapshot, also_other_snapshots)
        else:
-            start_snapshot = self.find_next_snapshot(common_snapshot, other_snapshots)
+            # normal situation: start_snapshot is the one after the common snapshot
+            start_snapshot = self.find_next_snapshot(common_snapshot, also_other_snapshots)

        return start_snapshot

@ -659,50 +684,25 @@ class ZfsDataset:

        return allowed_filter_properties, allowed_set_properties

-    def sync_snapshots(self, target_dataset, features, show_progress=False, filter_properties=None, set_properties=None,
-                       ignore_recv_exit_code=False, holds=True, rollback=False, raw=False, other_snapshots=False,
-                       no_send=False, destroy_incompatible=False):
-        """sync this dataset's snapshots to target_dataset, while also thinning out old snapshots along the way."""
+    def _add_virtual_snapshots(self, source_dataset, source_start_snapshot, also_other_snapshots):
+        """add snapshots from source to our snapshot list. (just the in memory list, no disk operations)"""

-        if set_properties is None:
-            set_properties = []
-        if filter_properties is None:
-            filter_properties = []
-
-        # determine common and start snapshot
-        target_dataset.debug("Determining start snapshot")
-        common_snapshot = self.find_common_snapshot(target_dataset)
-        start_snapshot = self.find_start_snapshot(common_snapshot, other_snapshots)
-        # should be destroyed before attempting zfs recv:
-        incompatible_target_snapshots = target_dataset.find_incompatible_snapshots(common_snapshot)
-
-        # make target snapshot list the same as source, by adding virtual non-existing ones to the list.
-        target_dataset.debug("Creating virtual target snapshots")
-        source_snapshot = start_snapshot
-        while source_snapshot:
-            # create virtual target snapshot
-            virtual_snapshot = ZfsDataset(target_dataset.zfs_node,
-                                          target_dataset.filesystem_name + "@" + source_snapshot.snapshot_name,
+        self.debug("Creating virtual target snapshots")
+        snapshot = source_start_snapshot
+        while snapshot:
+            # create virtual target snapsho
+            # NOTE: with force_exist we're telling the dataset it doesnt exist yet. (e.g. its virtual)
+            virtual_snapshot = ZfsDataset(self.zfs_node,
+                                          self.filesystem_name + "@" + snapshot.snapshot_name,
                                          force_exists=False)
-            target_dataset.snapshots.append(virtual_snapshot)
-            source_snapshot = self.find_next_snapshot(source_snapshot, other_snapshots)
+            self.snapshots.append(virtual_snapshot)
+            snapshot = source_dataset.find_next_snapshot(snapshot, also_other_snapshots)

-        # now let thinner decide what we want on both sides as final state (after all transfers are done)
-        if self.our_snapshots:
-            self.debug("Create thinning list")
-            (source_keeps, source_obsoletes) = self.thin_list(keeps=[self.our_snapshots[-1]])
-        else:
-            source_obsoletes = []
+    def _pre_clean(self, common_snapshot, target_dataset, source_obsoletes, target_obsoletes, target_keeps):
+        """cleanup old stuff before starting snapshot syncing"""

-        if target_dataset.our_snapshots:
-            (target_keeps, target_obsoletes) = target_dataset.thin_list(keeps=[target_dataset.our_snapshots[-1]],
-                                                                        ignores=incompatible_target_snapshots)
-        else:
-            target_keeps = []
-            target_obsoletes = []
-
-        # on source: destroy all obsoletes before common. but after common, only delete snapshots that target also
-        # doesn't want to explicitly keep
+        # on source: destroy all obsoletes before common.
+        # But after common, only delete snapshots that target also doesn't want
        before_common = True
        for source_snapshot in self.snapshots:
            if common_snapshot and source_snapshot.snapshot_name == common_snapshot.snapshot_name:
@ -720,12 +720,9 @@ class ZfsDataset:
                if target_snapshot.exists:
                    target_snapshot.destroy()

-        # now actually transfer the snapshots, if we want
-        if no_send:
-            return
+    def _validate_resume_token(self, target_dataset, start_snapshot):
+        """validate and get (or destory) resume token"""

-        # resume?
-        resume_token = None
        if 'receive_resume_token' in target_dataset.properties:
            resume_token = target_dataset.properties['receive_resume_token']
            # not valid anymore?
@ -733,9 +730,36 @@ class ZfsDataset:
            if not resume_snapshot or start_snapshot.snapshot_name != resume_snapshot.snapshot_name:
                target_dataset.verbose("Cant resume, resume token no longer valid.")
                target_dataset.abort_resume()
-                resume_token = None
+            else:
+                return resume_token
+
+    def _plan_sync(self, target_dataset, also_other_snapshots):
+        """plan where to start syncing and what to sync and what to keep"""
+
+        # determine common and start snapshot
+        target_dataset.debug("Determining start snapshot")
+        common_snapshot = self.find_common_snapshot(target_dataset)
+        start_snapshot = self.find_start_snapshot(common_snapshot, also_other_snapshots)
+        incompatible_target_snapshots = target_dataset.find_incompatible_snapshots(common_snapshot)
+
+        # let thinner decide whats obsolete on source
+        source_obsoletes = []
+        if self.our_snapshots:
+            source_obsoletes = self.thin_list(keeps=[self.our_snapshots[-1]])[1]
+
+        # let thinner decide keeps/obsoletes on target, AFTER the transfer would be done (by using virtual snapshots)
+        target_dataset._add_virtual_snapshots(self, start_snapshot, also_other_snapshots)
+        target_keeps = []
+        target_obsoletes = []
+        if target_dataset.our_snapshots:
+            (target_keeps, target_obsoletes) = target_dataset.thin_list(keeps=[target_dataset.our_snapshots[-1]],
+                                                                        ignores=incompatible_target_snapshots)
+
+        return common_snapshot, start_snapshot, source_obsoletes, target_obsoletes, target_keeps, incompatible_target_snapshots
+
+    def handle_incompatible_snapshots(self, incompatible_target_snapshots, destroy_incompatible):
+        """destroy incompatbile snapshots on target before sync, or inform user what to do"""

-        # incompatible target snapshots?
        if incompatible_target_snapshots:
            if not destroy_incompatible:
                for snapshot in incompatible_target_snapshots:
@ -745,7 +769,33 @@ class ZfsDataset:
                for snapshot in incompatible_target_snapshots:
                    snapshot.verbose("Incompatible snapshot")
                    snapshot.destroy()
-                    target_dataset.snapshots.remove(snapshot)
+                    self.snapshots.remove(snapshot)
+
+    def sync_snapshots(self, target_dataset, features, show_progress, filter_properties, set_properties,
+                       ignore_recv_exit_code, holds, rollback, raw, also_other_snapshots,
+                       no_send, destroy_incompatible, no_thinning):
+        """sync this dataset's snapshots to target_dataset, while also thinning out old snapshots along the way."""
+
+        (common_snapshot, start_snapshot, source_obsoletes, target_obsoletes, target_keeps,
+         incompatible_target_snapshots) = \
+            self._plan_sync(target_dataset=target_dataset, also_other_snapshots=also_other_snapshots)
+
+        # NOTE: we do this because we dont want filesystems to fillup when backups keep failing.
+        # Also usefull with no_send to still cleanup stuff.
+        if not no_thinning:
+            self._pre_clean(
+                common_snapshot=common_snapshot, target_dataset=target_dataset,
+                target_keeps=target_keeps, target_obsoletes=target_obsoletes, source_obsoletes=source_obsoletes)
+
+        # now actually transfer the snapshots, if we want
+        if no_send:
+            return
+
+        # check if we can resume
+        resume_token = self._validate_resume_token(target_dataset, start_snapshot)
+
+        # handle incompatible stuff on target
+        target_dataset.handle_incompatible_snapshots(incompatible_target_snapshots, destroy_incompatible)

        # rollback target to latest?
        if rollback:
@ -780,15 +830,16 @@ class ZfsDataset:
                        prev_source_snapshot.release()
                        target_dataset.find_snapshot(prev_source_snapshot).release()

-                # we may now destroy the previous source snapshot if its obsolete
-                if prev_source_snapshot in source_obsoletes:
-                    prev_source_snapshot.destroy()
+                if not no_thinning:
+                    # we may now destroy the previous source snapshot if its obsolete
+                    if prev_source_snapshot in source_obsoletes:
+                        prev_source_snapshot.destroy()

                    # destroy the previous target snapshot if obsolete (usually this is only the common_snapshot,
                    # the rest was already destroyed or will not be send)
-                prev_target_snapshot = target_dataset.find_snapshot(prev_source_snapshot)
-                if prev_target_snapshot in target_obsoletes:
-                    prev_target_snapshot.destroy()
+                    prev_target_snapshot = target_dataset.find_snapshot(prev_source_snapshot)
+                    if prev_target_snapshot in target_obsoletes:
+                        prev_target_snapshot.destroy()

                prev_source_snapshot = source_snapshot
            else:
@ -799,4 +850,4 @@ class ZfsDataset:
                    target_dataset.abort_resume()
                    resume_token = None

-            source_snapshot = self.find_next_snapshot(source_snapshot, other_snapshots)
+            source_snapshot = self.find_next_snapshot(source_snapshot, also_other_snapshots)
--- a/zfs_autobackup/ZfsNode.py
+++ b/zfs_autobackup/ZfsNode.py
@ -194,7 +194,7 @@ class ZfsNode(ExecuteNode):
            self.run(cmd, readonly=False)

    @CachedProperty
-    def selected_datasets(self):
+    def selected_datasets(self, ignore_received=True):
        """determine filesystems that should be backupped by looking at the special autobackup-property, systemwide

           returns: list of ZfsDataset
@ -204,35 +204,32 @@ class ZfsNode(ExecuteNode):

        # get all source filesystems that have the backup property
        lines = self.run(tab_split=True, readonly=True, cmd=[
-            "zfs", "get", "-t", "volume,filesystem", "-o", "name,value,source", "-s", "local,inherited", "-H",
+            "zfs", "get", "-t", "volume,filesystem", "-o", "name,value,source", "-H",
            "autobackup:" + self.backup_name
        ])

-        # determine filesystems that should be actually backupped
+        # The returnlist of selected ZfsDataset's:
        selected_filesystems = []
-        direct_filesystems = []
+
+        # list of sources, used to resolve inherited sources
+        sources = {}
+
        for line in lines:
-            (name, value, source) = line
+            (name, value, raw_source) = line
            dataset = ZfsDataset(self, name)

-            if value == "false":
-                dataset.verbose("Ignored (disabled)")
-
+            # "resolve" inherited sources
+            sources[name] = raw_source
+            if raw_source.find("inherited from ") == 0:
+                inherited = True
+                inherited_from = re.sub("^inherited from ", "", raw_source)
+                source = sources[inherited_from]
            else:
-                if source == "local" and (value == "true" or value == "child"):
-                    direct_filesystems.append(name)
+                inherited = False
+                source = raw_source

-                if source == "local" and value == "true":
-                    dataset.verbose("Selected (direct selection)")
-                    selected_filesystems.append(dataset)
-                elif source.find("inherited from ") == 0 and (value == "true" or value == "child"):
-                    inherited_from = re.sub("^inherited from ", "", source)
-                    if inherited_from in direct_filesystems:
-                        selected_filesystems.append(dataset)
-                        dataset.verbose("Selected (inherited selection)")
-                    else:
-                        dataset.debug("Ignored (already a backup)")
-                else:
-                    dataset.verbose("Ignored (only childs)")
+            # determine it
+            if dataset.is_selected(value=value, source=source, inherited=inherited, ignore_received=ignore_received):
+                selected_filesystems.append(dataset)

-        return selected_filesystems
+        return selected_filesystems
Author	SHA1	Message	Date
Edwin Eefting	cf72de7c28	cleanedup and improved select-code	2021-03-16 23:40:31 +01:00
Edwin Eefting	686bb48bda	restore progres. verify --destroy-incompabible output.	2021-03-11 11:57:51 +01:00
Edwin Eefting	6a48b8a2a9	nicer help	2021-03-03 15:48:44 +01:00
Edwin Eefting	477b66c342	split encoder	2021-03-03 11:50:11 +01:00
Edwin Eefting	a4155f970e	completed --no-thinning option. fixes #54	2021-03-01 00:04:10 +01:00
Edwin Eefting	0c9d14bf32	splitting up more stuff	2021-02-27 21:36:03 +01:00
Edwin Eefting	1f5955ccec	splitting up more stuff	2021-02-27 14:35:47 +01:00
Edwin Eefting	1b94a849db	splitting up more stuff	2021-02-26 20:10:39 +01:00
Edwin Eefting	98c40c6df5	splitting up more stuff	2021-02-26 20:10:20 +01:00
Edwin Eefting	b479ab9c98	more clear name	2021-02-21 22:25:42 +01:00
Edwin Eefting	a0fb205e75	cleaning up code	2021-02-21 22:08:35 +01:00
Edwin Eefting	d3ce222921	some more refactoring. splitting of smaller cleaner functions. started work on --no-thinning	2021-02-19 00:27:37 +01:00
DatuX	36e134eb75	Update README.md	2021-02-16 12:28:00 +01:00
Edwin Eefting	628cd75941	fix	2021-02-07 21:17:40 +01:00
Edwin Eefting	1da14c5c3b	forgot to document child	2021-02-07 21:09:03 +01:00
Edwin Eefting	c83d0fcff2	explain inheritance	2021-02-07 20:59:19 +01:00
Edwin Eefting	573af341b8	fixes	2021-02-07 18:04:00 +01:00
Edwin Eefting	a64168bee2	fixes	2021-02-07 18:00:58 +01:00
Edwin Eefting	c678ae5f9a	fixed --ignore-transfer-errors	2021-02-07 16:57:12 +01:00