test re-replication

allow re-replication of a backup with the same name. (now filters on target_path instead of received-status when selecting when appropriate. also shows notes about this)
improved progress reporting. improved no_thinning performance
2021-05-03 00:03:22 +02:00 · 2021-05-02 22:51:20 +02:00 · 2021-04-23 20:31:37 +02:00 · 2021-04-22 01:16:53 +02:00 · 2021-04-22 01:12:41 +02:00 · 2021-04-22 00:14:14 +02:00
13 changed files with 269 additions and 147 deletions
--- a/README.md
+++ b/README.md
@ -17,7 +17,7 @@ Since its using ZFS commands, you can see what its actually doing by specifying

 An important feature thats missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your system.

-zfs-autobackup tries to be the easiest to use backup tool for zfs.
+zfs-autobackup tries to be the easiest to use backup tool for zfs, with the most features.

 ## Features

@ -32,7 +32,8 @@ zfs-autobackup tries to be the easiest to use backup tool for zfs.
  * "pull" remote data from a server via SSH and backup it locally.
  * Or even pull data from a server while pushing the backup to another server. (Zero trust between source and target server)
 * Can be scheduled via a simple cronjob or run directly from commandline.
-* Supports resuming of interrupted transfers. 
+* Supports resuming of interrupted transfers.
+* ZFS encryption support: Can decrypt / encrypt or even re-encrypt datasets during transfer.
 * Multiple backups from and to the same datasets are no problem.
 * Creates the snapshot before doing anything else. (assuring you at least have a snapshot if all else fails)
 * Checks everything but tries continue on non-fatal errors when possible. (Reports error-count when done)
@ -42,7 +43,7 @@ zfs-autobackup tries to be the easiest to use backup tool for zfs.
 * Uses zfs-holds on important snapshots so they cant be accidentally destroyed.
 * Automatic resuming of failed transfers.
 * Can continue from existing common snapshots. (e.g. easy migration)
-* Gracefully handles destroyed datasets on source.
+* Gracefully handles datasets that no longer exist on source.
 * Easy installation:
  * Just install zfs-autobackup via pip, or download it manually.
  * Only needs to be installed on one side.
--- a/tests/test_cmdpipe.py
+++ b/tests/test_cmdpipe.py
@ -9,26 +9,24 @@ class TestCmdPipe(unittest2.TestCase):
        p=CmdPipe(readonly=False, inp=None)
        err=[]
        out=[]
-        p.add(["ls", "-d", "/", "/", "/nonexistent"], stderr_handler=lambda line: err.append(line))
+        p.add(["ls", "-d", "/", "/", "/nonexistent"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2))
        executed=p.execute(stdout_handler=lambda line: out.append(line))

        self.assertEqual(err, ["ls: cannot access '/nonexistent': No such file or directory"])
        self.assertEqual(out, ["/","/"])
        self.assertTrue(executed)
-        self.assertEqual(p.items[0]['process'].returncode,2)

    def test_input(self):
        """test stdinput"""
        p=CmdPipe(readonly=False, inp="test")
        err=[]
        out=[]
-        p.add(["echo", "test"], stderr_handler=lambda line: err.append(line))
+        p.add(["echo", "test"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0))
        executed=p.execute(stdout_handler=lambda line: out.append(line))

        self.assertEqual(err, [])
        self.assertEqual(out, ["test"])
        self.assertTrue(executed)
-        self.assertEqual(p.items[0]['process'].returncode,0)

    def test_pipe(self):
        """test piped"""
@ -37,9 +35,9 @@ class TestCmdPipe(unittest2.TestCase):
        err2=[]
        err3=[]
        out=[]
-        p.add(["echo", "test"], stderr_handler=lambda line: err1.append(line))
-        p.add(["tr", "e", "E"], stderr_handler=lambda line: err2.append(line))
-        p.add(["tr", "t", "T"], stderr_handler=lambda line: err3.append(line))
+        p.add(["echo", "test"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0))
+        p.add(["tr", "e", "E"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0))
+        p.add(["tr", "t", "T"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0))
        executed=p.execute(stdout_handler=lambda line: out.append(line))

        self.assertEqual(err1, [])
@ -47,9 +45,6 @@ class TestCmdPipe(unittest2.TestCase):
        self.assertEqual(err3, [])
        self.assertEqual(out, ["TEsT"])
        self.assertTrue(executed)
-        self.assertEqual(p.items[0]['process'].returncode,0)
-        self.assertEqual(p.items[1]['process'].returncode,0)
-        self.assertEqual(p.items[2]['process'].returncode,0)

        #test str representation as well
        self.assertEqual(str(p), "(echo test) | (tr e E) | (tr t T)")
@ -61,9 +56,9 @@ class TestCmdPipe(unittest2.TestCase):
        err2=[]
        err3=[]
        out=[]
-        p.add(["ls", "/nonexistent1"], stderr_handler=lambda line: err1.append(line))
-        p.add(["ls", "/nonexistent2"], stderr_handler=lambda line: err2.append(line))
-        p.add(["ls", "/nonexistent3"], stderr_handler=lambda line: err3.append(line))
+        p.add(["ls", "/nonexistent1"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2))
+        p.add(["ls", "/nonexistent2"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2))
+        p.add(["ls", "/nonexistent3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2))
        executed=p.execute(stdout_handler=lambda line: out.append(line))

        self.assertEqual(err1, ["ls: cannot access '/nonexistent1': No such file or directory"])
@ -71,9 +66,24 @@ class TestCmdPipe(unittest2.TestCase):
        self.assertEqual(err3, ["ls: cannot access '/nonexistent3': No such file or directory"])
        self.assertEqual(out, [])
        self.assertTrue(executed)
-        self.assertEqual(p.items[0]['process'].returncode,2)
-        self.assertEqual(p.items[1]['process'].returncode,2)
-        self.assertEqual(p.items[2]['process'].returncode,2)
+
+    def test_exitcode(self):
+        """test piped exitcodes """
+        p=CmdPipe(readonly=False)
+        err1=[]
+        err2=[]
+        err3=[]
+        out=[]
+        p.add(["bash", "-c", "exit 1"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,1))
+        p.add(["bash", "-c", "exit 2"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2))
+        p.add(["bash", "-c", "exit 3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,3))
+        executed=p.execute(stdout_handler=lambda line: out.append(line))
+
+        self.assertEqual(err1, [])
+        self.assertEqual(err2, [])
+        self.assertEqual(err3, [])
+        self.assertEqual(out, [])
+        self.assertTrue(executed)

    def test_readonly_execute(self):
        """everything readonly, just should execute"""
--- a/tests/test_encryption.py
+++ b/tests/test_encryption.py
@ -49,12 +49,12 @@ class TestZfsEncryption(unittest2.TestCase):
        self.prepare_encrypted_dataset("22222222", "test_target1/encryptedtarget")

        with patch('time.strftime', return_value="20101111000000"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --no-snapshot --exclude-received".split(" ")).run())

        with patch('time.strftime', return_value="20101111000001"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --no-snapshot --exclude-received".split(" ")).run())

        r = shelltest("zfs get -r -t filesystem encryptionroot test_target1")
        self.assertMultiLineEqual(r,"""
@ -86,12 +86,12 @@ test_target1/test_source2/fs2/sub                                     encryption
        self.prepare_encrypted_dataset("22222222", "test_target1/encryptedtarget")

        with patch('time.strftime', return_value="20101111000000"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --decrypt --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --decrypt --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --decrypt --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --decrypt --no-snapshot --exclude-received".split(" ")).run())

        with patch('time.strftime', return_value="20101111000001"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --decrypt --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --decrypt --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --decrypt --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --decrypt --no-snapshot --exclude-received".split(" ")).run())

        r = shelltest("zfs get -r -t filesystem encryptionroot test_target1")
        self.assertEqual(r, """
@ -121,12 +121,12 @@ test_target1/test_source2/fs2/sub                              encryptionroot  -
        self.prepare_encrypted_dataset("22222222", "test_target1/encryptedtarget")

        with patch('time.strftime', return_value="20101111000000"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --encrypt --debug --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --encrypt --debug --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --encrypt --debug --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --encrypt --debug --no-snapshot --exclude-received".split(" ")).run())

        with patch('time.strftime', return_value="20101111000001"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --encrypt --debug --allow-empty".split(" ")).run())
-            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --encrypt --debug --no-snapshot".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --encrypt --debug --allow-empty --exclude-received".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1/encryptedtarget --verbose --no-progress --encrypt --debug --no-snapshot --exclude-received".split(" ")).run())

        r = shelltest("zfs get -r -t filesystem encryptionroot test_target1")
        self.assertEqual(r, """
@ -157,16 +157,16 @@ test_target1/test_source2/fs2/sub                              encryptionroot  -

        with patch('time.strftime', return_value="20101111000000"):
            self.assertFalse(ZfsAutobackup(
-                "test test_target1 --verbose --no-progress --decrypt --encrypt --debug --allow-empty".split(" ")).run())
+                "test test_target1 --verbose --no-progress --decrypt --encrypt --debug --allow-empty --exclude-received".split(" ")).run())
            self.assertFalse(ZfsAutobackup(
-                "test test_target1/encryptedtarget --verbose --no-progress --decrypt --encrypt --debug --no-snapshot".split(
+                "test test_target1/encryptedtarget --verbose --no-progress --decrypt --encrypt --debug --no-snapshot --exclude-received".split(
                    " ")).run())

        with patch('time.strftime', return_value="20101111000001"):
            self.assertFalse(ZfsAutobackup(
-                "test test_target1 --verbose --no-progress --decrypt --encrypt --debug --allow-empty".split(" ")).run())
+                "test test_target1 --verbose --no-progress --decrypt --encrypt --debug --allow-empty --exclude-received".split(" ")).run())
            self.assertFalse(ZfsAutobackup(
-                "test test_target1/encryptedtarget --verbose --no-progress --decrypt --encrypt --debug --no-snapshot".split(
+                "test test_target1/encryptedtarget --verbose --no-progress --decrypt --encrypt --debug --no-snapshot --exclude-received".split(
                    " ")).run())

        r = shelltest("zfs get -r -t filesystem encryptionroot test_target1")
--- a/tests/test_executenode.py
+++ b/tests/test_executenode.py
@ -1,5 +1,5 @@
 from basetest import *
-from zfs_autobackup.ExecuteNode import ExecuteNode
+from zfs_autobackup.ExecuteNode import *

 print("THIS TEST REQUIRES SSH TO LOCALHOST")

@ -15,7 +15,7 @@ class TestExecuteNode(unittest2.TestCase):
            self.assertEqual(node.run(["echo","test"]), ["test"])

        with self.subTest("error exit code"):
-            with self.assertRaises(subprocess.CalledProcessError):
+            with self.assertRaises(ExecuteError):
                node.run(["false"])

        #
@ -81,29 +81,33 @@ class TestExecuteNode(unittest2.TestCase):
            nodeb.run(["true"], inp=output)

        with self.subTest("error on pipe input side"):
-            with self.assertRaises(subprocess.CalledProcessError):
+            with self.assertRaises(ExecuteError):
                output=nodea.run(["false"], pipe=True)
                nodeb.run(["true"], inp=output)

+        with self.subTest("error on both sides, ignore exit codes"):
+            output=nodea.run(["false"], pipe=True, valid_exitcodes=[])
+            nodeb.run(["false"], inp=output, valid_exitcodes=[])
+
        with self.subTest("error on pipe output side "):
-            with self.assertRaises(subprocess.CalledProcessError):
+            with self.assertRaises(ExecuteError):
                output=nodea.run(["true"], pipe=True)
                nodeb.run(["false"], inp=output)

        with self.subTest("error on both sides of pipe"):
-            with self.assertRaises(subprocess.CalledProcessError):
+            with self.assertRaises(ExecuteError):
                output=nodea.run(["false"], pipe=True)
                nodeb.run(["false"], inp=output)

        with self.subTest("check stderr on pipe output side"):
-            output=nodea.run(["true"], pipe=True)
-            (stdout, stderr)=nodeb.run(["ls", "nonexistingfile"], inp=output, return_stderr=True, valid_exitcodes=[0,2])
+            output=nodea.run(["true"], pipe=True, valid_exitcodes=[0])
+            (stdout, stderr)=nodeb.run(["ls", "nonexistingfile"], inp=output, return_stderr=True, valid_exitcodes=[2])
            self.assertEqual(stdout,[])
            self.assertRegex(stderr[0], "nonexistingfile" )

        with self.subTest("check stderr on pipe input side (should be only printed)"):
-            output=nodea.run(["ls", "nonexistingfile"], pipe=True)
-            (stdout, stderr)=nodeb.run(["true"], inp=output, return_stderr=True, valid_exitcodes=[0,2])
+            output=nodea.run(["ls", "nonexistingfile"], pipe=True, valid_exitcodes=[2])
+            (stdout, stderr)=nodeb.run(["true"], inp=output, return_stderr=True, valid_exitcodes=[0])
            self.assertEqual(stdout,[])
            self.assertEqual(stderr,[])

--- a/tests/test_zfsautobackup.py
+++ b/tests/test_zfsautobackup.py
@ -590,10 +590,10 @@ test_target1/test_source2/fs2/sub@test-20101111000003
        #test all ssh directions

        with patch('time.strftime', return_value="20101111000000"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --no-progress --verbose --allow-empty --ssh-source localhost".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --no-progress --verbose --allow-empty --ssh-source localhost --exclude-received".split(" ")).run())

        with patch('time.strftime', return_value="20101111000001"):
-            self.assertFalse(ZfsAutobackup("test test_target1 --no-progress --verbose --allow-empty --ssh-target localhost".split(" ")).run())
+            self.assertFalse(ZfsAutobackup("test test_target1 --no-progress --verbose --allow-empty --ssh-target localhost --exclude-received".split(" ")).run())

        with patch('time.strftime', return_value="20101111000002"):
            self.assertFalse(ZfsAutobackup("test test_target1 --no-progress --verbose --allow-empty --ssh-source localhost --ssh-target localhost".split(" ")).run())
--- a/tests/test_zfsautobackup31.py
+++ b/tests/test_zfsautobackup31.py
@ -47,3 +47,27 @@ test_target1/test_source2/fs2/sub@test-20101111000001
 """)


+    def test_re_replication(self):
+        """test re-replication of something thats already a backup (new in v3.1-beta5)"""
+
+        shelltest("zfs create test_target1/a")
+        shelltest("zfs create test_target1/b")
+
+        with patch('time.strftime', return_value="20101111000000"):
+            self.assertFalse(ZfsAutobackup("test test_target1/a --no-progress --verbose --debug".split(" ")).run())
+
+        with patch('time.strftime', return_value="20101111000001"):
+            self.assertFalse(ZfsAutobackup("test test_target1/b --no-progress --verbose".split(" ")).run())
+
+            r=shelltest("zfs list -H -o name -r -t snapshot test_target1")
+            #NOTE: it wont backup test_target1/a/test_source2/fs2/sub to test_target1/b since it doesnt have the zfs_autobackup property anymore.
+            self.assertMultiLineEqual(r,"""
+test_target1/a/test_source1/fs1@test-20101111000000
+test_target1/a/test_source1/fs1/sub@test-20101111000000
+test_target1/a/test_source2/fs2/sub@test-20101111000000
+test_target1/b/test_source1/fs1@test-20101111000000
+test_target1/b/test_source1/fs1/sub@test-20101111000000
+test_target1/b/test_source2/fs2/sub@test-20101111000000
+test_target1/b/test_target1/a/test_source1/fs1@test-20101111000000
+test_target1/b/test_target1/a/test_source1/fs1/sub@test-20101111000000
+""")
--- a/tests/test_zfsnode.py
+++ b/tests/test_zfsnode.py
@ -16,7 +16,7 @@ class TestZfsNode(unittest2.TestCase):
        node=ZfsNode("test", logger, description=description)

        with self.subTest("first snapshot"):
-            node.consistent_snapshot(node.selected_datasets, "test-1",100000)
+            node.consistent_snapshot(node.selected_datasets(exclude_paths=[], exclude_received=False), "test-1",100000)
            r=shelltest("zfs list -H -o name -r -t all "+TEST_POOLS)
            self.assertEqual(r,"""
 test_source1
@ -35,7 +35,7 @@ test_target1


        with self.subTest("second snapshot, no changes, no snapshot"):
-            node.consistent_snapshot(node.selected_datasets, "test-2",1)
+            node.consistent_snapshot(node.selected_datasets(exclude_paths=[], exclude_received=False), "test-2",1)
            r=shelltest("zfs list -H -o name -r -t all "+TEST_POOLS)
            self.assertEqual(r,"""
 test_source1
@ -53,7 +53,7 @@ test_target1
 """)

        with self.subTest("second snapshot, no changes, empty snapshot"):
-            node.consistent_snapshot(node.selected_datasets, "test-2",0)
+            node.consistent_snapshot(node.selected_datasets(exclude_paths=[], exclude_received=False), "test-2",0)
            r=shelltest("zfs list -H -o name -r -t all "+TEST_POOLS)
            self.assertEqual(r,"""
 test_source1
@ -78,7 +78,7 @@ test_target1
        logger=LogStub()
        description="[Source]"
        node=ZfsNode("test", logger, description=description)
-        s=pformat(node.selected_datasets)
+        s=pformat(node.selected_datasets(exclude_paths=[], exclude_received=False))
        print(s)

        #basics
--- a/zfs_autobackup/CmdPipe.py
+++ b/zfs_autobackup/CmdPipe.py
@ -17,12 +17,13 @@ class CmdPipe:
        self.readonly = readonly
        self._should_execute = True

-    def add(self, cmd, readonly=False, stderr_handler=None):
+    def add(self, cmd, readonly=False, stderr_handler=None, exit_handler=None):
        """adds a command to pipe"""

        self.items.append({
            'cmd': cmd,
-            'stderr_handler': stderr_handler
+            'stderr_handler': stderr_handler,
+            'exit_handler': exit_handler
        })

        if not readonly and self.readonly:
@ -117,10 +118,15 @@ class CmdPipe:
            if eof_count == len(selectors) and done_count == len(self.items):
                break

-        # ret = []
+        #close filehandles
        last_stdout.close()
        for item in self.items:
            item['process'].stderr.close()
-            # ret.append(item['process'].returncode)
+
+        #call exit handlers
+        for item in self.items:
+            if item['exit_handler'] is not None:
+                item['exit_handler'](item['process'].returncode)
+

        return True
--- a/zfs_autobackup/ExecuteNode.py
+++ b/zfs_autobackup/ExecuteNode.py
@ -5,6 +5,8 @@ import subprocess
 from zfs_autobackup.CmdPipe import CmdPipe
 from zfs_autobackup.LogStub import LogStub

+class ExecuteError(Exception):
+    pass

 class ExecuteNode(LogStub):
    """an endpoint to execute local or remote commands via ssh"""
@ -108,9 +110,20 @@ class ExecuteNode(LogStub):
                error_lines.append(line.rstrip())
            self._parse_stderr(line, hide_errors)

+        # exit code hanlder
+        if valid_exitcodes is None:
+            valid_exitcodes = [0]
+
+        def exit_handler(exit_code):
+            if self.debug_output:
+                self.debug("EXIT   > {}".format(exit_code))
+
+            if (valid_exitcodes != []) and (exit_code not in valid_exitcodes):
+             raise (ExecuteError("Command '{}' returned exit code {} (valid codes: {})".format(" ".join(cmd), exit_code, valid_exitcodes)))
+
        # add command to pipe
        encoded_cmd = self._remote_cmd(cmd)
-        p.add(cmd=encoded_cmd, readonly=readonly, stderr_handler=stderr_handler)
+        p.add(cmd=encoded_cmd, readonly=readonly, stderr_handler=stderr_handler, exit_handler=exit_handler)

        # return pipe instead of executing?
        if pipe:
@ -130,21 +143,8 @@ class ExecuteNode(LogStub):
        else:
            self.debug("CMDSKIP> {}".format(p))

-        # execute and verify exit codes
-        if p.execute(stdout_handler=stdout_handler) and valid_exitcodes is not []:
-            if valid_exitcodes is None:
-                valid_exitcodes = [0]
-
-            item_nr=1
-            for item in p.items:
-                exit_code=item['process'].returncode
-
-                if self.debug_output:
-                    self.debug("EXIT{}  > {}".format(item_nr, exit_code))
-
-                if exit_code not in valid_exitcodes:
-                    raise (subprocess.CalledProcessError(exit_code, " ".join(item['cmd'])))
-                item_nr=item_nr+1
+        # execute and calls handlers in CmdPipe
+        p.execute(stdout_handler=stdout_handler)

        if return_stderr:
            return output_lines, error_lines
--- a/zfs_autobackup/LogConsole.py
+++ b/zfs_autobackup/LogConsole.py
@ -46,3 +46,14 @@ class LogConsole:
            else:
                print("# " + txt)
            sys.stdout.flush()
+
+    def progress(self, txt):
+        """print progress output to stderr (stays on same line)"""
+        self.clear_progress()
+        print(">>> {}\r".format(txt), end='', file=sys.stderr)
+        sys.stderr.flush()
+
+    def clear_progress(self):
+        import colorama
+        print(colorama.ansi.clear_line(), end='', file=sys.stderr)
+        sys.stderr.flush()
--- a/zfs_autobackup/ZfsAutobackup.py
+++ b/zfs_autobackup/ZfsAutobackup.py
@ -12,8 +12,8 @@ from zfs_autobackup.ThinnerRule import ThinnerRule
 class ZfsAutobackup:
    """main class"""

-    VERSION = "3.1-beta3"
-    HEADER = "zfs-autobackup v{} - Copyright 2020 E.H.Eefting (edwin@datux.nl)".format(VERSION)
+    VERSION = "3.1-beta5"
+    HEADER = "zfs-autobackup v{} - (c)2021 E.H.Eefting (edwin@datux.nl)".format(VERSION)

    def __init__(self, argv, print_arguments=True):

@ -59,7 +59,6 @@ class ZfsAutobackup:
                            help='Ignore datasets that seem to be replicated some other way. (No changes since '
                                 'lastest snapshot. Useful for proxmox HA replication)')

-        parser.add_argument('--resume', action='store_true', help=argparse.SUPPRESS)
        parser.add_argument('--strip-path', metavar='N', default=0, type=int,
                            help='Number of directories to strip from target path (use 1 when cloning zones between 2 '
                                 'SmartOS machines)')
@ -89,8 +88,6 @@ class ZfsAutobackup:
        parser.add_argument('--ignore-transfer-errors', action='store_true',
                            help='Ignore transfer errors (still checks if received filesystem exists. useful for '
                                 'acltype errors)')
-        parser.add_argument('--raw', action='store_true',
-                            help=argparse.SUPPRESS)

        parser.add_argument('--decrypt', action='store_true',
                            help='Decrypt data before sending it over.')
@ -108,7 +105,8 @@ class ZfsAutobackup:
                            help='Show zfs commands and their output/exit codes. (noisy)')
        parser.add_argument('--progress', action='store_true',
                            help='show zfs progress output. Enabled automaticly on ttys. (use --no-progress to disable)')
-        parser.add_argument('--no-progress', action='store_true', help=argparse.SUPPRESS)  # needed to workaround a zfs recv -v bug
+        parser.add_argument('--no-progress', action='store_true',
+                            help=argparse.SUPPRESS)  # needed to workaround a zfs recv -v bug

        parser.add_argument('--send-pipe', metavar="COMMAND", default=[], action='append',
                            help='pipe zfs send output through COMMAND')
@ -116,6 +114,11 @@ class ZfsAutobackup:
        parser.add_argument('--recv-pipe', metavar="COMMAND", default=[], action='append',
                            help='pipe zfs recv input through COMMAND')

+        parser.add_argument('--resume', action='store_true', help=argparse.SUPPRESS)
+        parser.add_argument('--raw', action='store_true', help=argparse.SUPPRESS)
+        parser.add_argument('--exclude-received', action='store_true',
+                            help=argparse.SUPPRESS)  # probably never needed anymore
+
        # note args is the only global variable we use, since its a global readonly setting anyway
        args = parser.parse_args(argv)

@ -143,7 +146,8 @@ class ZfsAutobackup:
            self.verbose("NOTE: The --resume option isn't needed anymore (its autodetected now)")

        if args.raw:
-            self.verbose("NOTE: The --raw option isn't needed anymore (its autodetected now). Use --decrypt to explicitly send data decrypted.")
+            self.verbose(
+                "NOTE: The --raw option isn't needed anymore (its autodetected now). Also see --encrypt and --decrypt.")

        if args.target_path is not None and args.target_path[0] == "/":
            self.log.error("Target should not start with a /")
@ -162,73 +166,99 @@ class ZfsAutobackup:
        self.log.verbose("")
        self.log.verbose("#### " + title)

+    def progress(self, txt):
+        self.log.progress(txt)
+
+    def clear_progress(self):
+        self.log.clear_progress()
+
    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
    def thin_missing_targets(self, target_dataset, used_target_datasets):
        """thin target datasets that are missing on the source."""

        self.debug("Thinning obsolete datasets")
+        missing_datasets = [dataset for dataset in target_dataset.recursive_datasets if
+                            dataset not in used_target_datasets]
+
+        count = 0
+        for dataset in missing_datasets:
+
+            count = count + 1
+            if self.args.progress:
+                self.progress("Analysing missing {}/{}".format(count, len(missing_datasets)))

-        for dataset in target_dataset.recursive_datasets:
            try:
-                if dataset not in used_target_datasets:
-                    dataset.debug("Missing on source, thinning")
-                    dataset.thin()
+                dataset.debug("Missing on source, thinning")
+                dataset.thin()

            except Exception as e:
                dataset.error("Error during thinning of missing datasets ({})".format(str(e)))

+        if self.args.progress:
+            self.clear_progress()
+
    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
    def destroy_missing_targets(self, target_dataset, used_target_datasets):
        """destroy target datasets that are missing on the source and that meet the requirements"""

        self.debug("Destroying obsolete datasets")

-        for dataset in target_dataset.recursive_datasets:
+        missing_datasets = [dataset for dataset in target_dataset.recursive_datasets if
+                            dataset not in used_target_datasets]
+
+        count = 0
+        for dataset in missing_datasets:
+
+            count = count + 1
+            if self.args.progress:
+                self.progress("Analysing destroy missing {}/{}".format(count, len(missing_datasets)))
+
            try:
-                if dataset not in used_target_datasets:
-
-                    # cant do anything without our own snapshots
-                    if not dataset.our_snapshots:
-                        if dataset.datasets:
-                            # its not a leaf, just ignore
-                            dataset.debug("Destroy missing: ignoring")
-                        else:
-                            dataset.verbose(
-                                "Destroy missing: has no snapshots made by us. (please destroy manually)")
+                # cant do anything without our own snapshots
+                if not dataset.our_snapshots:
+                    if dataset.datasets:
+                        # its not a leaf, just ignore
+                        dataset.debug("Destroy missing: ignoring")
                    else:
-                        # past the deadline?
-                        deadline_ttl = ThinnerRule("0s" + self.args.destroy_missing).ttl
-                        now = int(time.time())
-                        if dataset.our_snapshots[-1].timestamp + deadline_ttl > now:
-                            dataset.verbose("Destroy missing: Waiting for deadline.")
+                        dataset.verbose(
+                            "Destroy missing: has no snapshots made by us. (please destroy manually)")
+                else:
+                    # past the deadline?
+                    deadline_ttl = ThinnerRule("0s" + self.args.destroy_missing).ttl
+                    now = int(time.time())
+                    if dataset.our_snapshots[-1].timestamp + deadline_ttl > now:
+                        dataset.verbose("Destroy missing: Waiting for deadline.")
+                    else:
+
+                        dataset.debug("Destroy missing: Removing our snapshots.")
+
+                        # remove all our snaphots, except last, to safe space in case we fail later on
+                        for snapshot in dataset.our_snapshots[:-1]:
+                            snapshot.destroy(fail_exception=True)
+
+                        # does it have other snapshots?
+                        has_others = False
+                        for snapshot in dataset.snapshots:
+                            if not snapshot.is_ours():
+                                has_others = True
+                                break
+
+                        if has_others:
+                            dataset.verbose("Destroy missing: Still in use by other snapshots")
                        else:
-
-                            dataset.debug("Destroy missing: Removing our snapshots.")
-
-                            # remove all our snaphots, except last, to safe space in case we fail later on
-                            for snapshot in dataset.our_snapshots[:-1]:
-                                snapshot.destroy(fail_exception=True)
-
-                            # does it have other snapshots?
-                            has_others = False
-                            for snapshot in dataset.snapshots:
-                                if not snapshot.is_ours():
-                                    has_others = True
-                                    break
-
-                            if has_others:
-                                dataset.verbose("Destroy missing: Still in use by other snapshots")
+                            if dataset.datasets:
+                                dataset.verbose("Destroy missing: Still has children here.")
                            else:
-                                if dataset.datasets:
-                                    dataset.verbose("Destroy missing: Still has children here.")
-                                else:
-                                    dataset.verbose("Destroy missing.")
-                                    dataset.our_snapshots[-1].destroy(fail_exception=True)
-                                    dataset.destroy(fail_exception=True)
+                                dataset.verbose("Destroy missing.")
+                                dataset.our_snapshots[-1].destroy(fail_exception=True)
+                                dataset.destroy(fail_exception=True)

            except Exception as e:
                dataset.error("Error during --destroy-missing: {}".format(str(e)))

+        if self.args.progress:
+            self.clear_progress()
+
    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
    def sync_datasets(self, source_node, source_datasets, target_node):
        """Sync datasets, or thin-only on both sides
@ -238,9 +268,15 @@ class ZfsAutobackup:
        """

        fail_count = 0
+        count = 0
        target_datasets = []
        for source_dataset in source_datasets:

+            # stats
+            if self.args.progress:
+                count = count + 1
+                self.progress("Analysing dataset {}/{} ({} failed)".format(count, len(source_datasets), fail_count))
+
            try:
                # determine corresponding target_dataset
                target_name = self.args.target_path + "/" + source_dataset.lstrip_path(self.args.strip_path)
@ -268,15 +304,20 @@ class ZfsAutobackup:
                                              also_other_snapshots=self.args.other_snapshots,
                                              no_send=self.args.no_send,
                                              destroy_incompatible=self.args.destroy_incompatible,
-                                              output_pipes=self.args.send_pipe, input_pipes=self.args.recv_pipe, decrypt=self.args.decrypt, encrypt=self.args.encrypt)
+                                              output_pipes=self.args.send_pipe, input_pipes=self.args.recv_pipe,
+                                              decrypt=self.args.decrypt, encrypt=self.args.encrypt)
            except Exception as e:
                fail_count = fail_count + 1
                source_dataset.error("FAILED: " + str(e))
                if self.args.debug:
                    raise

+        if self.args.progress:
+            self.clear_progress()
+
        target_path_dataset = ZfsDataset(target_node, self.args.target_path)
-        self.thin_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)
+        if not self.args.no_thinning:
+            self.thin_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)

        if self.args.destroy_missing is not None:
            self.destroy_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)
@ -285,11 +326,10 @@ class ZfsAutobackup:

    def thin_source(self, source_datasets):

-        if not self.args.no_thinning:
-            self.set_title("Thinning source")
+        self.set_title("Thinning source")

-            for source_dataset in source_datasets:
-                source_dataset.thin(skip_holds=True)
+        for source_dataset in source_datasets:
+            source_dataset.thin(skip_holds=True)

    def filter_replicated(self, datasets):
        if not self.args.ignore_replicated:
@ -337,11 +377,12 @@ class ZfsAutobackup:
            if self.args.test:
                self.verbose("TEST MODE - SIMULATING WITHOUT MAKING ANY CHANGES")

+            ################ create source zfsNode
            self.set_title("Source settings")

            description = "[Source]"
            if self.args.no_thinning:
-                source_thinner=None
+                source_thinner = None
            else:
                source_thinner = Thinner(self.args.keep_source)
            source_node = ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config,
@ -352,8 +393,24 @@ class ZfsAutobackup:
                "'autobackup:{}=child')".format(
                    self.args.backup_name, self.args.backup_name))

+            ################# select source datasets
            self.set_title("Selecting")
-            selected_source_datasets = source_node.selected_datasets
+
+            #Note: Before version v3.1-beta5, we always used exclude_received. This was a problem if you wanto to replicate an existing backup to another host and use the same backupname/snapshots.
+            exclude_paths = []
+            exclude_received=self.args.exclude_received
+            if self.args.ssh_source == self.args.ssh_target:
+                if self.args.target_path:
+                    # target and source are the same, make sure to exclude target_path
+                    source_node.verbose("NOTE: Source and target are on the same host, excluding target-path")
+                    exclude_paths.append(self.args.target_path)
+                else:
+                    source_node.verbose("NOTE: Source and target are on the same host, excluding received datasets.")
+                    exclude_received=True
+
+
+            selected_source_datasets = source_node.selected_datasets(exclude_received=exclude_received,
+                                                                     exclude_paths=exclude_paths)
            if not selected_source_datasets:
                self.error(
                    "No source filesystems selected, please do a 'zfs set autobackup:{0}=true' on the source datasets "
@ -364,18 +421,20 @@ class ZfsAutobackup:
            # filter out already replicated stuff?
            source_datasets = self.filter_replicated(selected_source_datasets)

+            ################# snapshotting
            if not self.args.no_snapshot:
                self.set_title("Snapshotting")
                source_node.consistent_snapshot(source_datasets, source_node.new_snapshotname(),
                                                min_changed_bytes=self.args.min_change)

+            ################# sync
            # if target is specified, we sync the datasets, otherwise we just thin the source. (e.g. snapshot mode)
            if self.args.target_path:

                # create target_node
                self.set_title("Target settings")
                if self.args.no_thinning:
-                    target_thinner=None
+                    target_thinner = None
                else:
                    target_thinner = Thinner(self.args.keep_target)
                target_node = ZfsNode(self.args.backup_name, self, ssh_config=self.args.ssh_config,
@ -390,7 +449,7 @@ class ZfsAutobackup:
                # check if exists, to prevent vague errors
                target_dataset = ZfsDataset(target_node, self.args.target_path)
                if not target_dataset.exists:
-                    raise(Exception(
+                    raise (Exception(
                        "Target path '{}' does not exist. Please create this dataset first.".format(target_dataset)))

                # do the actual sync
@ -400,9 +459,10 @@ class ZfsAutobackup:
                    source_datasets=source_datasets,
                    target_node=target_node)

-            #no target specified, run in snapshot-only mode
+            # no target specified, run in snapshot-only mode
            else:
-                self.thin_source(source_datasets)
+                if not self.args.no_thinning:
+                    self.thin_source(source_datasets)
                fail_count = 0

            if not fail_count:
--- a/zfs_autobackup/ZfsDataset.py
+++ b/zfs_autobackup/ZfsDataset.py
@ -1,8 +1,8 @@
 import re
-import subprocess
 import time

 from zfs_autobackup.CachedProperty import CachedProperty
+from zfs_autobackup.ExecuteNode import ExecuteError


 class ZfsDataset:
@ -112,15 +112,16 @@ class ZfsDataset:
        """true if this dataset is a snapshot"""
        return self.name.find("@") != -1

-    def is_selected(self, value, source, inherited, ignore_received):
+    def is_selected(self, value, source, inherited, exclude_received, exclude_paths):
        """determine if dataset should be selected for backup (called from
        ZfsNode)

        Args:
+            :type exclude_paths: list of str
            :type value: str
            :type source: str
            :type inherited: bool
-            :type ignore_received: bool
+            :type exclude_received: bool
        """

        # sanity checks
@ -128,22 +129,30 @@ class ZfsDataset:
            # probably a program error in zfs-autobackup or new feature in zfs
            raise (Exception(
                "{} autobackup-property has illegal source: '{}' (possible BUG)".format(self.name, source)))
+
        if value not in ["false", "true", "child", "-"]:
            # user error
            raise (Exception(
                "{} autobackup-property has illegal value: '{}'".format(self.name, value)))

+        # our path starts with one of the excluded paths?
+        for exclude_path in exclude_paths:
+            if self.name.startswith(exclude_path):
+                # too noisy for verbose
+                self.debug("Excluded (in exclude list)")
+                return False
+
        # now determine if its actually selected
        if value == "false":
-            self.verbose("Ignored (disabled)")
+            self.verbose("Excluded (disabled)")
            return False
        elif value == "true" or (value == "child" and inherited):
            if source == "local":
                self.verbose("Selected")
                return True
            elif source == "received":
-                if ignore_received:
-                    self.verbose("Ignored (local backup)")
+                if exclude_received:
+                    self.verbose("Excluded (dataset already received)")
                    return False
                else:
                    self.verbose("Selected")
@ -250,7 +259,7 @@ class ZfsDataset:
            self.invalidate()
            self.force_exists = False
            return True
-        except subprocess.CalledProcessError:
+        except ExecuteError:
            if not fail_exception:
                return False
            else:
@ -563,7 +572,6 @@ class ZfsDataset:

        return self.zfs_node.run(cmd, pipe=True, readonly=True)

-
    def recv_pipe(self, pipe, features, filter_properties=None, set_properties=None, ignore_exit_code=False):
        """starts a zfs recv for this snapshot and uses pipe as input

@ -976,7 +984,6 @@ class ZfsDataset:
            :type ignore_recv_exit_code: bool
            :type holds: bool
            :type rollback: bool
-            :type raw: bool
            :type decrypt: bool
            :type also_other_snapshots: bool
            :type no_send: bool
--- a/zfs_autobackup/ZfsNode.py
+++ b/zfs_autobackup/ZfsNode.py
@ -10,6 +10,7 @@ from zfs_autobackup.Thinner import Thinner
 from zfs_autobackup.CachedProperty import CachedProperty
 from zfs_autobackup.ZfsPool import ZfsPool
 from zfs_autobackup.ZfsDataset import ZfsDataset
+from zfs_autobackup.ExecuteNode import ExecuteError


 class ZfsNode(ExecuteNode):
@ -81,7 +82,7 @@ class ZfsNode(ExecuteNode):

        try:
            self.run(cmd, hide_errors=True, valid_exitcodes=[0, 1])
-        except subprocess.CalledProcessError:
+        except ExecuteError:
            return False

        return True
@ -127,9 +128,8 @@ class ZfsNode(ExecuteNode):
                        bytes_left = self._progress_total_bytes - bytes_
                        minutes_left = int((bytes_left / (bytes_ / (time.time() - self._progress_start_time))) / 60)

-                        print(">>> {}% {}MB/s (total {}MB, {} minutes left)     \r".format(percentage, speed, int(
-                            self._progress_total_bytes / (1024 * 1024)), minutes_left), end='', file=sys.stderr)
-                        sys.stderr.flush()
+                        self.logger.progress("Transfer {}% {}MB/s (total {}MB, {} minutes left)".format(percentage, speed, int(
+                            self._progress_total_bytes / (1024 * 1024)), minutes_left))

            return

@ -197,8 +197,7 @@ class ZfsNode(ExecuteNode):
            self.verbose("Creating snapshots {} in pool {}".format(snapshot_name, pool_name))
            self.run(cmd, readonly=False)

-    @CachedProperty
-    def selected_datasets(self, ignore_received=True):
+    def selected_datasets(self, exclude_received, exclude_paths):
        """determine filesystems that should be backupped by looking at the special autobackup-property, systemwide

           returns: list of ZfsDataset
@ -233,7 +232,7 @@ class ZfsNode(ExecuteNode):
                source = raw_source

            # determine it
-            if dataset.is_selected(value=value, source=source, inherited=inherited, ignore_received=ignore_received):
+            if dataset.is_selected(value=value, source=source, inherited=inherited, exclude_received=exclude_received, exclude_paths=exclude_paths):
                selected_filesystems.append(dataset)

        return selected_filesystems
Author	SHA1	Message	Date
Edwin Eefting	8ea178af1f	test re-replication	2021-05-03 00:03:22 +02:00
Edwin Eefting	3e39e1553e	allow re-replication of a backup with the same name. (now filters on target_path instead of received-status when selecting when appropriate. also shows notes about this)	2021-05-02 22:51:20 +02:00
Edwin Eefting	f0cc2bca2a	improved progress reporting. improved no_thinning performance	2021-04-23 20:31:37 +02:00
Edwin Eefting	59b0c23a20	Merge branch 'master' of github.com:psy0rz/zfs_autobackup	2021-04-22 01:16:53 +02:00
Edwin Eefting	401a3f73cc	better handling of piped exit codes	2021-04-22 01:12:41 +02:00
Edwin Eefting	8ec5ed2f4f	extra test	2021-04-22 00:14:14 +02:00
DatuX	8318b2f9bf	Update README.md	2021-04-21 00:19:21 +02:00
Edwin Eefting	72b97ab2e8	doc. bump version	2021-04-21 00:04:58 +02:00