added console reference

nicer errors
refactorred ZfsCheck.py for better sigpipe handling
2022-03-08 17:51:23 +01:00 · 2022-03-08 17:35:51 +01:00 · 2022-03-08 17:22:08 +01:00 · 2022-03-07 23:11:46 +01:00 · 2022-03-07 22:59:50 +01:00 · 2022-03-07 21:57:36 +01:00
40 changed files with 2553 additions and 1110 deletions
--- a/.github/FUNDING.yml
+++ b/.github/FUNDING.yml
@ -0,0 +1,5 @@
+# These are supported funding model platforms
+
+github: psy0rz
+ko_fi: psy0rz
+custom: https://paypal.me/psy0rz
--- a/.github/workflows/codeql-analysis.yml
+++ b/.github/workflows/codeql-analysis.yml
@ -0,0 +1,70 @@
+# For most projects, this workflow file will not need changing; you simply need
+# to commit it to your repository.
+#
+# You may wish to alter this file to override the set of languages analyzed,
+# or to provide custom queries or build logic.
+#
+# ******** NOTE ********
+# We have attempted to detect the languages in your repository. Please check
+# the `language` matrix defined below to confirm you have the correct set of
+# supported CodeQL languages.
+#
+name: "CodeQL"
+
+on:
+  push:
+    branches: [ master ]
+  pull_request:
+    # The branches below must be a subset of the branches above
+    branches: [ master ]
+  schedule:
+    - cron: '26 23 * * 3'
+
+jobs:
+  analyze:
+    name: Analyze
+    runs-on: ubuntu-latest
+    permissions:
+      actions: read
+      contents: read
+      security-events: write
+
+    strategy:
+      fail-fast: false
+      matrix:
+        language: [ 'python' ]
+        # CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
+        # Learn more about CodeQL language support at https://git.io/codeql-language-support
+
+    steps:
+    - name: Checkout repository
+      uses: actions/checkout@v2
+
+    # Initializes the CodeQL tools for scanning.
+    - name: Initialize CodeQL
+      uses: github/codeql-action/init@v1
+      with:
+        languages: ${{ matrix.language }}
+        # If you wish to specify custom queries, you can do so here or in a config file.
+        # By default, queries listed here will override any specified in a config file.
+        # Prefix the list here with "+" to use these queries and those in the config file.
+        # queries: ./path/to/local/query, your-org/your-repo/queries@main
+
+    # Autobuild attempts to build any compiled languages  (C/C++, C#, or Java).
+    # If this step fails, then you should remove it and run the build manually (see below)
+    - name: Autobuild
+      uses: github/codeql-action/autobuild@v1
+
+    # ℹ️ Command-line programs to run using the OS shell.
+    # 📚 https://git.io/JvXDl
+
+    # ✏️ If the Autobuild fails above, remove it and uncomment the following three lines
+    #    and modify them (or add more) to build your code if your project
+    #    uses a compiled language
+
+    #- run: |
+    #   make bootstrap
+    #   make release
+
+    - name: Perform CodeQL Analysis
+      uses: github/codeql-action/analyze@v1
--- a/README.md
+++ b/README.md
@ -2,6 +2,7 @@
 # ZFS autobackup

 [![Tests](https://github.com/psy0rz/zfs_autobackup/workflows/Regression%20tests/badge.svg)](https://github.com/psy0rz/zfs_autobackup/actions?query=workflow%3A%22Regression+tests%22) [![Coverage Status](https://coveralls.io/repos/github/psy0rz/zfs_autobackup/badge.svg)](https://coveralls.io/github/psy0rz/zfs_autobackup)  [![Python Package](https://github.com/psy0rz/zfs_autobackup/workflows/Upload%20Python%20Package/badge.svg)](https://pypi.org/project/zfs-autobackup/)
+[![CodeQL](https://github.com/psy0rz/zfs_autobackup/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/psy0rz/zfs_autobackup/actions/workflows/codeql-analysis.yml)

 ## Introduction

@ -13,764 +14,46 @@ You can select what to backup by setting a custom `ZFS property`. This makes it

 Other settings are just specified on the commandline: Simply setup and test your zfs-autobackup command and  fix all the issues you might encounter. When you're done you can just copy/paste your command to a cron or script.

-Since its using ZFS commands, you can see what it's actually doing by specifying `--debug`. This also helps a lot if you run into some strange problem or error. You can just copy-paste the command that fails and play around with it on the commandline. (something I missed in other tools)
+Since it's using ZFS commands, you can see what it's actually doing by specifying `--debug`. This also helps a lot if you run into some strange problem or error. You can just copy-paste the command that fails and play around with it on the commandline. (something I missed in other tools)

-An important feature thats missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your system.
+An important feature that's missing from other tools is a reliable `--test` option: This allows you to see what zfs-autobackup will do and tune your parameters. It will do everything, except make changes to your system.

 ## Features

 * Works across operating systems: Tested with **Linux**, **FreeBSD/FreeNAS** and **SmartOS**.
 * Low learning curve: no complex daemons or services, no additional software or networking needed. (Only read this page)   
 * Plays nicely with existing replication systems. (Like Proxmox HA)
-* Automatically selects filesystems to backup by looking at a simple ZFS property. (recursive)
+* Automatically selects filesystems to backup by looking at a simple ZFS property. 
 * Creates consistent snapshots. (takes all snapshots at once, atomicly.)
 * Multiple backups modes:
  * Backup local data on the same server.
  * "push" local data to a backup-server via SSH.
  * "pull" remote data from a server via SSH and backup it locally.
-  * Or even pull data from a server while pushing the backup to another server. (Zero trust between source and target server)
-* Can be scheduled via a simple cronjob or run directly from commandline.
-* Supports resuming of interrupted transfers.
+  * "pull+push": Zero trust between source and target.
+* Can be scheduled via simple cronjob or run directly from commandline.
 * ZFS encryption support: Can decrypt / encrypt or even re-encrypt datasets during transfer.
 * Supports sending with compression. (Using pigz, zstd etc)
 * IO buffering to speed up transfer.
 * Bandwidth rate limiting.
 * Multiple backups from and to the same datasets are no problem.
-* Creates the snapshot before doing anything else. (assuring you at least have a snapshot if all else fails)
-* Checks everything but tries continue on non-fatal errors when possible. (Reports error-count when done)
+* Resillient to errors.
 * Ability to manually 'finish' failed backups to see whats going on.
 * Easy to debug and has a test-mode. Actual unix commands are printed.
-* Uses **progressive thinning** for older snapshots.
-* Uses zfs-holds on important snapshots so they cant be accidentally destroyed.
+* Uses progressive thinning for older snapshots.
+* Uses zfs-holds on important snapshots to prevent accidental deletion.
 * Automatic resuming of failed transfers.
-* Can continue from existing common snapshots. (e.g. easy migration)
+* Easy migration from other zfs backup systems to zfs-autobackup.
 * Gracefully handles datasets that no longer exist on source.
-* Support for ZFS sending/receiving through custom pipes.
+* Complete and clean logging. 
 * Easy installation:
  * Just install zfs-autobackup via pip.
  * Only needs to be installed on one side.
  * Written in python and uses zfs-commands, no special 3rd party dependency's or compiled libraries needed.
-  * No separate config files or properties. Just one zfs-autobackup command you can copy/paste in your backup script.
+  * No annoying config files or properties. 

-## Installation
+## Getting started

-You only need to install zfs-autobackup on the side that initiates the backup. The other side doesnt need any extra configration.
-
-### Using pip
-
-The recommended way on most servers is to use [pip](https://pypi.org/project/zfs-autobackup/):
-
-```console
-[root@server ~]# pip install --upgrade zfs-autobackup
-```
-
-This can also be used to upgrade zfs-autobackup to the newest stable version.
-
-To install the latest beta version add the `--pre` option.
-
-### Using easy_install
-
-On older servers you might have to use easy_install
-
-```console
-[root@server ~]# easy_install zfs-autobackup
-```
-
-## Example
-
-In this example we're going to backup a machine called `server1` to a machine called `backup`.
-
-### Setup SSH login
-
-zfs-autobackup needs passwordless login via ssh. This means generating an ssh key and copying it to the remote server.
-
-#### Generate SSH key on `backup`
-
-On the backup-server that runs zfs-autobackup you need to create an SSH key. You only need to do this once.
-
-Use the `ssh-keygen` command and leave the passphrase empty:
-
-```console
-root@backup:~# ssh-keygen
-Generating public/private rsa key pair.
-Enter file in which to save the key (/root/.ssh/id_rsa):
-Enter passphrase (empty for no passphrase):
-Enter same passphrase again:
-Your identification has been saved in /root/.ssh/id_rsa.
-Your public key has been saved in /root/.ssh/id_rsa.pub.
-The key fingerprint is:
-SHA256:McJhCxvaxvFhO/3e8Lf5gzSrlTWew7/bwrd2U2EHymE root@backup
-The key's randomart image is:
-+---[RSA 2048]----+
-|    + =          |
-|   + X *    E .  |
-|  . = B +  o o . |
-|   .   o +  o  o.|
-|        S o   .oo|
-|         . + o= +|
-|          . ++==.|
-|            .+o**|
-|           .. +B@|
-+----[SHA256]-----+
-root@backup:~#
-```
-
-#### Copy SSH key to `server1`
-
-Now you need to copy the public part of the key to `server1`
-
-The `ssh-copy-id` command is a handy tool to automate this. It will just ask for your password.
-
-```console
-root@backup:~# ssh-copy-id root@server1.server.com
-/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
-/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
-/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
-Password:
-
-Number of key(s) added: 1
-
-Now try logging into the machine, with:   "ssh 'root@server1.server.com'"
-and check to make sure that only the key(s) you wanted were added.
-
-root@backup:~#
-```
-This allows the backup-server to login to `server1` as root without password.
-
-### Select filesystems to backup
-
-Its important to choose a unique and consistent backup name. In this case we name our backup: `offsite1`.
-
-On the source zfs system set the ```autobackup:offsite1``` zfs property to true:
-
-```console
-[root@server1 ~]# zfs set autobackup:offsite1=true rpool
-[root@server1 ~]# zfs get -t filesystem,volume autobackup:offsite1
-NAME                                    PROPERTY             VALUE                SOURCE
-rpool                                   autobackup:offsite1  true                 local
-rpool/ROOT                              autobackup:offsite1  true                 inherited from rpool
-rpool/ROOT/server1-1                    autobackup:offsite1  true                 inherited from rpool
-rpool/data                              autobackup:offsite1  true                 inherited from rpool
-rpool/data/vm-100-disk-0                autobackup:offsite1  true                 inherited from rpool
-rpool/swap                              autobackup:offsite1  true                 inherited from rpool
-...
-```
-
-ZFS properties are ```inherited``` by child datasets. Since we've set the property on the highest dataset, we're essentially backupping the whole pool.
-
-Because we don't want to backup everything, we can exclude certain filesystem by setting the property to false:
-
-```console
-[root@server1 ~]# zfs set autobackup:offsite1=false rpool/swap
-[root@server1 ~]# zfs get -t filesystem,volume autobackup:offsite1
-NAME                                    PROPERTY             VALUE                SOURCE
-rpool                                   autobackup:offsite1  true                 local
-rpool/ROOT                              autobackup:offsite1  true                 inherited from rpool
-rpool/ROOT/server1-1                    autobackup:offsite1  true                 inherited from rpool
-rpool/data                              autobackup:offsite1  true                 inherited from rpool
-rpool/data/vm-100-disk-0                autobackup:offsite1  true                 inherited from rpool
-rpool/swap                              autobackup:offsite1  false                local
-...
-```
-
-The autobackup-property can have 3 values:
- * ```true```: Backup the dataset and all its children 
- * ```false```: Dont backup the dataset and all its children. (used to exclude certain datasets)
- * ```child```: Only backup the children off the dataset, not the dataset itself.
-
-Only use the zfs-command to set these properties, not the zpool command. 
-
-### Running zfs-autobackup
-
-Run the script on the backup server and pull the data from the server specified by --ssh-source.
-
-```console
-[root@backup ~]# zfs-autobackup --ssh-source server1.server.com offsite1 backup/server1 --progress --verbose
-
-  #### Settings summary
-  [Source] Datasets on: server1.server.com
-  [Source] Keep the last 10 snapshots.
-  [Source] Keep every 1 day, delete after 1 week.
-  [Source] Keep every 1 week, delete after 1 month.
-  [Source] Keep every 1 month, delete after 1 year.
-  [Source] Send all datasets that have 'autobackup:offsite1=true' or 'autobackup:offsite1=child'
-
-  [Target] Datasets are local
-  [Target] Keep the last 10 snapshots.
-  [Target] Keep every 1 day, delete after 1 week.
-  [Target] Keep every 1 week, delete after 1 month.
-  [Target] Keep every 1 month, delete after 1 year.
-  [Target] Receive datasets under: backup/server1
-
-  #### Selecting
-  [Source] rpool: Selected (direct selection)
-  [Source] rpool/ROOT: Selected (inherited selection)
-  [Source] rpool/ROOT/server1-1: Selected (inherited selection)
-  [Source] rpool/data: Selected (inherited selection)
-  [Source] rpool/data/vm-100-disk-0: Selected (inherited selection)
-  [Source] rpool/swap: Ignored (disabled)
-
-  #### Snapshotting
-  [Source] rpool: No changes since offsite1-20200218175435
-  [Source] rpool/ROOT: No changes since offsite1-20200218175435
-  [Source] rpool/data: No changes since offsite1-20200218175435
-  [Source] Creating snapshot offsite1-20200218180123
-
-  #### Sending and thinning
-  [Target] backup/server1/rpool/ROOT/server1-1@offsite1-20200218175435: receiving full
-  [Target] backup/server1/rpool/ROOT/server1-1@offsite1-20200218175547: receiving incremental
-  [Target] backup/server1/rpool/ROOT/server1-1@offsite1-20200218175706: receiving incremental
-  [Target] backup/server1/rpool/ROOT/server1-1@offsite1-20200218180049: receiving incremental
-  [Target] backup/server1/rpool/ROOT/server1-1@offsite1-20200218180123: receiving incremental
-  [Target] backup/server1/rpool/data@offsite1-20200218175435: receiving full
-  [Target] backup/server1/rpool/data/vm-100-disk-0@offsite1-20200218175435: receiving full
-  ...
-```
-
-Note that this is called a "pull" backup: The backup server pulls the backup from the server. This is usually the preferred way.
-
-Its also possible to let a server push its backup to the backup-server. However this has security implications. In that case you would setup the SSH keys the other way around and use the --ssh-target parameter on the server.
-
-### Automatic backups
-
-Now every time you run the command, zfs-autobackup will create a new snapshot and replicate your data.
-
-Older snapshots will eventually be deleted, depending on the `--keep-source` and `--keep-target` settings. (The defaults are shown above under the 'Settings summary')
-
-Once you've got the correct settings for your situation, you can just store the command in a cronjob.
-
-Or just create a script and run it manually when you need it.
-
-## Use as snapshot tool
-
-You can use zfs-autobackup to only make snapshots.
-
-Just dont specify the target-path:
-```console
-root@ws1:~# zfs-autobackup test --verbose
-  zfs-autobackup v3.0 - Copyright 2020 E.H.Eefting (edwin@datux.nl)
-
-  #### Source settings
-  [Source] Datasets are local
-  [Source] Keep the last 10 snapshots.
-  [Source] Keep every 1 day, delete after 1 week.
-  [Source] Keep every 1 week, delete after 1 month.
-  [Source] Keep every 1 month, delete after 1 year.
-  [Source] Selects all datasets that have property 'autobackup:test=true' (or childs of datasets that have 'autobackup:test=child')
-
-  #### Selecting
-  [Source] test_source1/fs1: Selected (direct selection)
-  [Source] test_source1/fs1/sub: Selected (inherited selection)
-  [Source] test_source2/fs2: Ignored (only childs)
-  [Source] test_source2/fs2/sub: Selected (inherited selection)
-
-  #### Snapshotting
-  [Source] Creating snapshots test-20200710125958 in pool test_source1
-  [Source] Creating snapshots test-20200710125958 in pool test_source2
-
-  #### Thinning source
-  [Source] test_source1/fs1@test-20200710125948: Destroying
-  [Source] test_source1/fs1/sub@test-20200710125948: Destroying
-  [Source] test_source2/fs2/sub@test-20200710125948: Destroying
-
-  #### All operations completed successfully
-  (No target_path specified, only operated as snapshot tool.)
-```
-
-This also allows you to make several snapshots during the day, but only backup the data at night when the server is not busy.
-
-**Note**: In this mode it doesnt take a specified target-schedule into account when thinning, it only knows a snapshot is the common snapshot by looking at the holds. So make sure your source-schedule keeps the snapshots you still want to transfer at a later point.
-
-## Thinning out obsolete snapshots
-
-The thinner is the thing that destroys old snapshots on the source and target.
-
-The thinner operates "stateless": There is nothing in the name or properties of a snapshot that indicates how long it will be kept. Everytime zfs-autobackup runs, it will look at the timestamp of all the existing snapshots. From there it will determine which snapshots are obsolete according to your schedule. The advantage of this stateless system is that you can always change the schedule.
-
-Note that the thinner will ONLY destroy snapshots that are matching the naming pattern of zfs-autobackup. If you use `--other-snapshots`, it wont destroy those snapshots after replicating them to the target.
-
-### Destroying missing datasets
-
-When a dataset has been destroyed or deselected on the source, but still exists on the target we call it a missing dataset. Missing datasets will be still thinned out according to the schedule.
-
-The final snapshot will never be destroyed, unless you specify a **deadline** with the `--destroy-missing` option:
-
-In that case it will look at the last snapshot we took and determine if is older than the deadline you specified. e.g: `--destroy-missing 30d` will start destroying things 30 days after the last snapshot.
-
-#### After the deadline
-
-When the deadline is passed, all our snapshots, except the last one will be destroyed. Irregardless of the normal thinning schedule.
-
-The dataset has to have the following properties to be finally really destroyed:
-
-* The dataset has no direct child-filesystems or volumes.
-* The only snapshot left is the last one created by zfs-autobackup.
-* The remaining snapshot has no clones.
-
-### Thinning schedule
-
-The default thinning schedule is: `10,1d1w,1w1m,1m1y`.
-
-The schedule consists of multiple rules separated by a `,`
-
-A plain number specifies how many snapshots you want to always keep, regardless of time or interval.
-
-The format of the other rules is: `<Interval><TTL>`.
-
-* Interval: The minimum interval between the snapshots. Snapshots with intervals smaller than this will be destroyed.
-* TTL: The maximum time to life time of a snapshot, after that they will be destroyed.
-* These are the time units you can use for interval and TTL:
-  * `y`: Years
-  * `m`: Months
-  * `d`: Days
-  * `h`: Hours
-  * `min`: Minutes
-  * `s`: Seconds
-
-Since this might sound very complicated, the `--verbose` option will show you what it all means:
-
-```console
-  [Source] Keep the last 10 snapshots.
-  [Source] Keep every 1 day, delete after 1 week.
-  [Source] Keep every 1 week, delete after 1 month.
-  [Source] Keep every 1 month, delete after 1 year.
-```
-
-A snapshot will only be destroyed if it not needed anymore by ANY of the rules.
-
-You can specify as many rules as you need. The order of the rules doesn't matter.
-
-Keep in mind its up to you to actually run zfs-autobackup often enough: If you want to keep hourly snapshots, you have to make sure you at least run it every hour.
-
-However, its no problem if you run it more or less often than that: The thinner will still keep an optimal set of snapshots to match your schedule as good as possible.
-
-If you want to keep as few snapshots as possible, just specify 0. (`--keep-source=0` for example)
-
-If you want to keep ALL the snapshots, just specify a very high number.
-
-### More details about the Thinner
-
-We will give a practical example of how the thinner operates.
-
-Say we want have 3 thinner rules:
-
-* We want to keep daily snapshots for 7 days.
-* We want to keep weekly snapshots for 4 weeks.
-* We want to keep monthly snapshots for 12 months.
-
-So far we have taken 4 snapshots at random moments:
-
-![thinner example](https://raw.githubusercontent.com/psy0rz/zfs_autobackup/master/doc/thinner.png)
-
-For every rule, the thinner will divide the timeline in blocks and assign each snapshot to a block.
-
-A block can only be assigned one snapshot: If multiple snapshots fall into the same block, it only assigns it to the oldest that we want to keep.
-
-The colors show to which block a snapshot belongs:
-
-* Snapshot 1: This snapshot belongs to daily block 1, weekly block 0 and monthly block 0. However the daily block is too old.
-* Snapshot 2: Since weekly block 0 and monthly block 0 already have a snapshot, it only belongs to daily block 4.
-* Snapshot 3: This snapshot belongs to daily block 8 and weekly block 1.
-* Snapshot 4: Since daily block 8 already has a snapshot, this one doesn't belong to anything and can be deleted right away. (it will be keeped for now since its the last snapshot)
-
-zfs-autobackup will re-evaluate this on every run: As soon as a snapshot doesn't belong to any block anymore it will be destroyed.
-
-Snapshots on the source that still have to be send to the target wont be destroyed off course. (If the target still wants them, according to the target schedule)
-
-## How zfs-autobackup handles encryption
-
-In normal operation datasets are transferred unaltered:
-
-* Source datasets that are encrypted will be send over as such and stay encrypted at the target side. (In ZFS this is called raw-mode) You dont need keys at the target side if you dont want to access the data.
-* Source datasets that are plain will stay that way on the target. (Even if the specified target-path IS encrypted.) 
-
-Basically you dont have to do anything or worry about anything. 
-
-### Decrypting/encrypting
-
-Things get different if you want to change the encryption-state of a dataset during transfer:
-
-* If you want to decrypt encrypted datasets before sending them, you should use the `--decrypt` option. Datasets will then be stored plain at the target.
-* If you want to encrypt plain datasets when they are received, you should use the `--encrypt` option. Datasets will then be stored encrypted at the target. (Datasets that are already encrypted will still be sent over unaltered in raw-mode.) 
-* If you also want re-encrypt encrypted datasets with the target-side encryption you can use both options. 
-
-Note 1: The --encrypt option will rely on inheriting encryption parameters from the parent datasets on the target side. You are responsible for setting those up and loading the keys. So --encrypt is no guarantee for encryption: If you dont set it up, it cant encrypt.
-
-Note 2: Decide what you want at an early stage: If you change the --encrypt or --decrypt parameter after the inital sync you might get weird and wonderfull errors. (nothing dangerous)
-
-**Some common errors while using zfs encryption:**
-
-```
-cannot receive incremental stream: kernel modules must be upgraded to receive this stream.
-```
-
-This happens if you forget to use --encrypt, while the target datasets are already encrypted. (Very strange error message indeed)
-
-
-## Transfer buffering, compression and rate limiting.
-
-If you're transferring over a slow link it might be useful to use `--compress=zstd-fast`. This will compress the data before sending, so it uses less bandwidth. An alternative to this is to use --zfs-compressed: This will transfer blocks that already have compression intact. (--compress will usually compress much better but uses much more resources. --zfs-compressed uses the least resources, but can be a disadvantage if you want to use a different compression method on the target.)
-
-You can also limit the datarate by using the `--rate` option.
-
-The `--buffer` option might also help since it acts as an IO buffer: zfs send can vary wildly between completely idle and huge bursts of data. When zfs send is idle, the buffer will continue transferring data over the slow link.
-
-It's also possible to add custom send or receive pipes with `--send-pipe` and `--recv-pipe`.
-
-These options all work together and the buffer on the receiving side is only added if appropriate. When all options are active:
-
-#### On the sending side:
-
-zfs send -> send buffer -> custom send pipes -> compression -> transfer rate limiter
-
-#### On the receiving side:
-decompression -> custom recv pipes -> buffer -> zfs recv
-
-## Running custom commands before and after snapshotting
-
-You can run commands before and after the snapshot to freeze databases to make the on for example to make the on-disk data consistent before snapshotting.
-
-The commands will be executed on the source side. Use the `--pre-snapshot-cmd` and `--post-snapshot-cmd` options for this.
-
-For example:
-
-```sh
-zfs-autobackup \
-    --pre-snapshot-cmd 'daemon -f jexec mysqljail1 mysql -s -e "set autocommit=0;flush logs;flush tables with read lock;\\! echo \$\$ > /tmp/mysql_lock.pid && sleep 60"' \
-    --pre-snapshot-cmd 'daemon -f jexec mysqljail2 mysql -s -e "set autocommit=0;flush logs;flush tables with read lock;\\! echo \$\$ > /tmp/mysql_lock.pid && sleep 60"' \
-    --post-snapshot-cmd 'pkill -F /jails/mysqljail1/tmp/mysql_lock.pid' \
-    --post-snapshot-cmd 'pkill -F /jails/mysqljail2/tmp/mysql_lock.pid' \
-    backupfs1
-```
-
-Failure handling during pre/post commands:
-
-* If a pre-command fails, zfs-autobackup will exit with an error. (after executing the post-commands)
-* All post-commands are always executed. Even if the pre-commands or actual snapshot have failed. This way you can be sure that stuff is always cleanedup and unfreezed.
-
-## Tips
-
-* Use ```--debug``` if something goes wrong and you want to see the commands that are executed. This will also stop at the first error.
-* You can split up the snapshotting and sending tasks by creating two cronjobs. Create a separate snapshotter-cronjob by just omitting target-path.
-* Set the ```readonly``` property of the target filesystem to ```on```. This prevents changes on the target side. (Normally, if there are changes the next backup will fail and will require a zfs rollback.) Note that readonly means you cant change the CONTENTS of the dataset directly. Its still possible to receive new datasets and manipulate properties etc.
-* Use ```--clear-refreservation``` to save space on your backup server.
-* Use ```--clear-mountpoint``` to prevent the target server from mounting the backupped filesystem in the wrong place during a reboot.
-
-### Performance tips
-
-If you have a large number of datasets its important to keep the following tips in mind.
-
-Also it might help to use the --buffer option to add IO buffering during the data transfer. This might speed up things since it smooths out sudden IO bursts that are frequent during a zfs send or recv.
-
-#### Some statistics
-
-To get some idea of how fast zfs-autobackup is, I did some test on my laptop, with a SKHynix_HFS512GD9TNI-L2B0B disk. I'm using zfs 2.0.2.  
-
-I created 100 empty datasets and measured the total runtime of zfs-autobackup. I used all the performance tips below. (--no-holds, --allow-empty, ssh ControlMaster)
-
-* without ssh: 15 seconds. (>6 datasets/s)
-* either ssh-target or ssh-source=localhost: 20 seconds (5 datasets/s)
-* both ssh-target and ssh-source=localhost: 24 seconds (4 datasets/s)
-
-To be bold I created 2500 datasets, but that also was no problem. So it seems it should be possible to use zfs-autobackup with thousands of datasets.
-
-If you need more performance let me know.
-
-NOTE: There is actually a performance regression in ZFS version 2: https://github.com/openzfs/zfs/issues/11560 Use --no-progress as workaround.
-
-#### Less work
-
-You can make zfs-autobackup generate less work by using --no-holds and --allow-empty.
-
-This saves a lot of extra zfs-commands per dataset.
-
-#### Speeding up SSH
-
-You can make your ssh connections persistent and greatly speed up zfs-autobackup:
-
-On the backup-server add this to your ~/.ssh/config:
-
-```console
-Host *
-    ControlPath ~/.ssh/control-master-%r@%h:%p
-    ControlMaster auto
-    ControlPersist 3600
-```
-
-Thanks @mariusvw :)
-
-### Specifying ssh port or options
-
-The correct way to do this is by creating ~/.ssh/config:
-
-```console
-Host smartos04
-    Hostname 1.2.3.4
-    Port 1234
-    user root
-    Compression yes
-```
-
-This way you can just specify "smartos04" as host.
-
-Also uses compression on slow links.
-
-Look in man ssh_config for many more options.
-
-## Usage
-
-```console
-usage: zfs-autobackup [-h] [--ssh-config CONFIG-FILE] [--ssh-source USER@HOST]
-                   [--ssh-target USER@HOST] [--keep-source SCHEDULE]
-                   [--keep-target SCHEDULE] [--pre-snapshot-cmd COMMAND]
-                   [--post-snapshot-cmd COMMAND] [--other-snapshots]
-                   [--no-snapshot] [--no-send] [--no-thinning] [--no-holds]
-                   [--min-change BYTES] [--allow-empty] [--ignore-replicated]
-                   [--strip-path N] [--clear-refreservation]
-                   [--clear-mountpoint] [--filter-properties PROPERTY,...]
-                   [--set-properties PROPERTY=VALUE,...] [--rollback]
-                   [--destroy-incompatible] [--destroy-missing SCHEDULE]
-                   [--ignore-transfer-errors] [--decrypt] [--encrypt]
-                   [--zfs-compressed] [--test] [--verbose] [--debug]
-                   [--debug-output] [--progress] [--send-pipe COMMAND]
-                   [--recv-pipe COMMAND] [--compress TYPE] [--rate DATARATE]
-                   [--buffer SIZE]
-                   backup-name [target-path]
-
-zfs-autobackup v3.1 - (c)2021 E.H.Eefting (edwin@datux.nl)
-
-positional arguments:
-  backup-name           Name of the backup (you should set the zfs property
-                        "autobackup:backup-name" to true on filesystems you
-                        want to backup
-  target-path           Target ZFS filesystem (optional: if not specified,
-                        zfs-autobackup will only operate as snapshot-tool on
-                        source)
-
-optional arguments:
-  -h, --help            show this help message and exit
-  --ssh-config CONFIG-FILE
-                        Custom ssh client config
-  --ssh-source USER@HOST
-                        Source host to get backup from.
-  --ssh-target USER@HOST
-                        Target host to push backup to.
-  --keep-source SCHEDULE
-                        Thinning schedule for old source snapshots. Default:
-                        10,1d1w,1w1m,1m1y
-  --keep-target SCHEDULE
-                        Thinning schedule for old target snapshots. Default:
-                        10,1d1w,1w1m,1m1y
-  --pre-snapshot-cmd COMMAND
-                        Run COMMAND before snapshotting (can be used multiple
-                        times.
-  --post-snapshot-cmd COMMAND
-                        Run COMMAND after snapshotting (can be used multiple
-                        times.
-  --other-snapshots     Send over other snapshots as well, not just the ones
-                        created by this tool.
-  --no-snapshot         Don't create new snapshots (useful for finishing
-                        uncompleted backups, or cleanups)
-  --no-send             Don't send snapshots (useful for cleanups, or if you
-                        want a serperate send-cronjob)
-  --no-thinning         Do not destroy any snapshots.
-  --no-holds            Don't hold snapshots. (Faster. Allows you to destroy
-                        common snapshot.)
-  --min-change BYTES    Number of bytes written after which we consider a
-                        dataset changed (default 1)
-  --allow-empty         If nothing has changed, still create empty snapshots.
-                        (same as --min-change=0)
-  --ignore-replicated   Ignore datasets that seem to be replicated some other
-                        way. (No changes since lastest snapshot. Useful for
-                        proxmox HA replication)
-  --strip-path N        Number of directories to strip from target path (use 1
-                        when cloning zones between 2 SmartOS machines)
-  --clear-refreservation
-                        Filter "refreservation" property. (recommended, safes
-                        space. same as --filter-properties refreservation)
-  --clear-mountpoint    Set property canmount=noauto for new datasets.
-                        (recommended, prevents mount conflicts. same as --set-
-                        properties canmount=noauto)
-  --filter-properties PROPERTY,...
-                        List of properties to "filter" when receiving
-                        filesystems. (you can still restore them with zfs
-                        inherit -S)
-  --set-properties PROPERTY=VALUE,...
-                        List of propererties to override when receiving
-                        filesystems. (you can still restore them with zfs
-                        inherit -S)
-  --rollback            Rollback changes to the latest target snapshot before
-                        starting. (normally you can prevent changes by setting
-                        the readonly property on the target_path to on)
-  --destroy-incompatible
-                        Destroy incompatible snapshots on target. Use with
-                        care! (implies --rollback)
-  --destroy-missing SCHEDULE
-                        Destroy datasets on target that are missing on the
-                        source. Specify the time since the last snapshot, e.g:
-                        --destroy-missing 30d
-  --ignore-transfer-errors
-                        Ignore transfer errors (still checks if received
-                        filesystem exists. useful for acltype errors)
-  --decrypt             Decrypt data before sending it over.
-  --encrypt             Encrypt data after receiving it.
-  --zfs-compressed      Transfer blocks that already have zfs-compression as-
-                        is.
-  --test                dont change anything, just show what would be done
-                        (still does all read-only operations)
-  --verbose             verbose output
-  --debug               Show zfs commands that are executed, stops after an
-                        exception.
-  --debug-output        Show zfs commands and their output/exit codes. (noisy)
-  --progress            show zfs progress output. Enabled automaticly on ttys.
-                        (use --no-progress to disable)
-  --send-pipe COMMAND   pipe zfs send output through COMMAND (can be used
-                        multiple times)
-  --recv-pipe COMMAND   pipe zfs recv input through COMMAND (can be used
-                        multiple times)
-  --compress TYPE       Use compression during transfer, defaults to zstd-adapt
-                        if TYPE is not specified. (gzip, pigz-fast, pigz-slow,
-                        zstd-fast, zstd-slow, zstd-adapt, xz, lzo, lz4)
-  --rate DATARATE       Limit data transfer rate (e.g. 128K. requires
-                        mbuffer.)
-  --buffer SIZE         Add zfs send and recv buffers to smooth out IO bursts.
-                        (e.g. 128M. requires mbuffer)
-
-Full manual at: https://github.com/psy0rz/zfs_autobackup
-
-```
-
-## Troubleshooting
-
-### It keeps asking for my SSH password
-
-You forgot to setup automatic login via SSH keys, look in the example how to do this.
-
-### It says 'cannot receive incremental stream: invalid backup stream'
-
-This usually means you've created a new snapshot on the target side during a backup. If you restart zfs-autobackup, it will automaticly abort the invalid partially received snapshot and start over.
-
-### It says 'cannot receive incremental stream: destination has been modified since most recent snapshot'
-
-This means files have been modified on the target side somehow. 
-
-You can use --rollback to automaticly rollback such changes. Also try destroying the target dataset and using --clear-mountpoint on the next run. This way it wont get mounted.
-
-### It says 'internal error: Invalid argument'
-
-In some cases (Linux -> FreeBSD) this means certain properties are not fully supported on the target system.
-
-Try using something like: --filter-properties xattr or --ignore-transfer-errors. 
-
-### zfs receive fails, but snapshot seems to be received successful.
-
-This happens if you transfer between different Operating systems/zfs versions or feature sets.
-
-Try using the --ignore-transfer-errors option. This will ignore the error. It will still check if the snapshot is actually received correctly.
-
-## Restore example
-
-Restoring can be done with simple zfs commands. For example, use this to restore a specific SmartOS disk image to a temporary restore location:
-
-```console
-root@fs1:/home/psy#  zfs send fs1/zones/backup/zfsbackups/smartos01.server.com/zones/a3abd6c8-24c6-4125-9e35-192e2eca5908-disk0@smartos01_fs1-20160110000003 | ssh root@2.2.2.2 "zfs recv zones/restore"
-```
-
-After that you can rename the disk image from the temporary location to the location of a new SmartOS machine you've created.
-
-## Monitoring with Zabbix-jobs
-
-You can monitor backups by using my zabbix-jobs script. (<https://github.com/psy0rz/stuff/tree/master/zabbix-jobs>)
-
-Put this command directly after the zfs_backup command in your cronjob:
-
-```console
-zabbix-job-status backup_smartos01_fs1 daily $?
-```
-
-This will update the zabbix server with the exit code and will also alert you if the job didn't run for more than 2 days.
-
-## Backup a proxmox cluster with HA replication
-
-Due to the nature of proxmox we had to make a few enhancements to zfs-autobackup. This will probably also benefit other systems that use their own replication in combination with zfs-autobackup.
-
-All data under rpool/data can be on multiple nodes of the cluster. The naming of those filesystem is unique over the whole cluster. Because of this we should backup rpool/data of all nodes to the same destination. This way we wont have duplicate backups of the filesystems that are replicated. Because of various options, you can even migrate hosts and zfs-autobackup will be fine. (and it will get the next backup from the new node automatically)
-
-In the example below we have 3 nodes, named pve1, pve2 and pve3.
-
-### Preparing the proxmox nodes
-
-No preparation is needed, the script will take care of everything. You only need to setup the ssh keys, so that the backup server can access the proxmox server.
-
-TIP: make sure your backup server is firewalled and cannot be reached from any production machine.
-
-### SSH config on backup server
-
-I use ~/.ssh/config to specify how to reach the various hosts.
-
-In this example we are making an offsite copy and use portforwarding to reach the proxmox machines:
-```
-Host *
-    ControlPath ~/.ssh/control-master-%r@%h:%p
-    ControlMaster auto
-    ControlPersist 3600
-    Compression yes
-
-Host pve1
-    Hostname some.host.com
-    Port 10001
-
-Host pve2
-    Hostname some.host.com
-    Port 10002
-
-Host pve3
-    Hostname some.host.com
-    Port 10003
-```
-
-### Backup script
-
-I use the following backup script on the backup server.
-
-Adjust the variables HOSTS TARGET and NAME to your needs.
-
-```shell
-#!/bin/bash
-
-HOSTS="pve1 pve2 pve3"
-TARGET=rpool/pvebackups
-NAME=prox
-
-zfs create -p $TARGET/data &>/dev/null
-for HOST in $HOSTS; do
-
-  echo "################################### RPOOL $HOST"
-
-  # enable backup
-  ssh $HOST "zfs set autobackup:rpool_$NAME=child rpool/ROOT"
-
-  #backup rpool to specific directory per host
-  zfs create -p $TARGET/rpools/$HOST &>/dev/null
-  zfs-autobackup --keep-source=1d1w,1w1m --ssh-source $HOST rpool_$NAME $TARGET/rpools/$HOST --clear-mountpoint --clear-refreservation --ignore-transfer-errors --strip-path 2 --verbose   --no-holds   $@
-
-  zabbix-job-status backup_$HOST""_rpool_$NAME daily $? >/dev/null 2>/dev/null
-
-
-  echo "################################### DATA $HOST"
-
-  # enable backup
-  ssh $HOST "zfs set autobackup:data_$NAME=child rpool/data"
-
-  #backup data filesystems to a common directory
-  zfs-autobackup --keep-source=1d1w,1w1m --ssh-source $HOST data_$NAME $TARGET/data --clear-mountpoint --clear-refreservation --ignore-transfer-errors --strip-path 2 --verbose  --ignore-replicated --min-change 300000 --no-holds   $@
-
-  zabbix-job-status backup_$HOST""_data_$NAME daily $? >/dev/null 2>/dev/null
-
-done
-```
-
-This script will also send the backup status to Zabbix. (if you've installed my zabbix-job-status script https://github.com/psy0rz/stuff/tree/master/zabbix-jobs)
+Please look at our wiki to [Get started](https://github.com/psy0rz/zfs_autobackup/wiki).

 # Sponsor list

--- a/setup.py
+++ b/setup.py
@ -18,7 +18,9 @@ setuptools.setup(
    entry_points={
        'console_scripts':
            [
-                'zfs-autobackup = zfs_autobackup:cli',
+                'zfs-autobackup = zfs_autobackup.ZfsAutobackup:cli',
+                'zfs-autoverify = zfs_autobackup.ZfsAutoverify:cli',
+                'zfs-check = zfs_autobackup.ZfsCheck:cli',
            ]
    },
    packages=setuptools.find_packages(),
--- a/tests/basetest.py
+++ b/tests/basetest.py
@ -11,6 +11,9 @@ import subprocess
 import time
 from pprint import *
 from zfs_autobackup.ZfsAutobackup import *
+from zfs_autobackup.ZfsAutoverify import *
+from zfs_autobackup.ZfsCheck import *
+from zfs_autobackup.util import *
 from mock import *
 import contextlib
 import sys
@ -62,6 +65,7 @@ def shelltest(cmd):
    """execute and print result as nice copypastable string for unit tests (adds extra newlines on top/bottom)"""

    ret=(subprocess.check_output("SUDO_ASKPASS=./password.sh sudo -A "+cmd , shell=True).decode('utf-8'))
+
    print("######### result of: {}".format(cmd))
    print(ret)
    print("#########")
--- a/tests/data/empty
+++ b/tests/data/empty
--- a/tests/data/partial
+++ b/tests/data/partial
@ -0,0 +1 @@
+xC<78><43>ʟ<EFBFBD>ZG<5A><47>М<EFBFBD><D09C><EFBFBD>?<3F><><1D>ZG<>#<0F><>,<>ƻ<>Q=<3D>><3E>ك1<D983>NU<4E><15>u<>{Zj;<3B>`<60><19><19><>Dv<44><76>Q<EFBFBD>j<EFBFBD>voQFN<46><4E><EFBFBD><EFBFBD><EFBFBD>;3Sa<53>R<EFBFBD>^2Z<32><5A>
--- a/tests/data/whole
+++ b/tests/data/whole
--- a/tests/data/whole2
+++ b/tests/data/whole2
--- a/tests/data/whole_whole2
+++ b/tests/data/whole_whole2
--- a/tests/data/whole_whole2_partial
+++ b/tests/data/whole_whole2_partial
--- a/tests/run_test
+++ b/tests/run_test
@ -0,0 +1,5 @@
+#!/bin/bash
+
+#run one test. start from main directory
+
+python -m unittest discover tests $@ -vvvf
--- a/tests/run_tests
+++ b/tests/run_tests
@ -18,6 +18,7 @@ if ! [ -e /root/.ssh/id_rsa ]; then
    ssh -oStrictHostKeyChecking=no localhost true || exit 1
 fi

+umount /tmp/ZfsCheck*

 coverage run --branch --source zfs_autobackup -m unittest discover -vvvvf $SCRIPTDIR $@ 2>&1
 EXIT=$?
--- a/tests/test_blockhasher.py
+++ b/tests/test_blockhasher.py
@ -0,0 +1,157 @@
+from basetest import *
+from zfs_autobackup.BlockHasher import BlockHasher
+
+
+# make VERY sure this works correctly under all circumstances.
+
+# sha1 sums of files, (bs=4096)
+# da39a3ee5e6b4b0d3255bfef95601890afd80709  empty
+# 642027d63bb0afd7e0ba197f2c66ad03e3d70de1  partial
+# 3c0bf91170d873b8e327d3bafb6bc074580d11b7  whole
+# 2e863f1fcccd6642e4e28453eba10d2d3f74d798  whole2
+# 959e6b58078f0cfd2fb3d37e978fda51820473ff  whole_whole2
+# 309ffffba2e1977d12f3b7469971f30d28b94bd8  whole_whole2_partial
+
+class TestBlockHasher(unittest2.TestCase):
+
+    def setUp(self):
+        pass
+
+    def test_empty(self):
+        block_hasher = BlockHasher(count=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/empty")),
+            []
+        )
+
+    def test_partial(self):
+        block_hasher = BlockHasher(count=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/partial")),
+            [(0, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")]
+        )
+
+    def test_whole(self):
+        block_hasher = BlockHasher(count=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole")),
+            [(0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7")]
+        )
+
+    def test_whole2(self):
+        block_hasher = BlockHasher(count=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2")),
+            [
+                (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),
+                (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798")
+            ]
+        )
+
+    def test_wwp(self):
+        block_hasher = BlockHasher(count=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+    def test_wwp_count2(self):
+        block_hasher = BlockHasher(count=2)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                (0, "959e6b58078f0cfd2fb3d37e978fda51820473ff"),  # whole_whole2
+                (1, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+    def test_big(self):
+        block_hasher = BlockHasher(count=10)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                (0, "309ffffba2e1977d12f3b7469971f30d28b94bd8"),  # whole_whole2_partial
+            ])
+
+    def test_blockhash_compare(self):
+        #no errors
+        block_hasher = BlockHasher(count=1)
+        generator = block_hasher.generate("tests/data/whole_whole2_partial")
+        self.assertEqual([], list(block_hasher.compare("tests/data/whole_whole2_partial", generator)))
+
+        #compare file is smaller (EOF errors)
+        block_hasher = BlockHasher(count=1)
+        generator = block_hasher.generate("tests/data/whole_whole2_partial")
+        self.assertEqual(
+            [(1, '2e863f1fcccd6642e4e28453eba10d2d3f74d798', 'EOF'),
+             (2, '642027d63bb0afd7e0ba197f2c66ad03e3d70de1', 'EOF')],
+            list(block_hasher.compare("tests/data/whole", generator)))
+
+        #no errors, huge chunks
+        block_hasher = BlockHasher(count=10)
+        generator = block_hasher.generate("tests/data/whole_whole2_partial")
+        self.assertEqual([], list(block_hasher.compare("tests/data/whole_whole2_partial", generator)))
+
+        # different order to make sure seek functions are ok
+        block_hasher = BlockHasher(count=1)
+        checksums = list(block_hasher.generate("tests/data/whole_whole2_partial"))
+        checksums.reverse()
+        self.assertEqual([], list(block_hasher.compare("tests/data/whole_whole2_partial", checksums)))
+
+    def test_skip1(self):
+        block_hasher = BlockHasher(count=1, skip=1)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                # (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+        #should continue the pattern on the next file:
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                # (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                # (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+    def test_skip6(self):
+        block_hasher = BlockHasher(count=1, skip=6)
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                # (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                # (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+        #all blocks of next file are skipped
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                # (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                # (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                # (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+        #first block of this one is the 6th to be skipped:
+        self.assertEqual(
+            list(block_hasher.generate("tests/data/whole_whole2_partial")),
+            [
+                # (0, "3c0bf91170d873b8e327d3bafb6bc074580d11b7"),  # whole
+                (1, "2e863f1fcccd6642e4e28453eba10d2d3f74d798"),  # whole2
+                # (2, "642027d63bb0afd7e0ba197f2c66ad03e3d70de1")  # partial
+            ]
+        )
+
+    #NOTE: compare doesnt use skip. thats the job of its input generator
--- a/tests/test_cmdpipe.py
+++ b/tests/test_cmdpipe.py
@ -9,8 +9,8 @@ class TestCmdPipe(unittest2.TestCase):
        p=CmdPipe(readonly=False, inp=None)
        err=[]
        out=[]
-        p.add(CmdItem(["ls", "-d", "/", "/", "/nonexistent"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2)))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["ls", "-d", "/", "/", "/nonexistent"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2), stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err, ["ls: cannot access '/nonexistent': No such file or directory"])
        self.assertEqual(out, ["/","/"])
@ -21,8 +21,8 @@ class TestCmdPipe(unittest2.TestCase):
        p=CmdPipe(readonly=False, inp="test")
        err=[]
        out=[]
-        p.add(CmdItem(["cat"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0)))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["cat"], stderr_handler=lambda line: err.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0), stdout_handler=lambda line: out.append(line) ))
+        executed=p.execute()

        self.assertEqual(err, [])
        self.assertEqual(out, ["test"])
@ -37,8 +37,8 @@ class TestCmdPipe(unittest2.TestCase):
        out=[]
        p.add(CmdItem(["echo", "test"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0)))
        p.add(CmdItem(["tr", "e", "E"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0)))
-        p.add(CmdItem(["tr", "t", "T"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0)))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["tr", "t", "T"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,0), stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err1, [])
        self.assertEqual(err2, [])
@ -58,8 +58,8 @@ class TestCmdPipe(unittest2.TestCase):
        out=[]
        p.add(CmdItem(["ls", "/nonexistent1"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2)))
        p.add(CmdItem(["ls", "/nonexistent2"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2)))
-        p.add(CmdItem(["ls", "/nonexistent3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2)))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["ls", "/nonexistent3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2), stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err1, ["ls: cannot access '/nonexistent1': No such file or directory"])
        self.assertEqual(err2, ["ls: cannot access '/nonexistent2': No such file or directory"])
@ -76,8 +76,8 @@ class TestCmdPipe(unittest2.TestCase):
        out=[]
        p.add(CmdItem(["bash", "-c", "exit 1"], stderr_handler=lambda line: err1.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,1)))
        p.add(CmdItem(["bash", "-c", "exit 2"], stderr_handler=lambda line: err2.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,2)))
-        p.add(CmdItem(["bash", "-c", "exit 3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,3)))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["bash", "-c", "exit 3"], stderr_handler=lambda line: err3.append(line), exit_handler=lambda exit_code: self.assertEqual(exit_code,3), stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err1, [])
        self.assertEqual(err2, [])
@ -97,8 +97,8 @@ class TestCmdPipe(unittest2.TestCase):
            return True

        p.add(CmdItem(["echo", "test1"], stderr_handler=lambda line: err1.append(line), exit_handler=true_exit, readonly=True))
-        p.add(CmdItem(["echo", "test2"], stderr_handler=lambda line: err2.append(line), exit_handler=true_exit, readonly=True))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["echo", "test2"], stderr_handler=lambda line: err2.append(line), exit_handler=true_exit, readonly=True, stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err1, [])
        self.assertEqual(err2, [])
@ -113,11 +113,63 @@ class TestCmdPipe(unittest2.TestCase):
        err2=[]
        out=[]
        p.add(CmdItem(["echo", "test1"], stderr_handler=lambda line: err1.append(line), readonly=False))
-        p.add(CmdItem(["echo", "test2"], stderr_handler=lambda line: err2.append(line), readonly=True))
-        executed=p.execute(stdout_handler=lambda line: out.append(line))
+        p.add(CmdItem(["echo", "test2"], stderr_handler=lambda line: err2.append(line), readonly=True, stdout_handler=lambda line: out.append(line)))
+        executed=p.execute()

        self.assertEqual(err1, [])
        self.assertEqual(err2, [])
        self.assertEqual(out, [])
        self.assertTrue(executed)

+    def test_no_handlers(self):
+        with self.assertRaises(Exception):
+            p=CmdPipe()
+            p.add(CmdItem([ "echo" ]))
+            p.execute()
+
+        #NOTE: this will give some resource warnings
+
+    def test_manual_pipes(self):
+
+        # manual piping means: a command in the pipe has a stdout_handler, which is responsible for sending the data into the next item of the pipe.
+
+        result=[]
+
+
+        def stdout_handler(line):
+            item2.process.stdin.write(line.encode('utf8'))
+
+            # item2.process.stdin.close()
+
+        item1=CmdItem(["echo", "test"], stdout_handler=stdout_handler)
+        item2=CmdItem(["tr", "e", "E"], stdout_handler=lambda line: result.append(line))
+
+        p=CmdPipe()
+        p.add(item1)
+        p.add(item2)
+        p.execute()
+
+        self.assertEqual(result, ["tEst"])
+
+    def test_multiprocess(self):
+
+        #dont do any piping at all, just run multiple processes and handle outputs
+
+        result1=[]
+        result2=[]
+        result3=[]
+
+        item1=CmdItem(["echo", "test1"], stdout_handler=lambda line: result1.append(line))
+        item2=CmdItem(["echo", "test2"], stdout_handler=lambda line: result2.append(line))
+        item3=CmdItem(["echo", "test3"], stdout_handler=lambda line: result3.append(line))
+
+        p=CmdPipe()
+        p.add(item1)
+        p.add(item2)
+        p.add(item3)
+        p.execute()
+
+        self.assertEqual(result1, ["test1"])
+        self.assertEqual(result2, ["test2"])
+        self.assertEqual(result3, ["test3"])
+
--- a/tests/test_encryption.py
+++ b/tests/test_encryption.py
@ -32,6 +32,7 @@ class TestZfsEncryption(unittest2.TestCase):
    def prepare_encrypted_dataset(self, key, path, unload_key=False):

        # create encrypted source dataset
+        shelltest("rm /tmp/zfstest.key 2>/dev/null;true")
        shelltest("echo {} > /tmp/zfstest.key".format(key))
        shelltest("zfs create -o keylocation=file:///tmp/zfstest.key -o keyformat=passphrase -o encryption=on {}".format(path))

--- a/tests/test_executenode.py
+++ b/tests/test_executenode.py
@ -144,5 +144,70 @@ class TestExecuteNode(unittest2.TestCase):
        self.pipe(nodea, nodeb)


+    def test_cwd(self):
+
+        nodea=ExecuteNode(ssh_to="localhost", debug_output=True)
+        nodeb=ExecuteNode(debug_output=True)
+
+        #change to a directory with a space and execute a system pipe, check if all piped commands are executed in correct directory.
+        shelltest("mkdir '/tmp/space test' 2>/dev/null; true")
+        self.assertEqual(nodea.run(cmd=["pwd", ExecuteNode.PIPE, "cat"], cwd="/tmp/space test"), ["/tmp/space test"])
+        self.assertEqual(nodea.run(cmd=["cat", ExecuteNode.PIPE, "pwd"], cwd="/tmp/space test"), ["/tmp/space test"])
+        self.assertEqual(nodeb.run(cmd=["pwd", ExecuteNode.PIPE, "cat"], cwd="/tmp/space test"), ["/tmp/space test"])
+        self.assertEqual(nodeb.run(cmd=["cat", ExecuteNode.PIPE, "pwd"], cwd="/tmp/space test"), ["/tmp/space test"])
+
+    def test_script_handlers(self):
+
+        def test(node):
+            results = []
+            node.script(lines=["echo line1", "echo line2 1>&2", "exit 123"],
+                                  stdout_handler=lambda line: results.append(line),
+                                  stderr_handler=lambda line: results.append(line),
+                                  exit_handler=lambda exit_code: results.append(exit_code),
+                                  valid_exitcodes=[123]
+                                  )
+
+            self.assertEqual(results, ["line1", "line2", 123 ])
+
+        with self.subTest("remote"):
+            test(ExecuteNode(ssh_to="localhost", debug_output=True))
+        #
+        with self.subTest("local"):
+            test(ExecuteNode(debug_output=True))
+
+    def test_script_defaults(self):
+
+        result=[]
+        nodea=ExecuteNode(debug_output=True)
+        nodea.script(lines=["echo test"], stdout_handler=lambda line: result.append(line))
+
+        self.assertEqual(result, ["test"])
+
+    def test_script_pipe(self):
+
+        result=[]
+        nodea=ExecuteNode()
+        cmd_pipe=nodea.script(lines=["echo test"], pipe=True)
+        nodea.script(lines=["tr e E"], inp=cmd_pipe,stdout_handler=lambda line: result.append(line))
+
+        self.assertEqual(result, ["tEst"])
+
+
+    def test_mixed(self):
+
+        #should be able to mix run() and script()
+        node=ExecuteNode()
+
+        result=[]
+        pipe=node.run(["echo", "test"], pipe=True)
+        node.script(["tr e E"], inp=pipe, stdout_handler=lambda line: result.append(line))
+
+        self.assertEqual(result, ["tEst"])
+
+
+
+
+
+
 if __name__ == '__main__':
    unittest.main()
--- a/tests/test_log.py
+++ b/tests/test_log.py
@ -1,4 +1,4 @@
-import zfs_autobackup.LogConsole
+from zfs_autobackup.LogConsole import LogConsole
 from basetest import *


@ -8,7 +8,7 @@ class TestLog(unittest2.TestCase):
        """test with color output"""
        with OutputIO() as buf:
            with redirect_stdout(buf):
-                l=LogConsole(show_verbose=False, show_debug=False, color=True)
+                l= LogConsole(show_verbose=False, show_debug=False, color=True)
                l.verbose("verbose")
                l.debug("debug")

@ -46,7 +46,7 @@ class TestLog(unittest2.TestCase):
            self.assertEqual(list(buf.getvalue()), [' ', ' ', 'v', 'e', 'r', 'b', 'o', 's', 'e', '\n', '#', ' ', 'd', 'e', 'b', 'u', 'g', '\n', '!', ' ', 'e', 'r', 'r', 'o', 'r', '\n'])


-        zfs_autobackup.LogConsole.colorama=False
+        # zfs_autobackup.LogConsole.colorama=False



--- a/tests/test_scaling.py
+++ b/tests/test_scaling.py
@ -78,7 +78,8 @@ class TestZfsScaling(unittest2.TestCase):


            #this triggers if you make a change with an impact of more than O(snapshot_count/2)
-            expected_runs=743
+            expected_runs=636
+            print("EXPECTED RUNS: {}".format(expected_runs))
            print("ACTUAL RUNS: {}".format(run_counter))
            self.assertLess(abs(run_counter-expected_runs), dataset_count/2)

@ -92,6 +93,7 @@ class TestZfsScaling(unittest2.TestCase):


            #this triggers if you make a change with a performance impact of more than O(snapshot_count/2)
-            expected_runs=947
+            expected_runs=842
+            print("EXPECTED RUNS: {}".format(expected_runs))
            print("ACTUAL RUNS: {}".format(run_counter))
            self.assertLess(abs(run_counter-expected_runs), dataset_count/2)
--- a/tests/test_treehasher.py
+++ b/tests/test_treehasher.py
@ -0,0 +1,84 @@
+from basetest import *
+from zfs_autobackup.BlockHasher import BlockHasher
+
+
+# sha1 sums of files, (bs=4096)
+# da39a3ee5e6b4b0d3255bfef95601890afd80709  empty
+# 642027d63bb0afd7e0ba197f2c66ad03e3d70de1  partial
+# 3c0bf91170d873b8e327d3bafb6bc074580d11b7  whole
+# 2e863f1fcccd6642e4e28453eba10d2d3f74d798  whole2
+# 959e6b58078f0cfd2fb3d37e978fda51820473ff  whole_whole2
+# 309ffffba2e1977d12f3b7469971f30d28b94bd8  whole_whole2_partial
+
+
+class TestTreeHasher(unittest2.TestCase):
+
+    def test_treehasher(self):
+        shelltest("rm -rf /tmp/treehashertest; mkdir /tmp/treehashertest")
+        shelltest("cp tests/data/whole /tmp/treehashertest")
+        shelltest("mkdir /tmp/treehashertest/emptydir")
+        shelltest("mkdir /tmp/treehashertest/dir")
+        shelltest("cp tests/data/whole_whole2_partial /tmp/treehashertest/dir")
+
+        # it should ignore these:
+        shelltest("ln -s / /tmp/treehashertest/symlink")
+        shelltest("mknod /tmp/treehashertest/c c 1 1")
+        shelltest("mknod /tmp/treehashertest/b b 1 1")
+        shelltest("mkfifo /tmp/treehashertest/f")
+
+
+        block_hasher = BlockHasher(count=1, skip=0)
+        tree_hasher = TreeHasher(block_hasher)
+        with self.subTest("Test output, count 1, skip 0"):
+            self.assertEqual(list(tree_hasher.generate("/tmp/treehashertest")), [
+                ('whole', 0, '3c0bf91170d873b8e327d3bafb6bc074580d11b7'),
+                ('dir/whole_whole2_partial', 0, '3c0bf91170d873b8e327d3bafb6bc074580d11b7'),
+                ('dir/whole_whole2_partial', 1, '2e863f1fcccd6642e4e28453eba10d2d3f74d798'),
+                ('dir/whole_whole2_partial', 2, '642027d63bb0afd7e0ba197f2c66ad03e3d70de1')
+            ])
+
+        block_hasher = BlockHasher(count=1, skip=1)
+        tree_hasher = TreeHasher(block_hasher)
+        with self.subTest("Test output, count 1, skip 1"):
+            self.assertEqual(list(tree_hasher.generate("/tmp/treehashertest")), [
+                ('whole', 0, '3c0bf91170d873b8e327d3bafb6bc074580d11b7'),
+                # ('dir/whole_whole2_partial', 0, '3c0bf91170d873b8e327d3bafb6bc074580d11b7'),
+                ('dir/whole_whole2_partial', 1, '2e863f1fcccd6642e4e28453eba10d2d3f74d798'),
+                # ('dir/whole_whole2_partial', 2, '642027d63bb0afd7e0ba197f2c66ad03e3d70de1')
+            ])
+
+
+
+        block_hasher = BlockHasher(count=2)
+        tree_hasher = TreeHasher(block_hasher)
+
+        with self.subTest("Test output, count 2, skip 0"):
+            self.assertEqual(list(tree_hasher.generate("/tmp/treehashertest")), [
+                ('whole', 0, '3c0bf91170d873b8e327d3bafb6bc074580d11b7'),
+                ('dir/whole_whole2_partial', 0, '959e6b58078f0cfd2fb3d37e978fda51820473ff'),
+                ('dir/whole_whole2_partial', 1, '642027d63bb0afd7e0ba197f2c66ad03e3d70de1')
+            ])
+
+        with self.subTest("Test compare"):
+            generator = tree_hasher.generate("/tmp/treehashertest")
+            errors = list(tree_hasher.compare("/tmp/treehashertest", generator))
+            self.assertEqual(errors, [])
+
+        with self.subTest("Test mismatch"):
+            generator = list(tree_hasher.generate("/tmp/treehashertest"))
+            shelltest("cp tests/data/whole2 /tmp/treehashertest/whole")
+
+            self.assertEqual(list(tree_hasher.compare("/tmp/treehashertest", generator)),
+                             [('whole',
+                               0,
+                               '3c0bf91170d873b8e327d3bafb6bc074580d11b7',
+                               '2e863f1fcccd6642e4e28453eba10d2d3f74d798')])
+
+        with self.subTest("Test missing file compare"):
+            generator = list(tree_hasher.generate("/tmp/treehashertest"))
+            shelltest("rm /tmp/treehashertest/whole")
+
+            self.assertEqual(list(tree_hasher.compare("/tmp/treehashertest", generator)),
+                             [('whole', '-', '-', "ERROR: [Errno 2] No such file or directory: '/tmp/treehashertest/whole'")])
+
+
--- a/tests/test_verify.py
+++ b/tests/test_verify.py
@ -0,0 +1,102 @@
+
+from basetest import *
+
+
+# test zfs-verify:
+# - when there is no common snapshot at all
+# - when encryption key not loaded
+# - --test mode
+# - --fs-compare methods
+# - on snapshots of datasets:
+#   - that are correct
+#   - that are different
+# - on snapshots of zvols
+#  - that are correct
+#  - that are different
+# - test all directions (local, remote/local, local/remote, remote/remote)
+#
+
+class TestZfsVerify(unittest2.TestCase):
+
+
+    def setUp(self):
+        self.skipTest("WIP")
+
+        prepare_zpools()
+
+        #create actual test files and data
+        shelltest("zfs create test_source1/fs1/ok_filesystem")
+        shelltest("cp tests/*.py /test_source1/fs1/ok_filesystem")
+
+        shelltest("zfs create test_source1/fs1/bad_filesystem")
+        shelltest("cp tests/*.py /test_source1/fs1/bad_filesystem")
+
+        shelltest("zfs create -V 1M test_source1/fs1/ok_zvol")
+        shelltest("dd if=/dev/urandom of=/dev/zvol/test_source1/fs1/ok_zvol count=1 bs=512k")
+
+        shelltest("zfs create -V 1M test_source1/fs1/bad_zvol")
+        shelltest("dd if=/dev/urandom of=/dev/zvol/test_source1/fs1/bad_zvol count=1 bs=512k")
+
+        #create backup
+        with patch('time.strftime', return_value="test-20101111000000"):
+            self.assertFalse(ZfsAutobackup("test test_target1 --verbose --no-progress --no-holds".split(" ")).run())
+
+        #Do an ugly hack to create a fault in the bad filesystem
+        #In zfs-autoverify it doenst matter that the snapshot isnt actually the same snapshot, so this hack works
+        shelltest("zfs destroy test_target1/test_source1/fs1/bad_filesystem@test-20101111000000")
+        shelltest("zfs mount test_target1/test_source1/fs1/bad_filesystem")
+        shelltest("echo >> /test_target1/test_source1/fs1/bad_filesystem/test_verify.py")
+        shelltest("zfs snapshot test_target1/test_source1/fs1/bad_filesystem@test-20101111000000")
+
+        #do the same hack for the bad zvol
+        shelltest("zfs destroy test_target1/test_source1/fs1/bad_zvol@test-20101111000000")
+        shelltest("dd if=/dev/urandom of=/dev/zvol/test_target1/test_source1/fs1/bad_zvol count=1 bs=1")
+        shelltest("zfs snapshot test_target1/test_source1/fs1/bad_zvol@test-20101111000000")
+
+
+        # make sure we cant accidently compare current data
+        shelltest("zfs mount test_target1/test_source1/fs1/ok_filesystem")
+        shelltest("rm /test_source1/fs1/ok_filesystem/*")
+        shelltest("rm /test_source1/fs1/bad_filesystem/*")
+        shelltest("dd if=/dev/zero of=/dev/zvol/test_source1/fs1/ok_zvol count=1 bs=512k")
+
+
+
+    def test_verify(self):
+
+
+        with self.subTest("default --test"):
+            self.assertFalse(ZfsAutoverify("test test_target1 --verbose --test".split(" ")).run())
+
+        with self.subTest("rsync, remote source and target. (not supported, all 6 fail)"):
+            self.assertEqual(6, ZfsAutoverify("test test_target1 --ssh-source=localhost --ssh-target=localhost --verbose --exclude-received --fs-compare=rsync".split(" ")).run())
+
+        def runchecked(testname, command):
+            with self.subTest(testname):
+                with OutputIO() as buf:
+                    result=None
+                    with redirect_stderr(buf):
+                        result=ZfsAutoverify(command.split(" ")).run()
+
+                    print(buf.getvalue())
+                    self.assertEqual(2,result)
+                    self.assertRegex(buf.getvalue(), "bad_filesystem: FAILED:")
+                    self.assertRegex(buf.getvalue(), "bad_zvol: FAILED:")
+
+        runchecked("rsync, remote source", "test test_target1 --ssh-source=localhost --verbose --exclude-received --fs-compare=rsync")
+        runchecked("rsync, remote target", "test test_target1 --ssh-target=localhost --verbose --exclude-received --fs-compare=rsync")
+        runchecked("rsync, local", "test test_target1 --verbose --exclude-received --fs-compare=rsync")
+
+        runchecked("tar, remote source and remote target",
+                   "test test_target1 --ssh-source=localhost --ssh-target=localhost --verbose --exclude-received --fs-compare=find")
+        runchecked("tar, remote source",
+                   "test test_target1 --ssh-source=localhost --verbose --exclude-received --fs-compare=find")
+        runchecked("tar, remote target",
+                   "test test_target1 --ssh-target=localhost --verbose --exclude-received --fs-compare=find")
+        runchecked("tar, local", "test test_target1 --verbose --exclude-received --fs-compare=find")
+
+        with self.subTest("no common snapshot"):
+            #destroy common snapshot, now 3 should fail
+            shelltest("zfs destroy test_source1/fs1/ok_zvol@test-20101111000000")
+            self.assertEqual(3, ZfsAutoverify("test test_target1 --verbose --exclude-received".split(" ")).run())
+
--- a/tests/test_zfsautobackup.py
+++ b/tests/test_zfsautobackup.py
@ -3,6 +3,8 @@ from zfs_autobackup.CmdPipe import CmdPipe
 from basetest import *
 import time

+from zfs_autobackup.LogConsole import  LogConsole
+

 class TestZfsAutobackup(unittest2.TestCase):

--- a/tests/test_zfsautobackup31.py
+++ b/tests/test_zfsautobackup31.py
@ -79,3 +79,19 @@ test_target1/b/test_target1/a/test_source1/fs1/sub@test-20101111000000
            self.assertFalse(
                ZfsAutobackup("test test_target1 --no-progress --verbose --debug --zfs-compressed".split(" ")).run())

+    def test_force(self):
+        """test 1:1 replication"""
+
+        shelltest("zfs set autobackup:test=true test_source1")
+
+        with patch('time.strftime', return_value="test-20101111000000"):
+            self.assertFalse(
+                ZfsAutobackup("test test_target1 --no-progress --verbose --debug --force --strip-path=1".split(" ")).run())
+
+            r=shelltest("zfs list -H -o name -r -t snapshot test_target1")
+            self.assertMultiLineEqual(r,"""
+test_target1@test-20101111000000
+test_target1/fs1@test-20101111000000
+test_target1/fs1/sub@test-20101111000000
+test_target1/fs2/sub@test-20101111000000
+""")
--- a/tests/test_zfscheck.py
+++ b/tests/test_zfscheck.py
@ -0,0 +1,216 @@
+from basetest import *
+from zfs_autobackup.BlockHasher import BlockHasher
+
+
+class TestZfsCheck(unittest2.TestCase):
+
+    def setUp(self):
+        pass
+
+
+    def test_volume(self):
+        prepare_zpools()
+
+        shelltest("zfs create -V200M test_source1/vol")
+        shelltest("zfs snapshot test_source1/vol@test")
+
+        with self.subTest("Generate"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertFalse(ZfsCheck("test_source1/vol@test".split(" "),print_arguments=False).run())
+
+                print(buf.getvalue())
+                self.assertEqual("""0	2c2ceccb5ec5574f791d45b63c940cff20550f9a
+1	2c2ceccb5ec5574f791d45b63c940cff20550f9a
+""", buf.getvalue())
+
+                #store on disk for next step, add one error.
+                with open("/tmp/testhashes", "w") as fh:
+                    fh.write(buf.getvalue()+"1\t2c2ceccb5ec5574f791d45b63c940cff20550f9X")
+
+        with self.subTest("Compare"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertEqual(1, ZfsCheck("test_source1/vol@test --check=/tmp/testhashes".split(" "),print_arguments=False).run())
+                print(buf.getvalue())
+                self.assertEqual("Chunk 1 failed: 2c2ceccb5ec5574f791d45b63c940cff20550f9X 2c2ceccb5ec5574f791d45b63c940cff20550f9a\n", buf.getvalue())
+
+    def test_filesystem(self):
+        prepare_zpools()
+
+        shelltest("cp tests/data/whole /test_source1/testfile")
+        shelltest("mkdir /test_source1/emptydir")
+        shelltest("mkdir /test_source1/dir")
+        shelltest("cp tests/data/whole2 /test_source1/dir/testfile")
+
+        #it should ignore these:
+        shelltest("ln -s / /test_source1/symlink")
+        shelltest("mknod /test_source1/c c 1 1")
+        shelltest("mknod /test_source1/b b 1 1")
+        shelltest("mkfifo /test_source1/f")
+
+        shelltest("zfs snapshot test_source1@test")
+
+        with self.subTest("Generate"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertFalse(ZfsCheck("test_source1@test".split(" "), print_arguments=False).run())
+
+                print(buf.getvalue())
+                self.assertEqual("""testfile	0	3c0bf91170d873b8e327d3bafb6bc074580d11b7
+dir/testfile	0	2e863f1fcccd6642e4e28453eba10d2d3f74d798
+""", buf.getvalue())
+
+                #store on disk for next step, add error
+                with open("/tmp/testhashes", "w") as fh:
+                    fh.write(buf.getvalue()+"dir/testfile	0	2e863f1fcccd6642e4e28453eba10d2d3f74d79X")
+
+        with self.subTest("Compare"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertEqual(1, ZfsCheck("test_source1@test --check=/tmp/testhashes".split(" "),print_arguments=False).run())
+
+                print(buf.getvalue())
+                self.assertEqual("dir/testfile: Chunk 0 failed: 2e863f1fcccd6642e4e28453eba10d2d3f74d79X 2e863f1fcccd6642e4e28453eba10d2d3f74d798\n", buf.getvalue())
+
+    def test_file(self):
+
+        with self.subTest("Generate"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertFalse(ZfsCheck("tests/data/whole".split(" "), print_arguments=False).run())
+
+                print(buf.getvalue())
+                self.assertEqual("""0	3c0bf91170d873b8e327d3bafb6bc074580d11b7
+""", buf.getvalue())
+
+                # store on disk for next step, add error
+                with open("/tmp/testhashes", "w") as fh:
+                    fh.write(buf.getvalue()+"0	3c0bf91170d873b8e327d3bafb6bc074580d11bX")
+
+        with self.subTest("Compare"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertEqual(1,ZfsCheck("tests/data/whole --check=/tmp/testhashes".split(" "), print_arguments=False).run())
+                print(buf.getvalue())
+                self.assertEqual("Chunk 0 failed: 3c0bf91170d873b8e327d3bafb6bc074580d11bX 3c0bf91170d873b8e327d3bafb6bc074580d11b7\n", buf.getvalue())
+
+    def test_tree(self):
+        shelltest("rm -rf /tmp/testtree; mkdir /tmp/testtree")
+        shelltest("cp tests/data/whole /tmp/testtree")
+        shelltest("cp tests/data/whole_whole2 /tmp/testtree")
+        shelltest("cp tests/data/whole2 /tmp/testtree")
+        shelltest("cp tests/data/partial /tmp/testtree")
+        shelltest("cp tests/data/whole_whole2_partial /tmp/testtree")
+
+        ####################################
+        with self.subTest("Generate, skip 1"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertFalse(ZfsCheck("/tmp/testtree --skip=1".split(" "), print_arguments=False).run())
+
+                #since order varies, just check count (there is one empty line for some reason, only when testing like this)
+                print(buf.getvalue().split("\n"))
+                self.assertEqual(len(buf.getvalue().split("\n")),4)
+
+        ######################################
+        with self.subTest("Compare, all incorrect, skip 1"):
+
+            # store on disk for next step, add error
+            with open("/tmp/testhashes", "w") as fh:
+                fh.write("""
+partial	0	642027d63bb0afd7e0ba197f2c66ad03e3d70deX
+whole	0	3c0bf91170d873b8e327d3bafb6bc074580d11bX
+whole2	0	2e863f1fcccd6642e4e28453eba10d2d3f74d79X
+whole_whole2	0	959e6b58078f0cfd2fb3d37e978fda51820473fX
+whole_whole2_partial	0	309ffffba2e1977d12f3b7469971f30d28b94bdX
+""")
+
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertEqual(ZfsCheck("/tmp/testtree --check=/tmp/testhashes --skip=1".split(" "), print_arguments=False).run(), 3)
+
+                print(buf.getvalue())
+                self.assertMultiLineEqual("""partial: Chunk 0 failed: 642027d63bb0afd7e0ba197f2c66ad03e3d70deX 642027d63bb0afd7e0ba197f2c66ad03e3d70de1
+whole2: Chunk 0 failed: 2e863f1fcccd6642e4e28453eba10d2d3f74d79X 2e863f1fcccd6642e4e28453eba10d2d3f74d798
+whole_whole2_partial: Chunk 0 failed: 309ffffba2e1977d12f3b7469971f30d28b94bdX 309ffffba2e1977d12f3b7469971f30d28b94bd8
+""",buf.getvalue())
+
+        ####################################
+        with self.subTest("Generate"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertFalse(ZfsCheck("/tmp/testtree".split(" "), print_arguments=False).run())
+
+                #file order on disk can vary, so sort it..
+                sorted=buf.getvalue().split("\n")
+                sorted.sort()
+                sorted="\n".join(sorted)+"\n"
+
+                print(sorted)
+                self.assertEqual("""
+partial	0	642027d63bb0afd7e0ba197f2c66ad03e3d70de1
+whole	0	3c0bf91170d873b8e327d3bafb6bc074580d11b7
+whole2	0	2e863f1fcccd6642e4e28453eba10d2d3f74d798
+whole_whole2	0	959e6b58078f0cfd2fb3d37e978fda51820473ff
+whole_whole2_partial	0	309ffffba2e1977d12f3b7469971f30d28b94bd8
+""", sorted)
+
+                # store on disk for next step, add error
+                with open("/tmp/testhashes", "w") as fh:
+                    fh.write(buf.getvalue() + "whole_whole2_partial	0	309ffffba2e1977d12f3b7469971f30d28b94bdX")
+
+        ####################################
+        with self.subTest("Compare"):
+            with OutputIO() as buf:
+                with redirect_stdout(buf):
+                    self.assertEqual(1, ZfsCheck("/tmp/testtree --check=/tmp/testhashes".split(" "),
+                                                 print_arguments=False).run())
+                print(buf.getvalue())
+                self.assertEqual(
+                    "whole_whole2_partial: Chunk 0 failed: 309ffffba2e1977d12f3b7469971f30d28b94bdX 309ffffba2e1977d12f3b7469971f30d28b94bd8\n",
+                    buf.getvalue())
+
+    def test_brokenpipe_cleanup_filesystem(self):
+        """test if stuff is cleaned up correctly, in debugging mode , when a pipe breaks. """
+
+        prepare_zpools()
+        shelltest("cp tests/data/whole /test_source1/testfile")
+        shelltest("zfs snapshot test_source1@test")
+
+        #breaks pipe when grep exists:
+        #important to use --debug, since that generates extra output which would be problematic if we didnt do correct SIGPIPE handling
+        shelltest("python -m zfs_autobackup.ZfsCheck test_source1@test --debug | grep -m1 'Hashing tree'")
+        # time.sleep(5)
+
+        #should NOT be mounted anymore if cleanup went ok:
+        self.assertNotRegex(shelltest("mount"), "test_source1@test")
+
+    def test_brokenpipe_cleanup_volume(self):
+
+        prepare_zpools()
+        shelltest("zfs create -V200M test_source1/vol")
+        shelltest("zfs snapshot test_source1/vol@test")
+
+        #breaks pipe when grep exists:
+        #important to use --debug, since that generates extra output which would be problematic if we didnt do correct SIGPIPE handling
+        shelltest("python -m zfs_autobackup.ZfsCheck test_source1/vol@test --debug | grep -m1 'Hashing file'")
+        # time.sleep(1)
+
+        r = shelltest("zfs list -H -o name -r -t all " + TEST_POOLS)
+        self.assertMultiLineEqual("""
+test_source1
+test_source1/fs1
+test_source1/fs1/sub
+test_source1/vol
+test_source1/vol@test
+test_source2
+test_source2/fs2
+test_source2/fs2/sub
+test_source2/fs3
+test_source2/fs3/sub
+test_target1
+""",r )
+
+
+
--- a/zfs_autobackup/BlockHasher.py
+++ b/zfs_autobackup/BlockHasher.py
@ -0,0 +1,127 @@
+import hashlib
+import os
+
+
+class BlockHasher():
+    """This class was created to checksum huge files and blockdevices (TB's)
+    Instead of one sha1sum of the whole file, it generates sha1susms of chunks of the file.
+
+    The chunksize is count*bs (bs is the read blocksize from disk)
+
+    Its also possible to only read a certain percentage of blocks to just check a sample.
+
+    Input and output generators are in the format ( chunk_nr, hexdigest )
+
+    NOTE: skipping is only used on the generator side. The compare side just compares what it gets from the input generator.
+
+    """
+
+    def __init__(self, count=10000, bs=4096, hash_class=hashlib.sha1, skip=0):
+        self.count = count
+        self.bs = bs
+        self.chunk_size=bs*count
+        self.hash_class = hash_class
+
+        # self.coverage=coverage
+        self.skip=skip
+        self._skip_count=0
+
+        self.stats_total_bytes=0
+
+
+    def _seek_next_chunk(self, fh, fsize):
+        """seek fh to next chunk and update skip counter.
+        returns chunk_nr
+        return false it should skip the rest of the file
+
+
+        """
+
+        #ignore rempty files
+        if fsize==0:
+            return False
+
+        # need to skip chunks?
+        if self._skip_count > 0:
+            chunks_left = ((fsize - fh.tell()) // self.chunk_size) + 1
+            # not enough chunks left in this file?
+            if self._skip_count >= chunks_left:
+                # skip rest of this file
+                self._skip_count = self._skip_count - chunks_left
+                return False
+            else:
+                # seek to next chunk, reset skip count
+                fh.seek(self.chunk_size * self._skip_count, os.SEEK_CUR)
+                self._skip_count = self.skip
+                return  fh.tell()//self.chunk_size
+        else:
+            # should read this chunk, reset skip count
+            self._skip_count = self.skip
+            return fh.tell() // self.chunk_size
+
+    def generate(self, fname):
+        """Generates checksums
+
+        yields(chunk_nr, hexdigest)
+
+        yields nothing for empty files.
+        """
+
+
+        with open(fname, "rb") as fh:
+
+            fh.seek(0, os.SEEK_END)
+            fsize=fh.tell()
+            fh.seek(0)
+
+            while fh.tell()<fsize:
+                chunk_nr=self._seek_next_chunk(fh, fsize)
+                if chunk_nr is False:
+                    return
+
+                #read chunk
+                hash = self.hash_class()
+                block_nr = 0
+                while block_nr != self.count:
+                    block=fh.read(self.bs)
+                    if block==b"":
+                        break
+                    hash.update(block)
+                    block_nr = block_nr + 1
+
+                yield (chunk_nr, hash.hexdigest())
+
+    def compare(self, fname, generator):
+        """reads from generator and compares blocks
+        Yields mismatches in the form: ( chunk_nr, hexdigest, actual_hexdigest)
+        Yields errors in the form: ( chunk_nr, hexdigest, "message" )
+
+        """
+
+        try:
+            checked = 0
+            with open(fname, "rb") as f:
+                for (chunk_nr, hexdigest) in generator:
+                    try:
+
+                        checked = checked + 1
+                        hash = self.hash_class()
+                        f.seek(int(chunk_nr) * self.bs * self.count)
+                        block_nr = 0
+                        for block in iter(lambda: f.read(self.bs), b""):
+                            hash.update(block)
+                            block_nr = block_nr + 1
+                            if block_nr == self.count:
+                                break
+
+                        if block_nr == 0:
+                            yield (chunk_nr, hexdigest, 'EOF')
+
+                        elif (hash.hexdigest() != hexdigest):
+                            yield (chunk_nr, hexdigest, hash.hexdigest())
+
+                    except Exception as e:
+                        yield ( chunk_nr , hexdigest, 'ERROR: '+str(e))
+
+        except Exception as e:
+            yield ( '-', '-', 'ERROR: '+ str(e))
--- a/zfs_autobackup/CliBase.py
+++ b/zfs_autobackup/CliBase.py
@ -0,0 +1,109 @@
+import argparse
+import os.path
+import sys
+
+from .LogConsole import LogConsole
+
+
+class CliBase(object):
+    """Base class for all cli programs
+    Overridden in subclasses that add stuff for the specific programs."""
+
+    # also used by setup.py
+    VERSION = "3.2-alpha2"
+    HEADER = "{} v{} - (c)2022 E.H.Eefting (edwin@datux.nl)".format(os.path.basename(sys.argv[0]), VERSION)
+
+    def __init__(self, argv, print_arguments=True):
+
+        self.parser=self.get_parser()
+        self.args = self.parse_args(argv)
+
+        # helps with investigating failed regression tests:
+        if print_arguments:
+            print("ARGUMENTS: " + " ".join(argv))
+
+    def parse_args(self, argv):
+        """parses the arguments and does additional checks, might print warnings or notes
+        Overridden in subclasses with extra checks.
+        """
+
+        args = self.parser.parse_args(argv)
+
+        if args.help:
+            self.parser.print_help()
+            sys.exit(255)
+
+        if args.version:
+            print(self.HEADER)
+            sys.exit(255)
+
+        # auto enable progress?
+        if sys.stderr.isatty() and not args.no_progress:
+            args.progress = True
+
+        if args.debug_output:
+            args.debug = True
+
+        if args.test:
+            args.verbose = True
+
+        if args.debug:
+            args.verbose = True
+
+        self.log = LogConsole(show_debug=args.debug, show_verbose=args.verbose, color=sys.stdout.isatty())
+
+        self.verbose(self.HEADER)
+        self.verbose("")
+
+        return args
+
+    def get_parser(self):
+        """build up the argument parser
+        Overridden in subclasses that add extra arguments
+        """
+
+        parser = argparse.ArgumentParser(description=self.HEADER, add_help=False,
+                                         epilog='Full manual at: https://github.com/psy0rz/zfs_autobackup')
+
+        # Basic options
+        group=parser.add_argument_group("Common options")
+        group.add_argument('--help', '-h', action='store_true', help='show help')
+        group.add_argument('--test', '--dry-run', '-n', action='store_true',
+                            help='Dry run, dont change anything, just show what would be done (still does all read-only '
+                                 'operations)')
+        group.add_argument('--verbose', '-v', action='store_true', help='verbose output')
+        group.add_argument('--debug', '-d', action='store_true',
+                            help='Show zfs commands that are executed, stops after an exception.')
+        group.add_argument('--debug-output', action='store_true',
+                            help='Show zfs commands and their output/exit codes. (noisy)')
+        group.add_argument('--progress', action='store_true',
+                            help='show zfs progress output. Enabled automaticly on ttys. (use --no-progress to disable)')
+        group.add_argument('--no-progress', action='store_true',
+                            help=argparse.SUPPRESS)  # needed to workaround a zfs recv -v bug
+        group.add_argument('--version', action='store_true',
+                            help='Show version.')
+
+
+        return parser
+
+    def verbose(self, txt):
+        self.log.verbose(txt)
+
+    def warning(self, txt):
+        self.log.warning(txt)
+
+    def error(self, txt):
+        self.log.error(txt)
+
+    def debug(self, txt):
+        self.log.debug(txt)
+
+    def progress(self, txt):
+        self.log.progress(txt)
+
+    def clear_progress(self):
+        self.log.clear_progress()
+
+    def set_title(self, title):
+        self.log.verbose("")
+        self.log.verbose("#### " + title)
--- a/zfs_autobackup/CmdPipe.py
+++ b/zfs_autobackup/CmdPipe.py
@ -1,3 +1,17 @@
+# This is the low level process executing stuff.
+# It makes piping and parallel process handling more easy.
+
+# You can specify a handler for each line of stderr output for each item in the pipe.
+# Every item also has its own exitcode handler.
+
+# Normally you add a stdout_handler to the last item in the pipe.
+# However: You can also add stdout_handler to other items in a pipe. This will turn that item in to a manual pipe: your
+# handler is responsible for sending data into the next item of the pipe. (avaiable in item.next)
+
+# You can also use manual pipe mode to just execute multiple command in parallel and handle their output parallel,
+# without doing any actual pipe stuff. (because you dont HAVE to send data into the next item.)
+
+
 import subprocess
 import os
 import select
@ -11,17 +25,23 @@ except ImportError:
 class CmdItem:
    """one command item, to be added to a CmdPipe"""

-    def __init__(self, cmd, readonly=False, stderr_handler=None, exit_handler=None, shell=False):
+    def __init__(self, cmd, readonly=False, stderr_handler=None, exit_handler=None, stdout_handler=None, shell=False):
        """create item. caller has to make sure cmd is properly escaped when using shell.
+
+        If stdout_handler is None, it will connect the stdout to the stdin of the next item in the pipe, like
+        and actual system pipe. (no python overhead)
+
        :type cmd: list of str
        """

        self.cmd = cmd
        self.readonly = readonly
        self.stderr_handler = stderr_handler
+        self.stdout_handler = stdout_handler
        self.exit_handler = exit_handler
        self.shell = shell
        self.process = None
+        self.next = None #next item in pipe, set by CmdPipe

    def __str__(self):
        """return copy-pastable version of command."""
@ -84,72 +104,23 @@ class CmdPipe:
    def should_execute(self):
        return self._should_execute

-    def execute(self, stdout_handler):
-        """run the pipe. returns True all exit handlers returned true"""
+    def execute(self):
+        """run the pipe. returns True all exit handlers returned true. (otherwise it will be False/None depending on exit handlers returncode) """

        if not self._should_execute:
            return True

-        # first process should have actual user input as stdin:
-        selectors = []
+        selectors = self.__create()

-        # create processes
-        last_stdout = None
-        stdin = subprocess.PIPE
-        for item in self.items:
+        if not selectors:
+            raise (Exception("Cant use cmdpipe without any output handlers."))

-            item.create(stdin)
-            selectors.append(item.process.stderr)
-
-            if last_stdout is None:
-                # we're the first process in the pipe, do we have some input?
-                if self.inp is not None:
-                    # TODO: make streaming to support big inputs?
-                    item.process.stdin.write(self.inp.encode('utf-8'))
-                item.process.stdin.close()
-            else:
-                # last stdout was piped to this stdin already, so close it because we dont need it anymore
-                last_stdout.close()
-
-            last_stdout = item.process.stdout
-            stdin = last_stdout
-
-        # monitor last stdout as well
-        selectors.append(last_stdout)
-
-        while True:
-            # wait for output on one of the stderrs or last_stdout
-            (read_ready, write_ready, ex_ready) = select.select(selectors, [], [])
-            eof_count = 0
-            done_count = 0
-
-            # read line and call appropriate handlers
-            if last_stdout in read_ready:
-                line = last_stdout.readline().decode('utf-8').rstrip()
-                if line != "":
-                    stdout_handler(line)
-                else:
-                    eof_count = eof_count + 1
-
-            for item in self.items:
-                if item.process.stderr in read_ready:
-                    line = item.process.stderr.readline().decode('utf-8').rstrip()
-                    if line != "":
-                        item.stderr_handler(line)
-                    else:
-                        eof_count = eof_count + 1
-
-                if item.process.poll() is not None:
-                    done_count = done_count + 1
-
-            # all filehandles are eof and all processes are done (poll() is not None)
-            if eof_count == len(selectors) and done_count == len(self.items):
-                break
+        self.__process_outputs(selectors)

        # close filehandles
-        last_stdout.close()
        for item in self.items:
            item.process.stderr.close()
+            item.process.stdout.close()

        # call exit handlers
        success = True
@ -158,3 +129,86 @@ class CmdPipe:
                success=item.exit_handler(item.process.returncode) and success

        return success
+
+    def __process_outputs(self, selectors):
+        """watch all output selectors and call handlers"""
+
+        while True:
+            # wait for output on one of the stderrs or last_stdout
+            (read_ready, write_ready, ex_ready) = select.select(selectors, [], [])
+
+            eof_count = 0
+            done_count = 0
+
+            # read line and call appropriate handlers
+
+            for item in self.items:
+                if item.process.stdout in read_ready:
+                    line = item.process.stdout.readline().decode('utf-8').rstrip()
+                    if line != "":
+                        item.stdout_handler(line)
+                    else:
+                        eof_count = eof_count + 1
+                        if item.next:
+                            item.next.process.stdin.close()
+
+                if item.process.stderr in read_ready:
+                    line = item.process.stderr.readline().decode('utf-8').rstrip()
+                    if line != "":
+                        item.stderr_handler(line)
+                    else:
+                        eof_count = eof_count + 1
+
+
+                if item.process.poll() is not None:
+                    done_count = done_count + 1
+
+            # all filehandles are eof and all processes are done (poll() is not None)
+            if eof_count == len(selectors) and done_count == len(self.items):
+                break
+
+
+
+    def __create(self):
+        """create actual processes, do piping and return selectors."""
+
+        selectors = []
+        next_stdin = subprocess.PIPE  # means we write input via python instead of an actual system pipe
+        first = True
+        prev_item = None
+
+        for item in self.items:
+
+            # creates the actual subprocess via subprocess.popen
+            item.create(next_stdin)
+
+            # we piped previous process? dont forget to close its stdout
+            if next_stdin != subprocess.PIPE:
+                next_stdin.close()
+
+            if item.stderr_handler:
+                selectors.append(item.process.stderr)
+
+            # we're the first process in the pipe
+            if first:
+                if self.inp is not None:
+                    # write the input we have
+                    item.process.stdin.write(self.inp.encode('utf-8'))
+                item.process.stdin.close()
+                first = False
+
+            # manual stdout handling or pipe it to the next process?
+            if item.stdout_handler is None:
+                # no manual stdout handling, pipe it to the next process via sytem pipe
+                next_stdin = item.process.stdout
+            else:
+                # manual stdout handling via python
+                selectors.append(item.process.stdout)
+                # next process will get input from python:
+                next_stdin = subprocess.PIPE
+
+            if prev_item is not None:
+                prev_item.next = item
+
+            prev_item = item
+        return selectors
--- a/zfs_autobackup/ExecuteNode.py
+++ b/zfs_autobackup/ExecuteNode.py
@ -54,15 +54,16 @@ class ExecuteNode(LogStub):
        if cmd==self.PIPE:
            return('|')
        else:
-            return(cmd_quote(cmd))
+            return cmd_quote(cmd)

-    def _shell_cmd(self, cmd):
+    def _shell_cmd(self, cmd, cwd):
        """prefix specified ssh shell to command and escape shell characters"""

        ret=[]

        #add remote shell
        if not self.is_local():
+            #note: dont escape this part (executed directly without shell)
            ret=["ssh"]

            if self.ssh_config is not None:
@ -70,7 +71,17 @@ class ExecuteNode(LogStub):

            ret.append(self.ssh_to)

-        ret.append(" ".join(map(self._quote, cmd)))
+        #note: DO escape from here, executed in either local or remote shell.
+
+        shell_str=""
+
+        #add cwd change?
+        if cwd is not None:
+            shell_str=shell_str + "cd " + self._quote(cwd) + "; "
+
+        shell_str=shell_str + " ".join(map(self._quote, cmd))
+
+        ret.append(shell_str)

        return ret

@ -78,22 +89,26 @@ class ExecuteNode(LogStub):
        return self.ssh_to is None

    def run(self, cmd, inp=None, tab_split=False, valid_exitcodes=None, readonly=False, hide_errors=False,
-            return_stderr=False, pipe=False):
+            return_stderr=False, pipe=False, return_all=False, cwd=None):
        """run a command on the node , checks output and parses/handle output and returns it

-        Either uses a local shell (sh -c) or remote shell (ssh) to execute the command. Therefore the command can have stuff like actual pipes in it, if you dont want to use pipe=True to pipe stuff.
+        Takes care of proper quoting/escaping/ssh and logging of stdout/err/exit codes.
+
+        Either uses a local shell (sh -c) or remote shell (ssh) to execute the command.
+        Therefore the command can have stuff like actual pipes in it, if you dont want to use pipe=True to pipe stuff.

        :param cmd: the actual command, should be a list, where the first item is the command
                    and the rest are parameters. use ExecuteNode.PIPE to add an unescaped |
                    (if you want to use system piping instead of python piping)
-        :param pipe: return CmdPipe instead of executing it.
+        :param pipe: return CmdPipe instead of executing it. (pipe this into another run() command via inp=...)
        :param inp: Can be None, a string or a CmdPipe that was previously returned.
        :param tab_split: split tabbed files in output into a list
-        :param valid_exitcodes: list of valid exit codes for this command (checks exit code of both sides of a pipe)
-                                Use [] to accept all exit codes. Default [0]
+        :param valid_exitcodes: list of valid exit codes for this command. Use [] to accept all exit codes. Default [0]
        :param readonly: make this True if the command doesn't make any changes and is safe to execute in testmode
        :param hide_errors: don't show stderr output as error, instead show it as debugging output (use to hide expected errors)
        :param return_stderr: return both stdout and stderr as a tuple. (normally only returns stdout)
+        :param return_all: return both stdout and stderr and exit_code as a tuple. (normally only returns stdout)
+        :param cwd: Change current working directory before executing command.

        """

@ -106,6 +121,7 @@ class ExecuteNode(LogStub):

        # stderr parser
        error_lines = []
+        returned_exit_code=None

        def stderr_handler(line):
            if tab_split:
@ -128,23 +144,28 @@ class ExecuteNode(LogStub):

            return True

-        # add shell command and handlers to pipe
-        cmd_item=CmdItem(cmd=self._shell_cmd(cmd), readonly=readonly, stderr_handler=stderr_handler, exit_handler=exit_handler, shell=self.is_local())
-        cmd_pipe.add(cmd_item)
-
-        # return pipe instead of executing?
-        if pipe:
-            return cmd_pipe
-
        # stdout parser
        output_lines = []

-        def stdout_handler(line):
-            if tab_split:
-                output_lines.append(line.rstrip().split('\t'))
-            else:
-                output_lines.append(line.rstrip())
-            self._parse_stdout(line)
+        if pipe:
+            # dont specify output handler, so it will get piped to next process
+            stdout_handler=None
+        else:
+            # handle output manually, dont pipe it
+            def stdout_handler(line):
+                if tab_split:
+                    output_lines.append(line.rstrip().split('\t'))
+                else:
+                    output_lines.append(line.rstrip())
+                self._parse_stdout(line)
+
+        # add shell command and handlers to pipe
+        cmd_item=CmdItem(cmd=self._shell_cmd(cmd, cwd), readonly=readonly, stderr_handler=stderr_handler, exit_handler=exit_handler, shell=self.is_local(), stdout_handler=stdout_handler)
+        cmd_pipe.add(cmd_item)
+
+        # return CmdPipe instead of executing?
+        if pipe:
+            return cmd_pipe

        if cmd_pipe.should_execute():
            self.debug("CMD    > {}".format(cmd_pipe))
@ -152,10 +173,99 @@ class ExecuteNode(LogStub):
            self.debug("CMDSKIP> {}".format(cmd_pipe))

        # execute and calls handlers in CmdPipe
-        if not cmd_pipe.execute(stdout_handler=stdout_handler):
+        if not cmd_pipe.execute():
            raise(ExecuteError("Last command returned error"))

-        if return_stderr:
+        if return_all:
+            return output_lines, error_lines, cmd_item.process and cmd_item.process.returncode
+        elif return_stderr:
            return output_lines, error_lines
        else:
            return output_lines
+
+    def script(self, lines, inp=None, stdout_handler=None, stderr_handler=None, exit_handler=None, valid_exitcodes=None, readonly=False, hide_errors=False, pipe=False):
+        """Run a multiline script on the node.
+
+        This is much more low level than run() and allows for finer grained control.
+
+        Either uses a local shell (sh -c) or remote shell (ssh) to execute the command.
+        You need to do your own escaping/quoting.
+        It will do logging of stderr and exit codes, but you should
+        specify your stdout handler when calling CmdPipe.execute.
+        Also specify the optional stderr/exit code handlers if you need them.
+        Handlers are called for each line.
+        It wont collect lines internally like run() does, so streams of data can be of unlimited size.
+
+        :param lines: list of lines of the actual script.
+        :param inp: Can be None, a string or a CmdPipe that was previously returned.
+        :param readonly: make this True if the command doesn't make any changes and is safe to execute in testmode
+        :param valid_exitcodes: list of valid exit codes for this command. Use [] to accept all exit codes. Default [0]
+        :param hide_errors: don't show stderr output as error, instead show it as debugging output (use to hide expected errors)
+        :param pipe: return CmdPipe instead of executing it. (pipe this into another run() command via inp=...)
+
+        """
+
+        # create new pipe?
+        if not isinstance(inp, CmdPipe):
+            cmd_pipe = CmdPipe(self.readonly, inp)
+        else:
+            # add stuff to existing pipe
+            cmd_pipe = inp
+
+        internal_stdout_handler=None
+        if stdout_handler is not None:
+            if self.debug_output:
+                def internal_stdout_handler(line):
+                    self.debug("STDOUT > " + line.rstrip())
+                    stdout_handler(line)
+            else:
+                internal_stdout_handler=stdout_handler
+
+        def internal_stderr_handler(line):
+            self._parse_stderr(line, hide_errors)
+            if stderr_handler is not None:
+                stderr_handler(line)
+
+        # exit code hanlder
+        if valid_exitcodes is None:
+            valid_exitcodes = [0]
+
+        def internal_exit_handler(exit_code):
+            if self.debug_output:
+                self.debug("EXIT   > {}".format(exit_code))
+
+            if exit_handler is not None:
+                exit_handler(exit_code)
+
+            if (valid_exitcodes != []) and (exit_code not in valid_exitcodes):
+                self.error("Script returned exit code {} (valid codes: {})".format(exit_code, valid_exitcodes))
+                return False
+
+            return True
+
+        #build command
+        cmd=[]
+
+        #add remote shell
+        if not self.is_local():
+            #note: dont escape this part (executed directly without shell)
+            cmd.append("ssh")
+
+            if self.ssh_config is not None:
+                cmd.append(["-F", self.ssh_config])
+
+            cmd.append(self.ssh_to)
+
+        # convert to script
+        cmd.append("\n".join(lines))
+
+        # add shell command and handlers to pipe
+        cmd_item=CmdItem(cmd=cmd, readonly=readonly, stderr_handler=internal_stderr_handler, exit_handler=internal_exit_handler, stdout_handler=internal_stdout_handler, shell=self.is_local())
+        cmd_pipe.add(cmd_item)
+
+        self.debug("SCRIPT > {}".format(cmd_pipe))
+
+        if pipe:
+            return cmd_pipe
+        else:
+            return cmd_pipe.execute()
--- a/zfs_autobackup/LogConsole.py
+++ b/zfs_autobackup/LogConsole.py
@ -10,6 +10,7 @@ class LogConsole:
        self.last_log = ""
        self.show_debug = show_debug
        self.show_verbose = show_verbose
+        self._progress_uncleared=False

        if color:
            # try to use color, failback if colorama not available
@ -25,6 +26,7 @@ class LogConsole:
            self.colorama=False

    def error(self, txt):
+        self.clear_progress()
        if self.colorama:
            print(colorama.Fore.RED + colorama.Style.BRIGHT + "! " + txt + colorama.Style.RESET_ALL, file=sys.stderr)
        else:
@ -32,6 +34,7 @@ class LogConsole:
        sys.stderr.flush()

    def warning(self, txt):
+        self.clear_progress()
        if self.colorama:
            print(colorama.Fore.YELLOW + colorama.Style.BRIGHT + "  NOTE: " + txt + colorama.Style.RESET_ALL)
        else:
@ -40,6 +43,7 @@ class LogConsole:

    def verbose(self, txt):
        if self.show_verbose:
+            self.clear_progress()
            if self.colorama:
                print(colorama.Style.NORMAL + "  " + txt + colorama.Style.RESET_ALL)
            else:
@ -48,6 +52,7 @@ class LogConsole:

    def debug(self, txt):
        if self.show_debug:
+            self.clear_progress()
            if self.colorama:
                print(colorama.Fore.GREEN + "# " + txt + colorama.Style.RESET_ALL)
            else:
@ -57,10 +62,13 @@ class LogConsole:
    def progress(self, txt):
        """print progress output to stderr (stays on same line)"""
        self.clear_progress()
+        self._progress_uncleared=True
        print(">>> {}\r".format(txt), end='', file=sys.stderr)
        sys.stderr.flush()

    def clear_progress(self):
-        import colorama
-        print(colorama.ansi.clear_line(), end='', file=sys.stderr)
-        sys.stderr.flush()
+        if self._progress_uncleared:
+            import colorama
+            print(colorama.ansi.clear_line(), end='', file=sys.stderr)
+            # sys.stderr.flush()
+            self._progress_uncleared=False
--- a/zfs_autobackup/TreeHasher.py
+++ b/zfs_autobackup/TreeHasher.py
@ -0,0 +1,60 @@
+import itertools
+import os
+
+
+class TreeHasher():
+    """uses BlockHasher recursively on a directory tree
+
+    Input and output generators are in the format: ( relative-filepath, chunk_nr, hexdigest)
+
+    """
+
+    def __init__(self, block_hasher):
+        """
+
+        :type block_hasher: BlockHasher
+        """
+        self.block_hasher=block_hasher
+
+    def generate(self, start_path):
+        """Use BlockHasher on every file in a tree, yielding the results
+
+        note that it only checks the contents of actual files. It ignores metadata like permissions and mtimes.
+        It also ignores empty directories, symlinks and special files.
+        """
+
+        def walkerror(e):
+            raise e
+
+        for (dirpath, dirnames, filenames) in os.walk(start_path, onerror=walkerror):
+            for f in filenames:
+                file_path=os.path.join(dirpath, f)
+
+                if (not os.path.islink(file_path)) and os.path.isfile(file_path):
+                    for (chunk_nr, hash) in self.block_hasher.generate(file_path):
+                        yield ( os.path.relpath(file_path,start_path), chunk_nr, hash )
+
+
+    def compare(self, start_path, generator):
+        """reads from generator and compares blocks
+
+        yields mismatches in the form: ( relative_filename, chunk_nr, compare_hexdigest, actual_hexdigest )
+        yields errors in the form:     ( relative_filename, chunk_nr, compare_hexdigest, "message" )
+
+        """
+
+        count=0
+
+        def filter_file_name( file_name, chunk_nr, hexdigest):
+                return ( chunk_nr, hexdigest )
+
+
+        for file_name, group_generator in itertools.groupby(generator, lambda x: x[0]):
+            count=count+1
+            block_generator=itertools.starmap(filter_file_name, group_generator)
+            for ( chunk_nr, compare_hexdigest, actual_hexdigest) in self.block_hasher.compare(os.path.join(start_path,file_name), block_generator):
+                yield ( file_name, chunk_nr, compare_hexdigest, actual_hexdigest )
+
+
+
+
--- a/zfs_autobackup/ZfsAuto.py
+++ b/zfs_autobackup/ZfsAuto.py
@ -0,0 +1,117 @@
+import argparse
+import sys
+
+from .CliBase import CliBase
+
+
+class ZfsAuto(CliBase):
+    """Common Base class for ZfsAutobackup and ZfsAutoverify ."""
+
+    def __init__(self, argv, print_arguments=True):
+
+        self.hold_name = None
+        self.snapshot_time_format = None
+        self.property_name = None
+        self.exclude_paths = None
+
+        super(ZfsAuto, self).__init__(argv, print_arguments)
+
+    def parse_args(self, argv):
+        """parse common arguments, setup logging, check and adjust parameters"""
+
+        args = super(ZfsAuto, self).parse_args(argv)
+
+        if args.backup_name == None:
+            self.parser.print_usage()
+            self.log.error("Please specify BACKUP-NAME")
+            sys.exit(255)
+
+        if args.target_path is not None and args.target_path[0] == "/":
+            self.log.error("Target should not start with a /")
+            sys.exit(255)
+
+        if args.ignore_replicated:
+            self.warning("--ignore-replicated has been renamed, using --exclude-unchanged")
+            args.exclude_unchanged = True
+
+        # Note: Before version v3.1-beta5, we always used exclude_received. This was a problem if you wanted to
+        # replicate an existing backup to another host and use the same backupname/snapshots. However, exclude_received
+        # may still need to be used to explicitly exclude a backup with the 'received' source property to avoid accidental
+        # recursive replication of a zvol that is currently being received in another session (as it will have changes).
+
+        self.exclude_paths = []
+        if args.ssh_source == args.ssh_target:
+            if args.target_path:
+                # target and source are the same, make sure to exclude target_path
+                self.verbose("NOTE: Source and target are on the same host, excluding target-path from selection.")
+                self.exclude_paths.append(args.target_path)
+            else:
+                if not args.exclude_received:
+                    self.verbose("NOTE: Source and target are on the same host, adding --exclude-received to commandline.")
+                    args.exclude_received = True
+
+        if args.test:
+            self.warning("TEST MODE - SIMULATING WITHOUT MAKING ANY CHANGES")
+
+        #format all the names
+        self.property_name = args.property_format.format(args.backup_name)
+        self.snapshot_time_format = args.snapshot_format.format(args.backup_name)
+        self.hold_name = args.hold_format.format(args.backup_name)
+
+        self.verbose("")
+        self.verbose("Selecting dataset property : {}".format(self.property_name))
+        self.verbose("Snapshot format            : {}".format(self.snapshot_time_format))
+
+        return args
+
+    def get_parser(self):
+
+        parser = super(ZfsAuto, self).get_parser()
+
+        #positional arguments
+        parser.add_argument('backup_name', metavar='BACKUP-NAME', default=None, nargs='?',
+                            help='Name of the backup to select')
+
+        parser.add_argument('target_path', metavar='TARGET-PATH', default=None, nargs='?',
+                            help='Target ZFS filesystem (optional)')
+
+
+
+        # SSH options
+        group=parser.add_argument_group("SSH options")
+        group.add_argument('--ssh-config', metavar='CONFIG-FILE', default=None, help='Custom ssh client config')
+        group.add_argument('--ssh-source', metavar='USER@HOST', default=None,
+                            help='Source host to pull backup from.')
+        group.add_argument('--ssh-target', metavar='USER@HOST', default=None,
+                            help='Target host to push backup to.')
+
+        group=parser.add_argument_group("String formatting options")
+        group.add_argument('--property-format', metavar='FORMAT', default="autobackup:{}",
+                            help='Dataset selection string format. Default: %(default)s')
+        group.add_argument('--snapshot-format', metavar='FORMAT', default="{}-%Y%m%d%H%M%S",
+                            help='ZFS Snapshot string format. Default: %(default)s')
+        group.add_argument('--hold-format', metavar='FORMAT', default="zfs_autobackup:{}",
+                            help='ZFS hold string format. Default: %(default)s')
+        group.add_argument('--strip-path', metavar='N', default=0, type=int,
+                           help='Number of directories to strip from target path (use 1 when cloning zones between 2 '
+                                'SmartOS machines)')
+
+        group=parser.add_argument_group("Selection options")
+        group.add_argument('--ignore-replicated', action='store_true', help=argparse.SUPPRESS)
+        group.add_argument('--exclude-unchanged', action='store_true',
+                            help='Exclude datasets that have no changes since any last snapshot. (Useful in combination with proxmox HA replication)')
+        group.add_argument('--exclude-received', action='store_true',
+                            help='Exclude datasets that have the origin of their autobackup: property as "received". '
+                                 'This can avoid recursive replication between two backup partners.')
+
+        return parser
+
+    def print_error_sources(self):
+        self.error(
+            "No source filesystems selected, please do a 'zfs set autobackup:{0}=true' on the source datasets "
+            "you want to select.".format(
+                self.args.backup_name))
+
+    def make_target_name(self, source_dataset):
+        """make target_name from a source_dataset"""
+        return self.args.target_path + "/" + source_dataset.lstrip_path(self.args.strip_path)
--- a/zfs_autobackup/ZfsAutobackup.py
+++ b/zfs_autobackup/ZfsAutobackup.py
@ -1,170 +1,34 @@
-import argparse
-import sys
 import time

+import argparse
+from signal import signal, SIGPIPE
+from .util import output_redir, sigpipe_handler
+
+from .ZfsAuto import ZfsAuto
+
 from . import compressors
 from .ExecuteNode import ExecuteNode
 from .Thinner import Thinner
 from .ZfsDataset import ZfsDataset
-from .LogConsole import LogConsole
 from .ZfsNode import ZfsNode
 from .ThinnerRule import ThinnerRule
 import os.path

-class ZfsAutobackup:
-    """main class"""
-
-    VERSION = "3.1.2-rc2"
-    HEADER = "zfs-autobackup v{} - (c)2021 E.H.Eefting (edwin@datux.nl)".format(VERSION)
+class ZfsAutobackup(ZfsAuto):
+    """The main zfs-autobackup class. Start here, at run() :)"""

    def __init__(self, argv, print_arguments=True):

-        # helps with investigating failed regression tests:
-        if print_arguments:
-            print("ARGUMENTS: " + " ".join(argv))
+        # NOTE: common options and parameters are in ZfsAuto
+        super(ZfsAutobackup, self).__init__(argv, print_arguments)

-        parser = argparse.ArgumentParser(
-            description=self.HEADER,
-            epilog='Full manual at: https://github.com/psy0rz/zfs_autobackup')
-        parser.add_argument('--ssh-config', metavar='CONFIG-FILE', default=None, help='Custom ssh client config')
-        parser.add_argument('--ssh-source', metavar='USER@HOST', default=None,
-                            help='Source host to get backup from.')
-        parser.add_argument('--ssh-target', metavar='USER@HOST', default=None,
-                            help='Target host to push backup to.')
-        parser.add_argument('--keep-source', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
-                            help='Thinning schedule for old source snapshots. Default: %(default)s')
-        parser.add_argument('--keep-target', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
-                            help='Thinning schedule for old target snapshots. Default: %(default)s')
+    def parse_args(self, argv):
+        """do extra checks on common args"""

-        parser.add_argument('backup_name', metavar='BACKUP-NAME', default=None, nargs='?',
-                            help='Name of the backup (you should set the zfs property "autobackup:backup-name" to '
-                                 'true on filesystems you want to backup')
-        parser.add_argument('target_path', metavar='TARGET-PATH', default=None, nargs='?',
-                            help='Target ZFS filesystem (optional: if not specified, zfs-autobackup will only operate '
-                                 'as snapshot-tool on source)')
+        args = super(ZfsAutobackup, self).parse_args(argv)

-        parser.add_argument('--pre-snapshot-cmd', metavar="COMMAND", default=[], action='append',
-                            help='Run COMMAND before snapshotting (can be used multiple times.')
-        parser.add_argument('--post-snapshot-cmd', metavar="COMMAND", default=[], action='append',
-                            help='Run COMMAND after snapshotting (can be used multiple times.')
-        parser.add_argument('--other-snapshots', action='store_true',
-                            help='Send over other snapshots as well, not just the ones created by this tool.')
-        parser.add_argument('--no-snapshot', action='store_true',
-                            help='Don\'t create new snapshots (useful for finishing uncompleted backups, or cleanups)')
-        parser.add_argument('--no-send', action='store_true',
-                            help='Don\'t send snapshots (useful for cleanups, or if you want a serperate send-cronjob)')
-        parser.add_argument('--no-thinning', action='store_true', help="Do not destroy any snapshots.")
-        parser.add_argument('--no-holds', action='store_true',
-                            help='Don\'t hold snapshots. (Faster. Allows you to destroy common snapshot.)')
-        parser.add_argument('--min-change', metavar='BYTES', type=int, default=1,
-                            help='Number of bytes written after which we consider a dataset changed (default %('
-                                 'default)s)')
-        parser.add_argument('--allow-empty', action='store_true',
-                            help='If nothing has changed, still create empty snapshots. (same as --min-change=0)')
-
-        parser.add_argument('--ignore-replicated', action='store_true', help=argparse.SUPPRESS)
-        parser.add_argument('--exclude-unchanged', action='store_true',
-                            help='Exclude datasets that have no changes since any last snapshot. (Useful in combination with proxmox HA replication)')
-        parser.add_argument('--exclude-received', action='store_true',
-                            help='Exclude datasets that have the origin of their autobackup: property as "received". '
-                                 'This can avoid recursive replication between two backup partners.')
-        parser.add_argument('--strip-path', metavar='N', default=0, type=int,
-                            help='Number of directories to strip from target path (use 1 when cloning zones between 2 '
-                                 'SmartOS machines)')
-
-        parser.add_argument('--clear-refreservation', action='store_true',
-                            help='Filter "refreservation" property. (recommended, safes space. same as '
-                                 '--filter-properties refreservation)')
-        parser.add_argument('--clear-mountpoint', action='store_true',
-                            help='Set property canmount=noauto for new datasets. (recommended, prevents mount '
-                                 'conflicts. same as --set-properties canmount=noauto)')
-        parser.add_argument('--filter-properties', metavar='PROPERTY,...', type=str,
-                            help='List of properties to "filter" when receiving filesystems. (you can still restore '
-                                 'them with zfs inherit -S)')
-        parser.add_argument('--set-properties', metavar='PROPERTY=VALUE,...', type=str,
-                            help='List of propererties to override when receiving filesystems. (you can still restore '
-                                 'them with zfs inherit -S)')
-        parser.add_argument('--rollback', action='store_true',
-                            help='Rollback changes to the latest target snapshot before starting. (normally you can '
-                                 'prevent changes by setting the readonly property on the target_path to on)')
-        parser.add_argument('--force', '-F', action='store_true',
-                            help='Use zfs -F option to force overwrite/rollback. (Usefull with --strip-path=1, but use with care)')
-        parser.add_argument('--destroy-incompatible', action='store_true',
-                            help='Destroy incompatible snapshots on target. Use with care! (implies --rollback)')
-        parser.add_argument('--destroy-missing', metavar="SCHEDULE", type=str, default=None,
-                            help='Destroy datasets on target that are missing on the source. Specify the time since '
-                                 'the last snapshot, e.g: --destroy-missing 30d')
-        parser.add_argument('--ignore-transfer-errors', action='store_true',
-                            help='Ignore transfer errors (still checks if received filesystem exists. useful for '
-                                 'acltype errors)')
-
-        parser.add_argument('--decrypt', action='store_true',
-                            help='Decrypt data before sending it over.')
-
-        parser.add_argument('--encrypt', action='store_true',
-                            help='Encrypt data after receiving it.')
-
-        parser.add_argument('--zfs-compressed', action='store_true',
-                            help='Transfer blocks that already have zfs-compression as-is.')
-
-        parser.add_argument('--test','--dry-run', '-n', action='store_true',
-                            help='Dry run, dont change anything, just show what would be done (still does all read-only '
-                                 'operations)')
-        parser.add_argument('--verbose','-v', action='store_true', help='verbose output')
-        parser.add_argument('--debug','-d', action='store_true',
-                            help='Show zfs commands that are executed, stops after an exception.')
-        parser.add_argument('--debug-output', action='store_true',
-                            help='Show zfs commands and their output/exit codes. (noisy)')
-        parser.add_argument('--progress', action='store_true',
-                            help='show zfs progress output. Enabled automaticly on ttys. (use --no-progress to disable)')
-        parser.add_argument('--no-progress', action='store_true',
-                            help=argparse.SUPPRESS)  # needed to workaround a zfs recv -v bug
-
-        parser.add_argument('--resume', action='store_true', help=argparse.SUPPRESS)
-        parser.add_argument('--raw', action='store_true', help=argparse.SUPPRESS)
-
-        # these things all do stuff by piping zfs send/recv IO
-        parser.add_argument('--send-pipe', metavar="COMMAND", default=[], action='append',
-                            help='pipe zfs send output through COMMAND (can be used multiple times)')
-        parser.add_argument('--recv-pipe', metavar="COMMAND", default=[], action='append',
-                            help='pipe zfs recv input through COMMAND (can be used multiple times)')
-        parser.add_argument('--compress', metavar='TYPE', default=None, nargs='?', const='zstd-fast',
-                            choices=compressors.choices(),
-                            help='Use compression during transfer, defaults to zstd-fast if TYPE is not specified. ({})'.format(
-                                ", ".join(compressors.choices())))
-        parser.add_argument('--rate', metavar='DATARATE', default=None,
-                            help='Limit data transfer rate (e.g. 128K. requires mbuffer.)')
-        parser.add_argument('--buffer', metavar='SIZE', default=None,
-                            help='Add zfs send and recv buffers to smooth out IO bursts. (e.g. 128M. requires mbuffer)')
-
-        parser.add_argument('--snapshot-format', metavar='FORMAT', default="{}-%Y%m%d%H%M%S",
-                            help='Snapshot naming format. Default: %(default)s')
-        parser.add_argument('--property-format', metavar='FORMAT', default="autobackup:{}",
-                            help='Select property naming format. Default: %(default)s')
-        parser.add_argument('--hold-format', metavar='FORMAT', default="zfs_autobackup:{}",
-                            help='Hold naming format. Default: %(default)s')
-
-        parser.add_argument('--version', action='store_true',
-                            help='Show version.')
-
-        # note args is the only global variable we use, since its a global readonly setting anyway
-        args = parser.parse_args(argv)
-
-        self.args = args
-
-        if args.version:
-            print(self.HEADER)
-            sys.exit(255)
-
-        # auto enable progress?
-        if sys.stderr.isatty() and not args.no_progress:
-            args.progress = True
-
-        if args.debug_output:
-            args.debug = True
-
-        if self.args.test:
-            self.args.verbose = True
+        if not args.no_holds:
+            self.verbose("Hold name                  : {}".format(self.hold_name))

        if args.allow_empty:
            args.min_change = 0
@ -172,14 +36,6 @@ class ZfsAutobackup:
        if args.destroy_incompatible:
            args.rollback = True

-        self.log = LogConsole(show_debug=self.args.debug, show_verbose=self.args.verbose, color=sys.stdout.isatty())
-        self.verbose(self.HEADER)
-
-        if args.backup_name==None:
-            parser.print_usage()
-            self.log.error("Please specify BACKUP-NAME")
-            sys.exit(255)
-
        if args.resume:
            self.warning("The --resume option isn't needed anymore (its autodetected now)")

@ -187,41 +43,99 @@ class ZfsAutobackup:
            self.warning(
                "The --raw option isn't needed anymore (its autodetected now). Also see --encrypt and --decrypt.")

-        if args.target_path is not None and args.target_path[0] == "/":
-            self.log.error("Target should not start with a /")
-            sys.exit(255)
-
        if args.compress and args.ssh_source is None and args.ssh_target is None:
            self.warning("Using compression, but transfer is local.")

        if args.compress and args.zfs_compressed:
            self.warning("Using --compress with --zfs-compressed, might be inefficient.")

-        if args.ignore_replicated:
-            self.warning("--ignore-replicated has been renamed, using --exclude-unchanged")
-            args.exclude_unchanged = True
+        return args

-    def verbose(self, txt):
-        self.log.verbose(txt)
+    def get_parser(self):
+        """extend common parser with  extra stuff needed for zfs-autobackup"""

-    def warning(self, txt):
-        self.log.warning(txt)
+        parser = super(ZfsAutobackup, self).get_parser()

-    def error(self, txt):
-        self.log.error(txt)
+        group = parser.add_argument_group("Snapshot options")
+        group.add_argument('--no-snapshot', action='store_true',
+                           help='Don\'t create new snapshots (useful for finishing uncompleted backups, or cleanups)')
+        group.add_argument('--pre-snapshot-cmd', metavar="COMMAND", default=[], action='append',
+                           help='Run COMMAND before snapshotting (can be used multiple times.')
+        group.add_argument('--post-snapshot-cmd', metavar="COMMAND", default=[], action='append',
+                           help='Run COMMAND after snapshotting (can be used multiple times.')
+        group.add_argument('--min-change', metavar='BYTES', type=int, default=1,
+                           help='Only create snapshot if enough bytes are changed. (default %('
+                                'default)s)')
+        group.add_argument('--allow-empty', action='store_true',
+                           help='If nothing has changed, still create empty snapshots. (Faster. Same as --min-change=0)')
+        group.add_argument('--other-snapshots', action='store_true',
+                           help='Send over other snapshots as well, not just the ones created by this tool.')

-    def debug(self, txt):
-        self.log.debug(txt)
+        group = parser.add_argument_group("Transfer options")
+        group.add_argument('--no-send', action='store_true',
+                           help='Don\'t transfer snapshots (useful for cleanups, or if you want a serperate send-cronjob)')
+        group.add_argument('--no-holds', action='store_true',
+                           help='Don\'t hold snapshots. (Faster. Allows you to destroy common snapshot.)')
+        group.add_argument('--clear-refreservation', action='store_true',
+                           help='Filter "refreservation" property. (recommended, safes space. same as '
+                                '--filter-properties refreservation)')
+        group.add_argument('--clear-mountpoint', action='store_true',
+                           help='Set property canmount=noauto for new datasets. (recommended, prevents mount '
+                                'conflicts. same as --set-properties canmount=noauto)')
+        group.add_argument('--filter-properties', metavar='PROPERTY,...', type=str,
+                           help='List of properties to "filter" when receiving filesystems. (you can still restore '
+                                'them with zfs inherit -S)')
+        group.add_argument('--set-properties', metavar='PROPERTY=VALUE,...', type=str,
+                           help='List of propererties to override when receiving filesystems. (you can still restore '
+                                'them with zfs inherit -S)')
+        group.add_argument('--rollback', action='store_true',
+                           help='Rollback changes to the latest target snapshot before starting. (normally you can '
+                                'prevent changes by setting the readonly property on the target_path to on)')
+        group.add_argument('--force', '-F', action='store_true',
+                           help='Use zfs -F option to force overwrite/rollback. (Usefull with --strip-path=1, but use with care)')
+        group.add_argument('--destroy-incompatible', action='store_true',
+                           help='Destroy incompatible snapshots on target. Use with care! (implies --rollback)')
+        group.add_argument('--ignore-transfer-errors', action='store_true',
+                           help='Ignore transfer errors (still checks if received filesystem exists. useful for '
+                                'acltype errors)')

-    def set_title(self, title):
-        self.log.verbose("")
-        self.log.verbose("#### " + title)
+        group.add_argument('--decrypt', action='store_true',
+                           help='Decrypt data before sending it over.')
+        group.add_argument('--encrypt', action='store_true',
+                           help='Encrypt data after receiving it.')

-    def progress(self, txt):
-        self.log.progress(txt)
+        group.add_argument('--zfs-compressed', action='store_true',
+                           help='Transfer blocks that already have zfs-compression as-is.')

-    def clear_progress(self):
-        self.log.clear_progress()
+        group = parser.add_argument_group("ZFS send/recv pipes")
+        group.add_argument('--compress', metavar='TYPE', default=None, nargs='?', const='zstd-fast',
+                           choices=compressors.choices(),
+                           help='Use compression during transfer, defaults to zstd-fast if TYPE is not specified. ({})'.format(
+                               ", ".join(compressors.choices())))
+        group.add_argument('--rate', metavar='DATARATE', default=None,
+                           help='Limit data transfer rate (e.g. 128K. requires mbuffer.)')
+        group.add_argument('--buffer', metavar='SIZE', default=None,
+                           help='Add zfs send and recv buffers to smooth out IO bursts. (e.g. 128M. requires mbuffer)')
+        group.add_argument('--send-pipe', metavar="COMMAND", default=[], action='append',
+                           help='pipe zfs send output through COMMAND (can be used multiple times)')
+        group.add_argument('--recv-pipe', metavar="COMMAND", default=[], action='append',
+                           help='pipe zfs recv input through COMMAND (can be used multiple times)')
+
+        group = parser.add_argument_group("Thinner options")
+        group.add_argument('--no-thinning', action='store_true', help="Do not destroy any snapshots.")
+        group.add_argument('--keep-source', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
+                           help='Thinning schedule for old source snapshots. Default: %(default)s')
+        group.add_argument('--keep-target', metavar='SCHEDULE', type=str, default="10,1d1w,1w1m,1m1y",
+                           help='Thinning schedule for old target snapshots. Default: %(default)s')
+        group.add_argument('--destroy-missing', metavar="SCHEDULE", type=str, default=None,
+                           help='Destroy datasets on target that are missing on the source. Specify the time since '
+                                'the last snapshot, e.g: --destroy-missing 30d')
+
+        # obsolete
+        parser.add_argument('--resume', action='store_true', help=argparse.SUPPRESS)
+        parser.add_argument('--raw', action='store_true', help=argparse.SUPPRESS)
+
+        return parser

    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
    def thin_missing_targets(self, target_dataset, used_target_datasets):
@ -245,8 +159,8 @@ class ZfsAutobackup:
            except Exception as e:
                dataset.error("Error during thinning of missing datasets ({})".format(str(e)))

-        if self.args.progress:
-            self.clear_progress()
+        # if self.args.progress:
+        #     self.clear_progress()

    # NOTE: this method also uses self.args. args that need extra processing are passed as function parameters:
    def destroy_missing_targets(self, target_dataset, used_target_datasets):
@ -305,10 +219,13 @@ class ZfsAutobackup:
                                dataset.destroy(fail_exception=True)

            except Exception as e:
+                # if self.args.progress:
+                #     self.clear_progress()
+
                dataset.error("Error during --destroy-missing: {}".format(str(e)))

-        if self.args.progress:
-            self.clear_progress()
+        # if self.args.progress:
+        #     self.clear_progress()

    def get_send_pipes(self, logger):
        """determine the zfs send pipe"""
@ -413,7 +330,7 @@ class ZfsAutobackup:
            try:
                # determine corresponding target_dataset
                target_name = self.make_target_name(source_dataset)
-                target_dataset = ZfsDataset(target_node, target_name)
+                target_dataset = target_node.get_dataset(target_name)
                target_datasets.append(target_dataset)

                # ensure parents exists
@ -425,8 +342,8 @@ class ZfsAutobackup:
                    target_dataset.parent.create_filesystem(parents=True)

                # determine common zpool features (cached, so no problem we call it often)
-                source_features = source_node.get_zfs_pool(source_dataset.split_path()[0]).features
-                target_features = target_node.get_zfs_pool(target_dataset.split_path()[0]).features
+                source_features = source_node.get_pool(source_dataset).features
+                target_features = target_node.get_pool(target_dataset).features
                common_features = source_features and target_features

                # sync the snapshots of this dataset
@ -442,15 +359,19 @@ class ZfsAutobackup:
                                              decrypt=self.args.decrypt, encrypt=self.args.encrypt,
                                              zfs_compressed=self.args.zfs_compressed, force=self.args.force)
            except Exception as e:
+                # if self.args.progress:
+                #     self.clear_progress()
+
                fail_count = fail_count + 1
                source_dataset.error("FAILED: " + str(e))
                if self.args.debug:
+                    self.verbose("Debug mode, aborting on first error")
                    raise

-        if self.args.progress:
-            self.clear_progress()
+        # if self.args.progress:
+        #     self.clear_progress()

-        target_path_dataset = ZfsDataset(target_node, self.args.target_path)
+        target_path_dataset = target_node.get_dataset(self.args.target_path)
        if not self.args.no_thinning:
            self.thin_missing_targets(target_dataset=target_path_dataset, used_target_datasets=target_datasets)

@ -494,22 +415,6 @@ class ZfsAutobackup:

        try:

-            if self.args.test:
-                self.warning("TEST MODE - SIMULATING WITHOUT MAKING ANY CHANGES")
-
-            #format all the names
-            property_name = self.args.property_format.format(self.args.backup_name)
-            snapshot_time_format = self.args.snapshot_format.format(self.args.backup_name)
-            hold_name = self.args.hold_format.format(self.args.backup_name)
-
-            self.verbose("")
-            self.verbose("Selecting dataset property : {}".format(property_name))
-            self.verbose("Snapshot format            : {}".format(snapshot_time_format))
-
-            if not self.args.no_holds:
-                self.verbose("Hold name                  : {}".format(hold_name))
-
-
            ################ create source zfsNode
            self.set_title("Source settings")

@ -518,44 +423,26 @@ class ZfsAutobackup:
                source_thinner = None
            else:
                source_thinner = Thinner(self.args.keep_source)
-            source_node = ZfsNode(snapshot_time_format=snapshot_time_format, hold_name=hold_name, logger=self, ssh_config=self.args.ssh_config,
+            source_node = ZfsNode(snapshot_time_format=self.snapshot_time_format, hold_name=self.hold_name, logger=self,
+                                  ssh_config=self.args.ssh_config,
                                  ssh_to=self.args.ssh_source, readonly=self.args.test,
                                  debug_output=self.args.debug_output, description=description, thinner=source_thinner)

-
            ################# select source datasets
            self.set_title("Selecting")
-
-            # Note: Before version v3.1-beta5, we always used exclude_received. This was a problem if you wanted to
-            # replicate an existing backup to another host and use the same backupname/snapshots. However, exclude_received
-            # may still need to be used to explicitly exclude a backup with the 'received' source to avoid accidental
-            # recursive replication of a zvol that is currently being received in another session (as it will have changes).
-            exclude_paths = []
-            exclude_received = self.args.exclude_received
-            if self.args.ssh_source == self.args.ssh_target:
-                if self.args.target_path:
-                    # target and source are the same, make sure to exclude target_path
-                    self.warning("Source and target are on the same host, excluding target-path from selection.")
-                    exclude_paths.append(self.args.target_path)
-                else:
-                    self.warning("Source and target are on the same host, excluding received datasets from selection.")
-                    exclude_received = True
-
-            source_datasets = source_node.selected_datasets(property_name=property_name,exclude_received=exclude_received,
-                                                                     exclude_paths=exclude_paths,
-                                                                     exclude_unchanged=self.args.exclude_unchanged,
-                                                                     min_change=self.args.min_change)
+            source_datasets = source_node.selected_datasets(property_name=self.property_name,
+                                                            exclude_received=self.args.exclude_received,
+                                                            exclude_paths=self.exclude_paths,
+                                                            exclude_unchanged=self.args.exclude_unchanged,
+                                                            min_change=self.args.min_change)
            if not source_datasets:
-                self.error(
-                    "No source filesystems selected, please do a 'zfs set autobackup:{0}=true' on the source datasets "
-                    "you want to select.".format(
-                        self.args.backup_name))
+                self.print_error_sources()
                return 255

            ################# snapshotting
            if not self.args.no_snapshot:
                self.set_title("Snapshotting")
-                snapshot_name=time.strftime(snapshot_time_format)
+                snapshot_name = time.strftime(self.snapshot_time_format)
                source_node.consistent_snapshot(source_datasets, snapshot_name,
                                                min_changed_bytes=self.args.min_change,
                                                pre_snapshot_cmds=self.args.pre_snapshot_cmd,
@ -571,7 +458,8 @@ class ZfsAutobackup:
                    target_thinner = None
                else:
                    target_thinner = Thinner(self.args.keep_target)
-                target_node = ZfsNode(snapshot_time_format=snapshot_time_format, hold_name=hold_name, logger=self, ssh_config=self.args.ssh_config,
+                target_node = ZfsNode(snapshot_time_format=self.snapshot_time_format, hold_name=self.hold_name,
+                                      logger=self, ssh_config=self.args.ssh_config,
                                      ssh_to=self.args.ssh_target,
                                      readonly=self.args.test, debug_output=self.args.debug_output,
                                      description="[Target]",
@ -581,7 +469,7 @@ class ZfsAutobackup:
                self.set_title("Synchronising")

                # check if exists, to prevent vague errors
-                target_dataset = ZfsDataset(target_node, self.args.target_path)
+                target_dataset = target_node.get_dataset(self.args.target_path)
                if not target_dataset.exists:
                    raise (Exception(
                        "Target path '{}' does not exist. Please create this dataset first.".format(target_dataset)))
@ -618,6 +506,7 @@ class ZfsAutobackup:
                self.verbose("")
                self.warning("TEST MODE - DID NOT MAKE ANY CHANGES!")

+            self.clear_progress()
            return fail_count

        except Exception as e:
@ -628,3 +517,15 @@ class ZfsAutobackup:
        except KeyboardInterrupt:
            self.error("Aborted")
            return 255
+
+
+def cli():
+    import sys
+
+    signal(SIGPIPE, sigpipe_handler)
+
+    sys.exit(ZfsAutobackup(sys.argv[1:], False).run())
+
+
+if __name__ == "__main__":
+    cli()
--- a/zfs_autobackup/ZfsAutoverify.py
+++ b/zfs_autobackup/ZfsAutoverify.py
@ -0,0 +1,314 @@
+# from util import activate_volume_snapshot, create_mountpoints, cleanup_mountpoint
+from signal import signal, SIGPIPE
+from .util import output_redir, sigpipe_handler
+
+from .ZfsAuto import ZfsAuto
+from .ZfsNode import ZfsNode
+import sys
+
+
+# # try to be as unix compatible as possible, while still having decent performance
+# def compare_trees_find(source_node, source_path, target_node, target_path):
+#     # find /tmp/zfstmp_pve1_1993135target/ -xdev -type f -print0 | xargs -0 md5sum | md5sum -c
+#
+#     #verify tree has atleast one file
+#
+#     stdout=source_node.run(["find", ".", "-type", "f",
+#                           ExecuteNode.PIPE, "head", "-n1",
+#                           ], cwd=source_path)
+#
+#     if not stdout:
+#         source_node.debug("No files, skipping check")
+#     else:
+#         pipe=source_node.run(["find", ".", "-type", "f", "-print0",
+#                               ExecuteNode.PIPE, "xargs", "-0", "md5sum"
+#                               ], pipe=True, cwd=source_path)
+#         stdout=target_node.run([ "md5sum", "-c", "--quiet"], inp=pipe, cwd=target_path, valid_exitcodes=[0,1])
+#
+#         if len(stdout):
+#             for line in stdout:
+#                 target_node.error("md5sum: "+line)
+#
+#             raise(Exception("Some files have checksum errors"))
+#
+#
+# def compare_trees_rsync(source_node, source_path, target_node, target_path):
+#     """use rsync to compare two trees.
+#      Advantage is that we can see which individual files differ.
+#      But requires rsync and cant do remote to remote."""
+#
+#     cmd = ["rsync", "-rcnq", "--info=COPY,DEL,MISC,NAME,SYMSAFE", "--msgs2stderr", "--delete" ]
+#
+#     #local
+#     if source_node.ssh_to is None and target_node.ssh_to is None:
+#         cmd.append("{}/".format(source_path))
+#         cmd.append("{}/".format(target_path))
+#         source_node.debug("Running rsync locally, on source.")
+#         stdout, stderr = source_node.run(cmd, return_stderr=True)
+#
+#     #source is local
+#     elif source_node.ssh_to is None and target_node.ssh_to is not None:
+#         cmd.append("{}/".format(source_path))
+#         cmd.append("{}:{}/".format(target_node.ssh_to, target_path))
+#         source_node.debug("Running rsync locally, on source.")
+#         stdout, stderr = source_node.run(cmd, return_stderr=True)
+#
+#     #target is local
+#     elif source_node.ssh_to is not None and target_node.ssh_to is None:
+#         cmd.append("{}:{}/".format(source_node.ssh_to, source_path))
+#         cmd.append("{}/".format(target_path))
+#         source_node.debug("Running rsync locally, on target.")
+#         stdout, stderr=target_node.run(cmd, return_stderr=True)
+#
+#     else:
+#         raise Exception("Source and target cant both be remote when verifying. (rsync limitation)")
+#
+#     if stderr:
+#         raise Exception("Dataset verify failed, see above list for differences")
+
+
+def verify_filesystem(source_snapshot, source_mnt, target_snapshot, target_mnt, method):
+    """Compare the contents of two zfs filesystem snapshots """
+
+    try:
+
+        # mount the snapshots
+        source_snapshot.mount(source_mnt)
+        target_snapshot.mount(target_mnt)
+
+        if method=='rsync':
+            compare_trees_rsync(source_snapshot.zfs_node, source_mnt, target_snapshot.zfs_node, target_mnt)
+        # elif method == 'tar':
+        #     compare_trees_tar(source_snapshot.zfs_node, source_mnt, target_snapshot.zfs_node, target_mnt)
+        elif method == 'find':
+            compare_trees_find(source_snapshot.zfs_node, source_mnt, target_snapshot.zfs_node, target_mnt)
+        else:
+            raise(Exception("program errror, unknown method"))
+
+    finally:
+        source_snapshot.unmount()
+        target_snapshot.unmount()
+
+
+# def hash_dev(node, dev):
+#     """calculate md5sum of a device on a node"""
+#
+#     node.debug("Hashing volume {} ".format(dev))
+#
+#     cmd = [ "md5sum", dev ]
+#
+#     stdout = node.run(cmd)
+#
+#     if node.readonly:
+#         hashed=None
+#     else:
+#         hashed = stdout[0].split(" ")[0]
+#
+#     node.debug("Hash of volume {} is {}".format(dev, hashed))
+#
+#     return hashed
+
+
+
+# def deacitvate_volume_snapshot(snapshot):
+#     clone_name=get_tmp_clone_name(snapshot)
+#     clone=snapshot.zfs_node.get_dataset(clone_name)
+#     clone.destroy(deferred=True, verbose=False)
+
+def verify_volume(source_dataset, source_snapshot, target_dataset, target_snapshot):
+    """compare the contents of two zfs volume snapshots"""
+
+    # try:
+    source_dev= activate_volume_snapshot(source_snapshot)
+    target_dev= activate_volume_snapshot(target_snapshot)
+
+    source_hash= hash_dev(source_snapshot.zfs_node, source_dev)
+    target_hash= hash_dev(target_snapshot.zfs_node, target_dev)
+
+    if source_hash!=target_hash:
+        raise Exception("md5hash difference: {} != {}".format(source_hash, target_hash))
+
+    # finally:
+    #     deacitvate_volume_snapshot(source_snapshot)
+    #     deacitvate_volume_snapshot(target_snapshot)
+
+
+# class ZfsAutoChecksumVolume(ZfsAuto):
+#     def __init__(self, argv, print_arguments=True):
+#
+#         # NOTE: common options and parameters are in ZfsAuto
+#         super(ZfsAutoverify, self).__init__(argv, print_arguments)
+
+class ZfsAutoverify(ZfsAuto):
+    """The zfs-autoverify class, default agruments and stuff come from ZfsAuto"""
+
+    def __init__(self, argv, print_arguments=True):
+
+        # NOTE: common options and parameters are in ZfsAuto
+        super(ZfsAutoverify, self).__init__(argv, print_arguments)
+
+    def parse_args(self, argv):
+        """do extra checks on common args"""
+
+        args=super(ZfsAutoverify, self).parse_args(argv)
+
+        if args.target_path == None:
+            self.log.error("Please specify TARGET-PATH")
+            sys.exit(255)
+
+        return args
+
+    def get_parser(self):
+        """extend common parser with  extra stuff needed for zfs-autobackup"""
+
+        parser=super(ZfsAutoverify, self).get_parser()
+
+        group=parser.add_argument_group("Verify options")
+        group.add_argument('--fs-compare', metavar='METHOD', default="find", choices=["find", "rsync"],
+                            help='Compare method to use for filesystems. (find, rsync) Default: %(default)s ')
+
+        return parser
+
+    def verify_datasets(self, source_mnt, source_datasets, target_node, target_mnt):
+
+        fail_count=0
+        count = 0
+        for source_dataset in source_datasets:
+
+            # stats
+            if self.args.progress:
+                count = count + 1
+                self.progress("Analysing dataset {}/{} ({} failed)".format(count, len(source_datasets), fail_count))
+
+            try:
+                # determine corresponding target_dataset
+                target_name = self.make_target_name(source_dataset)
+                target_dataset = target_node.get_dataset(target_name)
+
+                # find common snapshots to  verify
+                source_snapshot = source_dataset.find_common_snapshot(target_dataset)
+                target_snapshot = target_dataset.find_snapshot(source_snapshot)
+
+                if source_snapshot is None or target_snapshot is None:
+                    raise(Exception("Cant find common snapshot"))
+
+                target_snapshot.verbose("Verifying...")
+
+                if source_dataset.properties['type']=="filesystem":
+                    verify_filesystem(source_snapshot, source_mnt, target_snapshot, target_mnt, self.args.fs_compare)
+                elif source_dataset.properties['type']=="volume":
+                    verify_volume(source_dataset, source_snapshot, target_dataset, target_snapshot)
+                else:
+                    raise(Exception("{} has unknown type {}".format(source_dataset, source_dataset.properties['type'])))
+
+
+            except Exception as e:
+                # if self.args.progress:
+                #     self.clear_progress()
+
+                fail_count = fail_count + 1
+                target_dataset.error("FAILED: " + str(e))
+                if self.args.debug:
+                    self.verbose("Debug mode, aborting on first error")
+                    raise
+
+        # if self.args.progress:
+        #     self.clear_progress()
+
+        return fail_count
+
+    def run(self):
+
+        source_node=None
+        source_mnt=None
+        target_node=None
+        target_mnt=None
+
+
+        try:
+
+            ################ create source zfsNode
+            self.set_title("Source settings")
+
+            description = "[Source]"
+            source_node = ZfsNode(snapshot_time_format=self.snapshot_time_format, hold_name=self.hold_name, logger=self,
+                                  ssh_config=self.args.ssh_config,
+                                  ssh_to=self.args.ssh_source, readonly=self.args.test,
+                                  debug_output=self.args.debug_output, description=description)
+
+            ################# select source datasets
+            self.set_title("Selecting")
+            source_datasets = source_node.selected_datasets(property_name=self.property_name,
+                                                            exclude_received=self.args.exclude_received,
+                                                            exclude_paths=self.exclude_paths,
+                                                            exclude_unchanged=self.args.exclude_unchanged,
+                                                            min_change=0)
+            if not source_datasets:
+                self.print_error_sources()
+                return 255
+
+            # create target_node
+            self.set_title("Target settings")
+            target_node = ZfsNode(snapshot_time_format=self.snapshot_time_format, hold_name=self.hold_name,
+                                  logger=self, ssh_config=self.args.ssh_config,
+                                  ssh_to=self.args.ssh_target,
+                                  readonly=self.args.test, debug_output=self.args.debug_output,
+                                  description="[Target]")
+            target_node.verbose("Verify datasets under: {}".format(self.args.target_path))
+
+            self.set_title("Verifying")
+
+            source_mnt, target_mnt= create_mountpoints(source_node, target_node)
+
+            fail_count = self.verify_datasets(
+                source_mnt=source_mnt,
+                source_datasets=source_datasets,
+                target_mnt=target_mnt,
+                target_node=target_node)
+
+            if not fail_count:
+                if self.args.test:
+                    self.set_title("All tests successful.")
+                else:
+                    self.set_title("All datasets verified ok")
+
+            else:
+                if fail_count != 255:
+                    self.error("{} dataset(s) failed!".format(fail_count))
+
+            if self.args.test:
+                self.verbose("")
+                self.warning("TEST MODE - DID NOT VERIFY ANYTHING!")
+
+            return fail_count
+
+        except Exception as e:
+            self.error("Exception: " + str(e))
+            if self.args.debug:
+                raise
+            return 255
+        except KeyboardInterrupt:
+            self.error("Aborted")
+            return 255
+        finally:
+
+            # cleanup
+            if source_mnt is not None:
+                cleanup_mountpoint(source_node, source_mnt)
+
+            if target_mnt is not None:
+                cleanup_mountpoint(target_node, target_mnt)
+
+
+
+
+def cli():
+    import sys
+
+    signal(SIGPIPE, sigpipe_handler)
+
+    sys.exit(ZfsAutoverify(sys.argv[1:], False).run())
+
+
+if __name__ == "__main__":
+    cli()
--- a/zfs_autobackup/ZfsCheck.py
+++ b/zfs_autobackup/ZfsCheck.py
@ -0,0 +1,310 @@
+from __future__ import print_function
+
+import time
+from signal import signal, SIGPIPE
+
+from . import util
+from .TreeHasher import TreeHasher
+from .BlockHasher import BlockHasher
+from .ZfsNode import ZfsNode
+from .util import *
+from .CliBase import CliBase
+
+
+class ZfsCheck(CliBase):
+
+    def __init__(self, argv, print_arguments=True):
+
+        # NOTE: common options argument parsing are in CliBase
+        super(ZfsCheck, self).__init__(argv, print_arguments)
+
+        self.node = ZfsNode(self.log, readonly=self.args.test, debug_output=self.args.debug_output)
+
+        self.block_hasher = BlockHasher(count=self.args.count, bs=self.args.block_size, skip=self.args.skip)
+
+    def get_parser(self):
+
+        parser = super(ZfsCheck, self).get_parser()
+
+        # positional arguments
+        parser.add_argument('target', metavar='TARGET', default=None, nargs='?', help='Target to checkum. (can be blockdevice, directory or ZFS snapshot)')
+
+        group = parser.add_argument_group('Checker options')
+
+        group.add_argument('--block-size', metavar="BYTES", default=4096, help="Read block-size, default %(default)s",
+                           type=int)
+        group.add_argument('--count', metavar="COUNT", default=int((100 * (1024 ** 2)) / 4096),
+                           help="Hash chunks of COUNT blocks. Default %(default)s . (CHUNK size is BYTES * COUNT) ", type=int)  # 100MiB
+
+        group.add_argument('--check', '-c', metavar="FILE", default=None, const=True, nargs='?',
+                           help="Read hashes from STDIN (or FILE) and compare them")
+
+        group.add_argument('--skip', '-s', metavar="NUMBER", default=0, type=int,
+                           help="Skip this number of chunks after every hash. %(default)s")
+
+        return parser
+
+    def parse_args(self, argv):
+        args = super(ZfsCheck, self).parse_args(argv)
+
+        if args.test:
+            self.warning("TEST MODE - WILL ONLY DO READ-ONLY STUFF")
+
+        if args.target is None:
+            self.error("Please specify TARGET")
+            sys.exit(1)
+
+        self.verbose("Target               : {}".format(args.target))
+        self.verbose("Block size           : {} bytes".format(args.block_size))
+        self.verbose("Block count          : {}".format(args.count))
+        self.verbose("Effective chunk size : {} bytes".format(args.count*args.block_size))
+        self.verbose("Skip chunk count     : {} (checks {:.2f}% of data)".format(args.skip, 100/(1+args.skip)))
+        self.verbose("")
+
+
+        return args
+
+    def prepare_zfs_filesystem(self, snapshot):
+
+        mnt = "/tmp/" + tmp_name()
+        self.debug("Create temporary mount point {}".format(mnt))
+        self.node.run(["mkdir", mnt])
+        snapshot.mount(mnt)
+        return mnt
+
+    def cleanup_zfs_filesystem(self, snapshot):
+        mnt = "/tmp/" + tmp_name()
+        snapshot.unmount()
+        self.debug("Cleaning up temporary mount point")
+        self.node.run(["rmdir", mnt], hide_errors=True, valid_exitcodes=[])
+
+    # NOTE: https://www.google.com/search?q=Mount+Path+Limit+freebsd
+    # Freebsd has limitations regarding path length, so we have to clone it so the part stays sort
+    def prepare_zfs_volume(self, snapshot):
+        """clone volume, waits and tries to findout /dev path to the volume, in a compatible way. (linux/freebsd/smartos)"""
+
+        clone_name = get_tmp_clone_name(snapshot)
+        clone = snapshot.clone(clone_name)
+
+        # TODO: add smartos location to this list as well
+        locations = [
+            "/dev/zvol/" + clone_name
+        ]
+
+        clone.debug("Waiting for /dev entry to appear in: {}".format(locations))
+        time.sleep(0.1)
+
+        start_time = time.time()
+        while time.time() - start_time < 10:
+            for location in locations:
+                if os.path.exists(location):
+                    return location
+
+                # fake it in testmode
+                if self.args.test:
+                    return location
+
+            time.sleep(1)
+
+        raise (Exception("Timeout while waiting for /dev entry to appear. (looking in: {})".format(locations)))
+
+    def cleanup_zfs_volume(self, snapshot):
+        """destroys temporary volume snapshot"""
+        clone_name = get_tmp_clone_name(snapshot)
+        clone = snapshot.zfs_node.get_dataset(clone_name)
+        clone.destroy(deferred=True, verbose=False)
+
+    def generate_tree_hashes(self, prepared_target):
+
+        tree_hasher = TreeHasher(self.block_hasher)
+        self.debug("Hashing tree: {}".format(prepared_target))
+        for i in tree_hasher.generate(prepared_target):
+            yield i
+
+    def generate_tree_compare(self, prepared_target, input_generator=None):
+
+        tree_hasher = TreeHasher(self.block_hasher)
+        self.debug("Comparing tree: {}".format(prepared_target))
+        for i in tree_hasher.compare(prepared_target, input_generator):
+            yield i
+
+    def generate_file_hashes(self, prepared_target):
+
+        self.debug("Hashing file: {}".format(prepared_target))
+        for i in self.block_hasher.generate(prepared_target):
+            yield i
+
+    def generate_file_compare(self, prepared_target, input_generator=None):
+
+        self.debug("Comparing file: {}".format(prepared_target))
+        for i in self.block_hasher.compare(prepared_target, input_generator):
+            yield i
+
+    def generate_input(self):
+        """parse input lines and yield items to use in compare functions"""
+
+        if self.args.check is True:
+            input_fh=sys.stdin
+        else:
+            input_fh=open(self.args.check, 'r')
+
+        last_progress_time = time.time()
+        progress_checked = 0
+        progress_skipped = 0
+
+        line=input_fh.readline()
+        skip=0
+        while line:
+            i=line.rstrip().split("\t")
+            #ignores lines without tabs
+            if (len(i)>1):
+
+                if skip==0:
+                    progress_checked=progress_checked+1
+                    yield i
+                    skip=self.args.skip
+                else:
+                    skip=skip-1
+                    progress_skipped=progress_skipped+1
+
+                if self.args.progress and time.time() - last_progress_time > 1:
+                    last_progress_time = time.time()
+                    self.progress("Checked {} hashes (skipped {})".format(progress_checked, progress_skipped))
+
+            line=input_fh.readline()
+
+        self.verbose("Checked {} hashes (skipped {})".format(progress_checked, progress_skipped))
+
+    def print_hashes(self, hash_generator):
+        """prints hashes that are yielded by the specified hash_generator"""
+
+        last_progress_time = time.time()
+        progress_count = 0
+
+        for i in hash_generator:
+
+            if len(i) == 3:
+                print("{}\t{}\t{}".format(*i))
+            else:
+                print("{}\t{}".format(*i))
+            progress_count = progress_count + 1
+
+            if self.args.progress and time.time() - last_progress_time > 1:
+                last_progress_time = time.time()
+                self.progress("Generated {} hashes.".format(progress_count))
+
+            sys.stdout.flush()
+
+        self.verbose("Generated {} hashes.".format(progress_count))
+        self.clear_progress()
+
+        return 0
+
+    def print_errors(self, compare_generator):
+        """prints errors that are yielded by the specified compare_generator"""
+        errors = 0
+        for i in compare_generator:
+            errors = errors + 1
+
+            if len(i) == 4:
+                (file_name, chunk_nr, compare_hexdigest, actual_hexdigest) = i
+                print("{}: Chunk {} failed: {} {}".format(file_name, chunk_nr, compare_hexdigest, actual_hexdigest))
+            else:
+                (chunk_nr, compare_hexdigest, actual_hexdigest) = i
+                print("Chunk {} failed: {} {}".format(chunk_nr, compare_hexdigest, actual_hexdigest))
+
+            sys.stdout.flush()
+
+        self.verbose("Total errors: {}".format(errors))
+        self.clear_progress()
+
+        return errors
+
+    def prepare_target(self):
+
+        if "@" in self.args.target:
+            # zfs snapshot
+            snapshot=self.node.get_dataset(self.args.target)
+            if not snapshot.exists:
+                raise Exception("ZFS snapshot {} does not exist!".format(snapshot))
+            dataset_type = snapshot.parent.properties['type']
+
+            if dataset_type == 'volume':
+                return self.prepare_zfs_volume(snapshot)
+            elif dataset_type == 'filesystem':
+                return self.prepare_zfs_filesystem(snapshot)
+            else:
+                raise Exception("Unknown dataset type")
+        return self.args.target
+
+    def cleanup_target(self):
+        if "@" in self.args.target:
+            # zfs snapshot
+            snapshot=self.node.get_dataset(self.args.target)
+            if not snapshot.exists:
+                return
+
+            dataset_type = snapshot.parent.properties['type']
+
+            if dataset_type == 'volume':
+                self.cleanup_zfs_volume(snapshot)
+            elif dataset_type == 'filesystem':
+                self.cleanup_zfs_filesystem(snapshot)
+
+    def run(self):
+
+        compare_generator=None
+        hash_generator=None
+        try:
+            prepared_target=self.prepare_target()
+            is_dir=os.path.isdir(prepared_target)
+
+            #run as compare
+            if self.args.check is not None:
+                input_generator=self.generate_input()
+                if is_dir:
+                    compare_generator = self.generate_tree_compare(prepared_target, input_generator)
+                else:
+                    compare_generator=self.generate_file_compare(prepared_target, input_generator)
+                errors=self.print_errors(compare_generator)
+            #run as generator
+            else:
+                if is_dir:
+                    hash_generator = self.generate_tree_hashes(prepared_target)
+                else:
+                    hash_generator=self.generate_file_hashes(prepared_target)
+
+                errors=self.print_hashes(hash_generator)
+
+        except Exception as e:
+            self.error("Exception: " + str(e))
+            if self.args.debug:
+                raise
+            return 255
+        except KeyboardInterrupt:
+            self.error("Aborted")
+            return 255
+
+        finally:
+            #important to call check_output so that cleanup still functions in case of a broken pipe:
+            # util.check_output()
+
+            #close generators, to make sure files are not in use anymore when cleaning up
+            if hash_generator is not None:
+                hash_generator.close()
+            if compare_generator is not None:
+                compare_generator.close()
+            self.cleanup_target()
+
+        return errors
+
+
+def cli():
+    import sys
+    signal(SIGPIPE, sigpipe_handler)
+
+    sys.exit(ZfsCheck(sys.argv[1:], False).run())
+
+
+if __name__ == "__main__":
+    cli()
--- a/zfs_autobackup/ZfsDataset.py
+++ b/zfs_autobackup/ZfsDataset.py
@ -188,13 +188,15 @@ class ZfsDataset:
        parent according to path

        we cache this so everything in the parent that is cached also stays.
+
+        returns None if there is no parent.
        """
        if self.is_snapshot:
-            return ZfsDataset(self.zfs_node, self.filesystem_name)
+            return self.zfs_node.get_dataset(self.filesystem_name)
        else:
            stripped=self.rstrip_path(1)
            if stripped:
-                return ZfsDataset(self.zfs_node, stripped)
+                return self.zfs_node.get_dataset(stripped)
            else:
                return None

@ -268,7 +270,7 @@ class ZfsDataset:

        self.force_exists = True

-    def destroy(self, fail_exception=False):
+    def destroy(self, fail_exception=False, deferred=False, verbose=True):
        """destroy the dataset. by default failures are not an exception, so we
        can continue making backups

@ -276,13 +278,20 @@ class ZfsDataset:
            :type fail_exception: bool
        """

-        self.verbose("Destroying")
+        if verbose:
+            self.verbose("Destroying")
+        else:
+            self.debug("Destroying")

        if self.is_snapshot:
            self.release()

        try:
-            self.zfs_node.run(["zfs", "destroy", self.name])
+            if deferred and self.is_snapshot:
+                self.zfs_node.run(["zfs", "destroy", "-d", self.name])
+            else:
+                self.zfs_node.run(["zfs", "destroy", self.name])
+
            self.invalidate()
            self.force_exists = False
            return True
@ -378,7 +387,7 @@ class ZfsDataset:
        """
        ret = []
        for name in names:
-            ret.append(ZfsDataset(self.zfs_node, name))
+            ret.append(self.zfs_node.get_dataset(name))

        return ret

@ -641,7 +650,7 @@ class ZfsDataset:
        else:
            valid_exitcodes = [0]

-        self.zfs_node.reset_progress()
+        # self.zfs_node.reset_progress()
        self.zfs_node.run(cmd, inp=pipe, valid_exitcodes=valid_exitcodes)

        # invalidate cache, but we at least know we exist now
@ -735,7 +744,7 @@ class ZfsDataset:
            matches = re.findall("toname = .*@(.*)", line)
            if matches:
                snapshot_name = matches[0]
-                snapshot = ZfsDataset(self.zfs_node, self.filesystem_name + "@" + snapshot_name)
+                snapshot = self.zfs_node.get_dataset(self.filesystem_name + "@" + snapshot_name)
                snapshot.debug("resume token belongs to this snapshot")
                return snapshot

@ -789,10 +798,6 @@ class ZfsDataset:
            # target has nothing yet
            return None
        else:
-            # snapshot=self.find_snapshot(target_dataset.snapshots[-1].snapshot_name)
-
-            # if not snapshot:
-            # try to common snapshot
            for source_snapshot in reversed(self.snapshots):
                if target_dataset.find_snapshot(source_snapshot):
                    source_snapshot.debug("common snapshot")
@ -882,9 +887,7 @@ class ZfsDataset:
        while snapshot:
            # create virtual target snapsho
            # NOTE: with force_exist we're telling the dataset it doesnt exist yet. (e.g. its virtual)
-            virtual_snapshot = ZfsDataset(self.zfs_node,
-                                          self.filesystem_name + "@" + snapshot.snapshot_name,
-                                          force_exists=False)
+            virtual_snapshot = self.zfs_node.get_dataset(self.filesystem_name + "@" + snapshot.snapshot_name, force_exists=False)
            self.snapshots.append(virtual_snapshot)
            snapshot = source_dataset.find_next_snapshot(snapshot, also_other_snapshots)

@ -1118,3 +1121,64 @@ class ZfsDataset:
                    resume_token = None

            source_snapshot = self.find_next_snapshot(source_snapshot, also_other_snapshots)
+
+    def mount(self, mount_point):
+
+        self.debug("Mounting")
+
+        cmd = [
+            "mount", "-tzfs", self.name, mount_point
+        ]
+
+        self.zfs_node.run(cmd=cmd, valid_exitcodes=[0])
+
+    def unmount(self):
+
+        self.debug("Unmounting")
+
+        cmd = [
+            "umount", "-l", self.name
+        ]
+
+
+        self.zfs_node.run(cmd=cmd, valid_exitcodes=[0])
+
+    def clone(self, name):
+        """clones this snapshot and returns ZfsDataset of the clone"""
+
+        self.debug("Cloning to {}".format(name))
+
+        cmd = [
+            "zfs", "clone", self.name, name
+        ]
+
+        self.zfs_node.run(cmd=cmd, valid_exitcodes=[0])
+
+        return self.zfs_node.get_dataset(name, force_exists=True)
+
+    def set(self, prop, value):
+        """set a zfs property"""
+
+        self.debug("Setting {}={}".format(prop, value))
+
+        cmd = [
+            "zfs", "set", "{}={}".format(prop, value), self.name
+        ]
+
+        self.zfs_node.run(cmd=cmd, valid_exitcodes=[0])
+
+        self.invalidate()
+
+    def inherit(self, prop):
+        """inherit zfs property"""
+
+        self.debug("Inheriting property {}".format(prop))
+
+        cmd = [
+            "zfs", "inherit", prop, self.name
+        ]
+
+        self.zfs_node.run(cmd=cmd, valid_exitcodes=[0])
+
+        self.invalidate()
+
--- a/zfs_autobackup/ZfsNode.py
+++ b/zfs_autobackup/ZfsNode.py
@ -17,7 +17,7 @@ from .ExecuteNode import ExecuteError
 class ZfsNode(ExecuteNode):
    """a node that contains zfs datasets. implements global (systemwide/pool wide) zfs commands"""

-    def __init__(self, snapshot_time_format, hold_name, logger, ssh_config=None, ssh_to=None, readonly=False,
+    def __init__(self, logger, snapshot_time_format="", hold_name="", ssh_config=None, ssh_to=None, readonly=False,
                 description="",
                 debug_output=False, thinner=None):

@ -32,9 +32,9 @@ class ZfsNode(ExecuteNode):
            self.verbose("Using custom SSH config: {}".format(ssh_config))

        if ssh_to:
-            self.verbose("Datasets on: {}".format(ssh_to))
-        else:
-            self.verbose("Datasets are local")
+            self.verbose("SSH to: {}".format(ssh_to))
+        # else:
+        #     self.verbose("Datasets are local")

        if thinner is not None:
            rules = thinner.human_rules()
@ -48,6 +48,7 @@ class ZfsNode(ExecuteNode):

        # list of ZfsPools
        self.__pools = {}
+        self.__datasets = {}

        self._progress_total_bytes = 0
        self._progress_start_time = time.time()
@ -55,6 +56,7 @@ class ZfsNode(ExecuteNode):
        ExecuteNode.__init__(self, ssh_config=ssh_config, ssh_to=ssh_to, readonly=readonly, debug_output=debug_output)

    def thin(self, objects, keep_objects):
+        # NOTE: if thinning is disabled with --no-thinning, self.__thinner will be none.
        if self.__thinner is not None:
            return self.__thinner.thin(objects, keep_objects)
        else:
@ -92,17 +94,25 @@ class ZfsNode(ExecuteNode):

        return True

-    # TODO: also create a get_zfs_dataset() function that stores all the objects in a dict. This should optimize
-    #  caching a bit and is more consistent.
-    def get_zfs_pool(self, name):
-        """get a ZfsPool() object from specified name. stores objects internally to enable caching"""
+    def get_pool(self, dataset):
+        """get a ZfsPool() object from dataset. stores objects internally to enable caching"""

-        return self.__pools.setdefault(name, ZfsPool(self, name))
+        if not isinstance(dataset, ZfsDataset):
+            raise (Exception("{} is not a ZfsDataset".format(dataset)))

-    def reset_progress(self):
-        """reset progress output counters"""
-        self._progress_total_bytes = 0
-        self._progress_start_time = time.time()
+        zpool_name = dataset.name.split("/")[0]
+
+        return self.__pools.setdefault(zpool_name, ZfsPool(self, zpool_name))
+
+    def get_dataset(self, name, force_exists=None):
+        """get a ZfsDataset() object from name. stores objects internally to enable caching"""
+
+        return self.__datasets.setdefault(name, ZfsDataset(self, name, force_exists))
+
+    # def reset_progress(self):
+    #     """reset progress output counters"""
+    #     self._progress_total_bytes = 0
+    #     self._progress_start_time = time.time()

    def parse_zfs_progress(self, line, hide_errors, prefix):
        """try to parse progress output of zfs recv -Pv, and don't show it as error to the user """
@ -122,9 +132,15 @@ class ZfsNode(ExecuteNode):
            # actual useful info
            if len(progress_fields) >= 3:
                if progress_fields[0] == 'full' or progress_fields[0] == 'size':
+                    # Reset the total bytes and start the timer again (otherwise the MB/s
+                    # counter gets confused)
                    self._progress_total_bytes = int(progress_fields[2])
+                    self._progress_start_time = time.time()
                elif progress_fields[0] == 'incremental':
+                    # Reset the total bytes and start the timer again (otherwise the MB/s
+                    # counter gets confused)
                    self._progress_total_bytes = int(progress_fields[3])
+                    self._progress_start_time = time.time()
                elif progress_fields[1].isnumeric():
                    bytes_ = int(progress_fields[1])
                    if self._progress_total_bytes:
@ -178,7 +194,7 @@ class ZfsNode(ExecuteNode):
                continue

            # force_exist, since we're making it
-            snapshot = ZfsDataset(dataset.zfs_node, dataset.name + "@" + snapshot_name, force_exists=True)
+            snapshot = self.get_dataset(dataset.name + "@" + snapshot_name, force_exists=True)

            pool = dataset.split_path()[0]
            if pool not in pools:
@ -238,7 +254,7 @@ class ZfsNode(ExecuteNode):

        for line in lines:
            (name, value, raw_source) = line
-            dataset = ZfsDataset(self, name)
+            dataset = self.get_dataset(name, force_exists=True)

            # "resolve" inherited sources
            sources[name] = raw_source
--- a/zfs_autobackup/init.py
+++ b/zfs_autobackup/init.py
@ -1,9 +1,3 @@



-def cli():
-    import sys
-    from .ZfsAutobackup import ZfsAutobackup
-
-    zfs_autobackup = ZfsAutobackup(sys.argv[1:], False)
-    sys.exit(zfs_autobackup.run())
--- a/zfs_autobackup/main.py
+++ b/zfs_autobackup/main.py
@ -4,7 +4,4 @@

 import sys

-if __name__ == "__main__":
-    from . import cli
-    cli()

--- a/zfs_autobackup/test.py
+++ b/zfs_autobackup/test.py
@ -0,0 +1,129 @@
+import os.path
+import os
+import subprocess
+import sys
+import time
+from signal import signal, SIGPIPE
+
+import util
+
+signal(SIGPIPE, util.sigpipe_handler)
+
+
+try:
+    print ("voor eerste")
+    raise Exception("eerstre")
+except Exception as e:
+    print ("voor tweede")
+    raise Exception("tweede")
+finally:
+    print ("JO")
+
+def generator():
+
+    try:
+        util.deb('in generator')
+        print ("TRIGGER SIGPIPE")
+        sys.stdout.flush()
+        util.deb('after trigger')
+
+        # if False:
+        yield ("bla")
+        # yield ("bla")
+
+    except GeneratorExit as e:
+        util.deb('GENEXIT '+str(e))
+        raise
+
+    except Exception as e:
+        util.deb('EXCEPT '+str(e))
+    finally:
+        util.deb('FINALLY')
+        print("nog iets")
+        sys.stdout.flush()
+        util.deb('after print in finally WOOP!')
+
+
+util.deb('START')
+g=generator()
+util.deb('after generator')
+for bla in g:
+    # print ("heb wat ontvangen")
+    util.deb('ontvangen van gen')
+    break
+    # raise Exception("moi")
+
+    pass
+raise Exception("moi")
+
+util.deb('after for')
+
+while True:
+    pass
+
+#
+# with open('test.py', 'rb') as fh:
+#
+#     # fsize = fh.seek(10000, os.SEEK_END)
+#     # print(fsize)
+#
+#     start=time.time()
+#     for i in range(0,1000000):
+#         # fh.seek(0, 0)
+#         fsize=fh.seek(0, os.SEEK_END)
+#         # fsize=fh.tell()
+#         # os.path.getsize('test.py')
+#     print(time.time()-start)
+#
+#
+#     print(fh.tell())
+#
+# sys.exit(0)
+#
+#
+#
+# checked=1
+# skipped=1
+# coverage=0.1
+#
+# max_skip=0
+#
+#
+# skipinarow=0
+# while True:
+#     total=checked+skipped
+#
+#     skip=coverage<random()
+#     if skip:
+#         skipped = skipped + 1
+#         print("S {:.2f}%".format(checked * 100 / total))
+#
+#         skipinarow = skipinarow+1
+#         if skipinarow>max_skip:
+#             max_skip=skipinarow
+#     else:
+#         skipinarow=0
+#         checked=checked+1
+#         print("C {:.2f}%".format(checked * 100 / total))
+#
+#     print(max_skip)
+#
+# skip=0
+# while True:
+#
+#     total=checked+skipped
+#     if skip>0:
+#         skip=skip-1
+#         skipped = skipped + 1
+#         print("S {:.2f}%".format(checked * 100 / total))
+#     else:
+#         checked=checked+1
+#         print("C {:.2f}%".format(checked * 100 / total))
+#
+#         #calc new skip
+#         skip=skip+((1/coverage)-1)*(random()*2)
+#         # print(skip)
+#         if skip> max_skip:
+#             max_skip=skip
+#
+#     print(max_skip)
--- a/zfs_autobackup/util.py
+++ b/zfs_autobackup/util.py
@ -0,0 +1,65 @@
+# root@psyt14s:/home/psy/zfs_autobackup# ls -lh /home/psy/Downloads/carimage.zip
+# -rw-rw-r-- 1 psy psy 990M Nov 26  2020 /home/psy/Downloads/carimage.zip
+# root@psyt14s:/home/psy/zfs_autobackup# time sha1sum /home/psy/Downloads/carimage.zip
+# a682e1a36e16fe0d0c2f011104f4a99004f19105  /home/psy/Downloads/carimage.zip
+#
+# real	0m2.558s
+# user	0m2.105s
+# sys	0m0.448s
+# root@psyt14s:/home/psy/zfs_autobackup# time python3 -m zfs_autobackup.ZfsCheck
+#
+# real	0m1.459s
+# user	0m0.993s
+# sys	0m0.462s
+
+# NOTE: surprisingly sha1 in via python3 is faster than the native sha1sum utility, even in the way we use below!
+import os
+import platform
+import sys
+
+
+def tmp_name(suffix=""):
+    """create temporary name unique to this process and node. always retruns the same result during the same execution"""
+
+    #we could use uuids but those are ugly and confusing
+    name="{}-{}-{}".format(
+        os.path.basename(sys.argv[0]).replace(" ","_"),
+        platform.node(),
+        os.getpid())
+    name=name+suffix
+    return name
+
+
+def get_tmp_clone_name(snapshot):
+    pool=snapshot.zfs_node.get_pool(snapshot)
+    return pool.name+"/"+tmp_name()
+
+
+
+def output_redir():
+    """use this after a BrokenPipeError to prevent further exceptions.
+    Redirects stdout/err to /dev/null
+    """
+
+    devnull = os.open(os.devnull, os.O_WRONLY)
+    os.dup2(devnull, sys.stdout.fileno())
+    os.dup2(devnull, sys.stderr.fileno())
+
+def sigpipe_handler(sig, stack):
+    #redir output so we dont get more SIGPIPES during cleanup. (which my try to write to stdout)
+    output_redir()
+    deb('redir')
+
+# def check_output():
+#     """make sure stdout still functions. if its broken, this will trigger a SIGPIPE which will be handled by the sigpipe_handler."""
+#     try:
+#         print(" ")
+#         sys.stdout.flush()
+#     except Exception as e:
+#         pass
+
+# def deb(txt):
+#     with open('/tmp/debug.log', 'a') as fh:
+#         fh.write("DEB: "+txt+"\n")
+
+
Author	SHA1	Message	Date
Edwin Eefting	244509a006	added console reference	2022-03-08 17:51:23 +01:00
Edwin Eefting	f9d3576752	nicer errors	2022-03-08 17:35:51 +01:00
Edwin Eefting	75161c1bd2	refactorred ZfsCheck.py for better sigpipe handling	2022-03-08 17:22:08 +01:00
Edwin Eefting	5d7d6f6a6c	remove random	2022-03-07 23:11:46 +01:00
Edwin Eefting	7c372cf211	test check skipping	2022-03-07 22:59:50 +01:00
Edwin Eefting	8854303b7a	test skipping	2022-03-07 21:57:36 +01:00
Edwin Eefting	233745c345	reworking block skipper	2022-03-07 21:08:56 +01:00
Edwin Eefting	b68ca19e5f	wip	2022-03-07 19:34:13 +01:00
Edwin Eefting	28ed44b1c8	wip	2022-03-07 19:34:01 +01:00
Edwin Eefting	1cedea5f5f	zfscheck wip	2022-02-23 21:31:00 +01:00
Edwin Eefting	d99c202e75	fix	2022-02-23 21:21:07 +01:00
Edwin Eefting	44c6896ddd	merged v3.1.2-rc2	2022-02-23 20:43:49 +01:00
Edwin Eefting	8276d07feb	fix	2022-02-22 19:52:16 +01:00
Edwin Eefting	82ad7c2480	more tests	2022-02-22 19:25:15 +01:00
Edwin Eefting	f29cf13db3	test compare as well	2022-02-22 18:48:51 +01:00
Edwin Eefting	0c6c75bf58	cleaner progress clearing	2022-02-22 18:41:54 +01:00
Edwin Eefting	f4e81bddb7	progress output	2022-02-22 18:00:06 +01:00
Edwin Eefting	f530cf40f3	fixes. supports stdin	2022-02-22 17:40:38 +01:00
Edwin Eefting	e7e1590919	can also be used on paths and files now	2022-02-22 17:18:15 +01:00
Edwin Eefting	0d882ec031	comparing input now functions	2022-02-22 16:59:08 +01:00
Edwin Eefting	6a58a294a3	now yields errors and mismatches	2022-02-22 14:47:15 +01:00
Edwin Eefting	3f755fcc69	moved tests	2022-02-21 22:38:56 +01:00
Edwin Eefting	d7d76032de	more tests	2022-02-21 22:37:13 +01:00
Edwin Eefting	b7e10242b9	itertools is nice :)	2022-02-21 21:39:03 +01:00
Edwin Eefting	bcc7983492	tree compare	2022-02-21 17:51:23 +01:00
Edwin Eefting	490b293ba1	block compare	2022-02-21 14:27:22 +01:00
Edwin Eefting	2d42d1d1a5	forgot a test	2022-02-21 14:02:45 +01:00
Edwin Eefting	a2f85690a3	extract BlockHasher and TreeHasher classes	2022-02-21 13:49:05 +01:00
Edwin Eefting	a807ec320e	zfs-check broken pipe handling tests for volumes	2022-02-21 13:01:45 +01:00
Edwin Eefting	3e6a327647	zfs-check broken pipe handling tests	2022-02-21 12:31:19 +01:00
Edwin Eefting	ed61f03b4b	zfs-check fixes and tests	2022-02-21 11:40:40 +01:00
Edwin Eefting	f397e7be59	python2 compat	2022-02-21 11:01:07 +01:00
Edwin Eefting	b60dd4c109	wip (will usse zfs-check to do actual hashing)	2022-02-21 00:46:54 +01:00
Edwin Eefting	10a85ff0b7	fixes	2022-02-21 00:46:36 +01:00
Edwin Eefting	770389156a	test basicas of zfscheck	2022-02-21 00:44:38 +01:00
Edwin Eefting	bb9ce25a37	correct brokenpipe handling	2022-02-21 00:02:30 +01:00
Edwin Eefting	2fe008acf5	zfs-check basic version complete	2022-02-20 18:03:17 +01:00
Edwin Eefting	14c45d2b34	zfs check initial version (wip)	2022-02-20 17:39:17 +01:00
Edwin Eefting	a115f0bd17	zfs check initial version (wip)	2022-02-20 17:30:02 +01:00
Edwin Eefting	626c84fe47	test data	2022-02-20 13:04:49 +01:00
Edwin Eefting	4d27b3b6ea	incremental block hasher (for zfs-verify)	2022-02-20 12:59:43 +01:00
Edwin Eefting	3ca1bce9b2	extracted clibase class (for zfs-check tool)	2022-02-20 11:32:43 +01:00
Edwin Eefting	f0d00aa4e8	extracted clibase class (for zfs-check tool)	2022-02-20 11:03:57 +01:00
Edwin Eefting	60560b884b	cleaned up progress stuff	2022-02-19 18:10:10 +01:00
DatuX	af9d768410	Merge pull request #118 from xrobau/master Fix MB/s calculations on multiple transfers	2022-02-19 18:00:02 +01:00
DatuX	f990c2565a	Update README.md	2022-02-19 08:09:16 +01:00
DatuX	af179fa424	Update README.md	2022-02-19 08:03:05 +01:00
DatuX	355aa0e84b	Create codeql-analysis.yml	2022-02-19 07:45:55 +01:00
Rob Thomas	494b41f4f1	Fix MB/s calculations on multiple transfers	2022-02-17 16:15:05 +10:00
Edwin Eefting	ef532d3ffb	cleanup	2022-02-09 14:25:22 +01:00
Edwin Eefting	7109873884	added pipe=true parameter to script	2022-02-09 14:18:10 +01:00
Edwin Eefting	acb0172ddf	more tests	2022-02-09 12:24:24 +01:00
DatuX	53db61de96	Merge pull request #116 from parke/master Fix two typos in README.md.	2022-02-05 08:40:55 +01:00
parke	3a947e5fee	Fix two typos in README.md.	2022-02-04 22:50:47 -08:00
Edwin Eefting	8233e7b35e	script mode testing and fixes	2022-01-29 10:10:18 +01:00
Edwin Eefting	e1fb7a37be	script mode testing and fixes	2022-01-28 23:59:50 +01:00
Edwin Eefting	2ffd3baf77	cmdpipe manual piping/parallel executing tested and done	2022-01-27 18:22:20 +01:00
Edwin Eefting	a8b43c286f	suppress exclude recieved warning when its already specified. #101	2022-01-27 16:12:17 +01:00
Edwin Eefting	609ad19dd9	refactorred stdout piping a bit to allow manual piping	2022-01-27 13:02:41 +01:00
Edwin Eefting	f2761ecee8	Merge remote-tracking branch 'origin/master'	2022-01-27 11:16:32 +01:00
Edwin Eefting	86706ca24f	script mode wip	2022-01-27 11:16:19 +01:00
Edwin Eefting	88d856d813	previous changes and this fix improved caching (less runs in test_scaling.py)	2022-01-27 11:02:11 +01:00
Edwin Eefting	81d0bee7ae	comments	2022-01-26 23:59:13 +01:00
Edwin Eefting	fa3f44a045	replaced tar verification with much better find/md5sum.	2022-01-24 23:25:55 +01:00
Edwin Eefting	02dca218b8	ExecuteNode.py now supports running from a certain directory	2022-01-24 23:08:09 +01:00
Edwin Eefting	89ed1e012d	cleanup	2022-01-24 17:22:44 +01:00
Edwin Eefting	ff9beae427	create temporary clone to verify volumes	2022-01-24 16:55:20 +01:00
Edwin Eefting	302a9ecd86	more consistent creation of ZfsDataset and ZfsPool via ZfsNode.get_dataset() and ZfsNode.get_pool()	2022-01-24 16:29:32 +01:00
Edwin Eefting	c0086f8953	added tar-mode. moved static methods. more compatible /dev checking without udevadm	2022-01-24 13:53:32 +01:00
Edwin Eefting	ddd82b935b	show test output	2022-01-24 12:31:28 +01:00
Edwin Eefting	51d6731aa8	settle udev devices when	2022-01-24 11:46:34 +01:00
Edwin Eefting	36f2b672bd	more zfs-verify tests	2022-01-24 11:41:51 +01:00
Edwin Eefting	81a785b360	more zfs-verify tests	2022-01-24 11:37:42 +01:00
Edwin Eefting	670532ef31	pythonversion agnostic	2022-01-24 11:02:56 +01:00
Edwin Eefting	dd55ca4079	zfs-autoverify wip (basics start to function)	2022-01-24 00:18:27 +01:00
Edwin Eefting	f66957d867	zfs-autoverify wip	2022-01-23 23:01:53 +01:00
Edwin Eefting	69975b37fb	zfs-autoverify wip	2022-01-23 21:36:56 +01:00
Edwin Eefting	c299626d18	debug mode implies verbose mode now	2022-01-23 21:22:46 +01:00
Edwin Eefting	7b4f10080f	zfs-verify wip (not functional yet)	2022-01-19 00:11:27 +01:00
Edwin Eefting	787e3dba9c	zfs-verify stuff	2022-01-18 23:46:08 +01:00
Edwin Eefting	86d504722c	zfs-verify stuff	2022-01-18 20:54:19 +01:00
Edwin Eefting	6791bc4abd	ready to implement zfs-autoverify	2022-01-18 01:02:01 +01:00
Edwin Eefting	db5186bf38	ready to implement zfs-autoverify	2022-01-18 00:11:52 +01:00
Edwin Eefting	d2b183bb27	move more ulgy stuff to parse_args	2022-01-17 23:34:22 +01:00
Edwin Eefting	033fcf68f7	move exclude_paths and exclude_received to common	2022-01-17 23:10:35 +01:00
Edwin Eefting	14d45667de	fixes	2022-01-17 22:54:27 +01:00
Edwin Eefting	f2a3221911	fixes	2022-01-17 22:34:18 +01:00
Edwin Eefting	8baee52ab1	greatly improved output of help (divided into sections)	2022-01-17 22:26:42 +01:00
Edwin Eefting	d114f63f29	extract common stuff to prepare for zfs-autoverify	2022-01-17 21:19:40 +01:00
DatuX	b36b64cc94	Update FUNDING.yml	2022-01-12 00:20:30 +01:00
DatuX	5a70172a50	Update FUNDING.yml	2022-01-12 00:20:17 +01:00
DatuX	f635e8cd67	Create FUNDING.yml	2022-01-12 00:16:01 +01:00
Edwin Eefting	0e362e5d89	moved stuff to wiki	2022-01-07 11:56:32 +01:00
Edwin Eefting	f2ab2938b0	moved stuff to wiki	2022-01-07 11:54:36 +01:00
Edwin Eefting	2d96d13125	moved stuff to wiki	2022-01-07 11:53:29 +01:00
Edwin Eefting	883984fda3	Revert "Initial ZFS clones support" Woops accidently committed this, still need to review/change it before comitting. This reverts commit `e11c332808`.	2022-01-04 22:48:25 +01:00
Edwin Eefting	db2625b08c	fix #101	2022-01-04 22:26:44 +01:00
Phil Krylov	e11c332808	Initial ZFS clones support	2021-12-21 20:09:41 +01:00
				`@ -0,0 +1 @@`
				xC<78><43>ʟ<EFBFBD>ZG<5A><47>М<EFBFBD><D09C><EFBFBD>?<3F><><1D>ZG<>#<0F><>,<>ƻ<>Q=<3D>><3E>ك1<D983>NU<4E><15>u<>{Zj;<3B>`<60><19><19><>Dv<44><76>Q<EFBFBD>j<EFBFBD>voQFN<46><4E><EFBFBD><EFBFBD><EFBFBD>;3Sa<53>R<EFBFBD>^2Z<32><5A>