It's yet another tool: mbuffer

mbuffer reads data from an input and writes it to one or more outputs. The more important thing though is the buffering, as you can just tell it to use X amount of memory as a buffer, the default is usually 2 MBytes. Typically the input is stdin and the output stdout, so it works well in a pipe like zfs send | mbuffer | zfs receive but files or TCP connections are possible as well, so it can replace cat and netcat. Tape drives and autoloader have some support as well, but I have no experience with that.

Pipes in shell scripts or directly on the command line are so common that you barely think about them, but they have a serious limitation. They block nearly instantly if the other end of the pipe is not ready to receive the data. This is good and bad at the same time. Good: The sender immediately notices that the receiver has failed, for example. Bad: Unless both ends of the pipe can send and receive data at the same time you lose throughput. Imagine a sender that can average 1 MByte/s output, while using a chunk size of 1 MByte, that means that the sender tries to shove 1 MBytes into the pipe as fast as possible, then has some work to do for nearly a second (e.g. waiting on a hard disk) and then tries to send another 1 MByte chunk. If the receiving end of the pipe does not use the exact same chunk size and e.g. has to do work after each 100 Kilobytes for about a tenth of a second, that means that the receiver is also able to accept about 1 MByte/s, but the reality will look completely different: The sender will generate 1 MByte data and sends the first 100 Kbyte through the pipe, then it locks up, waiting for the reciever for about 0.1 seconds, sending the next 100 Kbyte, waiting again... repeat ten times. In total generating this 1 MByte and sending it through the pipe will take roughly 2 seconds. Meaning that the data rate has just been halved. The generation only takes one second, while the 2nd second has just been wasted trying to shove that data into the pipe.

If you insert mbuffer between those two programs, it looks completely different. While the internal buffer in mbuffer isn't full, it will continue to accept new data from the sender. At the same time, whenever the receiver is ready to accept data, mbuffer will be able to send data from its internal buffer, unless it's empty. With the 2 MByte default buffer size that mbuffer is using, the above example should be running at roughly the full speed of 1 MByte/s, though with a data chunk size of 1 MBytes, I'd probably increase the buffer size to 4 MBytes, just to avoid possible choke points.

The downside is obvious: All that data has to be copied around in memory twice as much, increasing CPU usage. Another downside is that the sending process may think that the receiver got all the data, but actually it's just in mbuffer. This means that mbuffer mainly shines in scenarios where both sides have possible choke points, e.g. because they read and write to relatively slow disks or there's a somewhat unreliable network connection like Wifi in the middle that can choke because of retransmissions.

Another nice bonus: mbuffer displays a running status of the amount of data that went through the pipe:

$ mbuffer < /dev/zero > /dev/null
in @ 9529 MiB/s, out @ 9511 MiB/s, 18.8 GiB total, buffer   1% full ^C
mbuffer: error: error closing input: Bad file descriptor
mbuffer: warning: error during output to <stdout>: canceled
summary: 18.8 GiByte in  1.8sec - average of 10.4 GiB/s

Reading from /dev/zero and sending into /dev/null obviously is only useful as a benchmark, but it works as a nice example.

Homepage: First version is from 2001 and it is still actively being developed, though the basic featureset has been stable for a long time now. It's available on at least Debian stable and NixOS.