This is one meaning of "streaming"---provide the input in incremental chunks and, each time, return the updated results. As per Mark Sutton's algorithm, this can be done without retaining the full input. Only finite buffers are needed to store the intermediate state.
This does not make much sense in the realm of HDF5 inputs, so it should probably only be done after #10.
It's also important to measure whether adding this incremental capability and memory efficiency forces to trade away some performance. It's possible we'll decide to maintain two implementations if the performance difference is large.