Incorperate dask to improve multiprocessing capabilities and arbitrarily scale the library! (and make a Xarray terrain engine?)
While we are very happy with the new support of N dimension parameter grids added in version 2.0, accumulating a parameter across the 288 5-minute time steps in a day for example may be computationally intensive. The USGS's supercomputing resources are one way to get around this, **but it is entirely feasible to scale this library up to a next-generational level of performance by natively supporting `dask.distributed` parallelization!**
For example, if `fcpgtools` was able to check for the presence of `dask.distributed` in the virtual environment, we could then chunk our N-dimensional `xarray.DataArray` along the time dimension, and then apply the accumulate parameter function at M time steps simultaneously (where M is the # of cores being utilized by a `dask.distributed.Client()`.
This could be even more efficient if we supported accumulation functions in `xarray` nmatively, rather than swapping in and out of custom `pysheds.Raster/Grid` objects. That said, most the performance gain is possible with the existing `pysheds` terrain engine given integration of `dask.distributed` and using `dask.distributed.futures` to rebuild our N dimension output in the desired way.
Very exciting stuff! We could image a work where hundreds of cores could process a timeseries of `xarray` data at once with only one time step loaded into memory per core.
One limitation is that we can only chunk along the time axis, as accumulation algos need to be able to cross lat/long chunk boundaries, which would actually slow down performance due to communication between multiprocesses.
issue