Streamflow Percentile Calculator
All of the streamflow maps require pre-computed percentile classes to be available to compare recent data against. All of the streamflow maps provide a comparison of data (whether real-time, daily, or averaged over longer times) to historical streamflow for the day of the year. This requires percentile bins reflecting historical values (or lack thereof) for all active gages in the country for each day of the year.
I would recommend that code for this calculation exists in its own repository, and is possibly its own Python package as the scale of the computation and possibly the scale of the data that needs to be stored, will require deployment in the cloud. Creation of a package will also enable code to be independently developed, tested, reviewed, and managed from other aspects of the water-replacement work.
Some of the things this calculator needs to be able to are:
- Handle reporting of streamflow under various parameter codes in different locations
- Resolve low/high flows not measured properly (and/or other non-physical data)
- Filter and only calculate historical streamflow percentiles for sites with long enough data histories
- Reconcile any management of streams and historic / modern streamflow values (i.e., how do we handle a formerly unmanaged site that is now highly controlled?)
- Ultimately determine percentile bins (Low, <10, 10-24, 25-75, 76-90, >90, High) for each site for each day of the year; need this to be flexible in some way as we do not know whether these values will be generated for the whole year annually, or maybe monthly for the upcoming month, or daily for the next day
- Only use "Approved" streamflow values in the percentile class calculations