Runoff calculation perfect overlap bug
Currently, the runoff calculation function does the following for a given HUC with a list of sites that intersect it and a dictionary of site runoff data:
- Determines which intersecting sites have ANY data.
- Uses those intersecting sites to determine (based on proportions and weights) which sites should be used to calculate runoff, and PRIORITIZES sites that closely overlap the huc, REGARDLESS OF HOW MUCH DATA WERE COLLECTED AT THOSE SITES.
- If a site has near perfect overlap, the function ignores all other runoff data and only calculates HUC-level runoff using the near perfect overlap site and the dates represented by near perfect overlap.
I noticed this issue when composing the validation notebook for runoff. I started with one HUC (01010004), with runoff data in WW from 1902 to 2022. I ran the same HUC through the hyswap runoff function, and I found that the returned dataset only had runoff estimates for 1903. This is because in the year 1903, there was a streamgage basin collecting data that had near perfect overlap with the HUC. The hyswap function identified this perfect overlap, and used the runoff from the streamgage to represent runoff for the HUC, and then ignored all other dates/runoff estimates.
This can be fixed by calculating perfect overlap runoff and if the entire date range is not represented, calculate estimated runoff using all other gages. Then, we can stitch together dates with perfect site overlap with dates where multiple gages are used to estimate runoff.