red_river_2018 Conversion to Zarr
Checklist for Workflow associated with dataset conversion: ----- Dataset Name: Very High-Resolution Dynamic Downscaling of Regional Climate for Use in Long-term Hydrologic Planning along the Red River Valley System https://cida.usgs.gov/thredds/catalog/demo/thredds/red_river_2018/catalog.html https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/red_river_2018 ``` Dataset { Float32 HFX[Time = 315576][south_north = 165][west_east = 248]; Float32 LH[Time = 315576][south_north = 165][west_east = 248]; Float32 PBLH[Time = 315576][south_north = 165][west_east = 248]; Float32 Q2[Time = 315576][south_north = 165][west_east = 248]; Float32 RAINC[Time = 315576][south_north = 165][west_east = 248]; Float32 RAINNC[Time = 315576][south_north = 165][west_east = 248]; Float32 SWDOWN[Time = 315576][south_north = 165][west_east = 248]; Float32 T2[Time = 315576][south_north = 165][west_east = 248]; Float32 U10[Time = 315576][south_north = 165][west_east = 248]; Float32 V10[Time = 315576][south_north = 165][west_east = 248]; String time_char[Time = 315576]; Int32 Time[Time = 315576]; Float32 XLAT[south_north = 165][west_east = 248]; Float32 XLONG[south_north = 165][west_east = 248]; } red_river_2018; ``` Recommend working from OPeNDAP ----- - [x] Identify Source Data location and access (check the [dataset spreadsheet](https://doimspp.sharepoint.com/:x:/r/sites/gs-wma-impd-nhgf2/_layouts/15/doc2.aspx?sourcedoc=%7BE4F87A7F-EC9C-4164-BDC1-70337E34B9EB%7D&file=STAC%20Dataset%20Tracking.xlsx&action=default&mobileredirect=true&DefaultItemOpen=1&isSPOFile=1&clickparams=eyJBcHBOYW1lIjoiVGVhbXMtRGVza3RvcCIsIkFwcFZlcnNpb24iOiIyNy8yMjEwMjgwNzIwMCIsIkhhc0ZlZGVyYXRlZFVzZXIiOmZhbHNlfQ%3D%3D&cid=5b00836a-9892-4d0e-8057-8f656abefa6c&wdOrigin=TEAMS-ELECTRON.p2p_ns.bim&wdExp=TEAMS-CONTROL&wdhostclicktime=1697653648774&web=1)) - > - [x] Collect ownership information (Who do we ask questions of if we have problems?) - > - [ ] Create new workflow notebook from template; stash in the `./workflows` folder tree in an appropriate spot. - [x] Identify landing spot on S3 (currently somewhere in: https://s3.console.aws.amazon.com/s3/buckets/nhgf-development?prefix=workspace/&region=us-west-2) - [x] Calculate chunking, layout, compression, etc - [x] Run notebook - [x] Read test (pattern to be determined by the dataset) - [ ] Create STAC catalog entry; - [ ] Verify all metadata - [ ] Create entry - [x] Reportage - [x] add notebook and the dask performance report to the repo - [ ] Calculate summary statistics on output (compression ratio, total size) - [ ] Save STAC JSON snippet to repo - [ ] Merge and close the issue.
issue