Skip to content

hawaii_2018 Conversion to Zarr

Checklist for Workflow associated with dataset conversion:

Dataset Name: Very fine resolution dynamically downscaled climate data for Hawaii

https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_hawaii_present
https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_maui_present
https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_hawaii_rcp45
https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_maui_rcp45
https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_hawaii_rcp85
https://cida.usgs.gov/thredds/catalog.html?dataset=cida.usgs.gov/hawaii_maui_rcp85

https://cida.usgs.gov/thredds/catalog/demo/thredds/hawaii_2018/catalog.html

Large collection of WRF output e.g.

Dataset {
    Float32 HGT[south_north = 205][west_east = 180];
    Float32 LANDMASK[south_north = 205][west_east = 180];
    Float32 XLAT[south_north = 205][west_east = 180];
    Float32 XLONG[south_north = 205][west_east = 180];
    Int16 CFRACL[Time = 175296][south_north = 205][west_east = 180];
    Int16 CFRACT[Time = 175296][south_north = 205][west_east = 180];
    Float32 FGDP[Time = 175296][south_north = 205][west_east = 180];
    Float32 GLW[Time = 175296][south_north = 205][west_east = 180];
    Float32 GRDFLX[Time = 175296][south_north = 205][west_east = 180];
    Float32 GSW[Time = 175296][south_north = 205][west_east = 180];
    Float32 HFX[Time = 175296][south_north = 205][west_east = 180];
    Int32 I_RAINNC[Time = 175296][south_north = 205][west_east = 180];
    Float32 LAI[Time = 175296][south_north = 205][west_east = 180];
    Float32 LH[Time = 175296][south_north = 205][west_east = 180];
    Int16 LU_INDEX[Time = 175296][south_north = 205][west_east = 180];
    Float32 LWP[Time = 175296][south_north = 205][west_east = 180];
    Float32 PSFC[Time = 175296][south_north = 205][west_east = 180];
    Float32 Q2[Time = 175296][south_north = 205][west_east = 180];
    Float32 RAINNC[Time = 175296][south_north = 205][west_east = 180];
    Float32 SNOW[Time = 175296][south_north = 205][west_east = 180];
    Int16 SNOWC[Time = 175296][south_north = 205][west_east = 180];
    Float32 SNOWH[Time = 175296][south_north = 205][west_east = 180];
    Float32 T2[Time = 175296][south_north = 205][west_east = 180];
    Float32 TSK[Time = 175296][south_north = 205][west_east = 180];
    Int32 Time[Time = 175296];
    Float32 U10[Time = 175296][south_north = 205][west_east = 180];
    Float32 V10[Time = 175296][south_north = 205][west_east = 180];
} hawaii_hawaii_present;

May be too large to work against OPeNDAP -- ~15GB per year split into two domains for 20 years

Recommend testing OPeNDAP access speed and switching to direct if absolutely necessary.


  • Identify Source Data location and access (check the dataset spreadsheet)
  • Collect ownership information (Who do we ask questions of if we have problems?)
  • Create new workflow notebook from template; stash in the ./workflows folder tree in an appropriate spot.
  • Create STAC catalog entry;
    • Verify all metadata
    • Create entry
  • Reportage
    • add notebook and the dask performance report to the repo
    • Calculate summary statistics on output (compression ratio, total size)
    • Save STAC JSON snippet to repo
  • Merge and close the issue.
Edited by Andrew Laws