" diffs = [d2 - d1 for d1, d2 in zip(dim_vals, dim_vals[1:])]\n",
" unique_steps = np.unique(diffs)\n",
" # set step - if all steps are the same length\n",
" # datacube spec specifies to use null for irregularly spaced steps\n",
" if len(unique_steps)==1:\n",
" step = unique_steps[0]\n",
" else:\n",
" step = None\n",
" return(step)"
]
},
{
"cell_type": "markdown",
"id": "00a5e041-081d-428d-ac2e-75d16de205e6",
"metadata": {},
"source": [
"#### user input needed - you will need to copy all of the dimensions from above into the dict and fill in the appropriate attributes(type, axis, extent):"
"#### user inpds['bottom_top_stag'].values.min()ut needed - you will need to copy all of the dimensions from above into the dict and fill in the appropriate attributes(type, axis, extent):"
]
},
{
...
...
@@ -317,9 +386,9 @@
"# dimension name should come from the coordinates printed above\n",
"# we do not recommend including redundant dimensions (do not include x,y if you have lon,lat)\n",
"# note that the extent of each dimension should be pulled from the dataset\n",
This is a workflow for transforming the CONUS404 daily zarr dataset into a [STAC collection](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md). We use the [datacube extension](https://github.com/stac-extensions/datacube) to define the spatial and temporal dimensions of the zarr store, as well as the variables it contains.
To simplify this workflow so that it can scale to many datasets, a few simplifying suggestions and assumptions are made:
1. For USGS data, we can use the CC0-1.0 license. For all other data we can use Unlicense. Ref: https://spdx.org/licenses/
2. I am assuming all coordinates are from the WGS84 datum if not specified.
# When writing data to Zarr, Xarray sets this attribute on all variables based on the variable dimensions. When reading a Zarr group, Xarray looks for this attribute on all arrays,
#### user input needed - you will need to copy all of the dimensions from above into the dict and fill in the appropriate attributes(type, axis, extent):
#### user inpds['bottom_top_stag'].values.min()ut needed - you will need to copy all of the dimensions from above into the dict and fill in the appropriate attributes(type, axis, extent):
# spec says that the keys of cube:dimensions and cube:variables should be unique together; a key like lat should not be both a dimension and a variable.
# we will drop all values in dims from vars
vars=[vforvinvarsifvnotindims]
# Microsoft Planetary Computer includes coorindates and crs as variables here: https://planetarycomputer.microsoft.com/dataset/daymet-annual-na
# we will keep those in the var list
# create dictionary of dataset variables and associated dimensions