Newer
Older

Abundant data are currently available to describe the landscapes around us, yet the raw forms of these data are not always useful for scientific research and need to be processed into appropriate spatial units for analyses. Studies of streams suggest a stream and its condition can be characterized by accounting for the landscape draining to a stream segment and the landscape upstream and or downstream of the stream segment (network).
This Python package is intended to assist with summarization of landscape information to stream watershed drainages (local summaries). Methods are built in a generalized way and are intended to support efforts for any stream network having polygon based drainage watersheds. The output of these methods can be used to calculate stream network summaries using [xstrm](https://doi.org/10.5066/P9P8P7Z0).
Currently this package supports zonal summaries for both grid based data (e.g. TIFF, ESRI Raster, IMG) and point data. Summary results are returned in a dataframe format with options to export as a CSV file. An example of summary inputs and outputs is shown below under the quick start section. Additional illustrations are included in the 'tests/test_data' folder in the repository. Metadata for each summary is also built in a JSON format capturing information about the processing environment, parameters, data inputs/outputs, and processing date.
Daniel Wieferich - dwieferich@usgs.gov
## USGS Software Release Information
Wieferich, D.J., Gressler B., Krause K., Wieczorek M., McDonald, S. 2022. xstrm_local Version-1.1.0. U.S. Geological Survey software release. https://doi.org/10.5066/P98BOGI9.
The following USGS programs and science centers contributed to these efforts: Science Analytics and Synthesis Program, Eastern Ecological Science Center, National Climate Adaptation Science Center, USGS Maryland-Delaware-D.C. Water Science Center, and USGS Lower Mississippi-Gulf Water Science Center.
Requirements.txt shows a condensed version of packages for general use, while requirements_dev shows a full list of packages used in development.
Before installing xstrm_local it is recommended to have operating system specific dependencies in place for GDAL. It is also recommended to create a working Python environment.
#### Installation for Linux and Mac
After activating your new Python environment, pip install the package from main branch using the below command. Users can also install working branches of the repository by adding @branch_name after xstrm_local.git.
# Install Package from main branch
pip install git+https://code.usgs.gov/sas/bioscience/xstrm_local.git
or
```sh
# Install Package from working branch
pip install git+https://code.usgs.gov/sas/bioscience/xstrm_local.git@branch_name
```
#### Installation for Windows
For Windows machines it may be easiest to build a local environment using Conda and the xstrm_local.yml file that is included in the package. An example using anaconda prompt is included below. Prior to the steps below a user must clone or download the xstrm_local package to the local machine. Using Anaconda prompt (or alike) change directories to the top level of the xstrm_local repository.
# First download or clone the repository to the local machine
# Navigate into the xstrm_local directory, replace 'path/to' with the path to the directory
# Example Creating a Conda environment using Python 3.8 in Windows
conda env create -f xstrm_local.yml
# Activate the new environment called 'xstrm_local'
conda activate xstrm_local
# Install the xstrm_local package metadata
pip install -e . --no-deps
```
#### Summarizing Rasters and other Grid Based Data to Zone Polygons
The grid_to_poly module focuses on summarization of rasters (geo-referenced images) and other grid based data such as netcdfs to zones of interest. These efforts are primarily intended to support summarization of landscape variables to local watershed zones, but could be used in a more general sense. This module uses rasterstats and multiprocessing methods. Basic use is demonstrated below.
**Within Python**: Note this example will execute as is, if the user downloads xstrm_local and sets xstrm_local as working directory
```python
from xstrm_local import grid_to_poly as gtp
test_grid = './tests/test_data/test-grid-data.tif'
test_zones = './tests/test_data/test-poly.shp'
# Initiate summary object
# This example summarizes categorical data
summary = gtp.ZonalSummary(
summary_name='summary_testing1',
summary_short_name='tt1',
stats='mean count',
grid_data=test_grid,
zone_data=test_zones,
zone_id_col="FEATUREID",
categorical=True,
all_touched=False,
cores=2
)
# Process zonal summary, export results
summary.process_zonal_stats(export_summary_data=True, export_metadata=True)
```
**Visual representation of above grid to polygon summarization example**

#### Summarizing Point Data to Zone Polygons
The point_to_poly module focuses on summarization of point data to polygon zones of interest. These efforts are intended to support summarization of landscape variables to local watersheds. This module uses geopandas and multiprocessing methods. Basic use is demonstrated below.
**Within Python**: Note this example will execute as is, if the user downloads xstrm_local and sets xstrm_local as working directory
```python
from xstrm_local import point_to_poly as ptp
test_points = './tests/test_data/test_points.shp'
test_zones = './tests/test_data/test-poly.shp'
# this example shows four summaries to perform
test_stats = [{"col":"t1","type":"max","weight":None},
{"col":None,"type":"count","weight":None},
{"col":"t3","type":"categorical","weight":None},
{"col":None,"type":"density","weight":"AreaSqKM"}]
# Initiate summary object
summary = ptp.ZonalSummary(
summary_name='summary_testing1',
summary_short_name='tt1',
zone_id_col="FEATUREID",
summaries=test_stats,
point_data=test_points,
zone_data=test_zones,
nodata=-9,
calc_nodata=True
)
# Process zonal summary, export results
summary.process_zonal_stats(export_summary_data=True, export_metadata=True)
**Visual representation of above point to polygon summarization example**

Python pytest is used to test this code. To run tests first clone the repository and navigate to the xstrm_local directory. Ensure your environment is activated and all development requirements from requirements_dev.txt are installed and run python -m pytest. If adding new tests use the prefix 'test_' to the name of the new module and each test function.
```sh
# After cloning the repository navigate to the directory
cd xstrm_local
# Run pytests
python -m pytest
```
## Versioning
Bumpversion is used to version this code. To version, create a development environment that includes bumpversion and use the appropriate command below.
#### Small adjustments to code, improved documentation, and/or updated tests
``bumpversion patch --allow-dirty``
#### Improved methods within modules
``bumpversion minor --allow-dirty``
#### New module, or release
``bumpversion major --allow-dirty``
## Copyright and License
This USGS product is considered to be in the U.S. public domain, and is licensed under unlicense_
.. _unlicense: https://unlicense.org/
This software is preliminary or provisional and is subject to revision. It is being provided to meet the need for timely best science. The software has not received final approval by the U.S. Geological Survey (USGS). No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty. The software is provided on the condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from the authorized or unauthorized use of the software.
Acknowledgements
----------------
* This work was supported by funding from the [USGS Community for Data Integration (CDI)](https://www.usgs.gov/centers/community-for-data-integration-cdi). The CDI project (FY2016) National Stream Summarization: Standardizing Stream-Landscape Summaries Project and all those involved contributed guidance and concepts used in this effort.
* This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
The official USGS software release can be found at https://doi.org/10.5066/P98BOGI9. The main branch will have the most up-to-date version of the code.