Skip to content

Software Review: GARDEN

Bellino, Jason C. requested to merge jbellino/flow-vectors:software-review into 1.0.0

This merge request serves to document the software review process for "GARDEN - A flow vector tool for EDEN". The review was based on the required steps for a "software release" found at https://www.usgs.gov/products/software/software-management/types-software-review. License, disclaimer, and metadata (code.json) files were copied from https://code.chs.usgs.gov/software/software-management.git and project-specific updates were made where applicable.

The first section, "Overview", is a summary of activities performed during the review. A checklist for the official release of software is enumerated below in the section titled "Software Release Checklist". As tasks in the checklist are completed, the corresponding items in this document should be updated.

All review steps were completed by Jason Bellino (jbellino@usgs.gov).

Overview

Administrative Security Review [required]

A security review of the code was conducted—including all commits—no issues were found.

All software must have an administrative security review before it is made publicly available by any method. This type of review ensures personal, private, or otherwise sensitive information is not included in the repository. Types of sensitive information include:

  • Personally identifiable information (PII)
  • Absolute file system paths
  • Internal server host names or IP addresses
  • Usernames/passwords

from https://www.usgs.gov/products/software/software-management/types-software-review

Code Review [required]

Code was checked for structural quality and some minor formatting changes were made to conform to PEP 8 guidelines and improve readability of multiline mathematical operations. No unit tests were available for this package at the time of review and no major issues were found.

Pylint code score: 8.84/10.

Code reviews ensure structural code quality and should be performed frequently throughout the stages of software development. This might mean different things depending on the individual project/team but some typical quality checks include:

  • Coding standards
  • Unit tests passing
  • User input cleansing
  • Memory leaks
  • Vulnerabilities
  • Optimizations

from https://www.usgs.gov/products/software/software-management/types-software-review

This code was written for python 3.9 with vulnerabilities listed here and here. A dependency requirement was added to the environment.yml to instead use python 3.12.3 to avoid known vulnerabilities associated with CPython in previous versions (CVE-2023-6597, CVE-2024-0450, etc.). No changes to the code base were necessary to satisfy this upgrade. The only currently-known vulnerability for built-in libraries used in this repository is:

The following external packages were also checked for vulnerabilities with no critical issues found:

Name Version Latest Version Description Vulnerability Analysis
pandas 2.2.1 2.2.2 Powerful data structures for data analysis, time series, and statistics. https://security.snyk.io/package/pip/pandas
xarray 2024.2.0 2024.5.0 N-D labeled arrays and datasets in Python. https://security.snyk.io/package/pip/xarray
numpy 1.26.4 1.26.4 Fundamental package for array computing in Python. https://security.snyk.io/package/pip/numpy
paramiko 3.4.0 3.4.0 SSH2 protocol library. https://security.snyk.io/package/pip/paramiko
matplotlib 3.8.3 3.9.0 Python plotting package. https://security.snyk.io/package/pip/matplotlib
cartopy 0.22.0 0.23.0 A Python library for cartographic visualizations with Matplotlib. https://security.snyk.io/package/pip/cartopy
netcdf4 1.6.5 1.6.5 Provides an object-oriented python interface to the netCDF version 4 library. https://security.snyk.io/package/pip/netcdf4
openpyxl 3.1.2 3.1.2 A Python library to read/write Excel 2010 xlsx/xlsm files. https://security.snyk.io/package/pip/openpyxl

Domain Review [required]

Code was checked for internal consistency, proper logic and usage, and suitability of the libraries chosen. No similar outputs from other sources are available to compare for testing code accuracy; however, algorithms for computing vectors are applied using standard tools. Output files located in the "sample-output" directory were recreated with the applicable commands and the results were identical.


Software Release Checklist

from https://code.chs.usgs.gov/software/software-management/-/blob/main/software-release-checklist.md?ref_type=heads

Provisional Release

A provisional release is for the entirety of the project. The default branch (i.e. master or main) will be used for the provisional release.

Provisional releases allow the source code to be made available publicly. This may be in the timely interest of best science, to facilitate external domain and/or technical code reviews (in pursuance of a subsequent official release), or for any other reason deemed acceptable by your center director or equivalent or delegated authority.

A provisional release must exist in a group hierarchy on https://code.usgs.gov/ (i.e., it may not exist in your personal namespace). Top-level groups should reflect current high-level USGS organizational structure, e.g., science center, mission area, etc... Additional nested groups can be created as appropriate.

A provisional release may not be cited or referenced by other official USGS Information Products (e.g., manuscripts, data releases). Data produced by a provisional release may not be used to support any other official USGS Information Product (e.g., manuscript).

Checklist

  • Reviews
    • Administrative security review
  • Administrative files
    • Readme
    • Disclaimer
    • License
    • Metadata
  • Request product be made publicly available on a provisional basis

Reviews

Administrative security review

  • Ensure review inspects each commit in project history
  • If development workflow included incremental peer reviews, this step is already complete

Administrative Files

Readme

There should exist a README.md file (casing and extension are important) in the top-level of the project. This file should include introductory information about the project (e.g., title, description, use-case, recommended citation, etc..). It should include links to additional detailed information such as the disclaimer statement, license information, documentation, user guide, related manuscript(s), etc...

This file should be in markdown format to ensure it renders properly on GitLab.

Disclaimer

This FSP provisional disclaimer must be included in a file located at the top-level of the project called DISCLAIMER.md. Both casing and file extension are important.

This disclaimer may additionally be embedded in the README.md file, but this is not necessary and can lead to consistency problems later if the content is updated in once place and not the other. It is instead recommended to link to the disclaimer from the README.md file.

License

An appropriate open-source license must be included in a file located at the top-level of the project called LICENSE.md. Both casing and file extension are important.

Selecting the appropriate license may depend on the project, is provenance, its dependencies, or other factors. A default starting point is provided, but may not be appropriate in all cases. If you are not sure which license to select, please consult with the Office of the Solicitor for legal guidance.

Metadata

Project metadata are stored maintained in a file at the top-level of the project called code.json. This file is in JSON format. Its top-level element should be the releases array as defined in the code.json schema. An example code.json template file is provided to help get you started. You may use an online validator tool to check basic syntax and you may validate against the defined schema using a custom command line interface application. Additional information about individual fields is provided below:

  • name: Should be a short, human readable name for the project. This should match the value provided when creating the project in GitLab.
  • organization: Must always be U.S. Geological Survey casing and punctuation are important.
  • description: This may be a longer description of the project. It should be nore more than 1-2 sentances. Verbose descriptions may exist in the README.md file
  • version: This should be a semantic version number for the release (e.g., 1.0.0). This should not include a leading v (i.e., v1.0.0) or other identifier. Eventually a Git tag should be created for the indicated release.
  • status: Must be one of the enumerated values listed below:
    • Ideation
    • Development
    • Alpha
    • Beta
    • Release Candidate
    • Production
    • Archival
  • permissions.license.name: If using the default template provided above, this should be Public Domain, CC0-1.0. Otherwise it should be the name of the selected license.
  • permissions.license.URL: A link to the LICENSE.md file stored in this project
    • Must use the raw variant of the file
    • Must reference the main or master branch (as appropriate, this will differ for an official release, see below)
  • homepageURL: A link to the project homepage
    • Must be publicly accessible
    • Must point to a hosting platform with an authority to operate
      • May point to the project on GitLab
      • May point to a project home page elsewhere on the usgs.gov website
  • downloadURL: A link to download a ZIP archive of the project source code
    • Must point to the main or master branch (as appropriate, this will differ for an official release, see below)
  • disclaimerURL: A link to the DISCLAIMER.md file stored in this project
    • Must use the raw variant of the file
    • Must point to the main or master branch (as appropriate, this will differ for an official release, see below)
  • repositoryURL: A link to this project on GitLab
    • Must include the .git extension
  • tags: An array of topical/domain tags relevant to the project
    • If the project leverages or is related to AI/ML in any way, this array must include the tag usg-artificial-intelligence. This tag is short for U.S. Government Aritificial Intelligence (i.e., do not use usgs-artificial-intelligence).
  • languages: An array of the programming languages used within this project
  • date.metadataLastUpdated: An ISO datestamp of when the code.json file was last modified. Be sure to update this value whenever you modify anything in this file.

Additional fields are also available. See the official code.gov metadata schema for additional details.

Note that the top-level element in this file is an array. This means it may contain more than one release object for your project; for example, if this project has been under development for a long time, there may be multiple released versions. In this case, release objects should be ordered with the most recently released version appearing first, and so-on in reverse chronological order. For example:

[
  {
    // ... release 3.0.0, status Development
  },
  {
    // ... release 2.0.0, status Production
  },
  {
    // ... release 1.0.0, status Archival
  }
]

This metadata evolves over time. In the example above, the release tag for version 1.0.0 would only include metadata for that release and it would likely have a status of Production. Then in the release tag for version 2.0.0, two release objects would exist in the array, first would be 2.0.0 with status Production and second would appear 1.0.0 with status Archival, and so on...


Official Release

The checklist below documents the standard necessary steps/artifacts necessary to complete an official release of a software product. Individual science centers, offices, etc... may have additional requirements. Check with your local management to determine if any additional steps apply.

Official releases must refrence immutable Git tags. As you work towards an official release, you can use a release candidate branch to facilitate this process. The release candidate branch should have the same name as the eventual release tag. Upon completing the official release process, all other branches, tags, refs, etc... in the associated GitLab project are automatically released as provisional products as well. The primary distinguisher between a provisional release and an official release are the contents of the DISCLAIMER.md file, the code.json status property, and the ability to cite/reference the released product.

Checklist

  • Create IPDS record
    • Document reviews and approvals in this record as you proceed
  • Reserve a DOI
    • This should point to the immutable tag to be created later, it will not resolve until the tag is created, that is okay
  • Complete checklist for provisional release (above)
    • It is acceptable to skip submitting the issue to request the provisional release when directly pursuing an official release
  • Create a release-candidate branch for the targeted release
    • If intending to release an immutable tag version 1.0.0, create a release candidate branch 1.0.0.
  • Reviews
    • Technical code review
      • Should be completed by somebody with adequate understanding of the programming languages used in the project
      • May be facilitated/reconciled via GitLab issues and merge requests against the release candidate branch
      • Must be documented in the IPDS record
    • Domain Review
      • Should be completed by somebody with adequate understanding of the scientific subject matter covered in the released product
      • Solid documentation and testing suites can aid this review process
      • Must be documented in the IPDS record
  • Reconcile main or master branch
    • Merge changes from the release candidate branch back into the main or master branch (as appropriate)
  • Prepare final release changes (in the release candidate branch)
    • Update DISCLAIMER.md file to include the official verbiage
    • Update the code.json file...
      • The version field should be updated appropriately
      • The status field can be updated appropriately
      • URLs should point to the immutable tagged version (i.e., not main or master)
    • Do not merge these changes back in to the main or master branch
  • Request product be made publicly available as an official USGS Information Product
    • Submit a new issue using the GitLab Official Release issue template.
    • The Branch or Tag should indicate the intended target release tag even though it does not exist yet. The corresponding release candidate branch will be used to finalize the release.
    • An administrator will contact you to correct any errors in the release package and facilitate the final release of the product, during this process, you will be instructed to create an immutable Git tag and delete the release candidate branch.

References

  1. https://code.chs.usgs.gov/software/software-management/-/issues/new
  2. https://www.usgs.gov/software-management/types-software-review
  3. https://docs.gitlab.com/ee/user/markdown.html
  4. https://code.chs.usgs.gov/software/software-management/-/raw/main/administrative_templates/DISCLAIMER_preliminary.md
  5. https://code.chs.usgs.gov/software/software-management/-/raw/main/administrative_templates/LICENSE.md
  6. https://www.json.org/json-en.html
  7. https://github.com/GSA/code-gov-data/blob/master/schemas/schema-2.0.0.json
  8. https://code.chs.usgs.gov/software/software-management/-/raw/main/administrative_templates/code.json
  9. https://jsonformatter.curiousconcept.com/
  10. https://code.chs.usgs.gov/ghsc/hazdev/inventory-validator
  11. https://semver.org/
  12. https://git-scm.com/book/en/v2/Git-Basics-Tagging
  13. https://code.chs.usgs.gov/software/software-management/-/raw/main/administrative_templates/DISCLAIMER_official.md
Edited by Bellino, Jason C.

Merge request reports

Loading