Skip to content
Snippets Groups Projects
Commit 754c18d1 authored by Powers, Peter M.'s avatar Powers, Peter M.
Browse files

linter fixes and adjustments

parent 0d67b02c
No related branches found
No related tags found
1 merge request!525Docs move
...@@ -14,3 +14,6 @@ MD013: ...@@ -14,3 +14,6 @@ MD013:
heading_line_length: 100 heading_line_length: 100
# Number of characters for code blocks # Number of characters for code blocks
code_block_line_length: 100 code_block_line_length: 100
# Line length requirement disabled for
code_blocks: false
tables: false
...@@ -6,7 +6,7 @@ TODO ...@@ -6,7 +6,7 @@ TODO
## Build & Run Options ## Build & Run Options
* [Build and run locally](#build-and-run-locally) * [Build and run locally](#build-and-run-locally)
* [Run with Docker](#run-with-docker) * [Run with Docker](#run-with-docker)
## Build and Run Locally ## Build and Run Locally
...@@ -74,14 +74,14 @@ docker pull usgs/nshmp-haz ...@@ -74,14 +74,14 @@ docker pull usgs/nshmp-haz
### Docker Memory on Mac ### Docker Memory on Mac
By default, Docker Desktop for Mac is set to use 2 GB runtime memory. To run *nshmp-haz*, the By default, Docker Desktop for Mac is set to use 2 GB runtime memory. To run *nshmp-haz*, the
memory available to Docker must be [increased](https://docs.docker.com/docker-for-mac/#advanced) memory available to Docker must be [increased](https://docs.docker.com/docker-for-mac/#advanced)
to a minimum of 4 GB. to a minimum of 4 GB.
### Running ### Running
The *nshmp-haz* application may be run as a Docker container which mitigates the need to install The *nshmp-haz* application may be run as a Docker container which mitigates the need to install
Git, Java, or other dependencies besides Docker. A public image is available on Git, Java, or other dependencies besides Docker. A public image is available on
Docker hub at [https://hub.docker.com/r/usgs/nshmp-haz](https://hub.docker.com/r/usgs/nshmp-haz) Docker hub at [https://hub.docker.com/r/usgs/nshmp-haz](https://hub.docker.com/r/usgs/nshmp-haz)
which can be run with: which can be run with:
...@@ -118,8 +118,8 @@ Where: (TODO links below need checking) ...@@ -118,8 +118,8 @@ Where: (TODO links below need checking)
* `RETURN_PERIOD`, in years, is only required when running a disaggregation * `RETURN_PERIOD`, in years, is only required when running a disaggregation
* Other arguments: * Other arguments:
* (required) The absolute path to a GeoJSON or CSV [site(s)](site-specification) file * (required) The absolute path to a GeoJSON or CSV [site(s)](site-specification) file
* CSV example: `$(pwd)/my-csv-sites.csv:/app/sites.csv` * CSV example: `$(pwd)/my-csv-sites.csv:/app/sites.csv`
* GeoJSON example: `$(pwd)/my-geojson-sites.geojson:/app/sites.geojson` * GeoJSON example: `$(pwd)/my-geojson-sites.geojson:/app/sites.geojson`
* (optional) The absolute path to a [configuration](calculation-configuration) file * (optional) The absolute path to a [configuration](calculation-configuration) file
...@@ -144,4 +144,3 @@ Where: ...@@ -144,4 +144,3 @@ Where:
* `JAVA_XMS` is the intial memory for the JVM (default: system) * `JAVA_XMS` is the intial memory for the JVM (default: system)
* `JAVA_XMX` is the maximum memory for the JVM (default: 8g) * `JAVA_XMX` is the maximum memory for the JVM (default: 8g)
...@@ -47,6 +47,7 @@ __`performance`__ ...@@ -47,6 +47,7 @@ __`performance`__
   `.threadCount` |`String` | `ALL` | [`ThreadCount`](http://usgs.github.io/nshmp-haz/javadoc/index.html?gov/usgs/earthquake/nshmp/calc/ThreadCount.html)    `.threadCount` |`String` | `ALL` | [`ThreadCount`](http://usgs.github.io/nshmp-haz/javadoc/index.html?gov/usgs/earthquake/nshmp/calc/ThreadCount.html)
## Notes ## Notes
1. `hazard.truncationLevel`: This value is only used if the `hazard.exceedanceModel` requires a 1. `hazard.truncationLevel`: This value is only used if the `hazard.exceedanceModel` requires a
limit (e.g. `TRUNCATION_UPPER_ONLY`) limit (e.g. `TRUNCATION_UPPER_ONLY`)
2. `hazard.gmmUncertainty`: If values for additional epistemic uncertainty on ground motion have 2. `hazard.gmmUncertainty`: If values for additional epistemic uncertainty on ground motion have
...@@ -76,7 +77,8 @@ __`performance`__ ...@@ -76,7 +77,8 @@ __`performance`__
## Default Intensity Measure Levels (IMLs) ## Default Intensity Measure Levels (IMLs)
Units of PGV IMLs are cm/s; all other IMTs are in units of g. Spectral acceleration IMTs that are not listed use the values of the next highest spectral period. Units of PGV IMLs are cm/s; all other IMTs are in units of g. Spectral acceleration IMTs that are
not listed use the values of the next highest spectral period.
IMT | IMLs IMT | IMLs
-----------|----- -----------|-----
......
...@@ -4,12 +4,21 @@ The following provides basic guidance on how to set up command-line use of nshmp ...@@ -4,12 +4,21 @@ The following provides basic guidance on how to set up command-line use of nshmp
## Required Software ## Required Software
* Java 11 JDK: [Oracle](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html) or [Amazon Corretto](https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html) * Java 11 JDK: [Oracle](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html) or
[Amazon Corretto](https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/downloads-list.html)
* [Git](https://git-scm.com/downloads) * [Git](https://git-scm.com/downloads)
* Git is a distributed version control system. The USGS uses a [GitLab](https://docs.gitlab.com) [instance](https://code.usgs.gov/) to host projects and facilitate sharing and collaborative development of code. Git is included in the macOS [developer tools](https://developer.apple.com/xcode/). * Git is a distributed version control system. The USGS uses a [GitLab](https://docs.gitlab.com)
* Windows users may want to consider [Git for Windows](https://git-for-windows.github.io) or [GitHub Desktop](https://desktop.github.com), both of which include a linux-like terminal (Git BASH) in which subsequent commands listed here will work. [instance](https://code.usgs.gov/) to host projects and facilitate sharing and collaborative
development of code. Git is included in the macOS
Other project dependencies are managed with [Gradle](https://gradle.org/), which does not require a separate installation. Gradle is clever about finding Java, but some users may have to explicitly define a `JAVA_HOME` environment variable. For example, on Unix-like systems with `bash` as the default shell, one might add the following to `~/.bash_profile`: [developer tools](https://developer.apple.com/xcode/).
* Windows users may want to consider [Git for Windows](https://git-for-windows.github.io) or
[GitHub Desktop](https://desktop.github.com), both of which include a linux-like terminal
(Git BASH) in which subsequent commands listed here will work.
Other project dependencies are managed with [Gradle](https://gradle.org/), which does not
require a separate installation. Gradle is clever about finding Java, but some users may have to
explicitly define a `JAVA_HOME` environment variable. For example, on Unix-like systems with
`bash` as the default shell, one might add the following to `~/.bash_profile`:
```bash ```bash
# macOS # macOS
...@@ -18,7 +27,9 @@ export JAVA_HOME="$(/usr/libexec/java_home -v 11)" ...@@ -18,7 +27,9 @@ export JAVA_HOME="$(/usr/libexec/java_home -v 11)"
export JAVA_HOME=/usr/lib/jvm/jdk-11.0.6.jdk export JAVA_HOME=/usr/lib/jvm/jdk-11.0.6.jdk
``` ```
On Windows systems, environment variables are set through the `System Properties > Advanced > Environment Variables...` control panel. Depending on where Java is installed, `JAVA_HOME` might be: On Windows systems, environment variables are set through the `System Properties > Advanced >
Environment Variables...` control panel. Depending on where Java is installed, `JAVA_HOME`
might be:
```bash ```bash
JAVA_HOME C:\Program Files\Java\jdk-11.0.6.jdk JAVA_HOME C:\Program Files\Java\jdk-11.0.6.jdk
...@@ -26,7 +37,11 @@ JAVA_HOME C:\Program Files\Java\jdk-11.0.6.jdk ...@@ -26,7 +37,11 @@ JAVA_HOME C:\Program Files\Java\jdk-11.0.6.jdk
## Set Up Git ## Set Up Git
Follow the [GitLab instructions](https://docs.gitlab.com/ee/topics/git/). Some users may find it easier to use [Git for Windows](https://git-for-windows.github.io) or [GitHub Desktop](https://desktop.github.com). These desktop applications install required system components and are helpful for managing communication between local and remote repositories and viewing file diffs as one makes changes. Follow the [GitLab instructions](https://docs.gitlab.com/ee/topics/git/). Some users may find it
easier to use [Git for Windows](https://git-for-windows.github.io) or
[GitHub Desktop](https://desktop.github.com). These desktop applications install required system
components and are helpful for managing communication between local and remote repositories and
viewing file diffs as one makes changes.
## Get the Code ## Get the Code
...@@ -37,4 +52,9 @@ git clone https://code.usgs.gov/ghsc/nshmp/nshmp-haz.git ...@@ -37,4 +52,9 @@ git clone https://code.usgs.gov/ghsc/nshmp/nshmp-haz.git
## Eclipse Integration (Optional) ## Eclipse Integration (Optional)
Eclipse provides automatic compilation, syntax highlighting, and integration with Git, among other useful features. To build or modify *nshmp-haz* using [Eclipse](http://www.eclipse.org/), install the [Eclipse IDE for Java Developers](https://www.eclipse.org/downloads/packages/) or [Eclipse IDE for Enterprise Java and Web Developers](https://www.eclipse.org/downloads/packages/), if you plan on developing web services. Import the project into Eclipse: `File > Import > Gradle > Existing Gradle Project` Eclipse provides automatic compilation, syntax highlighting, and integration with Git, among
other useful features. To build or modify *nshmp-haz* using [Eclipse](http://www.eclipse.org/),
install the [Eclipse IDE for Java Developers](https://www.eclipse.org/downloads/packages/) or
[Eclipse IDE for Enterprise Java and Web Developers](https://www.eclipse.org/downloads/packages/),
if you plan on developing web services. Import the project into Eclipse: `File > Import >
Gradle > Existing Gradle Project`
### Abstract ### Abstract
Probabilistic seismic hazard analysis (PSHA; Cornell, 1968) is elegant in its relative simplicity. However, in the more than 40-years since its publication, the methodology has come to be applied to increasingly complex and non-standard source and ground motion models. For example, the third Uniform California Earthquake Rupture Forecast ([UCERF3](http://pubs.usgs.gov/of/2013/1165/)) upended the notion of discrete faults as independent sources, and the USGS national seismic hazard model uses temporally clustered sources. Moreover, as the logic trees typically employed in PSHAs to capture epistemic uncertainty grow larger, so too does the demand for a more complete understanding of uncertainty. At the USGS, there are additional requirements to support source model mining, deaggregation, and map-making, often through the use of dynamic web-applications. Implementations of the PSHA methodology commonly iterate over all sources that influence the hazard at a site and sequentially build a single hazard curve. Such a linear PSHA computational pipeline, however, proves difficult to maintain and modify to support the additional complexity of new models, hazard products, and analyses. The functional programming paradigm offers some relief. The functional approach breaks calculations down into their component parts or steps, storing intermediate results as immutable objects, making it easier to: chain actions together; preserve intermediate data or results that may still be relevant (e.g. as in a deaggregation); and leverage the concurrency supported by many modern programming languages. Probabilistic seismic hazard analysis (PSHA; Cornell, 1968) is elegant in its relative simplicity.
However, in the more than 40-years since its publication, the methodology has come to be applied
to increasingly complex and non-standard source and ground motion models. For example, the third
Uniform California Earthquake Rupture Forecast ([UCERF3](http://pubs.usgs.gov/of/2013/1165/))
upended the notion of discrete faults as independent sources, and the USGS national seismic hazard
model uses temporally clustered sources. Moreover, as the logic trees typically employed in PSHAs
to capture epistemic uncertainty grow larger, so too does the demand for a more complete
understanding of uncertainty. At the USGS, there are additional requirements to support source
model mining, deaggregation, and map-making, often through the use of dynamic web-applications.
Implementations of the PSHA methodology commonly iterate over all sources that influence the
hazard at a site and sequentially build a single hazard curve. Such a linear PSHA computational
pipeline, however, proves difficult to maintain and modify to support the additional complexity of
new models, hazard products, and analyses. The functional programming paradigm offers some relief.
The functional approach breaks calculations down into their component parts or steps, storing
intermediate results as immutable objects, making it easier to: chain actions together; preserve
intermediate data or results that may still be relevant (e.g. as in a deaggregation); and leverage
the concurrency supported by many modern programming languages.
#### Traditional PSHA formulation (after Baker, 2013): #### Traditional PSHA formulation (after Baker, 2013):
![image](images/psha-formula.png "PSHA formulation of Baker (2013)") ![image](images/psha-formula.png "PSHA formulation of Baker (2013)")
Briefly, the rate, *λ*, of exceeding an intensity measure, *IM*, level may be computed as a summation of the rate of exceeding such a level for all relevant earthquake sources (discretized in magnitude, *M*, and distance, *R*). This formulation relies on models of ground motion that give the probability that an intensity measure level of interest will be exceeded conditioned on the occurrence of a particular earthquake. Such models are commonly referred to as: Briefly, the rate, *λ*, of exceeding an intensity measure, *IM*, level may be computed as a
summation of the rate of exceeding such a level for all relevant earthquake sources (discretized
in magnitude, *M*, and distance, *R*). This formulation relies on models of ground motion that
give the probability that an intensity measure level of interest will be exceeded conditioned on
the occurrence of a particular earthquake. Such models are commonly referred to as:
* __Intensity measure relationships__ * __Intensity measure relationships__
* __Attenuation relationships__ * __Attenuation relationships__
* __Ground motion prediction equations (GMPEs)__ * __Ground motion prediction equations (GMPEs)__
* __Ground motion models (GMMs)__ * __Ground motion models (GMMs)__
The parameterization of modern models (e.g. NGA-West2; Bozorgnia et al., 2014) extends to much more than magnitude and distance, including, but not limited to: The parameterization of modern models (e.g. NGA-West2; Bozorgnia et al., 2014) extends to much
more than magnitude and distance, including, but not limited to:
* __Multiple distance metrics__ (e.g. rJB, rRup, rX, rY) * __Multiple distance metrics__ (e.g. rJB, rRup, rX, rY)
* __Fault geometry__ (e.g. dip, width, rupture depth, hypocentral depth) * __Fault geometry__ (e.g. dip, width, rupture depth, hypocentral depth)
...@@ -20,19 +41,25 @@ The parameterization of modern models (e.g. NGA-West2; Bozorgnia et al., 2014) e ...@@ -20,19 +41,25 @@ The parameterization of modern models (e.g. NGA-West2; Bozorgnia et al., 2014) e
#### Simple, yes, but used for so much more… #### Simple, yes, but used for so much more…
While this formulation is relatively straightforward and is typically presented with examples for a single site, using a single GMM, and a nominal number of sources, modern PSHAs commonly include: While this formulation is relatively straightforward and is typically presented with examples for
a single site, using a single GMM, and a nominal number of sources, modern PSHAs commonly include:
* Multiple thousands of sources (e.g. the 2014 USGS NSHM in the Central & Eastern US includes all smoothed seismicity sources out to 1000km from a site).
* Different source types, the relative contributions of which are important, and the GMM parameterizations of which may be different. * Multiple thousands of sources (e.g. the 2014 USGS NSHM in the Central & Eastern US includes all
* Sources (and associated ruptures – source filling or floating) represented by logic trees of magnitude-frequency distributions (MFDs). smoothed seismicity sources out to 1000km from a site).
* Source MFDs subject to logic trees of uncertainty on Mmax, total rate (for the individual source, or over a region, e.g. as in UCERF3) or other properties of the distribution. * Different source types, the relative contributions of which are important, and the GMM
parameterizations of which may be different.
* Sources (and associated ruptures – source filling or floating) represented by logic trees of
magnitude-frequency distributions (MFDs).
* Source MFDs subject to logic trees of uncertainty on Mmax, total rate (for the individual source,
or over a region, e.g. as in UCERF3) or other properties of the distribution.
* Logic trees of magnitude scaling relations for each source. * Logic trees of magnitude scaling relations for each source.
* Source models that do not adhere to the traditional formulation (e.g. cluster models of the NSHM). * Source models that do not adhere to the traditional formulation (e.g. cluster models of the NSHM).
* Logic trees of ground motion models. * Logic trees of ground motion models.
#### And further extended to support… #### And further extended to support…
* Response Spectra, Conditional Mean Spectra – multiple intensity measure types (IMTs; e.g. PGA, PGD, PGV, multiple SAs) * Response Spectra, Conditional Mean Spectra – multiple intensity measure types (IMTs; e.g. PGA,
PGD, PGV, multiple SAs)
* Deaggregation * Deaggregation
* Banded deaggregation (multiple deaggregations at varying IMLs) * Banded deaggregation (multiple deaggregations at varying IMLs)
* Maps – many thousands of sites * Maps – many thousands of sites
...@@ -40,7 +67,8 @@ While this formulation is relatively straightforward and is typically presented ...@@ -40,7 +67,8 @@ While this formulation is relatively straightforward and is typically presented
#### How are such calculations managed? #### How are such calculations managed?
* PSHA codes typically compute hazard in a linear fashion, looping over all relevant sources for a site. * PSHA codes typically compute hazard in a linear fashion, looping over all relevant sources for
a site.
* Adding additional GMMs, logic trees, IMT’s, and sites is addressed with more, outer loops: * Adding additional GMMs, logic trees, IMT’s, and sites is addressed with more, outer loops:
```PHP ```PHP
foreach IMT { foreach IMT {
...@@ -55,23 +83,37 @@ foreach IMT { ...@@ -55,23 +83,37 @@ foreach IMT {
} }
} }
``` ```
* Support for secondary analyses, such as deaggregation is supplied by a separate code or codes and can require repeating many of the steps performed to generate an initial hazard curve. * Support for secondary analyses, such as deaggregation is supplied by a separate code or codes
and can require repeating many of the steps performed to generate an initial hazard curve.
#### What about scaleability, maintenance, and performance? #### What about scaleability, maintenance, and performance?
* Although scaleability can be addressed for secondary products, such as maps, by distributing individual site calculations over multiple processors and threads, it is often difficult to leverage multi-core systems for individual site calculations. This hampers one’s ability to leverage multi-core systems in the face of ever more complex source and ground motion models and their respective logic trees. * Although scaleability can be addressed for secondary products, such as maps, by distributing
* A linear pipeline complicates testing, requiring end to end tests rather than tests of discrete calculations. individual site calculations over multiple processors and threads, it is often difficult to
* Multiple codes repeating identical tasks invite error and complicate maintenance by multiple individuals. leverage multi-core systems for individual site calculations. This hampers one’s ability to
leverage multi-core systems in the face of ever more complex source and ground motion models and
their respective logic trees.
* A linear pipeline complicates testing, requiring end to end tests rather than tests of discrete
calculations.
* Multiple codes repeating identical tasks invite error and complicate maintenance by multiple
individuals.
#### Enter functional programming… #### Enter functional programming…
* http://en.wikipedia.org/wiki/Functional_programming * http://en.wikipedia.org/wiki/Functional_programming
* Functional programming languages have been around for some time (e.g. Haskell, Lisp, R), and fundamental aspects of functional programming/design are common in many languages. For example, a cornerstone of the functional paradigm is the anonymous (or lambda) function; in Matlab, one may write [sqr = @(x) x.^2;]. * Functional programming languages have been around for some time (e.g. Haskell, Lisp, R), and
* In Matlab, one may pass function ‘handles’ (references) to other functions as arguments. This is also possible in Javascript, where such handles serve as callbacks. Given the rise in popularity of the functional style, Java 8 recently added constructs in the form of the function and streaming APIs, and libraries exists for other languages. fundamental aspects of functional programming/design are common in many languages. For example,
a cornerstone of the functional paradigm is the anonymous (or lambda) function; in Matlab, one
may write [sqr = @(x) x.^2;].
* In Matlab, one may pass function ‘handles’ (references) to other functions as arguments. This
is also possible in Javascript, where such handles serve as callbacks. Given the rise in
popularity of the functional style, Java 8 recently added constructs in the form of the function
and streaming APIs, and libraries exists for other languages.
#### How do PSHA and related calculations leverage such an approach? #### How do PSHA and related calculations leverage such an approach?
Break the traditional PSHA formulation down into discrete steps and preserve the data associated with each step: Break the traditional PSHA formulation down into discrete steps and preserve the data associated
with each step:
* **[1]** Source & Site parameterization * **[1]** Source & Site parameterization
* **[2]** Ground motion calculation (mean and standard deviation only) * **[2]** Ground motion calculation (mean and standard deviation only)
...@@ -92,10 +134,12 @@ The functional pipeline can be processed stepwise: ...@@ -92,10 +134,12 @@ The functional pipeline can be processed stepwise:
#### Benefits: #### Benefits:
* It’s possible to build a single calculation pipeline that will handle a standard hazard curve calculation and all of its extensions without repetition. * It’s possible to build a single calculation pipeline that will handle a standard hazard curve
calculation and all of its extensions without repetition.
* Pipeline performance scales with available hardware. * Pipeline performance scales with available hardware.
* No redundant code. * No redundant code.
* Can add or remove transforms or data at any point in the pipeline, or build new pipelines without adversely affecting existing code. * Can add or remove transforms or data at any point in the pipeline, or build new pipelines
without adversely affecting existing code.
#### Drawbacks: #### Drawbacks:
...@@ -104,6 +148,9 @@ The functional pipeline can be processed stepwise: ...@@ -104,6 +148,9 @@ The functional pipeline can be processed stepwise:
#### References #### References
* Baker J.W. (2013). An Introduction to Probabilistic Seismic Hazard Analysis (PSHA), White Paper, Version 2.0, 79 pp. * Baker J.W. (2013). An Introduction to Probabilistic Seismic Hazard Analysis (PSHA), White Paper,
* Bozorgnia, Y., et al. (2014) NGA-West2 Research Project, *Earthquake Spectra*, Vol. 30, No. 3, pp. 973-987. Version 2.0, 79 pp.
* Cornell, C.A., 1968, Engineering seismic risk analysis, *Bulletin of the Seismological Society of America*, Vol. 58, No. 5, pp. 1583-1606. * Bozorgnia, Y., et al. (2014) NGA-West2 Research Project, *Earthquake Spectra*, Vol. 30, No. 3,
pp. 973-987.
* Cornell, C.A., 1968, Engineering seismic risk analysis, *Bulletin of the Seismological Society
of America*, Vol. 58, No. 5, pp. 1583-1606.
# Sidebar
[Home](home) [Home](home)
[Building & Running](building-&-running) [Building & Running](building-&-running)
   [Configuration](calculation-configuration)    [Configuration](calculation-configuration)
...@@ -13,6 +15,6 @@ ...@@ -13,6 +15,6 @@
[API Docs](http://usgs.github.io/nshmp-haz/javadoc) [API Docs](http://usgs.github.io/nshmp-haz/javadoc)
[License](https://github.com/usgs/nshmp-haz/blob/master/LICENSE.md) [License](https://github.com/usgs/nshmp-haz/blob/master/LICENSE.md)
[<img valign="middle" src="images/usgs-icon.png">](https://www.usgs.gov) &nbsp;[U.S. Geological Survey](https://www.usgs.gov) ![USGS logo](images/usgs-icon.png) &nbsp;[U.S. Geological Survey](https://www.usgs.gov)
National Seismic Hazard Mapping Project ([NSHMP](https://earthquake.usgs.gov/hazards/)) National Seismic Hazard Mapping Project ([NSHMP](https://earthquake.usgs.gov/hazards/))
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment