Commit 7a01d24c authored by Asquith, William H.'s avatar Asquith, William H.

renamed to .md and edits to the markdown format

parent 94186ccf
README (path ./visGWDBmrva/READMEoutput.md)
Authors: William H. Asquith, Ronald C. Seanor
Contributors: Virginia L. McGuire, Wade H. Kress
Point of contact: William H. Asquith (wasquith@usgs.gov)
***
***
This README file is about the output of the visGWDB groundwater informatics application
in the R language. This README file is subordinate to the `README.md` file in this same
directory.
Asquith, W.H., Seanor, R.C., McGuire, V.L., and Kress, W.H., 2020.
Methods to quality assure, plot, summarize, interpolate, and extend
groundwater-level information---Examples for the Mississippi River Valley
alluvial aquifer: USGS bureau approved and revision to
Environmental Modelling & Software.
This README file is located at the root level of visGWDB because the `./output/`
subdirectory can be freely deleted as it is created at run time. The `./output/`
subdirectory should contain the following files and subdirectories. There are or have
been many inquiries into groundwater level informatics. To this end, the output from
visGWDB is multifaceted and described herein. For reference the `./output20190901.zip`
contains a compressed version of the `./output/` subdirectory for a default test of the
visGWDB framework.
***
# MANIFEST OF SUBDIRECTORIES AND FILES:
1. `./hydrograph_legend.pdf` --- This portable document format graphic is made by the `../include/visGWDB_hydrolegend.R` script. `The README.md` file within the `../include/` subdirectory has a few more details.
2. `./outliers/` --- Part of the visGWDB development motiviation was the detection of outliers in the database. This subdirectory contains salient information, and the information therein is dependent on the contents of the GWtrim, the wells processed in the master loop within `visGWDB.R`, and other outlier and statistics settings in `../include/visGWDB_control.R`. The subdirectory contains a tab-delimited table of the outliers (if any are found) in the file `GWoutliers.txt`. The directory also contains two RData binary files of the `GWoutliers.txt` table. `GWoutliers.RData` is just a version of this table, and `spGWoutliers.RData` is a format configured for the _R_ package **sp** (spatial analysis) (Pebesma and Bivand, 2005, 2018).
3. `./pdfs/` --- This directory contains the well-specific hydrographs. One portable document format file per well will be present. Graphical settings are found in `../include/visGWDB_control.R`. The critical logical switches for these are `manyPDFs` and `showVIS` already set within the `visGWDB.R` script.
*** An important point about the support vector machine (SVM) lines and by association
the identification of 'outliers' by tis technique is that the method has a modest
amount of stochastic behavior. By this term, this means that subsequent runs will
show slightly different results. This is associated with how the SVM passes through
its input data set during support vectors identification in relation to k-fold
cross validation. Because outliers are identified through an either in or out style,
the list of outliers by the SVM could change slightly. The visGWDB implementation
of the package kernlab SVM does not use k-fold cross validation because the
enormous computational run times when large databases with many wells are processed.
***
4. `./pobs/` --- This directory contains the pseudo-observations. These are not actually named in the paper by Asquith and others (2020). However, the computations are described therein in the context of site-specific and regional-specific generalized additive models (GAM) and support vector machines. These pseudo-observations are estimates from all four of these model for the dates of the observations. These are then combined by a weighted mean to form the PSEUDO estimates on the far right columns in the files. The directory contains comma-separated files with column names summarized in the file `./rgam/aHEADER.txt`. The regional GAM estimates are also written in the `./output/rgam/` directory if the logical switches in the code are set. The site-specific GAM and site-specific SVM are set to missing if the absolute difference between the models is less than the `check.delta.between.models` setting in `../include/visGWDB_control.R`. Similarly, this applies to the regional GAM and regional SVM values too.
5. `./pozo/` --- This directory contains the monthly estimates of water levels. These are the estimates that form the central theme to the paper by Asquith and others (2020). The directory contains comma separate files with column names summarized in the file `./pozo/aHEADER.txt`. The critical logical switch to create these output files is `actually.writePOZO` in `../include/visGWDB_control.R` and additional logical switches that are already set in the `visGWDB.R` script. The time range as well as even the specification of monthly estimation represented by the contents of this directory are controlled by the `../include/visGWDB_auxcode.R` file. The site-specific GAM and site-specific SVM are set to missing if the absolute difference between the models is less than the `check.delta.between.models` setting in `../include/visGWDB_control.R`. Similarly, this applies to the regional GAM and regional SVM values too.
6. `./rgam/` --- This directory contains information on the regional generalized additive model (rGAM) time-series estimation of pseudo-observations for the dates of the measurements. This model is based on the size of the neighborhood around the target or subject well, a minimum number of samples, and the maximum number of wells to permit into the statistics. The settings are in `../include/visGWDB_control.R`. The directory contains comma-separated files with column names summarized in the file `./rgam/aHEADER.txt`. This directory does not contain the regional support vector machine (rSVM) estimates and because rSVMs do not have elaborate diagnostics on prediction uncertainties relative to the rGAMs it was felt that having these rGAM output tables could be useful to others. The contents of this directory are not the subject of the paper by Asquith and others (2020) though the computations are described indirectly in that paper. The critical logical switch to create these output files is `actually.writePOBSRGAM` in `../include/visGWDB_control.R` and additional logical switches that are already set in `visGWDB.R`. The estimates are also within the files of the `./output/pobs/` subdirectory.
7. `./stats/` --- This directory contains several different types of tab-delimited files. The headers in the files explain the columns therein. The tables have one record per well and hence are not time-series oriented like the contents of the other subdirectories. Also a couple of RData binary files are provided as versions of the statistics table `GWstats.txt`. These are `GWstats.RData` and `spGWstats.RData` in which the later is for the _R_ package **sp** (spatial analysis) (Pebesma and Bivand, 2005, 2018). Note all of these files will be present as their presence can be dependent on other code settings.
o The `GWstats.txt` table is a major output visGWDB. The `GWstats.txt` table contains one record per well is present, with many statistics for each well along with results of the L-moments, hypothesis testing, and other features described in Asquith and others (2020).
o The `GWlatlon.txt` is a simple file of the site information by site badge, with the decimal latitude and longitude referenced to the North American Datum of 1983 (NAD83).
o The `GWmeta.txt` file contains the metadata for the wells (non time series) processed, whereas an overall summary of the GWtrim object used by visGWBD (the database) is provided in `GWtrim.pdf` (still experimental) and within `GWtrim.txt`.
o The `GWtrim.pdf` file contains boxplots showing the distribution of water-level altitudes for all years, then for all months, and then by level-source agency code.
o The `GWtrim.txt` file contains summary statistics of the input database.
o The `GWwave.txt` file is an experimental table (mostly for the developer) in which preservation of the trigonometric results of the site-specific and regional generalized additive modeling of the water levels are reported.
8. `./vartmp/` --- This directory is to work as a temporary directory used as part of the `../include/visGWDBpostMloop.R` script. This directory does not really contain user-level information for further inquiries, and is to be thought of as analogous in purpose to the `/var/tmp/` directory on a unix-like operating systems.
9. `./zz_sessionInfo.txt` (a text file) --- This is a summary of the R session information. The version used by the author (Asquith) during locking down of this code base is shown in the `README.txt` file. The `zz_` is used to force the file to sort to the bottom of the directory and otherwise not to be threaded among the subdirectories.
# REFERENCES
Pebesma, E.J., Bivand, R.S., 2005, Classes and methods for spatial data in R,
R News, v. 5, no. 2, accessed May 6, 2019, at
\url{https://cran.r-project.org/doc/Rnews/}.
Pebesma, E.J., Bivand, R.S., 2018, sp---Classes and methods for spatial data,
R package version 1.3-1, dated June 5, 2018, accessed May 6, 2019,
\url{https://CRAN.R-project.org/package=sp}.
README (path ./visGWDBmrva/READMEoutput.txt)
Authors: William H. Asquith, Ronald C. Seanor
Contributors: Virginia L. McGuire, Wade H. Kress
Point of contact: William H. Asquith (wasquith@usgs.gov)
------------------------------------------------------------------------------------------
This README file is about the output of the visGWDB groundwater informatics application
in the R language. This README file is subordinate to the README.txt file in this same
directory.
Asquith, W.H., Seanor, R.C., McGuire, V.L., and Kress, W.H., 2020.
Methods to quality assure, plot, summarize, interpolate, and extend
groundwater-level information---Examples for the Mississippi River Valley
alluvial aquifer: USGS bureau approved and revision to
Environmental Modelling & Software.
This README file is located at the root level of visGWDB because the ./output/
subdirectory can be freely deleted as it is created at run time. The ./output/
subdirectory should contain the following files and subdirectories. There are or have
been many inquiries into groundwater level informatics. To this end, the output from
visGWDB is multifaceted and described herein. For reference the ./output20190901.zip
contains a compressed version of the output/ subdirectory for a default test of the
visGWDB framework.
------------------------------------------------------------------------------------------
MANIFEST OF SUBDIRECTORIES AND FILES:
./hydrograph_legend.pdf
This portable document format graphic is made by the ../include/visGWDB_hydrolegend.R
script. The README.txt file within the ../include/ subdirectory has a few more details.
./outliers/
Part of the visGWDB development motiviation was the detection of outliers in the database.
This subdirectory contains salient information, and the information therein is dependent
on the contents of the GWtrim, the wells processed in the master loop within visGWDB.R,
and other outlier and statistics settings in ../include/visGWDB_control.R. The
subdirectory contains a tab-delimited table of the outliers (if any are found) in the
file GWoutliers.txt. The directory also contains two RData binary files of the
GWoutliers.txt table. GWoutliers.RData is just a version of this table, and
spGWoutliers.RData is a format configured for the R package named "sp" (spatial analysis)
(Pebesma and Bivand, 2005, 2018).
./pdfs/
This directory contains the well-specific hydrographs. One portable document format file
per well will be present. Graphical settings are found in ../include/visGWDB_control.R.
The critical logical switches for these are "manyPDFs" and "showVIS" already set within
the visGWDB.R script.
*** An important point about the support vector machine (SVM) lines and by association
the identification of 'outliers' by tis technique is that the method has a modest
amount of stochastic behavior. By this term, this means that subsequent runs will
show slightly different results. This is associated with how the SVM passes through
its input data set during support vectors identification in relation to k-fold
cross validation. Because outliers are identified through an either in or out style,
the list of outliers by the SVM could change slightly. The visGWDB implementation
of the package kernlab SVM does not use k-fold cross validation because the
enormous computational run times when large databases with many wells are processed.
***
./pobs/
This directory contains the pseudo-observations. These are not actually named in the
paper by Asquith and others (2020). However, the computations are described therein in
the context of site-specific and regional-specific generalized additive models (GAM) and
support vector machines. These pseudo-observations are estimates from all four of these
model for the dates of the observations. These are then combined by a weighted mean to
form the PSEUDO estimates on the far right columns in the files. The directory contains
comma-separated files with column names summarized in the file ./rgam/aHEADER.txt.
The regional GAM estimates are also written in the ./output/rgam/ directory if the logical
switches in the code are set. The site-specific GAM and site-specific SVM are set
to missing if the absolute difference between the models is less than the
check.delta.between.models setting in ../include/visGWDB_control.R. Similarly, this
applies to the regional GAM and regional SVM values too.
./pozo/
This directory contains the monthly estimates of water levels. These are the estimates
that form the central theme to the paper by Asquith and others (2020). The directory
contains comma separate files with column names summarized in the file ./pozo/aHEADER.txt.
The critical logical switch to create these output files is "actually.writePOZO" in
../include/visGWDB_control.R and additional logical switches that are already set in
the visGWDB.R script. The time range as well as even the specification of monthly
estimation represented by the contents of this directory are controlled by the
../include/visGWDB_auxcode.R file. The site-specific GAM and site-specific SVM are set
to missing if the absolute difference between the models is less than the
check.delta.between.models setting in ../include/visGWDB_control.R. Similarly, this
applies to the regional GAM and regional SVM values too.
./rgam/
This directory contains information on the regional generalized additive model (rGAM)
time-series estimation of pseudo-observations for the dates of the measurements. This
model is based on the size of the neighborhood around the target or subject well, a
minimum number of samples, and the maximum number of wells to permit into the statistics.
The settings are in ../include/visGWDB_control.R. The directory contains comma-separated
files with column names summarized in the file ./rgam/aHEADER.txt. This directory does
not contain the regional support vector machine (rSVM) estimates and because rSVMs do
not have elaborate diagnostics on prediction uncertainties relative to the rGAMs it was
felt that having these rGAM output tables could be useful to others. The contents of this
directory are not the subject of the paper by Asquith and others (2020) though the
computations are described indirectly in that paper. The critical logical switch to
create these output files is "actually.writePOBSRGAM" in ../include/visGWDB_control.R and
additional logical switches that are already set in visGWDB.R. The estimates are also
within the files of the ./output/pobs/ subdirectory.
./stats/
This directory contains several different types of tab-delimited files. The headers in
the files explain the columns therein. The tables have one record per well and hence are
not time-series oriented like the contents of the other subdirectories. Also a couple of
RData binary files are provided as versions of the statistics table GWstats.txt. These are
GWstats.RData and spGWstats.RData in which the later is for the R package named "sp"
(spatial analysis) (Pebesma and Bivand, 2005, 2018). Note all of these files will be
present as their presence can be dependent on other code settings.
The GWstats.txt table is a major output visGWDB. The GWstats.txt table contains one record
per well is present, with many statistics for each well along with results of the
L-moments, hypothesis testing, and other features described in Asquith and others (2020).
The GWlatlon.txt is a simple file of the site information by site badge, with the decimal
latitude and longitude referenced to the North American Datum of 1983 (NAD83).
The GWmeta.txt file contains the metadata for the wells (non time series) processed,
whereas an overall summary of the GWtrim object used by visGWBD (the database) is
provided in GWtrim.pdf (still experimental) and within GWtrim.txt.
The GWtrim.pdf file contains boxplots showing the distribution of water-level altitudes
for all years, then for all months, and then by level-source agency code.
The GWtrim.txt file contains summary statistics of the input database.
The GWwave.txt file is an experimental table (mostly for the developer) in which
preservation of the trigonometric results of the site-specific and regional generalized
additive modeling of the water levels are reported.
./vartmp/
This directory is to work as a temporary directory used as part of the
../include/visGWDBpostMloop.R script. This directory does not really contain user-level
information for further inquiries, and is to be thought of as analogous in purpose to
the /var/tmp/ directory on a unix-like operating systems.
./zz_sessionInfo.txt (a text file)
This is a summary of the R session information. The version used by the author (Asquith)
during locking down of this code base is shown in the README.txt file. The zz_ is used to
force the file to sort to the bottom of the directory and otherwise not to be threaded
among the subdirectories.
REFERENCES
Pebesma, E.J., Bivand, R.S., 2005, Classes and methods for spatial data in R,
R News, v. 5, no. 2, accessed May 6, 2019, at
\url{https://cran.r-project.org/doc/Rnews/}.
Pebesma, E.J., Bivand, R.S., 2018, sp---Classes and methods for spatial data,
R package version 1.3-1, dated June 5, 2018, accessed May 6, 2019,
\url{https://CRAN.R-project.org/package=sp}.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment