Commit 73f407dd authored by Laura A DeCicco's avatar Laura A DeCicco
Browse files

Merge pull request #135 from ldecicco-USGS/master

Minimize examples to web services
parents c59ec8c0 474e6c08
......@@ -2,7 +2,7 @@ Package: dataRetrieval
Type: Package
Title: Retrieval Functions for USGS and EPA Hydrologic and Water Quality Data
Version: 2.3.1
Date: 2015-07-17
Date: 2015-08-13
Authors@R: c( person("Robert", "Hirsch", role = c("aut"),
email = "rhirsch@usgs.gov"),
person("Laura", "DeCicco", role = c("aut","cre"),
......@@ -36,4 +36,5 @@ Suggests:
testthat
VignetteBuilder: knitr
BuildVignettes: true
BugReports: https://github.com/USGS-R/dataRetrieval/issues
URL: https://github.com/USGS-R/dataRetrieval, http://pubs.usgs.gov/tm/04/a10/
......@@ -84,16 +84,17 @@
#' inactiveAndAcitive <- constructNWISURL(inactiveAndAcitive, "00060", "2014-01-01", "2014-01-10",'dv')
#' inactiveAndAcitive <- importWaterML1(inactiveAndAcitive)
#'
#' Timezone change with specified local timezone:
#' tzURL <- constructNWISURL("04027000", c("00300","63680"), "2011-11-05", "2011-11-07","uv")
#' tzIssue <- importWaterML1(tzURL, TRUE, "America/Chicago")
#'
#'
#' }
#' filePath <- system.file("extdata", package="dataRetrieval")
#' fileName <- "WaterML1Example.xml"
#' fullPath <- file.path(filePath, fileName)
#' imporFile <- importWaterML1(fullPath,TRUE)
#'
#'#Timezone change with specified local timezone:
#' tzURL <- constructNWISURL("04027000", c("00300","63680"), "2011-11-05", "2011-11-07","uv")
#' tzIssue <- importWaterML1(tzURL, TRUE, "America/Chicago")
#'
importWaterML1 <- function(obs_url,asDateTime=FALSE, tz=""){
if(file.exists(obs_url)){
......
#' Daily Value USGS NWIS Data Retrieval
#'
#' Imports data from NWIS web service. This function gets the data from here: \url{http://waterservices.usgs.gov/}
#' Information on parameter and statistics codes can be found here: \url{http://help.waterdata.usgs.gov}
#'
#' @param siteNumber character USGS site number. This is usually an 8 digit number. Multiple sites can be requested with a character vector.
#' @param parameterCd character of USGS parameter code(s). This is usually an 5 digit number.
......
......@@ -19,8 +19,7 @@
#' @param \dots named arguments for the base name for any other parameter code. The
#'form of the name must be like pXXXXX, where XXXXX is the parameter code.
#' @return A dataset like \code{data} with selected columns renamed.
#' @note The following statistics codes are converted by \code{renameNWISColumns}. See
#'\url{http://help.waterdata.usgs.gov} for more information about USGS codes.
#' @note The following statistics codes are converted by \code{renameNWISColumns}.
#'\describe{
#'\item{00001}{Maximum value, suffix: Max}
#'\item{00002}{Minimum value, suffix: Min}
......
......@@ -77,3 +77,6 @@ This software is provided "AS IS."
[
![CC0](http://i.creativecommons.org/p/zero/1.0/88x31.png)
](http://creativecommons.org/publicdomain/zero/1.0/)
CRAN statistics:
[![](http://cranlogs.r-pkg.org/badges/dataRetrieval)](http://cran.rstudio.com/web/packages/dataRetrieval/index.html)
......@@ -241,18 +241,15 @@ legend("topleft", variableInfo$param_units,
# parameterCd=c("00010","00060"),
# hasDataTypeCd="dv")
## ----dataExample------------------------------------------
dischargeWI <- readNWISdata(service="dv",
stateCd="WI",
parameterCd="00060",
drainAreaMin="50",
statCd="00003")
names(dischargeWI)
nrow(dischargeWI)
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
## ----dataExample, eval=FALSE------------------------------
# dischargeWI <- readNWISdata(service="dv",
# stateCd="WI",
# parameterCd="00060",
# drainAreaMin="50",
# statCd="00003")
#
# siteInfo <- attr(dischargeWI, "siteInfo")
#
## ----NJChloride, eval=FALSE-------------------------------
#
......@@ -266,28 +263,27 @@ head(siteInfo)
# characteristicName="pH")
#
## ----meta1, eval=TRUE-------------------------------------
attr(dischargeWI, "url")
attr(dischargeWI, "queryTime")
siteInfo <- attr(dischargeWI, "siteInfo")
## ----meta2, eval=TRUE-------------------------------------
names(attributes(dischargeWI))
## ----meta3, eval=TRUE-------------------------------------
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
variableInfo <- attr(dischargeWI, "variableInfo")
## ----meta1, eval=FALSE------------------------------------
#
# attr(dischargeWI, "url")
#
# attr(dischargeWI, "queryTime")
#
# siteInfo <- attr(dischargeWI, "siteInfo")
#
## ----meta2, eval=FALSE------------------------------------
#
# names(attributes(dischargeWI))
#
## ----meta3, eval=FALSE------------------------------------
#
# siteInfo <- attr(dischargeWI, "siteInfo")
#
# variableInfo <- attr(dischargeWI, "variableInfo")
#
#
## ----meta5, eval=FALSE------------------------------------
# comment(peakData)
......
......@@ -2,7 +2,7 @@
%\VignetteEngine{knitr::knitr}
%\VignetteDepends{}
%\VignetteSuggests{xtable, testthat}
%\VignetteImports{XML, RCurl, reshape2,lubridate,plyr}
%\VignetteImports{XML, RCurl, reshape2,lubridate,plyr,utils,stats}
%\VignettePackage{dataRetrieval}
\documentclass[a4paper,11pt]{article}
......@@ -182,9 +182,7 @@ library(dataRetrieval)
%------------------------------------------------------------
\section{Introduction to dataRetrieval}
%------------------------------------------------------------
The dataRetrieval package was created to simplify the process of loading hydrologic data into the R environment. It is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrologic data that are available on the Web, as well as data from the Water Quality Portal (WQP), which currently houses water quality data from the Environmental Protection Agency (EPA), U.S. Department of Agriculture (USDA), and USGS. Direct USGS data is obtained from a service called the National Water Information System (NWIS). A lot of useful information about NWIS can be obtained here:
\url{http://help.waterdata.usgs.gov/}
The dataRetrieval package was created to simplify the process of loading hydrologic data into the R environment. It is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrologic data that are available on the Web, as well as data from the Water Quality Portal (WQP), which currently houses water quality data from the Environmental Protection Agency (EPA), U.S. Department of Agriculture (USDA), and USGS. Direct USGS data is obtained from a service called the National Water Information System (NWIS).
For information on getting started in R and installing the package, see (\ref{sec:appendix1}): Getting Started. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
......@@ -281,13 +279,8 @@ The USGS organizes hydrologic data in a standard structure. Streamgages are loc
Once the siteNumber is known, the next required input for USGS data retrievals is the \enquote{parameter code}. This is a 5-digit code that specifies the measured parameter being requested. For example, parameter code 00631 represents \enquote{Nitrate plus nitrite, water, filtered, milligrams per liter as nitrogen}, with units of \enquote{mg/l as N}.
A useful place to discover USGS codes information, along with other NWIS information is:
\url{http://help.waterdata.usgs.gov/codes-and-parameters}
Not every station will measure all parameters. A short list of commonly measured parameters is shown in Table \ref{tab:params}.
<<tableParameterCodes, echo=FALSE,results='asis'>>=
pCode <- c('00060', '00065', '00010','00045','00400')
shortName <- c("Discharge [ft$^3$/s]","Gage height [ft]","Temperature [C]", "Precipitation [in]", "pH")
......@@ -315,9 +308,7 @@ Casrn stands for \enquote{Chemical Abstracts Service (CAS) Registry Number}. Mor
\url{http://www.cas.org/content/chemical-substances/faqs}
For unit values data (sensor data measured at regular time intervals such as 15 minutes or hourly), knowing the parameter code and siteNumber is enough to make a request for data. For most variables that are measured on a continuous basis, the USGS also stores the historical data as daily values. These daily values are statistical summaries of the continuous data, e.g. maximum, minimum, mean, or median. The different statistics are specified by a 5-digit statistics code. A complete list of statistic codes can be found here:
\url{http://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html&inline=true}
For unit values data (sensor data measured at regular time intervals such as 15 minutes or hourly), knowing the parameter code and siteNumber is enough to make a request for data. For most variables that are measured on a continuous basis, the USGS also stores the historical data as daily values. These daily values are statistical summaries of the continuous data, e.g. maximum, minimum, mean, or median. The different statistics are specified by a 5-digit statistics code.
Some common codes are shown in Table \ref{tab:stat}.
......@@ -455,9 +446,6 @@ parameterCd <- "00618"
parameterINFO <- readNWISpCode(parameterCd)
@
More information on parameter codes can obtained from:
\url{http://help.waterdata.usgs.gov/codes-and-parameters/parameters}
\FloatBarrier
......@@ -481,9 +469,7 @@ discharge <- readNWISdv(siteNumber,
parameterCd, startDate, endDate)
@
The column \texttt{"}datetime\texttt{"} in the returned data frame is automatically imported as a variable of class \texttt{"}Date\texttt{"} in R. Each requested parameter has a value and remark code column. The names of these columns depend on the requested parameter and stat code combinations. USGS daily value qualification codes are often \texttt{"}A\texttt{"} (approved for publication) or \texttt{"}P\texttt{"} (provisional data subject to revision). A more complete list of daily value qualification codes can be found here:
\url{http://help.waterdata.usgs.gov/codes-and-parameters/daily-value-qualification-code-dv_rmk_cd}
The column \texttt{"}datetime\texttt{"} in the returned data frame is automatically imported as a variable of class \texttt{"}Date\texttt{"} in R. Each requested parameter has a value and remark code column. The names of these columns depend on the requested parameter and stat code combinations. USGS daily value qualification codes are often \texttt{"}A\texttt{"} (approved for publication) or \texttt{"}P\texttt{"} (provisional data subject to revision).
Another example would be a request for mean and maximum daily temperature and discharge in early 2012:
......@@ -757,17 +743,14 @@ For NWIS data, the function \texttt{readNWISdata} can be used. The argument list
The \texttt{"..."} argument allows the user to create their own queries based on the instructions found in the web links above. The links provide instructions on how to create a URL to request data. Perhaps you want sites only in Wisconsin, with a drainage area less than 50 mi$^2$, and the most recent daily dischage data. That request would be done as follows:
<<dataExample>>=
<<dataExample, eval=FALSE>>=
dischargeWI <- readNWISdata(service="dv",
stateCd="WI",
parameterCd="00060",
drainAreaMin="50",
statCd="00003")
names(dischargeWI)
nrow(dischargeWI)
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
@
......@@ -815,7 +798,7 @@ dataPH <- readWQPdata(statecode="US:55",
%------------------------------------------------------------
All data frames returned from the Web services have some form of associated metadata. This information is included as attributes to the data frame. All data frames will have a \texttt{url} (returning a character of the url used to obtain the data), \texttt{siteInfo} (returning a data frame with information on sites), and \texttt{queryTime} (returning a POSIXct datetime) attributes. For example, the url and query time used to obtain the data can be found as follows:
<<meta1, eval=TRUE>>=
<<meta1, eval=FALSE>>=
attr(dischargeWI, "url")
......@@ -827,7 +810,7 @@ siteInfo <- attr(dischargeWI, "siteInfo")
Depending on the format that the data was obtained (RDB, WaterML1, etc), there will be additional information embedded in the data frame as attributes. To discover the available attributes:
<<meta2, eval=TRUE>>=
<<meta2, eval=FALSE>>=
names(attributes(dischargeWI))
......@@ -835,10 +818,9 @@ names(attributes(dischargeWI))
For data obtained from \texttt{readNWISuv}, \texttt{readNWISdv}, \texttt{readNWISgwl} there are two attributes that are particularly useful: \texttt{siteInfo} and \texttt{variableInfo}.
<<meta3, eval=TRUE>>=
<<meta3, eval=FALSE>>=
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
variableInfo <- attr(dischargeWI, "variableInfo")
......
......@@ -88,15 +88,16 @@ inactiveAndAcitive <- c("07334200","05212700")
inactiveAndAcitive <- constructNWISURL(inactiveAndAcitive, "00060", "2014-01-01", "2014-01-10",'dv')
inactiveAndAcitive <- importWaterML1(inactiveAndAcitive)
Timezone change with specified local timezone:
tzURL <- constructNWISURL("04027000", c("00300","63680"), "2011-11-05", "2011-11-07","uv")
tzIssue <- importWaterML1(tzURL, TRUE, "America/Chicago")
}
filePath <- system.file("extdata", package="dataRetrieval")
fileName <- "WaterML1Example.xml"
fullPath <- file.path(filePath, fileName)
imporFile <- importWaterML1(fullPath,TRUE)
#Timezone change with specified local timezone:
tzURL <- constructNWISURL("04027000", c("00300","63680"), "2011-11-05", "2011-11-07","uv")
tzIssue <- importWaterML1(tzURL, TRUE, "America/Chicago")
}
\seealso{
\code{\link{renameNWISColumns}}
......
......@@ -48,7 +48,6 @@ queryTime \tab POSIXct \tab The time the data was returned \cr
}
\description{
Imports data from NWIS web service. This function gets the data from here: \url{http://waterservices.usgs.gov/}
Information on parameter and statistics codes can be found here: \url{http://help.waterdata.usgs.gov}
}
\examples{
siteNumber <- '04085427'
......
......@@ -44,8 +44,7 @@ function reads information from the header and the arguments in the call to
to rename those columns.
}
\note{
The following statistics codes are converted by \code{renameNWISColumns}. See
\url{http://help.waterdata.usgs.gov} for more information about USGS codes.
The following statistics codes are converted by \code{renameNWISColumns}.
\describe{
\item{00001}{Maximum value, suffix: Max}
\item{00002}{Minimum value, suffix: Min}
......
......@@ -36,15 +36,16 @@ test_that("External importRDB1 tests", {
expect_is(qwData$sample_dt, 'Date')
expect_is(qwData$startDateTime, 'POSIXct')
iceSite <- '04024430'
start <- "2014-11-09"
end <- "2014-11-28"
urlIce <- constructNWISURL(iceSite,"00060",start, end,"uv",format="tsv")
ice <- importRDB1(urlIce)
expect_that(sum(is.na(ice$X01_00060)) > 0, is_true())
iceNoConvert <- importRDB1(urlIce, convertType=FALSE)
expect_that(sum(iceNoConvert$X01_00060 == "Ice") > 0, is_true())
#This data got deleted:
# iceSite <- '04024430'
# start <- "2014-11-09"
# end <- "2014-11-28"
# urlIce <- constructNWISURL(iceSite,"00060",start, end,"uv",format="tsv")
# ice <- importRDB1(urlIce)
# expect_that(sum(is.na(ice$X01_00060)) > 0, is_true())
#
# iceNoConvert <- importRDB1(urlIce, convertType=FALSE)
# expect_that(sum(iceNoConvert$X01_00060 == "Ice") > 0, is_true())
})
context("importRDB")
......
......@@ -182,9 +182,7 @@ library(dataRetrieval)
%------------------------------------------------------------
\section{Introduction to dataRetrieval}
%------------------------------------------------------------
The dataRetrieval package was created to simplify the process of loading hydrologic data into the R environment. It is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrologic data that are available on the Web, as well as data from the Water Quality Portal (WQP), which currently houses water quality data from the Environmental Protection Agency (EPA), U.S. Department of Agriculture (USDA), and USGS. Direct USGS data is obtained from a service called the National Water Information System (NWIS). A lot of useful information about NWIS can be obtained here:
\url{http://help.waterdata.usgs.gov/}
The dataRetrieval package was created to simplify the process of loading hydrologic data into the R environment. It is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrologic data that are available on the Web, as well as data from the Water Quality Portal (WQP), which currently houses water quality data from the Environmental Protection Agency (EPA), U.S. Department of Agriculture (USDA), and USGS. Direct USGS data is obtained from a service called the National Water Information System (NWIS).
For information on getting started in R and installing the package, see (\ref{sec:appendix1}): Getting Started. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
......@@ -281,13 +279,8 @@ The USGS organizes hydrologic data in a standard structure. Streamgages are loc
Once the siteNumber is known, the next required input for USGS data retrievals is the \enquote{parameter code}. This is a 5-digit code that specifies the measured parameter being requested. For example, parameter code 00631 represents \enquote{Nitrate plus nitrite, water, filtered, milligrams per liter as nitrogen}, with units of \enquote{mg/l as N}.
A useful place to discover USGS codes information, along with other NWIS information is:
\url{http://help.waterdata.usgs.gov/codes-and-parameters}
Not every station will measure all parameters. A short list of commonly measured parameters is shown in Table \ref{tab:params}.
<<tableParameterCodes, echo=FALSE,results='asis'>>=
pCode <- c('00060', '00065', '00010','00045','00400')
shortName <- c("Discharge [ft$^3$/s]","Gage height [ft]","Temperature [C]", "Precipitation [in]", "pH")
......@@ -315,9 +308,7 @@ Casrn stands for \enquote{Chemical Abstracts Service (CAS) Registry Number}. Mor
\url{http://www.cas.org/content/chemical-substances/faqs}
For unit values data (sensor data measured at regular time intervals such as 15 minutes or hourly), knowing the parameter code and siteNumber is enough to make a request for data. For most variables that are measured on a continuous basis, the USGS also stores the historical data as daily values. These daily values are statistical summaries of the continuous data, e.g. maximum, minimum, mean, or median. The different statistics are specified by a 5-digit statistics code. A complete list of statistic codes can be found here:
\url{http://help.waterdata.usgs.gov/code/stat_cd_nm_query?stat_nm_cd=%25&fmt=html&inline=true}
For unit values data (sensor data measured at regular time intervals such as 15 minutes or hourly), knowing the parameter code and siteNumber is enough to make a request for data. For most variables that are measured on a continuous basis, the USGS also stores the historical data as daily values. These daily values are statistical summaries of the continuous data, e.g. maximum, minimum, mean, or median. The different statistics are specified by a 5-digit statistics code.
Some common codes are shown in Table \ref{tab:stat}.
......@@ -455,9 +446,6 @@ parameterCd <- "00618"
parameterINFO <- readNWISpCode(parameterCd)
@
More information on parameter codes can obtained from:
\url{http://help.waterdata.usgs.gov/codes-and-parameters/parameters}
\FloatBarrier
......@@ -481,9 +469,7 @@ discharge <- readNWISdv(siteNumber,
parameterCd, startDate, endDate)
@
The column \texttt{"}datetime\texttt{"} in the returned data frame is automatically imported as a variable of class \texttt{"}Date\texttt{"} in R. Each requested parameter has a value and remark code column. The names of these columns depend on the requested parameter and stat code combinations. USGS daily value qualification codes are often \texttt{"}A\texttt{"} (approved for publication) or \texttt{"}P\texttt{"} (provisional data subject to revision). A more complete list of daily value qualification codes can be found here:
\url{http://help.waterdata.usgs.gov/codes-and-parameters/daily-value-qualification-code-dv_rmk_cd}
The column \texttt{"}datetime\texttt{"} in the returned data frame is automatically imported as a variable of class \texttt{"}Date\texttt{"} in R. Each requested parameter has a value and remark code column. The names of these columns depend on the requested parameter and stat code combinations. USGS daily value qualification codes are often \texttt{"}A\texttt{"} (approved for publication) or \texttt{"}P\texttt{"} (provisional data subject to revision).
Another example would be a request for mean and maximum daily temperature and discharge in early 2012:
......@@ -757,17 +743,14 @@ For NWIS data, the function \texttt{readNWISdata} can be used. The argument list
The \texttt{"..."} argument allows the user to create their own queries based on the instructions found in the web links above. The links provide instructions on how to create a URL to request data. Perhaps you want sites only in Wisconsin, with a drainage area less than 50 mi$^2$, and the most recent daily dischage data. That request would be done as follows:
<<dataExample>>=
<<dataExample, eval=FALSE>>=
dischargeWI <- readNWISdata(service="dv",
stateCd="WI",
parameterCd="00060",
drainAreaMin="50",
statCd="00003")
names(dischargeWI)
nrow(dischargeWI)
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
@
......@@ -815,7 +798,7 @@ dataPH <- readWQPdata(statecode="US:55",
%------------------------------------------------------------
All data frames returned from the Web services have some form of associated metadata. This information is included as attributes to the data frame. All data frames will have a \texttt{url} (returning a character of the url used to obtain the data), \texttt{siteInfo} (returning a data frame with information on sites), and \texttt{queryTime} (returning a POSIXct datetime) attributes. For example, the url and query time used to obtain the data can be found as follows:
<<meta1, eval=TRUE>>=
<<meta1, eval=FALSE>>=
attr(dischargeWI, "url")
......@@ -827,7 +810,7 @@ siteInfo <- attr(dischargeWI, "siteInfo")
Depending on the format that the data was obtained (RDB, WaterML1, etc), there will be additional information embedded in the data frame as attributes. To discover the available attributes:
<<meta2, eval=TRUE>>=
<<meta2, eval=FALSE>>=
names(attributes(dischargeWI))
......@@ -835,10 +818,9 @@ names(attributes(dischargeWI))
For data obtained from \texttt{readNWISuv}, \texttt{readNWISdv}, \texttt{readNWISgwl} there are two attributes that are particularly useful: \texttt{siteInfo} and \texttt{variableInfo}.
<<meta3, eval=TRUE>>=
<<meta3, eval=FALSE>>=
siteInfo <- attr(dischargeWI, "siteInfo")
head(siteInfo)
variableInfo <- attr(dischargeWI, "variableInfo")
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment