diff --git a/DESCRIPTION b/DESCRIPTION index 305373212f2af53b30d35e7b7dd38d27b7290025..8293f31e2c91da73e535941cdef4798e62c7d21b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -40,7 +40,8 @@ Suggests: rmarkdown, testthat, knitr, - covr + covr, + adc BugReports: https://code.usgs.gov/water/wrtdsplus/-/issues VignetteBuilder: knitr BuildVignettes: true diff --git a/vignettes/workflow-1_add-custom-variable.Rmd b/vignettes/workflow-1_add-custom-variable.Rmd index abae6bcca81266108016ed3a276aa1b04c60aca5..06b49c42c0edc0331182401218c7fda304d8814e 100644 --- a/vignettes/workflow-1_add-custom-variable.Rmd +++ b/vignettes/workflow-1_add-custom-variable.Rmd @@ -1,9 +1,8 @@ --- title: "Workflow 1: Adding Custom Variable" -date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{workflow-1_add-custom-variable} + %\VignetteIndexEntry{Workflow 1: Adding Custom Variable} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: @@ -19,7 +18,3 @@ knitr::opts_chunk$set( ) ``` -```{r} -library(WRTDSplus) -``` - diff --git a/vignettes/workflow-2_replace-discharge.Rmd b/vignettes/workflow-2_replace-discharge.Rmd index a09a4d7081c1eabfd3d4cf454f96ed7ede15c5b9..a41edfd114065438bdfd8cdb43d008e5687a0278 100644 --- a/vignettes/workflow-2_replace-discharge.Rmd +++ b/vignettes/workflow-2_replace-discharge.Rmd @@ -2,7 +2,7 @@ title: "Workflow 2: Replace Discharge with Custom Variable" output: rmarkdown::html_vignette vignette: > - %\VignetteIndexEntry{workflow-2_replace-discharge} + %\VignetteIndexEntry{Workflow 2: Replace Discharge with Custom Variable} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: @@ -14,10 +14,180 @@ knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, - comment = "#>" + comment = "#>", + eval = requireNamespace("adc") ) ``` +# Overview + +This workflow describes how to substitute discharge with a custom variable using the [WRTDSplus package](https://rconnect.chs.usgs.gov/WRTDSplus/). To do this we add a custom variable as a column named 'Z' to both the `Daily` and `Sample` dataframes within an `eList` object. Then we run the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using the function `modelEstimation_ReplaceQ()` from the `WRTDSplus` package. + +In this example, our custom variable will be quickflow, which we calculate using the `bf_sep_lh()` function from the [adc](https://github.com/TxWRI/adc) package. + +For more information on alternative discharge methods see: <https://rconnect.usgs.gov/EGRET/articles/AlternativeQMethod.html> + +# Getting Started + +To start, you need to install and load the [WRTDSplus package](https://rconnect.chs.usgs.gov/WRTDSplus/) and the [EGRET package](https://rconnect.usgs.gov/EGRET/). + ```{r} +library(EGRET) library(WRTDSplus) ``` + +Next, you need to create the `INFO`, `Daily`, and `Sample` dataframes that make up an `eList` object. For more information on `eList` objects see the `EGRET` [vignette](https://rconnect.usgs.gov/EGRET/articles/EGRET.html) or the `EGRET` [user guide](https://pubs.usgs.gov/tm/04/a10/). For this example, we will use Total Phosphorus measurements from 'Dorn Creek at Cty M near Waunakee, WI' for water years 2012 to 2021. All data will be compiled through NWIS. + +```{r} +# Creating INFO dataframe from NWIS +site <- "05427930" +pcode <- "00665" +INFO <- readNWISInfo(site, pcode, interactive = FALSE) +# Adding additional inputs to the INFO dataframe +INFO$shortName <- "Dorn Creek" +INFO$staAbbrev <- "DCM" +INFO$paramShortName <- "Total Phosphorus" +INFO$constitAbbrev <- "TP" +INFO$windowY <- 7 +INFO$windowS <- 0.25 +INFO$windowQ <- 2 +INFO$minNumObs <- 100 + +# Creating Sample dataframe from NWIS +start_date <- "2012-10-01" +end_date <- "2021-09-30" +Sample <- readNWISSample(site, pcode, start_date, end_date) + +# Creating Daily dataframe from NWIS +Daily <- readNWISDaily(site, "00060", start_date, end_date) +``` + +# Add Variable to Replace Discharge + +Next, you need to add or create the variable you would like to substitute with discharge. For this example, we will substitute discharge with quickflow. To calculate quickflow for this example we are using the `bf_sep_lh()` function from the [adc](https://github.com/TxWRI/adc) package. + +```{r} +# Calculate baseflow and quickflow +Daily$bf <- adc::bf_sep_lh(Daily$Q, a = 0.925, n = 3) +Daily$qf <- Daily$Q - Daily$bf +``` + +After you create your new variable, you need to add it to the `Daily` and `Sample` dataframes as a column named '*Z*'. Than create an `eList` object using your `INFO`, `Daily`, and `Sample` dataframes. The values added to column '*Z*' will be modeled "as is". If you want to apply any transformations to the value this must be done prior to making the `eList`. This is different from how EGRET handles concentration and discharge values, which are stored un-transformed in the `eList` and than log transformed while using `modelEstimation()`. + +For this example, we want to replace discharge with the log of quickflow. We will apply the log transformation prior to creating our `eList`. To avoid taking the log of zero, we offset quickflow values by a constant value. + +```{r fig.width = 7, fig.height = 6} +# Calculating the offset for when quickflow is 0 +Qq05 <- quantile(Daily$qf[Daily$qf > 0], probs = 0.05) + +# Transforming quickflow values +Daily$LogQq <- log(Daily$qf + Qq05) #applying offset +Daily$Z <- Daily$LogQq + +# Adding log of quickflow to the Sample dataframe +Sample <- dplyr::left_join(Sample, + Daily[, c("Date", "Z", "qf")], + by = "Date") + +# Create eList +eList <- mergeReport(INFO, Daily, Sample) +``` + +# Selecting Variable Windows + +Before we run our model estimation we need to pick a value for the half-window width for our custom variable. To start, one to two standard deviations is recommended. + +```{r windows} +# Summary Statistics to help us determine the half-window for our custom variable +sd(Daily$Z, na.rm = T) +``` + +The standard deviation for our custom variable is 1.8, which we will use as our half-window width for quickflow (input `windowZ` for `modelEstimation_ReplaceQ()`). + +Comparing models using different value for `windowZ` may help you determine the best value to use. You can use the function `errorStats()` to compare models (See [Compare Model to Traditional WRTDS Model](#compare) for more details on how to compare models). + +# Run WRTDS with Discharge Replacement + +You can now run the WRTDS model using the function `modelEstimation_ReplaceQ()` from the `WRTDSplus` package. This function will use the values in the column 'Z', instead of discharge. You can view the results as you normally would using the `EGRET` package. + +```{r fig.width = 7, fig.height = 6} +eList_qf <- modelEstimation_ReplaceQ(eList, windowZ = 1.8) +plotConcHist(eList_qf) +``` + +# Compare Model to Traditional WRTDS Model {#compare} + +Let's now see how our model using quickflow compares to the same model using discharge. + +```{r} +# Run model using discharge +eList_Q <- modelEstimation(eList) + +# Compare models +errorStats(eList_Q) + +errorStats(eList_qf) +``` + +The Rsquared value for log(Concentration) increases from 0.626 to 0.692 and the root mean square error (rmse) decreases from 0.445 to 0.404 when discharge is replaced with quickflow. Replacing discharge in the WRTDS model with quickflow does produce a slightly better quality model. + +## Comparison Plots + +**Plots of Observed Concentration versus Estimated Concentration** + +*left - model with discharge, right - model with quickflow* + +```{r fig.width = 9, fig.height = 6} +par(mfrow = c(1, 2)) +plotConcPred(eList_Q) +plotConcPred(eList_qf) +``` + +**Plot of the Residuals from WRTDS (in log concentration units) versus Time** + +*left - model with discharge, right - model with quickflow* + +```{r fig.width = 9, fig.height = 6} +par(mfrow = c(1, 2)) +plotResidTime(eList_Q) +plotResidTime(eList_qf) +``` + +**Plot of the Residuals from WRTDS versus the Estimated Values (all in log concentration units)** + +*left - model with discharge, right - model with quickflow* + +```{r fig.width = 9, fig.height = 6} +par(mfrow = c(1, 2)) +plotResidPred(eList_Q) +plotResidPred(eList_qf) +``` + +**Plots Comparing Estimated Concentrations between the Models** + +```{r fig.width = 7, fig.height = 7} +plot(eList_Q[["Sample"]]$ConcHat, eList_qf[["Sample"]]$ConcHat, pch = 20, + ylab = paste("Quickflow Model Estimated Concentration in", INFO$param.units), + xlab = paste("Discharge Model Estimated Concentration in", INFO$param.units), + main = paste(INFO$shortName, "\n", INFO$paramShortName, "\n", + "Comparing Estimated Concentrations between Models")) +abline(a = 0 , b = 1) +``` + +By plotting the estimated concentrations of Total Phosphorous from both models against each other with a 1:1 line we see that overall the estimated values between models are similar. At lower concentrations, there is greater variability between the model concentration estimates then at higher concentrations. + +```{r fig.width = 7, fig.height = 7} +# Calculate mean estimated concentration for each model +AnnualResults_Q <- setupYears(eList_Q$Daily) +AnnualResults_qf <- setupYears(eList_qf$Daily) + +plot(AnnualResults_Q$DecYear, AnnualResults_Q$Conc, + type = "b", col = "black", pch = 20, + ylab = paste("Mean Estimated Concentration in", INFO$param.units), xlab = "", + main = paste(INFO$shortName, "\n", INFO$paramShortName, "\n", + "Concentration Estimates versus Water Year")) +points(AnnualResults_qf$DecYear, AnnualResults_qf$Conc, + type = "b", col = "red", pch = 20) +``` + +This time series over water year compares the annual mean estimated concentrations between the models, where the values from the model using discharge are in black and the values from the quickflow model are in red. For all water years, except 2015, the annual mean estimated concentrations from the discharge model are larger then the average values from the quickflow model.