Error reading WaterML 2 data
Created by: jirikadlec2
I would like to use the dataRetrieval package for parsing WaterML 2.0 files. I'm using the importWaterML2 function. The following example fails:
importWaterML2("http://worldwater.byu.edu/app/index.php/cartagena_jim/services/cuahsi_1_1.asmx/GetValues?location=cartagena_jim2:BC-C&variable=cartagena_jim2:IDWW-Streamflow&startDate=&endDate=")
the error is: Error in $<-.data.frame
(*tmp*
, "identifier", value = c("phenomena", :
replacement has 920 rows, data has 92
The second example also fails with a similar error:
importWaterML2("http://www.waterml2.org/KiWIS-WML2-Example.wml")
Error in $<-.data.frame
(*tmp*
, "identifier", value = character(0)) :
replacement has 0 rows, data has 30
Looking at the source code, the error occurs in https://github.com/USGS-R/dataRetrieval/blob/master/R/importWaterML2.r#L154-156
id <- as.character(xpathApply(chunk, "//gml:identifier", xmlValue, namespaces = chunkNS))
DF2$identifier <- rep(id, nrow(DF2))
The difference between the first example file and the USGS WaterML2 is that USGS has one identifier element, but my example has more than one gml:identifier element. For instance the example file has 10 identifier elements (phenomena, method, quality, censorCode, ...)
The problem with the second example WaterML2 file (http://www.waterml2.org/KiWIS-WML2-Example.wml) is that this file doesn't have any gml:identifier elements.
To fix this issue, it would be nice if importWaterML2 can support the cases "no identifiers found" and "more than one identifier".
A temporary fix could be adding an extra function parameter identifier=TRUE, and if I call importWaterML2(url, identifier = FALSE) than it could return the data.frame without any identifier values.