Skip to content

Error reading WaterML 2 data

Created by: jirikadlec2

I would like to use the dataRetrieval package for parsing WaterML 2.0 files. I'm using the importWaterML2 function. The following example fails:

importWaterML2("http://worldwater.byu.edu/app/index.php/cartagena_jim/services/cuahsi_1_1.asmx/GetValues?location=cartagena_jim2:BC-C&variable=cartagena_jim2:IDWW-Streamflow&startDate=&endDate=")

the error is: Error in $<-.data.frame(*tmp*, "identifier", value = c("phenomena", : replacement has 920 rows, data has 92

The second example also fails with a similar error: importWaterML2("http://www.waterml2.org/KiWIS-WML2-Example.wml")

Error in $<-.data.frame(*tmp*, "identifier", value = character(0)) : replacement has 0 rows, data has 30

Looking at the source code, the error occurs in https://github.com/USGS-R/dataRetrieval/blob/master/R/importWaterML2.r#L154-156

id <- as.character(xpathApply(chunk, "//gml:identifier", xmlValue, namespaces = chunkNS)) DF2$identifier <- rep(id, nrow(DF2))

The difference between the first example file and the USGS WaterML2 is that USGS has one identifier element, but my example has more than one gml:identifier element. For instance the example file has 10 identifier elements (phenomena, method, quality, censorCode, ...)

The problem with the second example WaterML2 file (http://www.waterml2.org/KiWIS-WML2-Example.wml) is that this file doesn't have any gml:identifier elements.

To fix this issue, it would be nice if importWaterML2 can support the cases "no identifiers found" and "more than one identifier".

A temporary fix could be adding an extra function parameter identifier=TRUE, and if I call importWaterML2(url, identifier = FALSE) than it could return the data.frame without any identifier values.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information