Commit 4a556459 authored by Asquith, William H.'s avatar Asquith, William H.

initial commit

parent 0296213d
# README (path ./visGWDBmrva/inst/www/perl/README.md)
#### Author: William H. Asquith
#### Point of contact: William H. Asquith (wasquith@usgs.gov)
***
***
# DISCUSSION
This README file describes the contents of the above path. One or more **Perl 5** language (accessed on October 22, 2019 at https://www.perl.org) scripts are provided in this directory. The lead author's primary development platform is MacOS and other unix-like operating systems. The Perl language is naturally available on those in contrast to Windows. The Perl coding and examples here are not anticipated to be platform independent.
1. Script `catCSV.pl` --- This script is intended for concatenation of all comman-separate value (CSV) files in a given directory. For example, consider the `./visGWDBmrva/output/pozo` directory of monthly water levels. At the operating system terminal, the author typically changes to that directory and issues the following command `../../inst/perl/catCSV.pl > ../allpozo.csv` that looks up two and then down two directories to the **Perl** script and saves a continuous concatenation of all `.csv` files in the file `./visGWDBmrva/output/allpozo.csv`. This file has bee prepared for importation into _R_ using the `read.csv()` command. The script does some additional processing that a naive use of `cat *.csv > ../altpozo.csv` would do. There is only one column label line in the `allpozo.csv` in lieu of repeated label lines in the `altpozo.csv` file.
#!/usr/bin/perl -w
use strict;
# AUTHOR: William H. Asquith (USGS, Texas) (May 2018)
# FOR WHOM: me
# PURPOSE: To allow Asquith to run massive full period of record visualization of entire
# groundwater databases with visGWDB producing pseudo-observation export files
# and in post processing with this script cat all of the files together.
# (The file count is likely to be too big for a wild card unix pipeline---
# hence this script was written.) By replacing the NAs to NULLs, the output
# can be pushed into Esri ArcGIS and other packages. In general, it is hoped
# that the NAs were not originally written into the .csv files but the script
# will strip those out.
my @csv = glob("*.csv"); # glob names of all CVS files in the current directory
my $labels = undef; # harvest the label line of the first file encountered
foreach my $csv (@csv) { # for each of those CVS files process it
open(FH, "<$csv") or die; # open for reading
if(! $labels) { # if the labels are not defined, execute this block
$labels = <FH>; # read the first line from the file
if(! $labels) { # ../../inst/perl/visGWDB_cat.pl > junk.csv would be one
print STDERR "ERROR: File '$csv' has a missing header\n"; # cause because the
# junk.csv is formed instantly, it seems, and hence is in the glob, use
# redirection to say junk.txt.
next
}
chomp($labels); # remove the line ending
print "$labels\n"; # print the labels to standard out
}
while(<FH>) { # now read the files in the files
next if(/SITE_BADGE/); # skip the label line
chomp; # remove the line ending
s/,NA,/,,/g;s/,NA,/,,/g; # two global per line passes are needed to strip NAs
s/,NA$/,/; # finally strip trailing NAs
s/^NA,/,/; # finally strip leading NAs (should not need: SITE_BADGE)
print "$_\n"; # and print the line to standard output
}
close(FH);
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment