Skip to content

Multiple file formats

Created by: emthompson-usgs

Have StreamCollection handle the case where the same record is provided in multiple file formats (either as different processing levels or the same processing level). This seems to only come up with data from CESMD. New code should probably occur AFTER the existing grouping of traces is done in the StreamCollection __init__ method (could be at the end of __group_by_net_sta_inst or in a new method). I added some test data to gmprocess/data/testdata/duplicate_records. There are 14 files but there are only really two stations (and they are for the same earthquake). Some things to consider:

  1. There are 12 files for US.A020 includes; each channel is a different file and they are all SMC format. Each channel has three files, two of which are for V1 and the other is for V2.
  2. Station CE.23837 includes two COSMOS formats (*.V1 and *.V1C). The *V1 file isn't able to parse the network code and so it ends up as "ZZ" and so it is treated as a different station than the record in the *.V1C. We essentially need a way to fix this, perhaps by checking station codes (ignoring network codes) and since that is not guaranteed to be unique, add a check on the lat/lon and if it is very close then assume they are the same station and discard one of them.