Update links authored by Hotovec-Ellis, Alicia Jean's avatar Hotovec-Ellis, Alicia Jean
REDPy automatically generates many output files at the end of each run of `redpy-catfill`, `redpy-backfill`, and `redpy-force-plot`. These outputs are designed to help you navigate the catalog and more easily notice patterns within and across families. Some scripts may also allow you to manually generate additional outputs. REDPy automatically generates many output files at the end of each run of [`redpy-catfill`](Console-Scripts#redpy-catfill), [`redpy-backfill`](Console-Scripts#redpy-backfill], and [`redpy-force-plot`](Console-Scripts#redpy-force-plot]. These outputs are designed to help you navigate the catalog and more easily notice patterns within and across families. Some scripts may also allow you to manually generate additional outputs.
[[_TOC_]] [[_TOC_]]
...@@ -54,22 +54,22 @@ REDPy/ ...@@ -54,22 +54,22 @@ REDPy/
``` ```
## Interactive Timelines ## Interactive Timelines
The `overview.html`, `overview_meta.html`, and `overview_recent.html` files are interactive plots with a shared time axis (i.e., panning or zooming in one panel will update the rest) that can be opened in a web browser. `overview_recent.html` shares the same format as `overview.html` but for only the last `recplot` days, and shows all families active within that period in the occurrence timeline (instead of families with at least `minplot` members). Meanwhile, `overview_meta.html` condenses all plots into tabs so many runs can appear on a `meta.html` page at once. This plot shows the last `mrecplot` days with at least `mminplot` members. The ending time of the plots is set to the time of the latest trigger by default, but can also be set to the time the timeline was rendered by setting `bokehendtime=now`. The `overview.html`, `overview_meta.html`, and `overview_recent.html` files are interactive plots with a shared time axis (i.e., panning or zooming in one panel will update the rest) that can be opened in a web browser. `overview_recent.html` shares the same format as `overview.html` but for only the last [`recplot`](Inputs-and-Settings#configuration-file) days, and shows all families active within that period in the occurrence timeline (instead of families with at least [`minplot`](Inputs-and-Settings#configuration-file) members). Meanwhile, `overview_meta.html` condenses all plots into tabs so many runs can appear on a `meta.html` page at once. This plot shows the last [`mrecplot`](Inputs-and-Settings#configuration-file) days with at least [`mminplot`](Inputs-and-Settings#configuration-file) members. The ending time of the plots is set to the time of the latest trigger by default, but can also be set to the time the timeline was rendered by setting [`bokehendtime=now`](Inputs-and-Settings#configuration-file).
Below is an annotated screenshot of the `overview_recent.html` output of the suggested default run: Below is an annotated screenshot of the `overview_recent.html` output of the suggested default run:
<img src="https://code.usgs.gov/ahotovec-ellis/REDPy/-/raw/main/img/output-overview.png" alt="Annotated overview_recent.html" /> <img src="https://code.usgs.gov/ahotovec-ellis/REDPy/-/raw/main/img/output-overview.png" alt="Annotated overview_recent.html" />
1. **Title** (here, 'REDPy Catalog') is set in `title`. It is also used for the `<title>` of the page. 1. **Title** (here, 'REDPy Catalog') is set in [`title`](Inputs-and-Settings#configuration-file). It is also used for the `<title>` of the page.
2. The **navigation bar** at top right has options to pan, zoom, tap, reset, and save. 2. The **navigation bar** at top right has options to pan, zoom, tap, reset, and save.
By default, these timelines have four panels, the layout of which can be customized in `plotformat`: By default, these timelines have four panels, the layout of which can be customized in [`plotformat`](Inputs-and-Settings#configuration-file):
3. The **Repeaters vs. Orphans** plot shows the number/counts of these two classifications within temporal bins defined by `binday`, `binhr`, and `mbinhr`. The total number of triggers in that hour will be the sum of these two lines, which can be displayed instead by setting `timeline_vs=triggers` (also changes the plot title to **Repeaters vs. Triggers**). 3. The **Repeaters vs. Orphans** plot shows the number/counts of these two classifications within temporal bins defined by [`binday`](Inputs-and-Settings#configuration-file), [`binhr`](Inputs-and-Settings#configuration-file), and [`mbinhr`](Inputs-and-Settings#configuration-file). The total number of triggers in that hour will be the sum of these two lines, which can be displayed instead by setting [`timeline_vs=triggers`](Inputs-and-Settings#configuration-file) (also changes the plot title to **Repeaters vs. Triggers**).
4. The **Frequency Index** (FI) plot has a point for every repeating event with a quantity related to its frequency content, a ratio of energy in an upper and lower frequency band (see [Buurman and West (2006)](https://pubs.usgs.gov/pp/1769/chapters/p1769_chapter02.pdf) for more details on FI). Tectonic-type events usually have FI>0 and 'long period' earthquakes have FI<0 with the default settings, but note that the value is dependent on your choice of frequency bands (`filomin`, etc.) and many other factors. This plot is useful for quickly identifying the character of repeating seismicity and how the frequency content evolves with time. The dots are colored by the number of available channels during the correlation window that are not in a data gap, may be hovered for additional event information (e.g., event time, FI, family), and clicked to open the corresponding family page. 4. The **Frequency Index** (FI) plot has a point for every repeating event with a quantity related to its frequency content, a ratio of energy in an upper and lower frequency band (see [Buurman and West (2006)](https://pubs.usgs.gov/pp/1769/chapters/p1769_chapter02.pdf) for more details on FI). Tectonic-type events usually have FI>0 and 'long period' earthquakes have FI<0 with the default settings, but note that the value is dependent on your choice of frequency bands ([`filomin`](Inputs-and-Settings#configuration-file), etc.) and many other factors. This plot is useful for quickly identifying the character of repeating seismicity and how the frequency content evolves with time. The dots are colored by the number of available channels during the correlation window that are not in a data gap, may be hovered for additional event information (e.g., event time, FI, family), and clicked to open the corresponding family page.
5. The **Occurrence Timeline** has horizontal lines corresponding to families, with endpoints at the times of the first and last events in that family. This line shows through when no members are active and will have an arrow if the family extends off the bounds of the plot. Colored bars correspond to time bins with activity within that family, colored by either the number of events within that bin (see color scale at top left) or the average FI (with color scale span controlled by `fispanlow` and `fispanhigh`). This bin width is controlled by `dybin`, `hrbin`, and `mhrbin`. By default, color by rate and FI are tabbed. The number to the right of the bars corresponds to the number of total members within the family. Hovering the mouse over a family will display a preview waveform (core event at `plotsta` station) and the family's ID number. Clicking here will open a [more detailed page about the family](#family-pages). 5. The **Occurrence Timeline** has horizontal lines corresponding to families, with endpoints at the times of the first and last events in that family. This line shows through when no members are active and will have an arrow if the family extends off the bounds of the plot. Colored bars correspond to time bins with activity within that family, colored by either the number of events within that bin (see color scale at top left) or the average FI (with color scale span controlled by [`fispanlow`](Inputs-and-Settings#configuration-file) and [`fispanhigh`](Inputs-and-Settings#configuration-file)). This bin width is controlled by [`dybin`](Inputs-and-Settings#configuration-file), [`hrbin`](Inputs-and-Settings#configuration-file), and [`mhrbin`](Inputs-and-Settings#configuration-file). By default, color by rate and FI are tabbed. The number to the right of the bars corresponds to the number of total members within the family. Hovering the mouse over a family will display a preview waveform (core event at [`plotsta`](Inputs-and-Settings#configuration-file) station) and the family's ID number. Clicking here will open a [more detailed page about the family](#family-pages).
6. The **Family Longevity** plot orders the families in the occurrence timeline by the length of time they are active, and can be useful for identifying times when many families die or are created. If the starting time of a family is before the date of the start of `overview_recent.html`, an arrow will indicate that the family extends off the plot. 6. The **Family Longevity** plot orders the families in the occurrence timeline by the length of time they are active, and can be useful for identifying times when many families die or are created. If the starting time of a family is before the date of the start of `overview_recent.html`, an arrow will indicate that the family extends off the plot.
...@@ -93,15 +93,15 @@ Below this is a multi-paneled image `families/fam0.png`: ...@@ -93,15 +93,15 @@ Below this is a multi-paneled image `families/fam0.png`:
5. The normalized sum of the Fourier **amplitude spectra** (post-filtering) over all channels for both the single core event and all events together. Intended to quickly summarize the strongest frequencies in the signal across all observations. 5. The normalized sum of the Fourier **amplitude spectra** (post-filtering) over all channels for both the single core event and all events together. Intended to quickly summarize the strongest frequencies in the signal across all observations.
6. **Timeline of amplitude** (on the same preview station `printsta` only) of each event with time. 6. **Timeline of amplitude** (on the same preview station [`printsta`](Inputs-and-Settings#configuration-file) only) of each event with time.
7. **Timeline of inter-event time** (time between successive members of the family) in hours. Note the logarithmic scale on the y-axis. 7. **Timeline of inter-event time** (time between successive members of the family) in hours. Note the logarithmic scale on the y-axis.
8. **Timeline of cross-correlation coefficient** relative to the best correlated event (i.e., the event that has the maximum sum across rows in the stored correlation matrix). Open circles at the bottom mean that no value is stored for that pair (either not computed or below `cmin`). This is intended to help visualize how the waveforms are changing with time. A black symbol will denote which event is the current core event, which may be different from the best correlated. *Note that the coefficient plotted here is what is stored in the table, which is across all stations used.* 8. **Timeline of cross-correlation coefficient** relative to the best correlated event (i.e., the event that has the maximum sum across rows in the stored correlation matrix). Open circles at the bottom mean that no value is stored for that pair (either not computed or below [`cmin`](Inputs-and-Settings#configuration-file)). This is intended to help visualize how the waveforms are changing with time. A black symbol will denote which event is the current core event, which may be different from the best correlated. *Note that the coefficient plotted here is what is stored in the table, which is across all stations used.*
If `checkcomcat=False` (by default), this will be the end of the page. If it is `True`, and if a match to at least one member in the family was found, more of the page will be rendered: If [`checkcomcat=False`](Inputs-and-Settings#configuration-file) (by default), this will be the end of the page. If it is `True`, and if a match to at least one member in the family was found, more of the page will be rendered:
9. A **map of matched local events** will have the locations of local matched events as red dots. Locations given in `stalats` and `stalons` are plotted as black triangles. The average depth of the family (usually relative to sea level) is given at the top of the map along with the number of matches found. This image is `families/map0.png` and is only rendered if a local match is found. 9. A **map of matched local events** will have the locations of [local matched events](#associated-locations) as red dots. Locations given in [`stalats`](Inputs-and-Settings#configuration-file) and [`stalons`](Inputs-and-Settings#configuration-file) are plotted as black triangles. The average depth of the family (usually relative to sea level) is given at the top of the map along with the number of matches found. This image is `families/map0.png` and is only rendered if a local match is found.
10. A **list of matched events** will be listed for both local (black) and more distant (red; regional and teleseismic distances) matches. All matches are listed, including possible conflicting matches, along with the best matching phase arrival. This list scrolls to conserve space. Totals are listed at the bottom. If no matches are found the list will be empty. 10. A **list of matched events** will be listed for both local (black) and more distant (red; regional and teleseismic distances) matches. All matches are listed, including possible conflicting matches, along with the best matching phase arrival. This list scrolls to conserve space. Totals are listed at the bottom. If no matches are found the list will be empty.
...@@ -119,7 +119,7 @@ Several text-based catalogs are written to the output directory: ...@@ -119,7 +119,7 @@ Several text-based catalogs are written to the output directory:
`catalog_triggers.txt`: Dates of all triggers that made it past the junk filter. Includes deleted events (and subsequent matches to deleted families), expired and current orphans, and all repeaters. The time is the original trigger time, which may be slightly different from the event time listed in `catalog.txt`. `catalog_triggers.txt`: Dates of all triggers that made it past the junk filter. Includes deleted events (and subsequent matches to deleted families), expired and current orphans, and all repeaters. The time is the original trigger time, which may be slightly different from the event time listed in `catalog.txt`.
`swarm.csv` and `swarm_triggers.csv`: These files can be read by [Swarm](https://volcanoes.usgs.gov/software/swarm/download.shtml) v2.8.5+ using the "tagging" feature to annotate the interactive helicorders. It marks each repeating event with a label that has the `groupname` and the family it belongs to (so for the default run, Family 1 would be labeled as 'default1'). The station listed is the one referenced by `printsta` in the configuration, and can be changed using global find/replace in a text editor to change which station or channel the tags should appear on. Colors can be chosen for families or types of interest by adding lines to the `EventClassifications.config` file in the Swarm folder. For example, adding the line: `swarm.csv` and `swarm_triggers.csv`: These files can be read by [Swarm](https://volcanoes.usgs.gov/software/swarm/download.shtml) v2.8.5+ using the "tagging" feature to annotate the interactive helicorders. It marks each repeating event with a label that has the [`groupname`](Inputs-and-Settings#configuration-file) and the family it belongs to (so for the default run, Family 1 would be labeled as 'default1'). The station listed is the one referenced by [`printsta`](Inputs-and-Settings#configuration-file) in the configuration, and can be changed using global find/replace in a text editor to change which station or channel the tags should appear on. Colors can be chosen for families or types of interest by adding lines to the `EventClassifications.config` file in the Swarm folder. For example, adding the line:
`default1, #ffff00` `default1, #ffff00`
...@@ -129,7 +129,7 @@ changes the appearance of members of the `default1` family to be yellow to stand ...@@ -129,7 +129,7 @@ changes the appearance of members of the `default1` family to be yellow to stand
### Reports ### Reports
Some families (especially large ones) may have interesting features that a user might want to investigate in more detail. A "report" can be generated with `redpy-create-report` for a given family (or list of families) that has more information than the standard family page. Some families (especially large ones) may have interesting features that a user might want to investigate in more detail. A "report" can be generated with [`redpy-create-report`](Console-Scripts#redpy-create-report) for a given family (or list of families) that has more information than the standard family page.
Below is an annotated screenshot of the `reports/report-0.html` output for a Family 0 report with no flags (`redpy-create-report 0`) after the default run completes: Below is an annotated screenshot of the `reports/report-0.html` output for a Family 0 report with no flags (`redpy-create-report 0`) after the default run completes:
<img src="https://code.usgs.gov/ahotovec-ellis/REDPy/-/raw/main/img/output-reports.png" alt="Annotated report page for Family 0" /> <img src="https://code.usgs.gov/ahotovec-ellis/REDPy/-/raw/main/img/output-reports.png" alt="Annotated report page for Family 0" />
...@@ -142,17 +142,17 @@ Below is an annotated screenshot of the `reports/report-0.html` output for a Fam ...@@ -142,17 +142,17 @@ Below is an annotated screenshot of the `reports/report-0.html` output for a Fam
4. Instead of showing only the core and stack of the waveforms, images of **all waveforms** on all channels are shown (`reports/0-reportwaves.png`). Time is on the x-axis, and each row of pixels on the y-axis corresponds to a member of the family. Color is by amplitude, with white at 0, red for negative amplitudes, and blue for positive amplitudes. Each waveform is normalized, with some cropping of amplitudes toward the end of the waveform. This view takes up a significant amount of space (note that I've cropped out most of the next 6 panels!) but can help assess which channels have the most signal or see subtle changes in the waveforms with time. 4. Instead of showing only the core and stack of the waveforms, images of **all waveforms** on all channels are shown (`reports/0-reportwaves.png`). Time is on the x-axis, and each row of pixels on the y-axis corresponds to a member of the family. Color is by amplitude, with white at 0, red for negative amplitudes, and blue for positive amplitudes. Each waveform is normalized, with some cropping of amplitudes toward the end of the waveform. This view takes up a significant amount of space (note that I've cropped out most of the next 6 panels!) but can help assess which channels have the most signal or see subtle changes in the waveforms with time.
5. The **timeline of amplitude** plot is now interactive (pan, zoom), and amplitudes for all channels are now shown instead of just the single `printsta` channel. As the title suggests, click on the names in the legend to toggle hiding or showing each channel to isolate which you want to see. 5. The **timeline of amplitude** plot is now interactive (pan, zoom), and amplitudes for all channels are now shown instead of just the single [`printsta`](Inputs-and-Settings#configuration-file) channel. As the title suggests, click on the names in the legend to toggle hiding or showing each channel to isolate which you want to see.
6. The **timeline of inter-event time** is now interactive, but otherwise matches the family page. 6. The **timeline of inter-event time** is now interactive, but otherwise matches the family page.
7. The **timeline of cross-correlation coefficient** is also interactive, but now is explicitly relative to the core event. Values below `cmin` have been filled in, unless the `-s` flag has been used to "skip" recalculating the full dense cross-correlation matrix. 7. The **timeline of cross-correlation coefficient** is also interactive, but now is explicitly relative to the core event. Values below [`cmin`](Inputs-and-Settings#configuration-file) have been filled in, unless the `-s` flag has been used to "skip" recalculating the full dense cross-correlation matrix.
These interactive timelines can be accessed by themselves in `reports/0-report-bokeh.html`. These interactive timelines can be accessed by themselves in `reports/0-report-bokeh.html`.
8. The **stored cross-correlation matrix** has rows and columns for each member, with darker colors corresponding to more similar waveforms and lighter colors to less similar. The color switches from yellow to white at `cmin`, which for the stored matrix corresponds to places where either the correlation value is below `cmin` for that pair, _or_ it was never calculated in the first place. 8. The **stored cross-correlation matrix** has rows and columns for each member, with darker colors corresponding to more similar waveforms and lighter colors to less similar. The color switches from yellow to white at [`cmin`](Inputs-and-Settings#configuration-file), which for the stored matrix corresponds to places where either the correlation value is below [`cmin`](Inputs-and-Settings#configuration-file) for that pair, _or_ it was never calculated in the first place.
9. The **full cross-correlation matrix** contains values for _all_ pairs, including those that were missing from the stored matrix. Values below `cmin` are still cropped at white, but values exist. For large families (>1000 members or so) this matrix can take a very long time to create. If you choose to skip re-calculating the full matrix with `-s`, this part of the image will not be included in `reports/0-reportcmat.png`. 9. The **full cross-correlation matrix** contains values for _all_ pairs, including those that were missing from the stored matrix. Values below [`cmin`](Inputs-and-Settings#configuration-file) are still cropped at white, but values exist. For large families (>1000 members or so) this matrix can take a very long time to create. If you choose to skip re-calculating the full matrix with `-s`, this part of the image will not be included in `reports/0-reportcmat.png`.
If you use the `-o` flag, the waveform plots in 4 and correlation matrices in 8 and/or 9 will be ordered with OPTICS, which is used to pick the core event. This ordering is based on similarity from the correlation matrix, such that similar events will be close to each other in the order, allowing visualization of possible sub-families within the family. The ordering is applied to both the waveform images and the correlation matrices. Note the tight groups of dark colors surrounded by lighter colors in the matrix on the right: If you use the `-o` flag, the waveform plots in 4 and correlation matrices in 8 and/or 9 will be ordered with OPTICS, which is used to pick the core event. This ordering is based on similarity from the correlation matrix, such that similar events will be close to each other in the order, allowing visualization of possible sub-families within the family. The ordering is applied to both the waveform images and the correlation matrices. Note the tight groups of dark colors surrounded by lighter colors in the matrix on the right:
...@@ -162,14 +162,14 @@ If you use the `-m` flag, two .npy (numpy format) files will be output: `reports ...@@ -162,14 +162,14 @@ If you use the `-m` flag, two .npy (numpy format) files will be output: `reports
### Associated Locations ### Associated Locations
If `checkcomcat=True`, three files will be saved that contain external catalog locations for the three distance and magnitude thresholds defined in your [configuration](Inputs-and-Settings#configuration-file): `external_local.txt`, `external_regional.txt`, and `external_teleseismic.txt`. These are updated with each run to span the same time as the first and last triggers. For long-lived runs, these files can grow to be relatively large, but save a significant amount of time querying the original catalog over the web. You can replace these files with custom catalogs with the same format. However, be careful if the files do not completely bracket the triggers of your run as the code may attempt to append what it thinks is missing. If you want, you can also use these files as inputs into `redpy-compare-catalog`, which output by default to `matches.csv` in your current directory. If [`checkcomcat=True`](Inputs-and-Settings#configuration-file), three files will be saved that contain external catalog locations for the three distance and magnitude thresholds defined in your [configuration](Inputs-and-Settings#configuration-file): `external_local.txt`, `external_regional.txt`, and `external_teleseismic.txt`. These are updated with each run to span the same time as the first and last triggers. For long-lived runs, these files can grow to be relatively large, but save a significant amount of time querying the original catalog over the web. You can replace these files with custom catalogs with the same format. However, be careful if the files do not completely bracket the triggers of your run as the code may attempt to append what it thinks is missing. If you want, you can also use these files as inputs into [`redpy-compare-catalog`](Console-Scripts#redpy-compare-catalog), which output by default to `matches.csv` in your current directory.
Furthermore, if you run `redpy-write-family-locations`, `family_locations.csv` will be written to the output directory for that run with a summary of the median of locations contained within each [family page](Outputs#family-pages). Furthermore, if you run [`redpy-write-family-locations`](Console-Scripts#redpy-write-family-locations), `family_locations.csv` will be written to the output directory for that run with a summary of the median of locations contained within each [family page](Outputs#family-pages).
### Higher Quality Outputs ### Higher Quality Outputs
Editable, publication quality (or at least nearer to it than a flat screenshot) .pdf versions of the timeline and family images may be created with `redpy-create-pdf-timeline` and `redpy-create-pdf-family`. The timeline will be created in the main output directory as `overview.pdf`, and there are several flags that control its appearance. The family image is created in `families/fam*.pdf` where `*` is the family number you specified. It will look almost identical to `families/fam*.png`, and you can control the time span (e.g., you want multiple families to share the same time axes for comparison, or you want to zoom in on a time of interest in a long-lived family). Editable, publication quality (or at least nearer to it than a flat screenshot) .pdf versions of the timeline and family images may be created with [`redpy-create-pdf-timeline`](Console-Scripts#redpy-create-pdf-timeline) and [`redpy-create-pdf-family`](Console-Scripts#redpy-create-pdf-family). The timeline will be created in the main output directory as `overview.pdf`, and there are several flags that control its appearance. The family image is created in `families/fam*.pdf` where `*` is the family number you specified. It will look almost identical to `families/fam*.png`, and you can control the time span (e.g., you want multiple families to share the same time axes for comparison, or you want to zoom in on a time of interest in a long-lived family).
### Contents of Junk Table ### Contents of Junk Table
The contents of the "junk" table may be output for troubleshooting purposes (i.e., to check that triggers included here have been excluded from consideration correctly). A text catalog `catalog_junk.txt` is created in the main outputs directory with the trigger times and "junk type" for each entry. The type is related to which thresholds were exceeded: `freq` means that the frequency index filter for teleseisms flagged that trigger for having too many channels (more than `teleok`) with FI below the threshold (`telefi`), `kurt` means that the kurtosis (sharpness/spikiness) of the waveform or its frequency spectrum was exceeded, or that the trigger failed `both` the frequency and kurtosis tests. A new folder `junk/` is created that has images of waveforms with filenames containing the time of the trigger (`YYYYmmddHHMMSS.SSSS` corresponding to year, month, day, hour, minute, and decimal second) and the junk type. The contents of the ["junk" table](How-REDPy-Works#junk-table) may be output for troubleshooting purposes (i.e., to check that triggers included here have been excluded from consideration correctly). A text catalog `catalog_junk.txt` is created in the main outputs directory with the trigger times and "junk type" for each entry. The type is related to which thresholds were exceeded: `freq` means that the frequency index filter for teleseisms flagged that trigger for having too many channels (more than [`teleok`](Inputs-and-Settings#configuration-file)) with FI below the threshold ([`telefi`](Inputs-and-Settings#configuration-file)), `kurt` means that the kurtosis (sharpness/spikiness check) of the waveform or its frequency spectrum was exceeded, or that the trigger failed `both` the frequency and kurtosis tests. A new folder `junk/` is created that has images of waveforms with filenames containing the time of the trigger (`YYYYmmddHHMMSS.SSSS` corresponding to year, month, day, hour, minute, and decimal second) and the junk type.
\ No newline at end of file \ No newline at end of file