Commits · ee606ebb6445d443f368c12d3b931893ecbc7d16 · ghsc / National Geomagnetism Program / geomag-algorithms

Jan 07, 2025
- Changed metadata_class to typehint of MetadataCategory since this will not be used for flags only. · 92050fa3
  Wernle, Alexandra Nicole authored 3 months ago
  
  92050fa3
- Added mode for managing existing spikes. Added logic to create_spike_metadata... · b94c90b4
  Wernle, Alexandra Nicole authored 3 months ago
  
  Added mode for managing existing spikes. Added logic to create_spike_metadata to allow empty arrays.
  b94c90b4
- Pydantic v2 requires model_dump · e10ecb92
  Wernle, Alexandra Nicole authored 4 months ago
  
  e10ecb92
- Added error handling for if window>period. · 5f0c832b
  Wernle, Alexandra Nicole authored 4 months ago
  
  5f0c832b
- Making attributes Optional and added CustomUTCDateTimeType. · 08251027
  Wernle, Alexandra Nicole authored 4 months ago
  
  08251027
- Making attributes Optional in pydantic style. · b84089ae
  Wernle, Alexandra Nicole authored 4 months ago
  
  b84089ae
- Changed add_empty_channels to False. Changed check_existing_metadata to return... · 000a7ef5
  Wernle, Alexandra Nicole authored 4 months ago
  
  Changed add_empty_channels to False. Changed check_existing_metadata to return whole metadata object.
  000a7ef5
- Split run into more functions. Added more functionality such that preexisting... · ca17a45e
  Wernle, Alexandra Nicole authored 4 months ago
  
  Split run into more functions. Added more functionality such that preexisting spikes are handled here instead of only flag-spikes.
  ca17a45e
- Added pass at end of loop. · aed7b732
  Wernle, Alexandra Nicole authored 5 months ago
  
  aed7b732
- Changed channels argument to str. Added add_empty_channels to get_timeseries.... · 7e3b1140
  Wernle, Alexandra Nicole authored 5 months ago
  
  Changed channels argument to str. Added add_empty_channels to get_timeseries. Changed split_stream_by_day to split stream, not individual traces.
  7e3b1140
- Fixed Trace import. · 777e7d45
  Wernle, Alexandra Nicole authored 5 months ago
  
  777e7d45
- Changed channels typehints, added channel to check_existing_metadata. · c6764189
  Wernle, Alexandra Nicole authored 5 months ago
  
  c6764189
- New SpikesAlgorithm Class to create spike metadata. · 70b9b650
  Wernle, Alexandra Nicole authored 6 months ago
  
  70b9b650
- New MetadataAlgorithm Class for other metadata algorithms to inherit from. · 9ba30d74
  Wernle, Alexandra Nicole authored 6 months ago
  
  9ba30d74
Sep 23, 2024

Function get_nearest_time() behaves as expected now · ffdb80da

Erin (Josh) Rigler authored 6 months ago

A FilterAlgorith.py module function get_nearest_time() is supposed to return
the nearest *allowed* time for a given filter "step". This worked fine with
steps for second and minute data, whose allowed times are the tops of seconds
and minutes. However, for hourly (and daily, and any "average" type step),
things failed since the allowed times are the center of the interval (for
example, for hourly data, which is the average of all minute samples from
00 to 59, the the allowed time is 29:30).

One consequence was that if a user specified an interval with a start and
end time that did not encompass a given hour's allowed center time (e.g.,
start=AA:29:31, end=BB:29:29), the algorithm would still return a sample
for time BB:29:30. More generally, requests for average type data would
include an extra sample.

In this fixed version, if start=AA:29:31, and end=BB:29:29, nothing is
returned, as intended (and as always worked for non-average type steps).
Furthermoref start=AA:29:30, and end=BB:29:29, a sample for hour AA is
generated; if start=AA:29:31, and end=BB:29:30, a sample for hour BB is
generated, and if start=AA:29:30, and end=BB:29:30, samples for both AA
and BB are generated, all as intended.

ffdb80da

Sep 12, 2024

Fix FilterAlgorithm.align_trace() method and unit tests · 0e3949f2

Erin (Josh) Rigler authored 6 months ago

The method FilterAlgorithm.align_trace() pre-processes the `trace.data` input
array for subsequent processing by FilterAlgorithm.firfilter() such that the
first input array element corresponds to a time step on which output samples
must fall (as defined in the dictionary `step`) minus half the fir window
width. In short, it ensures that output samples fall on desired time steps.

Prior to this fix, it only worked as intended when input trace's starttime fell
on an even time step. A bug became obvious when attempting to filter data from
non-Geomag stations that did not have nice time stamps. This fix addresses that
issue, and also ensures that `align_trace()` does what was claimed in its own
original docstrings, which is to handle trailing misalignments as well.

Note: one thing `align_trace()` does NOT do is ensure that all needed input
data are available to generate desired outputs. The user is responsible for
providing this, but can use the FilterAlgorithm.get_input_interal() method
to calculate the actual required input starttime and endtime. The method
`align_trace()` will trim or pad with NaNs only enough to align time stamps,
and may actually result in `firfilter()` output that are NaNs if the input
trace was not adequate.

0e3949f2

Sep 11, 2024

Changes to AdjustedAlgorithm class · 24fa5d71

Erin (Josh) Rigler authored 9 months ago

- no longer forces default inchannels and outchannels, and lets the
  AdjustedMatrix.process() method "do the right thing" if neither is
  explicitly set by the user
- some mostly aesthetic changes that more cleanly separate the vector
  and scalar (F) adjustements
- modified AdjustedAlgorithm.can_produce_data():
  a. if any non-F inchannels cannot produce data, return False
  b. if F inchannel can produce data, return True as long as all
     non-F inchannels can also produce data
  c. if this is not desirable (e.g., you want to treat non-F and F
     channels independently), it is necessary to create two instances
     of AdjustedAlgorithm, one for non-F inchannels, and one for F.

24fa5d71

Aug 16, 2024
- Ran lint · 3f05d06c
  Wilbur, Spencer Franklin authored 9 months ago
  
  3f05d06c
- Added severl changes to allow for users to retrieve one minute and one hour... · 9ebcc9f6
  Wilbur, Spencer Franklin authored 9 months ago
  
  Added severl changes to allow for users to retrieve one minute and one hour data from the avaiable one-second data via the FDSN client.
  9ebcc9f6
May 28, 2024

Add can_produce_data() method to SqDistAlgorithm · 2bd15768

Erin (Josh) Rigler authored 10 months ago

This simply returns True from can_produce_data(), which is appropriate
because 1) a stateful algorithm should be able to procude data as long
as it starts from a valid state, adn 2) SqDistAlgorithm itself checks
this state. If we ever implement a different stateful algorithm, it
should do something similar.

2bd15768

Apr 22, 2024
- Black re-reformatting · 81bc3f8b
  Geels, Brendan Ryan authored 11 months ago
  
  81bc3f8b
Apr 17, 2024
- Black re-reformatting · 93a7050c
  Geels, Brendan Ryan authored 11 months ago
  
  93a7050c
Dec 08, 2022

Set min_count_end to starttime - delta · e5740ed3

Erin (Josh) Rigler authored 2 years ago

I was lazy the first time and didn't subtract delta from starttime
when setting the default min_count_end in the configure method. This
led to "flakiness" in Dst when long gaps were encountered in input
observatories...exactly the kind of issue min_count_end was intended
to mitigate. It should work correctly now.

e5740ed3

Nov 22, 2022

Small fix requiring a length-1 list to be indexed · 5e1ca4cc

Erin (Josh) Rigler authored 2 years ago

Writing unit tests for Controller configurations is tricky, so I
didn't do it, and this tiny-but-impactful bug slipped in. I did
test this on the stagin server this time, and the processing pipe-
line should work after this is deployed.

5e1ca4cc

Now process inchannels, not outchannels · e7f9f8ef

Erin (Josh) Rigler authored 2 years ago

The original version of AverageAlgorithm.py seemed to be written
to process the Controller's `--outchannels`. It "worked" for several
years because `--outchannels` defaulted to `--inchannels`. But what
we actually always wanted was to process `--inchannels`.

Mostly this was just a matter of semantics, but when I recently added
the ability to average fewer than the full complement of inputs, it
was also necessary to change the `can_produce_data()` method to check
for "any" instead of "all" channels, which in turn required that the
AverageAlgorithm class properly instantiate its `_inchannels` and
`_outchannels` class variables, instead of just set them to `None`.

To briefly explain the different changes in this commit:

- added `Algorithm.__init__(self, inchannels=[channel])` to ensure that
  can_produce_data() would work when run via programmatic interface.
- added `Algorithm.configure(self, arguments)` to AverageAlgorithm.configure()
  to ensure that can_produce_data() would work when run via Controller.py.
- changed all variations of `outchannel` to `inchannel`
- cleaned up where class variables were modified inside process() method

e7f9f8ef

Nov 18, 2022

Make AverageAlgorithm require **any** inputs · b8523fc4

Erin (Josh) Rigler authored 2 years ago

Recent attempts to change the AverageAlgorithm to allow fewer than the full
complement of inputs when calculating a mutli-station average were thwarted
by the default can_produce_data() method in the parent Algorithm class, which
required all inputs to be present, when what we now want is **any** inputs to
be present.

b8523fc4

Nov 16, 2022
- Don't modify algorithm state inside process() method · 219be6e5
  Erin (Josh) Rigler authored 2 years ago
  
  219be6e5
Nov 15, 2022
- Better way to create array of UTCDateTimes · d20c1df9
  Erin (Josh) Rigler authored 2 years ago
  
  d20c1df9
Nov 14, 2022

Use self.observatories to determine number of inputs · 0dcdead5

Erin (Josh) Rigler authored 2 years ago

Previously use the number of traces in the input Stream. This could
be fewer than the number of desired observatories if a given input's
data was completely missing.

0dcdead5

Add start/end options AverageAlgorithm's min_count · c946f502

Erin (Josh) Rigler authored 2 years ago

The recently added `min_count` option to AverageAlgorithm.py leads to some
undesirable behavior when realtime data, with asynchronous inputs, are being
processed. By adding the ability to specify an interval over which `min_count`
is applied, some of this undesirable behavior can be mitigated.

In particular, if the `realtime` option is specified via the controller, and
`min_count` is defined, the minimum number of inputs will be allowed only for
time steps prior to `(UTCDateTime.now() - realtime)`; the full complement of
inputs will be required to calculate averages more recent than that. One
drawback is that if an input observatory goes offline for an extended period,
the Dst index will be calculated with a persistent lag `realtime` seconds long.

A user can always override this admittedly ad-hoc default behavior using the
`min_count_start` and `min_count_end` options.

c946f502

Nov 09, 2022

Add --average-min-count option to AverageAlgorithm · 4d3d8bd5

Erin (Josh) Rigler authored 2 years ago

- added a `min_count` keyword to the AvergeAlgorithm's __init__() method;
- added a --average-min-count option via the add_arguments() classmethod;
- set `self.min_count = arguments.average_min_count` in configure() method;
- refactored process() method to respect `min_count`, but now it defaults
  to a requirement that all inputs be valid (this modified a recent merge
  by @awernle that only required that any inputs be valid;
- modified AverageAlgorithm_test.py to properly assess things in light
  of the change to default behavior just mentioned.

4d3d8bd5

Nov 02, 2022
- Renamed variables and put into paragraphs. Asserted that the count correctly... · aa8f6ab2
  Wernle, Alexandra Nicole authored 2 years ago
  
  Renamed variables and put into paragraphs. Asserted that the count correctly shows where there are fewer input data.
  aa8f6ab2
- Fixed linting errors · 0edcd320
  Wernle, Alexandra Nicole authored 2 years ago
  
  0edcd320
- Changed dst_tot to calculate average regardless of nan values and added a new... · 69637b4d
  Wernle, Alexandra Nicole authored 2 years ago
  
  Changed dst_tot to calculate average regardless of nan values and added a new stream that counts available observatories pertimestep
  69637b4d
Mar 18, 2022
- Black formatting updates for ** operation, remove typing_extensions now that python3.8+ · 92254a3b
  Jeremy M Fee authored 3 years ago
  
  92254a3b
Sep 16, 2021
- Check for E-Field channels in can_produce_data · fdf2d967
  Cain, Payton David authored 3 years ago
  
  fdf2d967
Mar 26, 2021
- cast to list in adjusted package, store reading times in comment · b08a989e
  Cain, Payton David authored 4 years ago
  
  b08a989e
Mar 15, 2021
- Make algorithm an optional parameter for run methods · af8ea2e6
  Cain, Payton David authored 4 years ago
  
  af8ea2e6
Mar 10, 2021
- Default algorithm's matrix to identity with no statefile · d22e983b
  Cain, Payton David authored 4 years ago
  
  d22e983b
Mar 03, 2021
- Raise value error for bad readings, set endtime with field, update json parsing · 2f74ece5
  Cain, Payton David authored 4 years ago
  
  2f74ece5

Admin message