Build an MTH5 and Operate the Aurora Pipeline

This notebook pulls MT miniSEED data from the IRIS Dataselect web service and produces MTH5 out of it. It outlines the process of making an MTH5 file, generating a processing config, and running the Aurora processor.

It assumes that aurora, mth5, and mt_metadata have all been installed.

In this “new” version, the workflow has changed somewhat.

  1. The process_mth5 call works with a dataset dataframe, rather than a single run_id

  2. The config object is now based on the mt_metadata.base Base class

  3. Remote reference processing is supported (at least in theory)

Flow of this notebook

Section 1: Here we do imports and construct a table of the data that we will access to build the mth5. Note that there is no explanation here as to the table source – a future update can show how to create such a table from IRIS data_availability tools

Seciton 2: the metadata and the data are accessed, and the mth5 is created and stored.

Section 3:

[1]:
# Required imports for the program.
from pathlib import Path
import pandas as pd
from mth5.clients.make_mth5 import MakeMTH5
from mth5 import mth5, timeseries
from mt_metadata.utils.mttime import get_now_utc, MTime
from aurora.config import BANDS_DEFAULT_FILE
from aurora.config.config_creator import ConfigCreator
from aurora.pipelines.process_mth5 import process_mth5
from aurora.tf_kernel.dataset import DatasetDefinition
from aurora.tf_kernel.helpers import extract_run_summary_from_mth5
2022-06-03 17:47:11,454 [line 135] mth5.setup_logger - INFO: Logging file can be found /home/kkappler/software/irismt/mth5/logs/mth5_debug.log

Build an MTH5 file from information extracted by IRIS

[2]:
# Set path so MTH5 file builds to current working directory.
default_path = Path().cwd()
default_path
[2]:
PosixPath('/home/kkappler/software/irismt/aurora/docs/notebooks')
[3]:
# Initialize the Make MTH5 code.
m = MakeMTH5(mth5_version='0.1.0')
m.client = "IRIS"

Specify the data to access from IRIS

Note that here we explicitly prescribe the data, but this dataframe could be built from IRIS data availability tools in a programatic way

[4]:
# Generate data frame of FDSN Network, Station, Location, Channel, Startime, Endtime codes of interest

CAS04LQE = ['8P', 'CAS04', '', 'LQE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
CAS04LQN = ['8P', 'CAS04', '', 'LQN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
CAS04BFE = ['8P', 'CAS04', '', 'LFE', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
CAS04BFN = ['8P', 'CAS04', '', 'LFN', '2020-06-02T19:00:00', '2020-07-13T19:00:00']
CAS04BFZ = ['8P', 'CAS04', '', 'LFZ', '2020-06-02T19:00:00', '2020-07-13T19:00:00']

request_list = [CAS04LQE, CAS04LQN, CAS04BFE, CAS04BFN, CAS04BFZ]

# Turn list into dataframe
request_df =  pd.DataFrame(request_list, columns=m.column_names)
[5]:
# Inspect the dataframe
print(request_df)
  network station location channel                start                  end
0      8P   CAS04              LQE  2020-06-02T19:00:00  2020-07-13T19:00:00
1      8P   CAS04              LQN  2020-06-02T19:00:00  2020-07-13T19:00:00
2      8P   CAS04              LFE  2020-06-02T19:00:00  2020-07-13T19:00:00
3      8P   CAS04              LFN  2020-06-02T19:00:00  2020-07-13T19:00:00
4      8P   CAS04              LFZ  2020-06-02T19:00:00  2020-07-13T19:00:00
[6]:
# Request the inventory information from IRIS
inventory = m.get_inventory_from_df(request_df, data=False)
[7]:
# Inspect the inventory
inventory
[7]:
(Inventory created at 2022-06-04T00:47:20.278242Z
        Created by: ObsPy 1.2.2
                    https://www.obspy.org
        Sending institution: MTH5
        Contains:
                Networks (1):
                        8P
                Stations (1):
                        8P.CAS04 (Corral Hollow, CA, USA)
                Channels (5):
                        8P.CAS04..LFZ, 8P.CAS04..LFN, 8P.CAS04..LFE, 8P.CAS04..LQN,
                        8P.CAS04..LQE,
 0 Trace(s) in Stream:
)

Builds an MTH5 file from the user defined database. Note: Intact keeps the MTH5 open after it is done building

With the mth5 object set, we are ready to actually request the data from the fdsn client (IRIS) and save it to an MTH5 file. This process builds an MTH5 file and can take some time depending on how much data is requested.

Note: interact keeps the MTH5 open after it is done building

[8]:
interact = True
mth5_object = m.make_mth5_from_fdsnclient(request_df, interact=interact)
2022-06-03 17:47:31,516 [line 591] mth5.mth5.MTH5.open_mth5 - WARNING: 8P_CAS04.h5 will be overwritten in 'w' mode
2022-06-03 17:47:31,939 [line 656] mth5.mth5.MTH5._initialize_file - INFO: Initialized MTH5 0.1.0 file /home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5 in mode w
2022-06-03 17:48:28,522 [line 121] mt_metadata.base.metadata.station.add_run - WARNING: Run a is being overwritten with current information
2022-06-03T17:48:28 [line 136] obspy_stages.create_filter_from_stage - INFO: Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter
2022-06-03T17:48:28 [line 136] obspy_stages.create_filter_from_stage - INFO: Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter
2022-06-03T17:48:28 [line 136] obspy_stages.create_filter_from_stage - INFO: Converting PoleZerosResponseStage electric_si_units to a CoefficientFilter
2022-06-03T17:48:28 [line 136] obspy_stages.create_filter_from_stage - INFO: Converting PoleZerosResponseStage electric_dipole_92.000 to a CoefficientFilter
More or less runs have been requested by the user than are defined in the metadata. Runs will be defined but only the requested run extents contain time series data based on the users request.
2022-06-03 17:48:31,009 [line 224] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: start time of dataset 2020-06-02T19:00:00+00:00 does not match metadata start 2020-06-02T18:41:43+00:00 updating metatdata value to 2020-06-02T19:00:00+00:00
2022-06-03 17:48:51,497 [line 462] mth5.timeseries.run_ts.RunTS.from_obspy_stream - WARNING: could not find ey
2022-06-03 17:49:04,164 [line 462] mth5.timeseries.run_ts.RunTS.from_obspy_stream - WARNING: could not find ex
2022-06-03 17:49:04,272 [line 462] mth5.timeseries.run_ts.RunTS.from_obspy_stream - WARNING: could not find ey
2022-06-03 17:49:04,770 [line 235] mth5.timeseries.run_ts.RunTS.validate_metadata - WARNING: end time of dataset 2020-07-13T19:00:00+00:00 does not match metadata end 2020-07-13T21:46:12+00:00 updating metatdata value to 2020-07-13T19:00:00+00:00

Examine and Update the MTH5 object

With the open MTH5 Object, we can start to examine what is in it. For example, we retrieve the filename and file_version. You can additionally do things such as getting the station information and edit it by setting a new value, in this case the declination model.

[9]:
mth5_object
[9]:
/:
====================
    |- Group: Survey
    ----------------
        |- Group: Filters
        -----------------
            |- Group: coefficient
            ---------------------
                |- Group: electric_analog_to_digital
                ------------------------------------
                |- Group: electric_dipole_92.000
                --------------------------------
                |- Group: electric_si_units
                ---------------------------
                |- Group: magnetic_analog_to_digital
                ------------------------------------
            |- Group: fap
            -------------
            |- Group: fir
            -------------
            |- Group: time_delay
            --------------------
                |- Group: electric_time_offset
                ------------------------------
                |- Group: hx_time_offset
                ------------------------
                |- Group: hy_time_offset
                ------------------------
                |- Group: hz_time_offset
                ------------------------
            |- Group: zpk
            -------------
                |- Group: electric_butterworth_high_pass
                ----------------------------------------
                    --> Dataset: poles
                    ....................
                    --> Dataset: zeros
                    ....................
                |- Group: electric_butterworth_low_pass
                ---------------------------------------
                    --> Dataset: poles
                    ....................
                    --> Dataset: zeros
                    ....................
                |- Group: magnetic_butterworth_low_pass
                ---------------------------------------
                    --> Dataset: poles
                    ....................
                    --> Dataset: zeros
                    ....................
        |- Group: Reports
        -----------------
        |- Group: Standards
        -------------------
            --> Dataset: summary
            ......................
        |- Group: Stations
        ------------------
            |- Group: CAS04
            ---------------
                |- Group: Transfer_Functions
                ----------------------------
                |- Group: a
                -----------
                    --> Dataset: ex
                    .................
                    --> Dataset: ey
                    .................
                    --> Dataset: hx
                    .................
                    --> Dataset: hy
                    .................
                    --> Dataset: hz
                    .................
                |- Group: b
                -----------
                    --> Dataset: ex
                    .................
                    --> Dataset: ey
                    .................
                    --> Dataset: hx
                    .................
                    --> Dataset: hy
                    .................
                    --> Dataset: hz
                    .................
                |- Group: c
                -----------
                    --> Dataset: ex
                    .................
                    --> Dataset: ey
                    .................
                    --> Dataset: hx
                    .................
                    --> Dataset: hy
                    .................
                    --> Dataset: hz
                    .................
                |- Group: d
                -----------
                    --> Dataset: ex
                    .................
                    --> Dataset: ey
                    .................
                    --> Dataset: hx
                    .................
                    --> Dataset: hy
                    .................
                    --> Dataset: hz
                    .................
        --> Dataset: channel_summary
        ..............................
        --> Dataset: tf_summary
        .........................

Need documentation:

Is the cell below required? Or is it an example of some metadata specification?

[10]:
# Edit and update the MTH5 metadata
s = mth5_object.get_station("CAS04")
print(s.metadata.location.declination.model)
s.metadata.location.declination.model = 'IGRF'
print(s.metadata.location.declination.model)
s.write_metadata()    # writes to file mth5_filename
IGRF-13
IGRF
[11]:
# Collect information from the MTh5 Object and use it in the config files.
mth5_filename = mth5_object.filename
version = mth5_object.file_version
print(mth5_filename)
/home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5
[12]:
# Get the available stations and runs from the MTH5 object
mth5_object.channel_summary.summarize()
ch_summary = mth5_object.channel_summary.to_dataframe()
ch_summary
[12]:
survey station run latitude longitude elevation component start end n_samples sample_rate measurement_type azimuth tilt units hdf5_reference run_hdf5_reference station_hdf5_reference
0 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ex 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
1 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ey 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 electric 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
2 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hx 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
3 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hy 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
4 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hz 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
5 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ex 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
6 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ey 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 electric 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
7 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hx 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
8 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hy 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
9 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hz 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
10 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ex 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
11 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ey 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
12 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hx 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
13 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hy 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
14 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hz 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
15 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ex 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
16 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ey 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
17 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hx 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
18 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hy 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
19 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hz 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>

If MTH5 file already exists you can start here

If you dont want to execute the previous code to get data again

[4]:
#interact = False
if interact:
    pass
else:
    h5_path = default_path.joinpath("8P_CAS04.h5")
    mth5_object = mth5.MTH5(file_version="0.1.0")
    #mth5_object.open_mth5(config.stations.local.mth5_path, mode="a")
    mth5_object.open_mth5(h5_path, mode="a")
    ch_summary = mth5_object.channel_summary.to_dataframe()


Generate an Aurora Configuration file using MTH5 as an input

Up to this point, we have used mth5 and mt_metadata, but haven’t yet used aurora. So we will use the MTH5 that we just created (and examined and updated) as input into Aurora.

First, we get a list of the available runs to process from the MTH5

[57]:
ch_summary
[57]:
survey station run latitude longitude elevation component start end n_samples sample_rate measurement_type azimuth tilt units hdf5_reference run_hdf5_reference station_hdf5_reference
0 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ex 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
1 CONUS South CAS04 a 37.633351 -121.468382 329.3875 ey 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 electric 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
2 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hx 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
3 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hy 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
4 CONUS South CAS04 a 37.633351 -121.468382 329.3875 hz 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 11267 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
5 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ex 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
6 CONUS South CAS04 b 37.633351 -121.468382 329.3875 ey 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 electric 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
7 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hx 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
8 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hy 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
9 CONUS South CAS04 b 37.633351 -121.468382 329.3875 hz 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 847649 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
10 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ex 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 electric 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
11 CONUS South CAS04 c 37.633351 -121.468382 329.3875 ey 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
12 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hx 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
13 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hy 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
14 CONUS South CAS04 c 37.633351 -121.468382 329.3875 hz 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1638043 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
15 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ex 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
16 CONUS South CAS04 d 37.633351 -121.468382 329.3875 ey 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 electric 0.0 0.0 counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
17 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hx 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 13.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
18 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hy 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 103.2 0.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
19 CONUS South CAS04 d 37.633351 -121.468382 329.3875 hz 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1034586 1.0 magnetic 0.0 90.0 digital counts <HDF5 object reference> <HDF5 object reference> <HDF5 object reference>
[17]:
available_runs = ch_summary.run.unique()
sr = ch_summary.sample_rate.unique()
if len(sr) != 1:
    print('Only one sample rate per run is available')
available_stations = ch_summary.station.unique()
[7]:
mth5_object.filename

[7]:
PosixPath('/home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5')
[ ]:
available_stations[0]

Now, we condense the channel summary into a run_summary.

This method takes an iterable of mth5_objs or h5_paths

If the mth5 file is closed, you can get the run summary this way

[15]:
from aurora.tf_kernel.helpers import extract_run_summaries_from_mth5s
h5_path = default_path.joinpath("8P_CAS04.h5")
run_summary = extract_run_summaries_from_mth5s([h5_path,])
2022-06-03 17:53:53,098 [line 731] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing /home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5

Otherwise, work from the mth5_object

[16]:
run_summary = extract_run_summary_from_mth5(mth5_object)

[17]:
print(run_summary.columns)
Index(['station_id', 'run_id', 'start', 'end', 'sample_rate', 'input_channels',
       'output_channels', 'channel_scale_factors', 'mth5_path'],
      dtype='object')
[18]:
print(run_summary[['station_id', 'run_id', 'start', 'end', ]])
  station_id run_id                     start                       end
0      CAS04      a 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00
1      CAS04      b 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00
2      CAS04      c 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00
3      CAS04      d 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00

There are no reference stations, set all rows to False

[19]:
run_summary["remote"] = False

Make an aurora configuration file (and then save that json file.)

[20]:
    cc = ConfigCreator()#config_path=CONFIG_PATH)
    config = cc.create_run_processing_object(emtf_band_file=BANDS_DEFAULT_FILE,
                                        sample_rate=sr[0]
                                        )
    config.stations.from_dataset_dataframe(run_summary)
    for decimation in config.decimations:
        decimation.estimator.engine = "RME"

Take a look at the config:

[21]:
config
[21]:
{
    "processing": {
        "decimations": [
            {
                "decimation_level": {
                    "anti_alias_filter": "default",
                    "bands": [
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 30,
                                "index_min": 25
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 24,
                                "index_min": 20
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 19,
                                "index_min": 16
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 15,
                                "index_min": 13
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 12,
                                "index_min": 10
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 9,
                                "index_min": 8
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 7,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 0,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.factor": 1.0,
                    "decimation.level": 0,
                    "decimation.method": "default",
                    "decimation.sample_rate": 1.0,
                    "estimator.engine": "RME",
                    "estimator.estimate_per_channel": true,
                    "extra_pre_fft_detrend_type": "linear",
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "hz",
                        "ex",
                        "ey"
                    ],
                    "prewhitening_type": "first difference",
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "window.num_samples": 128,
                    "window.overlap": 32,
                    "window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "anti_alias_filter": "default",
                    "bands": [
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 13,
                                "index_min": 11
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 10,
                                "index_min": 9
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 8,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 6,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 1,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.factor": 4.0,
                    "decimation.level": 1,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.25,
                    "estimator.engine": "RME",
                    "estimator.estimate_per_channel": true,
                    "extra_pre_fft_detrend_type": "linear",
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "hz",
                        "ex",
                        "ey"
                    ],
                    "prewhitening_type": "first difference",
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "window.num_samples": 128,
                    "window.overlap": 32,
                    "window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "anti_alias_filter": "default",
                    "bands": [
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 13,
                                "index_min": 11
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 10,
                                "index_min": 9
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 8,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 6,
                                "index_min": 6
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 2,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 5,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.factor": 4.0,
                    "decimation.level": 2,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.0625,
                    "estimator.engine": "RME",
                    "estimator.estimate_per_channel": true,
                    "extra_pre_fft_detrend_type": "linear",
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "hz",
                        "ex",
                        "ey"
                    ],
                    "prewhitening_type": "first difference",
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "window.num_samples": 128,
                    "window.overlap": 32,
                    "window.type": "boxcar"
                }
            },
            {
                "decimation_level": {
                    "anti_alias_filter": "default",
                    "bands": [
                        {
                            "band": {
                                "decimation_level": 3,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 22,
                                "index_min": 18
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 3,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 17,
                                "index_min": 14
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 3,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 13,
                                "index_min": 10
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 3,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 9,
                                "index_min": 7
                            }
                        },
                        {
                            "band": {
                                "decimation_level": 3,
                                "frequency_max": 0,
                                "frequency_min": 0,
                                "index_max": 6,
                                "index_min": 5
                            }
                        }
                    ],
                    "decimation.factor": 4.0,
                    "decimation.level": 3,
                    "decimation.method": "default",
                    "decimation.sample_rate": 0.015625,
                    "estimator.engine": "RME",
                    "estimator.estimate_per_channel": true,
                    "extra_pre_fft_detrend_type": "linear",
                    "input_channels": [
                        "hx",
                        "hy"
                    ],
                    "output_channels": [
                        "hz",
                        "ex",
                        "ey"
                    ],
                    "prewhitening_type": "first difference",
                    "reference_channels": [
                        "hx",
                        "hy"
                    ],
                    "regression.max_iterations": 10,
                    "regression.max_redescending_iterations": 2,
                    "regression.minimum_cycles": 10,
                    "window.num_samples": 128,
                    "window.overlap": 32,
                    "window.type": "boxcar"
                }
            }
        ],
        "id": "None-None",
        "stations.local.id": "CAS04",
        "stations.local.mth5_path": "/home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5",
        "stations.local.remote": false,
        "stations.local.runs": [
            {
                "run": {
                    "id": "a",
                    "input_channels": [
                        {
                            "channel": {
                                "id": "hx",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hy",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "output_channels": [
                        {
                            "channel": {
                                "id": "ex",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "ey",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hz",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "sample_rate": 1.0,
                    "time_periods": [
                        {
                            "time_period": {
                                "end": "2020-06-02T22:07:46+00:00",
                                "start": "2020-06-02T19:00:00+00:00"
                            }
                        }
                    ]
                }
            },
            {
                "run": {
                    "id": "b",
                    "input_channels": [
                        {
                            "channel": {
                                "id": "hx",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hy",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "output_channels": [
                        {
                            "channel": {
                                "id": "ex",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "ey",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hz",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "sample_rate": 1.0,
                    "time_periods": [
                        {
                            "time_period": {
                                "end": "2020-06-12T17:52:23+00:00",
                                "start": "2020-06-02T22:24:55+00:00"
                            }
                        }
                    ]
                }
            },
            {
                "run": {
                    "id": "c",
                    "input_channels": [
                        {
                            "channel": {
                                "id": "hx",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hy",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "output_channels": [
                        {
                            "channel": {
                                "id": "ex",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "ey",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hz",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "sample_rate": 1.0,
                    "time_periods": [
                        {
                            "time_period": {
                                "end": "2020-07-01T17:32:59+00:00",
                                "start": "2020-06-12T18:32:17+00:00"
                            }
                        }
                    ]
                }
            },
            {
                "run": {
                    "id": "d",
                    "input_channels": [
                        {
                            "channel": {
                                "id": "hx",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hy",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "output_channels": [
                        {
                            "channel": {
                                "id": "ex",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "ey",
                                "scale_factor": 1.0
                            }
                        },
                        {
                            "channel": {
                                "id": "hz",
                                "scale_factor": 1.0
                            }
                        }
                    ],
                    "sample_rate": 1.0,
                    "time_periods": [
                        {
                            "time_period": {
                                "end": "2020-07-13T19:00:00+00:00",
                                "start": "2020-07-01T19:36:55+00:00"
                            }
                        }
                    ]
                }
            }
        ],
        "stations.remote": []
    }
}

Run the Aurora Pipeline using the input MTh5 and Confiugration File

[22]:
dataset_definition = DatasetDefinition()
dataset_definition.df = run_summary
dataset_definition.df
[22]:
station_id run_id start end sample_rate input_channels output_channels channel_scale_factors mth5_path remote
0 CAS04 a 2020-06-02 19:00:00+00:00 2020-06-02 22:07:46+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False
1 CAS04 b 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False
2 CAS04 c 2020-06-12 18:32:17+00:00 2020-07-01 17:32:59+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False
3 CAS04 d 2020-07-01 19:36:55+00:00 2020-07-13 19:00:00+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False

Select the runs you want to process

use the run_list variable

[23]:
run_list = ['b',]
new_df = dataset_definition.restrict_runs_by_station("CAS04", run_list, overwrite=True)
print(new_df)
dataset_definition.df
   index station_id run_id                     start  \
0      1      CAS04      b 2020-06-02 22:24:55+00:00

                        end  sample_rate input_channels output_channels  \
0 2020-06-12 17:52:23+00:00          1.0       [hx, hy]    [ex, ey, hz]

                               channel_scale_factors  \
0  {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '...

                                           mth5_path  remote
0  /home/kkappler/software/irismt/aurora/docs/not...   False
[23]:
index station_id run_id start end sample_rate input_channels output_channels channel_scale_factors mth5_path remote
0 1 CAS04 b 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False
[24]:
show_plot = True
tf_cls = process_mth5(config,
                    dataset_definition,
                    units="MT",
                    show_plot=show_plot,
                    z_file_path=None,
                    return_collection=False
                )
Processing config indicates 4 decimation levels
fix this so that it gets from config based on station_id, without caring if local or remote
/home/kkappler/software/irismt/aurora/aurora/pipelines/time_series_helpers.py:224: RuntimeWarning: invalid value encountered in true_divide
  stft_obj[channel_id].data /= calibration_response
Processing band 25.728968s
Processing band 19.929573s
Processing band 15.164131s
Processing band 11.746086s
Processing band 9.195791s
Processing band 7.362526s
Processing band 5.856115s
Processing band 4.682492s
GET PLOTTER FROM MTpy
OK, now ser linewidth and markersize
../_images/notebooks_operate_aurora_45_3.png
/home/kkappler/anaconda2/envs/py37/lib/python3.7/site-packages/pandas/core/indexing.py:1732: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
fix this so that it gets from config based on station_id, without caring if local or remote
/home/kkappler/software/irismt/aurora/aurora/pipelines/time_series_helpers.py:224: RuntimeWarning: invalid value encountered in true_divide
  stft_obj[channel_id].data /= calibration_response
Processing band 102.915872s
Processing band 85.631182s
Processing band 68.881694s
Processing band 54.195827s
Processing band 43.003958s
Processing band 33.310722s
GET PLOTTER FROM MTpy
OK, now ser linewidth and markersize
../_images/notebooks_operate_aurora_45_8.png
/home/kkappler/anaconda2/envs/py37/lib/python3.7/site-packages/pandas/core/indexing.py:1732: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
fix this so that it gets from config based on station_id, without caring if local or remote
/home/kkappler/software/irismt/aurora/aurora/pipelines/time_series_helpers.py:224: RuntimeWarning: invalid value encountered in true_divide
  stft_obj[channel_id].data /= calibration_response
Processing band 411.663489s
Processing band 342.524727s
Processing band 275.526776s
Processing band 216.783308s
Processing band 172.015831s
Processing band 133.242890s
GET PLOTTER FROM MTpy
OK, now ser linewidth and markersize
../_images/notebooks_operate_aurora_45_13.png
/home/kkappler/anaconda2/envs/py37/lib/python3.7/site-packages/pandas/core/indexing.py:1732: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)
fix this so that it gets from config based on station_id, without caring if local or remote
/home/kkappler/software/irismt/aurora/aurora/pipelines/time_series_helpers.py:224: RuntimeWarning: invalid value encountered in true_divide
  stft_obj[channel_id].data /= calibration_response
Processing band 1514.701336s
Processing band 1042.488956s
Processing band 723.371271s
Processing band 532.971560s
Processing band 412.837995s
GET PLOTTER FROM MTpy
OK, now ser linewidth and markersize
../_images/notebooks_operate_aurora_45_18.png
2022-06-03 17:55:09,287 [line 731] mth5.mth5.MTH5.close_mth5 - INFO: Flushing and closing /home/kkappler/software/irismt/aurora/docs/notebooks/8P_CAS04.h5
['b']
[25]:
dataset_definition.df
2022-06-03 17:58:34,548 [line 113] mth5.groups.base.Run.__str__ - WARNING: MTH5 file is closed and cannot be accessed.
2022-06-03 17:58:34,830 [line 113] mth5.groups.base.Run.__str__ - WARNING: MTH5 file is closed and cannot be accessed.
[25]:
index station_id run_id start end sample_rate input_channels output_channels channel_scale_factors mth5_path remote mth5_obj run run_dataarray stft
0 1 CAS04 b 2020-06-02 22:24:55+00:00 2020-06-12 17:52:23+00:00 1.0 [hx, hy] [ex, ey, hz] {'ex': 1.0, 'ey': 1.0, 'hx': 1.0, 'hy': 1.0, '... /home/kkappler/software/irismt/aurora/docs/not... False HDF5 file is closed and cannot be accessed. MTH5 file is closed and cannot be accessed. [[<xarray.DataArray ()>\narray(-7814.971096500... None
[26]:

type(tf_cls)
[26]:
mt_metadata.transfer_functions.core.TF

Write the transfer functions generated by the Aurora pipeline

[27]:
 tf_cls.write_tf_file(fn="emtfxml_test.xml", file_type="emtfxml")
2022-06-03 17:58:38,607 [line 197] mt_metadata.transfer_functions.io.readwrite.write_file - INFO: Wrote emtfxml_test.xml
[27]:
EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)
[28]:
tf_cls.write_tf_file(fn="emtfxml_test.xml", file_type="edi")
2022-06-03 17:58:39,327 [line 197] mt_metadata.transfer_functions.io.readwrite.write_file - INFO: Wrote emtfxml_test.xml
[28]:
EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)
[29]:
 tf_cls.write_tf_file(fn="emtfxml_test.xml", file_type="zmm")
2022-06-03 17:58:39,876 [line 197] mt_metadata.transfer_functions.io.readwrite.write_file - INFO: Wrote emtfxml_test.xml
[29]:
EMTFXML(station='CAS04', latitude=37.63, longitude=-121.47, elevation=329.39)