main

main.apply_all_flags(df_prov, qa_flags, qa_events, params)[source]

Combine flags from multiple sources and quality checks.

Applies manual flags, automatic post-GCE checks, and evaluates provisional/GCE flagging.

Parameters:
  • df_prov – pandas DataFrame. Expects provisional/post-GCE precipitation data.

  • qa_flags – pandas DataFrame. Expects a column of boolean values for each flag.

  • qa_events – pandas DataFrame. Expects a column of boolean values for each flag.

  • params – dict. A dictionary of methods and their parameters to execute.

  • unapplied_rules – dict. A dict of quality checks to preform based on flagging, primarily GCE flagging.

Returns:

instance of class qaqc.ApplyFlags().

main.load_data(strtyr, endyr, fname_base='MS043PPT_PPT_L1_5min_', data_path='../config.yaml', **kwargs)[source]

Load porvisional/post-GCE precipitation data.

See data_transfer.LoadProvisionalData() and data_transfer.LoadProvisionalData.load_ppt_data().

Returns:

instance of data_transfer.LoadProvisionalData()

main.main(strtyr, endyr, fname_base='MS043PPT_PPT_L1_5min_', data_path='./config.yaml', qa_params='./qa_param.yaml', probes={'all_param'}, keep_col_name=['Value', 'Flag_Value'], probeid_col='Parameter', output_dir='processed_data', write_csv=True, **kwargs)[source]

Main process that loads data and runs all qa checks.

  1. load provisional/post-GCE data

  2. select data for specific probe and what QA to run on it

  3. run rules and tests on data

  4. combine all flags from tests: manual flags, rules applied above, GCE, and our Notes DataBase of events/flags.

Any keyword argument accepted by pandas.read_csv can be supplied to this function and will be passed to`pandas.read_csv()`.

Parameters:
  • strtyr – int. First water year to be processed

  • endyr – int. Last water year to be processed

  • fname_base – str. Prefix for all files containing provisional/post-GCE data

  • data_path – str. Path to directory containing data files. Will accept .yaml or .yml

  • qa_params – Dict of manual and automatic qa rules and probe parameters. Will load from yaml if type is str

  • probes – dict. A set of probe names to run. If ‘all_data’, or ‘all_param’ is passed, all unique probes are run

  • kwargs – dict. Accepts any valid kwargs for pd.read_csv

main.qc()[source]
main.qc_cross_probe(xacc, xppt, param_auto, probe, flag)[source]
main.qc_provisional(df, params)[source]

Do quality checks on provisional/post-GCE data.

params is a dictionary where each key matches a method name of qaqc.QaRules() and the values are the keyword arguments necessary to execute the data check.

Parameters:
  • df – pandas DataFrame. Expects provisional/post-GCE precipitation data

  • params – dict. A dictionary of methods and their parameters to execute

Returns:

pd.DataFrames of flags and events; dict of checks not preformed; calculated moving window values.

main.write_output_data(ap_flg_df, acc_df, site, probe, strtyr=2019, endyr=2024, fname_base='MS043PPT_PPT_L1_5min_', file_path='../config.yaml', output_dir='/processed_data')[source]

Write data to output file. Intended to be used after QAQC has been performed.

See data_transfer.WriteProvisionalData() and data_transfer.WriteProvisionalData.write_ouput_files().

Parameters:
  • ap_flg_df – Instance of qaqc.ApplyFlags contianing data to be written to output file.

  • site – str. Site name to output

  • probe

  • strtyr – int. First water year of data to write.

  • endyr – int. Last water year of data to write.

  • fname_base – str. Base name of output file.

  • file_path – str. Path to config.yaml containing path to data files and header information.

Returns:

None