main¶
- main.apply_all_flags(df_prov, qa_flags, qa_events, params)[source]¶
Combine flags from multiple sources and quality checks.
Applies manual flags, automatic post-GCE checks, and evaluates provisional/GCE flagging.
- Parameters:
df_prov – pandas DataFrame. Expects provisional/post-GCE precipitation data.
qa_flags – pandas DataFrame. Expects a column of boolean values for each flag.
qa_events – pandas DataFrame. Expects a column of boolean values for each flag.
params – dict. A dictionary of methods and their parameters to execute.
unapplied_rules – dict. A dict of quality checks to preform based on flagging, primarily GCE flagging.
- Returns:
instance of class
qaqc.ApplyFlags().
- main.load_data(strtyr, endyr, fname_base='MS043PPT_PPT_L1_5min_', data_path='../config.yaml', **kwargs)[source]¶
Load porvisional/post-GCE precipitation data.
See
data_transfer.LoadProvisionalData()anddata_transfer.LoadProvisionalData.load_ppt_data().- Returns:
instance of
data_transfer.LoadProvisionalData()
- main.main(strtyr, endyr, fname_base='MS043PPT_PPT_L1_5min_', data_path='./config.yaml', qa_params='./qa_param.yaml', probes={'all_param'}, keep_col_name=['Value', 'Flag_Value'], probeid_col='Parameter', output_dir='processed_data', write_csv=True, **kwargs)[source]¶
Main process that loads data and runs all qa checks.
load provisional/post-GCE data
select data for specific probe and what QA to run on it
run rules and tests on data
combine all flags from tests: manual flags, rules applied above, GCE, and our Notes DataBase of events/flags.
Any keyword argument accepted by pandas.read_csv can be supplied to this function and will be passed to`pandas.read_csv()`.
- Parameters:
strtyr – int. First water year to be processed
endyr – int. Last water year to be processed
fname_base – str. Prefix for all files containing provisional/post-GCE data
data_path – str. Path to directory containing data files. Will accept .yaml or .yml
qa_params – Dict of manual and automatic qa rules and probe parameters. Will load from yaml if type is str
probes – dict. A set of probe names to run. If ‘all_data’, or ‘all_param’ is passed, all unique probes are run
kwargs – dict. Accepts any valid kwargs for pd.read_csv
- main.qc_provisional(df, params)[source]¶
Do quality checks on provisional/post-GCE data.
params is a dictionary where each key matches a method name of
qaqc.QaRules()and the values are the keyword arguments necessary to execute the data check.- Parameters:
df – pandas DataFrame. Expects provisional/post-GCE precipitation data
params – dict. A dictionary of methods and their parameters to execute
- Returns:
pd.DataFrames of flags and events; dict of checks not preformed; calculated moving window values.
- main.write_output_data(ap_flg_df, acc_df, site, probe, strtyr=2019, endyr=2024, fname_base='MS043PPT_PPT_L1_5min_', file_path='../config.yaml', output_dir='/processed_data')[source]¶
Write data to output file. Intended to be used after QAQC has been performed.
See
data_transfer.WriteProvisionalData()anddata_transfer.WriteProvisionalData.write_ouput_files().- Parameters:
ap_flg_df – Instance of qaqc.ApplyFlags contianing data to be written to output file.
site – str. Site name to output
probe
strtyr – int. First water year of data to write.
endyr – int. Last water year of data to write.
fname_base – str. Base name of output file.
file_path – str. Path to config.yaml containing path to data files and header information.
- Returns:
None