CEN01- Clog Cross Comparison: Composite Scores

Composite weighted clog ranking for CEN01 compared to all other probes.

To untangle the final clog assignments, the component contributions must be broken out. Multiple components across each compared probe contribute to the final product. Sometimes this requires adjusting the parameters used, or adjusting how the components are compared or aggregated. Below, the goal is to look at the combined flagging, take unexpected or undesired flagging, and break them down into their input parts and make adjustments that culminate in improved outputs. It is a bit tangled, and is as meticulous to carry out as it is to read.

In the process of carrying out this assessment, a number of changes were made to the parameters of individual probe comparisons, as well as broader systemic changes to how clog and flagging scores are tallied and assessed.

[1]:
import pandas as pd
import matplotlib.pyplot as plt

# Jupyter magic to make plots display interactive
# must install ipympl (Ipython-matplotlib) and nodejs
from ipywidgets.embed import embed_minimal_html
%matplotlib widget

import sys
sys.path.append("../")
from post_gce_qc import qaqc, data_transfer, cross_probe_qc, main


Process the Data

QA the Data

[2]:
# Get QA/QC'd data for all probes
# main.main runs Clog QA, so this ONLY runs other QA routines
all_probes = main.load_data(2019, 2024, fname_base='MS00413_PPT_L1_5min_', data_path='../config_new.yaml')

params = qaqc._load_yaml('../qa_param.yaml')
probes = params.keys()

flagged = {}
for probe in probes:
    site = probe[:3]
    nprobe = probe[-2:]
    df = all_probes.pivot_on_probe(all_probes.df, site, nprobe)

    param = params[probe]

    qa_flags, qa_events = main.qc_provisional(df, param)
    flags = main.apply_all_flags(df, qa_flags, qa_events, param)

    print(f'All quality checks and quality assurance rules applied to {probe}\n------------------\n')

    flagged[probe] = flags

Loading all PPT data from ../config_new.yaml

All quality checks and quality assurance rules applied to VAR_02
------------------

All quality checks and quality assurance rules applied to UPL_01
------------------

All quality checks and quality assurance rules applied to UPL_02
------------------

All quality checks and quality assurance rules applied to CEN_01
------------------

All quality checks and quality assurance rules applied to CEN_02
------------------

All quality checks and quality assurance rules applied to CS2_02
------------------

All quality checks and quality assurance rules applied to PRI_03
------------------

All quality checks and quality assurance rules applied to H15_02
------------------

All quality checks and quality assurance rules applied to GSM_02
------------------

Create CrossTables and Get Params

[3]:
# Get parameters for probe
params = qaqc._load_yaml('../qa_param.yaml')

probe = 'CEN_01'
param = params[probe]

# build pivot table for cross site comparison
xppt = cross_probe_qc.BuildXTable.assemble_cross_table(flagged, ppt_col='adj_precip')
xacc = cross_probe_qc.BuildXTable.assemble_wy_acc(xppt)
[4]:
# initiate cross probe quality checks for CEN01
xprobe = cross_probe_qc.XProbesQc(xacc.index, probe)
# create table of ratios with CEN01 as base
xprobe.set_accum_ratio(xacc)

Find Clog Events for Each Probe

[5]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])

Get Weighted Clog ID

[6]:
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)

Assess Clogs

[7]:
plt.close('all')
[8]:
ax1 = xacc[['CEN_01', 'CEN_02']].plot(grid=True, legend=True)
xacc.loc[xprobe.event.clog==True, ['CEN_01']].plot(grid=True, linestyle='', marker='.', ax=ax1, label='clogs')
[8]:
<Axes: xlabel='Date'>
[13]:
plt.close(2)
[9]:
# plt.close('all')
# ax1 = xacc[['CEN_01', 'CEN_02']].plot(grid=True, legend=True)
# xacc.loc[xprobe.event.clog==True, ['CEN_01']].plot(grid=True, linestyle='', marker='.', ax=ax1, label='clogs')

#plt.figure()
xprobe.ratio[['H15_02', 'CEN_02',]].plot(grid=True, legend=True)
xprobe.ratio.loc[xprobe.event.clog==True, 'CEN_02'].plot(grid=True, linestyle='', marker='.')
[9]:
<Axes: xlabel='Date'>

CEN SH 10/29/18

CEN SH is the dominant probe driving most of the clogs. Some of these clogs are fairly problematic, so this probe needs to be carefully inspected to make sure the relationship is properly identifying clogs and

[14]:
day = pd.to_datetime('10/29/18')

flagged['CEN_01'].apply_QaRules_flags(xprobe.event, xprobe.flags)

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[14]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-10-29 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[23]:
day = pd.to_datetime('10/29/18 1000')
end = day + pd.to_timedelta('10h')
pd.options.display.min_rows = 35

xprobe.clog.loc[day:end]
[23]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2018-10-29 10:00:00 False True <NA> False False False False True
2018-10-29 10:05:00 False True <NA> False False False False True
2018-10-29 10:10:00 False True <NA> False False False False True
2018-10-29 10:15:00 False True <NA> False False False False True
2018-10-29 10:20:00 False True <NA> False False False False True
2018-10-29 10:25:00 False True <NA> False False False False True
2018-10-29 10:30:00 False True <NA> False False False False True
2018-10-29 10:35:00 False True <NA> False False False False True
2018-10-29 10:40:00 True True <NA> False False False False True
2018-10-29 10:45:00 True True <NA> False False False False True
2018-10-29 10:50:00 True True <NA> False False False False True
2018-10-29 10:55:00 True True <NA> False False False False True
2018-10-29 11:00:00 True True <NA> False False False False True
2018-10-29 11:05:00 True True <NA> False False False False True
2018-10-29 11:10:00 True True <NA> False False False False True
2018-10-29 11:15:00 True True <NA> False False False False True
2018-10-29 11:20:00 True True <NA> False False False False True
... ... ... ... ... ... ... ... ...
2018-10-29 18:40:00 True True <NA> False False False False True
2018-10-29 18:45:00 True True <NA> False False False False True
2018-10-29 18:50:00 True True <NA> False False False False True
2018-10-29 18:55:00 True True <NA> False False False False True
2018-10-29 19:00:00 True True <NA> False False False False True
2018-10-29 19:05:00 False True <NA> False False False False True
2018-10-29 19:10:00 False False <NA> False False False False True
2018-10-29 19:15:00 False False <NA> False False False False True
2018-10-29 19:20:00 False False <NA> False False False False True
2018-10-29 19:25:00 False False <NA> False False False False True
2018-10-29 19:30:00 False False <NA> False False False False True
2018-10-29 19:35:00 False False <NA> False False False False True
2018-10-29 19:40:00 False False <NA> False False False False True
2018-10-29 19:45:00 False False <NA> False False False False True
2018-10-29 19:50:00 False False <NA> False False False False True
2018-10-29 19:55:00 False False <NA> False False False False True
2018-10-29 20:00:00 False False <NA> False False False False True

121 rows × 8 columns

OK, so even CEN SH starts ID’ing this clog late. And There are basically no U flags, which is the most of the point of a clog event. So first let’s look at U flags and see if it is a weighting problem or an ID problem.

U Flags: Thresholds and Alignment of Non-0 Precip

[24]:
xprobe.U.loc[day:end]
[24]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2018-10-29 10:00:00 False True <NA> False False False False True
2018-10-29 10:05:00 False False <NA> False False False False True
2018-10-29 10:10:00 False False <NA> False False False False True
2018-10-29 10:15:00 False True <NA> False False False False True
2018-10-29 10:20:00 False False <NA> False False False False False
2018-10-29 10:25:00 False False <NA> False False False False True
2018-10-29 10:30:00 False True <NA> False False False False True
2018-10-29 10:35:00 False False <NA> False False False False True
2018-10-29 10:40:00 True False <NA> False False False False True
2018-10-29 10:45:00 False True <NA> False False False False True
2018-10-29 10:50:00 False False <NA> False False False False True
2018-10-29 10:55:00 False False <NA> False False False False True
2018-10-29 11:00:00 True True <NA> False False False False False
2018-10-29 11:05:00 False False <NA> False False False False True
2018-10-29 11:10:00 False False <NA> False False False False False
2018-10-29 11:15:00 False False <NA> False False False False False
2018-10-29 11:20:00 False False <NA> False False False False False
... ... ... ... ... ... ... ... ...
2018-10-29 18:40:00 False False <NA> False False False False False
2018-10-29 18:45:00 False False <NA> False False False False False
2018-10-29 18:50:00 False False <NA> False False False False False
2018-10-29 18:55:00 False False <NA> False False False False False
2018-10-29 19:00:00 False False <NA> False False False False False
2018-10-29 19:05:00 False False <NA> False False False False False
2018-10-29 19:10:00 False False <NA> False False False False False
2018-10-29 19:15:00 False False <NA> False False False False False
2018-10-29 19:20:00 False False <NA> False False False False False
2018-10-29 19:25:00 False False <NA> False False False False False
2018-10-29 19:30:00 False False <NA> False False False False False
2018-10-29 19:35:00 False False <NA> False False False False False
2018-10-29 19:40:00 False False <NA> False False False False False
2018-10-29 19:45:00 False False <NA> False False False False False
2018-10-29 19:50:00 False False <NA> False False False False False
2018-10-29 19:55:00 False False <NA> False False False False False
2018-10-29 20:00:00 False False <NA> False False False False False

121 rows × 8 columns

So I can make it more permissive by allowing U flags to be scores >= 66 and upping the lower sites to a weight of 8, so a lower site plus CEN SH (weight 58) will create a U flag. This still requires 2 of the lower sites to create a clog in the first place with the more restrictive >66 criteria.

But that will still create a pretty sparse handful of U flags. So we need to dig into the parameters that set the CEN SH U flag.

[20]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']

pair_param['n_std'] = 2.5

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[25]:
U.loc[day:end]
[25]:
Date
2018-10-29 10:00:00    False
2018-10-29 10:05:00    False
2018-10-29 10:10:00    False
2018-10-29 10:15:00    False
2018-10-29 10:20:00    False
2018-10-29 10:25:00    False
2018-10-29 10:30:00    False
2018-10-29 10:35:00    False
2018-10-29 10:40:00     True
2018-10-29 10:45:00    False
2018-10-29 10:50:00    False
2018-10-29 10:55:00    False
2018-10-29 11:00:00     True
2018-10-29 11:05:00    False
2018-10-29 11:10:00    False
2018-10-29 11:15:00    False
2018-10-29 11:20:00    False
                       ...
2018-10-29 18:40:00    False
2018-10-29 18:45:00    False
2018-10-29 18:50:00    False
2018-10-29 18:55:00    False
2018-10-29 19:00:00    False
2018-10-29 19:05:00    False
2018-10-29 19:10:00    False
2018-10-29 19:15:00    False
2018-10-29 19:20:00    False
2018-10-29 19:25:00    False
2018-10-29 19:30:00    False
2018-10-29 19:35:00    False
2018-10-29 19:40:00    False
2018-10-29 19:45:00    False
2018-10-29 19:50:00    False
2018-10-29 19:55:00    False
2018-10-29 20:00:00    False
Length: 121, dtype: bool[pyarrow]
[26]:
pair_param['n_std'] = 1.5

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[27]:
U.loc[day:end]
[27]:
Date
2018-10-29 10:00:00    False
2018-10-29 10:05:00    False
2018-10-29 10:10:00    False
2018-10-29 10:15:00    False
2018-10-29 10:20:00    False
2018-10-29 10:25:00    False
2018-10-29 10:30:00    False
2018-10-29 10:35:00    False
2018-10-29 10:40:00     True
2018-10-29 10:45:00    False
2018-10-29 10:50:00    False
2018-10-29 10:55:00    False
2018-10-29 11:00:00     True
2018-10-29 11:05:00    False
2018-10-29 11:10:00    False
2018-10-29 11:15:00    False
2018-10-29 11:20:00    False
                       ...
2018-10-29 18:40:00    False
2018-10-29 18:45:00    False
2018-10-29 18:50:00    False
2018-10-29 18:55:00    False
2018-10-29 19:00:00    False
2018-10-29 19:05:00    False
2018-10-29 19:10:00    False
2018-10-29 19:15:00    False
2018-10-29 19:20:00    False
2018-10-29 19:25:00    False
2018-10-29 19:30:00    False
2018-10-29 19:35:00    False
2018-10-29 19:40:00    False
2018-10-29 19:45:00    False
2018-10-29 19:50:00    False
2018-10-29 19:55:00    False
2018-10-29 20:00:00    False
Length: 121, dtype: bool[pyarrow]
[28]:
pair_param['precision_val'] = 0.4

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[30]:
U.loc[day:end]
[30]:
Date
2018-10-29 10:00:00    False
2018-10-29 10:05:00    False
2018-10-29 10:10:00    False
2018-10-29 10:15:00    False
2018-10-29 10:20:00    False
2018-10-29 10:25:00    False
2018-10-29 10:30:00    False
2018-10-29 10:35:00    False
2018-10-29 10:40:00     True
2018-10-29 10:45:00    False
2018-10-29 10:50:00    False
2018-10-29 10:55:00    False
2018-10-29 11:00:00     True
2018-10-29 11:05:00    False
2018-10-29 11:10:00    False
2018-10-29 11:15:00    False
2018-10-29 11:20:00    False
                       ...
2018-10-29 18:40:00    False
2018-10-29 18:45:00    False
2018-10-29 18:50:00    False
2018-10-29 18:55:00    False
2018-10-29 19:00:00    False
2018-10-29 19:05:00    False
2018-10-29 19:10:00    False
2018-10-29 19:15:00    False
2018-10-29 19:20:00    False
2018-10-29 19:25:00    False
2018-10-29 19:30:00    False
2018-10-29 19:35:00    False
2018-10-29 19:40:00    False
2018-10-29 19:45:00    False
2018-10-29 19:50:00    False
2018-10-29 19:55:00    False
2018-10-29 20:00:00    False
Length: 121, dtype: bool[pyarrow]
[49]:
plt.close(6)
[50]:
plt.figure()
for n, pval in enumerate([0.1, 0.2, 0.3, 0.4, 0.5]):
    ax1 = plt.subplot(5,1,n+1)
    _, precip_run_std = qaqc.QaRules.calc_rolling_mean(p2pqc.TOT, precision=pval, wind='1h', nstd=1.5)
    p2pqc.TOT.loc[day:end].plot(grid=True, legend=True, ax=ax1, linestyle='', marker='.')
    precip_run_std.loc[day:end].plot(grid=True, legend=True, ax=ax1)
    plt.title(pval)
[56]:
plt.tight_layout()

Wow, so, by my count, there should be 13 U’s. Let’s check how many we have.

[52]:
U.loc[day:end][U==True].count()
[52]:
10
[55]:
p2pqc.TOT.loc[day:end, 'CEN_02'][U.loc[day:end]].plot(grid=True, label='U', marker='X', linestyle='', legend=True)
[55]:
<Axes: title={'center': '0.5'}, xlabel='Date'>
[57]:
p2pqc.TOT.loc[day:end, 'CEN_02'][clog.loc[day:end]].plot(grid=True, label='clog', marker='*', linestyle='', legend=True)
[57]:
<Axes: title={'center': '0.5'}, xlabel='Date'>
[60]:
xprobe.flags.loc[day:end, 'U'][xprobe.flags['U']==True].count()
[60]:
1

OK, so there is at least an hour’s worth of U flags from CENT SH, but there aren’t enough other probes to support it. This is a weighting problem.

I suspect a lot of the problem is because, at the 5 min level, the 0 values at the pair overlap with precip at the base site.

Here’s the actual code:

base, match = pair

non0 = self.TOT[match] > 0
return (precip_run_avg[match] >= precip_run_avg[base]) & non0 & clog

Certainly, this >0 problem is exaccerbated at CS2MET, since 10 out of every 15 minutes has 0 precip, so the chances of them lining up is low.

But the weighting is probably a problem too. The current weighting scheme, probes within a site are weighted as 58, and the low elevation south ridge sites are all weighted 7 each. So, to get a U flag, or a clog, 2 of the low elevation ridge sites are required. In this case, if they were weighted heavier we’d get some more U flags.

Currently, it’s a clog event with almost no U flags, which makes no sense. Let’s look at the parameters at CS2MET and make sure they are flagging as much as possible as U during the clog.

[64]:
xprobe.U.loc[day:end, 'CS2_02'][xprobe.U['CS2_02']==True].count()
[64]:
9

That’s not bad…but how do they line up with CEN SH?

[70]:
xprobe.U[xprobe.U['CS2_02']==True].loc[day:end]
[70]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2018-10-29 10:00:00 False True <NA> False False False False True
2018-10-29 10:15:00 False True <NA> False False False False True
2018-10-29 10:30:00 False True <NA> False False False False True
2018-10-29 10:45:00 False True <NA> False False False False True
2018-10-29 11:00:00 True True <NA> False False False False False
2018-10-29 12:15:00 False True <NA> False False False False True
2018-10-29 15:00:00 False True <NA> False False False False True
2018-10-29 15:15:00 False True <NA> False False False False True
2018-10-29 15:30:00 True True <NA> False False False False True

OK, let’s try to get more True values out of CS2 in hopes that it will overlap with more of the CENT U values.

[71]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CS2_02')

pair_param = param['auto_flag']['flag_x_clogs']['CS2_02']['clog_pair_flagging_wrap']

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CS2_02', **pair_param)
[83]:
plt.close(6)
[84]:
plt.figure()
for n, wind in enumerate(['0.75h', '1h', '1.5h', '2h', '2.5h']):
    ax1 = plt.subplot(5,1,n+1)
    _, precip_run_std = qaqc.QaRules.calc_rolling_mean(p2pqc.TOT, precision=0.254, wind=wind, nstd=1.5)
    p2pqc.TOT.loc[day:end].plot(grid=True, legend=True, ax=ax1, linestyle='', marker='.')
    precip_run_std.loc[day:end].plot(grid=True, legend=True, ax=ax1)
    plt.title(wind)

[87]:
plt.tight_layout()
[89]:
p2pqc.TOT.loc[day:end, 'CS2_02'][U.loc[day:end]].plot(grid=True, label='U', marker='X', linestyle='', legend=True)
[89]:
<Axes: title={'center': '2.5h'}, xlabel='Date'>

Well, every moment where it was raining got a flag. It’s just that the flags didn’t line up very well across sites. Let’s try looking at Mack and see if it’s lacking in flagging.

[90]:
xprobe.U.loc[day:end, 'GSM_02'][xprobe.U['GSM_02']==True].count()
[90]:
31
[97]:
plt.close(7)
[98]:
xppt.loc[day:end, ['CEN_01', 'CEN_02', 'GSM_02']].plot(grid=True, legend=True, marker='.', linestyle='')
[98]:
<Axes: xlabel='Date'>
Proposed change to the code

If we create a 20 min centered running window for precip, it will ensure that there are only flags where precip occurred recently, but give it a 10 min grace period in both directions. This has the added benefit that for the NOAH IV’s at CS2MET and PRIM, which are on 15 min, it will apply flagging to timesteps inbetweeen the 15 in increment.

non0 = self.TOT[match] > 0
for shift in [-2,1,1,2]:
    non0 |= non0.shift(shift)
[118]:
import sys

del sys.modules['post_gce_qc.cross_probe_qc']

from post_gce_qc import cross_probe_qc
[119]:
# initiate cross probe quality checks for CEN01
xprobe = cross_probe_qc.XProbesQc(xacc.index, probe)
# create table of ratios with CEN01 as base
xprobe.set_accum_ratio(xacc)
[120]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])
[121]:
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)
[128]:
xprobe.flags[xprobe.flags['U']==True].loc[day:end, 'U'].count()
[128]:
20
[129]:
xprobe.flags[xprobe.flags['U']==True].loc[day:end, 'U']
[129]:
Date
2018-10-29 10:40:00    True
2018-10-29 10:45:00    True
2018-10-29 10:50:00    True
2018-10-29 10:55:00    True
2018-10-29 11:00:00    True
2018-10-29 11:05:00    True
2018-10-29 11:10:00    True
2018-10-29 11:15:00    True
2018-10-29 14:50:00    True
2018-10-29 14:55:00    True
2018-10-29 15:00:00    True
2018-10-29 15:05:00    True
2018-10-29 15:10:00    True
2018-10-29 15:15:00    True
2018-10-29 15:20:00    True
2018-10-29 15:25:00    True
2018-10-29 15:30:00    True
2018-10-29 15:35:00    True
2018-10-29 15:40:00    True
2018-10-29 15:45:00    True
Name: U, dtype: bool[pyarrow]
[130]:
day = pd.to_datetime('10/29/18 1000')
end = day + pd.to_timedelta('10h')

xprobe.U.loc[day:end]
[130]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2018-10-29 10:00:00 False True <NA> False False False False True
2018-10-29 10:05:00 False True <NA> False False False False True
2018-10-29 10:10:00 False True <NA> False False False False True
2018-10-29 10:15:00 False True <NA> False False False False True
2018-10-29 10:20:00 False True <NA> False False False False True
2018-10-29 10:25:00 False True <NA> False False False False True
2018-10-29 10:30:00 False True <NA> False False False False True
2018-10-29 10:35:00 False True <NA> False False False False True
2018-10-29 10:40:00 True True <NA> False False False False True
2018-10-29 10:45:00 True True <NA> False False False False True
2018-10-29 10:50:00 True True <NA> False False False False True
2018-10-29 10:55:00 True True <NA> False False False False True
2018-10-29 11:00:00 True True <NA> False False False False True
2018-10-29 11:05:00 True True <NA> False False False False True
2018-10-29 11:10:00 True True <NA> False False False False True
2018-10-29 11:15:00 True True <NA> False False False False True
2018-10-29 11:20:00 True False <NA> False False False False True
... ... ... ... ... ... ... ... ...
2018-10-29 18:40:00 False False <NA> False False False False False
2018-10-29 18:45:00 False False <NA> False False False False False
2018-10-29 18:50:00 False False <NA> False False False False False
2018-10-29 18:55:00 False False <NA> False False False False False
2018-10-29 19:00:00 False False <NA> False False False False False
2018-10-29 19:05:00 False False <NA> False False False False False
2018-10-29 19:10:00 False False <NA> False False False False False
2018-10-29 19:15:00 False False <NA> False False False False False
2018-10-29 19:20:00 False False <NA> False False False False False
2018-10-29 19:25:00 False False <NA> False False False False False
2018-10-29 19:30:00 False False <NA> False False False False False
2018-10-29 19:35:00 False False <NA> False False False False False
2018-10-29 19:40:00 False False <NA> False False False False False
2018-10-29 19:45:00 False False <NA> False False False False False
2018-10-29 19:50:00 False False <NA> False False False False False
2018-10-29 19:55:00 False False <NA> False False False False False
2018-10-29 20:00:00 False False <NA> False False False False False

121 rows × 8 columns

[131]:
xprobe.flags.loc[day:end, 'U']
[131]:
Date
2018-10-29 10:00:00    False
2018-10-29 10:05:00    False
2018-10-29 10:10:00    False
2018-10-29 10:15:00    False
2018-10-29 10:20:00    False
2018-10-29 10:25:00    False
2018-10-29 10:30:00    False
2018-10-29 10:35:00    False
2018-10-29 10:40:00     True
2018-10-29 10:45:00     True
2018-10-29 10:50:00     True
2018-10-29 10:55:00     True
2018-10-29 11:00:00     True
2018-10-29 11:05:00     True
2018-10-29 11:10:00     True
2018-10-29 11:15:00     True
2018-10-29 11:20:00    False
                       ...
2018-10-29 18:40:00    False
2018-10-29 18:45:00    False
2018-10-29 18:50:00    False
2018-10-29 18:55:00    False
2018-10-29 19:00:00    False
2018-10-29 19:05:00    False
2018-10-29 19:10:00    False
2018-10-29 19:15:00    False
2018-10-29 19:20:00    False
2018-10-29 19:25:00    False
2018-10-29 19:30:00    False
2018-10-29 19:35:00    False
2018-10-29 19:40:00    False
2018-10-29 19:45:00    False
2018-10-29 19:50:00    False
2018-10-29 19:55:00    False
2018-10-29 20:00:00    False
Name: U, Length: 121, dtype: bool[pyarrow]
[132]:
plt.close(8)
[133]:
flagged['CEN_01'].apply_QaRules_flags(xprobe.event, xprobe.flags)


day = pd.to_datetime('10/29/18')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[133]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-10-29 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

OK! That finally looks acceptable. U’s during the 2 pulses and C for the catchup

11/23/18: Stair-Stepping Mini-Clogs and U Window Size

[134]:
day = pd.to_datetime('11/23/18 1800')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='12h', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[134]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-11-23 18:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[143]:
xppt.loc[day:end, ['CEN_01','CEN_02', 'GSM_02', 'CS2_02']].plot(grid=True, legend=True, linestyle='', marker='.')
[143]:
<Axes: xlabel='Date'>
[144]:
day = pd.to_datetime('11/23/18 1800')
end = day +  pd.to_timedelta('7h')
xprobe.U[day:end]
[144]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2018-11-23 18:00:00 False True <NA> False False False False False
2018-11-23 18:05:00 False True <NA> False False False False False
2018-11-23 18:10:00 False True <NA> False False False False False
2018-11-23 18:15:00 False True <NA> False False False False False
2018-11-23 18:20:00 False True <NA> False False False False False
2018-11-23 18:25:00 False True <NA> False False False False False
2018-11-23 18:30:00 False True <NA> False False False False False
2018-11-23 18:35:00 False True <NA> False False False False False
2018-11-23 18:40:00 False True <NA> False False False False False
2018-11-23 18:45:00 False True <NA> False False False False False
2018-11-23 18:50:00 False True <NA> False False False False False
2018-11-23 18:55:00 False True <NA> False False False False False
2018-11-23 19:00:00 False True <NA> False False False False False
2018-11-23 19:05:00 False True <NA> False False False False False
2018-11-23 19:10:00 False True <NA> False False False False False
2018-11-23 19:15:00 False True <NA> False False False False False
2018-11-23 19:20:00 False True <NA> False False False False False
... ... ... ... ... ... ... ... ...
2018-11-23 23:40:00 False False <NA> False False False False False
2018-11-23 23:45:00 True False <NA> False False False False False
2018-11-23 23:50:00 True False <NA> False False False False False
2018-11-23 23:55:00 True False <NA> False False False False False
2018-11-24 00:00:00 True False <NA> False False False False False
2018-11-24 00:05:00 True False <NA> False False False False False
2018-11-24 00:10:00 True False <NA> False False False False False
2018-11-24 00:15:00 True False <NA> False False False False False
2018-11-24 00:20:00 True False <NA> False False False False False
2018-11-24 00:25:00 True False <NA> False False False False False
2018-11-24 00:30:00 False False <NA> False False False False False
2018-11-24 00:35:00 False False <NA> False False False False False
2018-11-24 00:40:00 False False <NA> False False False False True
2018-11-24 00:45:00 False False <NA> False False False False True
2018-11-24 00:50:00 False False <NA> False False False False True
2018-11-24 00:55:00 False False <NA> False False False False False
2018-11-24 01:00:00 False False <NA> False False False False False

85 rows × 8 columns

[146]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[153]:
plt.close(12)
[154]:
plt.figure()
for n, wind in enumerate(['0.5h', '0.75h', '1h', '1.5h', '2h']):
    ax1 = plt.subplot(5,1,n+1)
    _, precip_run_std = qaqc.QaRules.calc_rolling_mean(p2pqc.TOT, precision=0.2, wind=wind, nstd=1.5)
    p2pqc.TOT.loc[day:end].plot(grid=True, legend=True, ax=ax1, linestyle='', marker='.')
    precip_run_std.loc[day:end].plot(grid=True, legend=True, ax=ax1)
    plt.title(wind)
[155]:
plt.tight_layout()
[274]:
import sys

del sys.modules['post_gce_qc.cross_probe_qc']

from post_gce_qc import cross_probe_qc
[275]:
# initiate cross probe quality checks for CEN01
xprobe = cross_probe_qc.XProbesQc(xacc.index, probe)
# create table of ratios with CEN01 as base
xprobe.set_accum_ratio(xacc)
[276]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])
[277]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']
pair_param['window'] = '0.5h'

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[278]:
U.loc[day:end]
[278]:
Date
2018-11-23 18:00:00     True
2018-11-23 18:05:00     True
2018-11-23 18:10:00    False
2018-11-23 18:15:00    False
2018-11-23 18:20:00    False
2018-11-23 18:25:00    False
2018-11-23 18:30:00    False
2018-11-23 18:35:00    False
2018-11-23 18:40:00     True
2018-11-23 18:45:00     True
2018-11-23 18:50:00     True
2018-11-23 18:55:00    False
2018-11-23 19:00:00    False
2018-11-23 19:05:00    False
2018-11-23 19:10:00    False
2018-11-23 19:15:00    False
2018-11-23 19:20:00    False
                       ...
2018-11-23 23:40:00     True
2018-11-23 23:45:00     True
2018-11-23 23:50:00     True
2018-11-23 23:55:00     True
2018-11-24 00:00:00     True
2018-11-24 00:05:00     True
2018-11-24 00:10:00     True
2018-11-24 00:15:00     True
2018-11-24 00:20:00     True
2018-11-24 00:25:00     True
2018-11-24 00:30:00    False
2018-11-24 00:35:00    False
2018-11-24 00:40:00    False
2018-11-24 00:45:00    False
2018-11-24 00:50:00    False
2018-11-24 00:55:00    False
2018-11-24 01:00:00    False
Length: 85, dtype: bool[pyarrow]
[279]:
xprobe.U['CEN_02'] = U
[280]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'GSM_02')

pair_param = param['auto_flag']['flag_x_clogs']['GSM_02']['clog_pair_flagging_wrap']
pair_param['window'] = '0.5h'

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'GSM_02', **pair_param)
[281]:
xprobe.U['GSM_02'] = U
[282]:
params = qaqc._load_yaml('../qa_param.yaml')
param = params[probe]
[283]:
clog, Uwt, Cwt = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])

xprobe.flag_x_clogs(clog, Uwt, Cwt)

flagged['CEN_01'].apply_QaRules_flags(xprobe.event, xprobe.flags)
[284]:
plt.close(13)
[285]:
day = pd.to_datetime('11/23/18 1800')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='12h', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[285]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-11-23 18:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

12/18/18 Ideal Example

This one looks pretty good. The catchup/cumulative is doubled, so the C flag is followed by an E/Set0.The U’s only show up during the tank increases, but the whole period seems to have a clog event. This is the ideal flagging.

[286]:
day = pd.to_datetime('12/18/18')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='8D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[286]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-12-18 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

2/12/19: Delayed Flagging and Low Clog Score

The flagging looks correct, but starts pretty late. Let’s see if that could be adjusted to start flagging earlier.

[287]:
day = pd.to_datetime('2/12/19')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='10D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[287]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2019-02-12 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[318]:
plt.close(16)
[319]:
end = day + pd.to_timedelta('10D')
strt = day - pd.to_timedelta('30D')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)
[319]:
<Axes: xlabel='Date'>
[320]:
for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')

OK, so only UPLO and CENT ID this clog, and UPLO starts pretty late. So, unless we can get another site to come in earlier, it will continue to be late. Taking sites from top to bottom, CS2MET has some false clogging, so that can’t be tuned any more to catch this clog. HI15 and VARA both look promising. VARA was very difficult to parameterize. I’ll review that document, and focus on trying to get HI15 to ID the clog.

HI15 Clog ID: Param Adjustment

[322]:
p2p = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'H15_02')
[322]:
<Axes: xlabel='Date'>
[330]:
plt.close(17)
[331]:
plt.figure()
xprobe.ratio.H15_02.plot(grid=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.025)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.025', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.03)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.03', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.06)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.06', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.08)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.08', legend=True)

# test original window precision
xprobe.ratio.H15_02[xprobe.clog.H15_02].plot(grid=True, linestyle='', marker='.', label='precision 0.12', legend=True)
[331]:
<Axes: xlabel='Date'>

There are only 2 flags that appear marginal at 0.06. It’s unclear from the parameterization doc why such a large number was chosen. Let’s see if we can revise to something closer to 0.06. This probably won’t fix this event, but it will start events earlier.

[332]:
plt.figure()
xprobe.ratio.H15_02.plot(grid=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.035)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.035', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.04)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.04', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.045)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.045', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.05)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.05', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.055)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.055', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.06)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.06', legend=True)

# test lower window precision
hclog = p2p.set_clog_event(pair=('CEN_01', 'H15_02'), min_accum=50, lowest_normal_ratio=-0.1, rolling_window='8D', window_precision=0.065)
xprobe.ratio.H15_02[hclog].plot(grid=True, linestyle='', marker='.', label='precision 0.065', legend=True)
[332]:
<Axes: xlabel='Date'>

0.065 avoids all false flagging. 0.055 still just barely includes the clog in question, but not enough to improve the clog ID. At 0.055 there is only 1 questionable flag.

I’ll reset this, but it doesn’t fix this problem.

VARA Clog ID

After a close review of the parameterization document, it doesn’t seem likely that there will be any benefit from tweaking any of these values; There are a myriad of false clogs when the sensitivity is increased.

UPLO Increased Sensitivity

After looking at the parameterization docs, any more sensitive and this will have a ton of false flags.

4/6/19 Ideal Example

[333]:
day = pd.to_datetime('4/6/19')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='12D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[333]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2019-04-06 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

1/8/20 Detune: remove clog

[334]:
day = pd.to_datetime('1/8/20')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='8D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[334]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2020-01-08 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[335]:
end = day + pd.to_timedelta('8D')

clog.loc[day:end]
[335]:
Date
2020-01-08 00:00:00     8
2020-01-08 00:05:00     8
2020-01-08 00:10:00     8
2020-01-08 00:15:00     8
2020-01-08 00:20:00     8
2020-01-08 00:25:00     8
2020-01-08 00:30:00     8
2020-01-08 00:35:00     8
2020-01-08 00:40:00     8
2020-01-08 00:45:00     8
2020-01-08 00:50:00     8
2020-01-08 00:55:00     8
2020-01-08 01:00:00     8
2020-01-08 01:05:00     8
2020-01-08 01:10:00     8
2020-01-08 01:15:00     8
2020-01-08 01:20:00     8
                       ..
2020-01-15 22:40:00    20
2020-01-15 22:45:00    20
2020-01-15 22:50:00    20
2020-01-15 22:55:00    20
2020-01-15 23:00:00    20
2020-01-15 23:05:00    20
2020-01-15 23:10:00    20
2020-01-15 23:15:00    20
2020-01-15 23:20:00    20
2020-01-15 23:25:00    20
2020-01-15 23:30:00    20
2020-01-15 23:35:00    20
2020-01-15 23:40:00    20
2020-01-15 23:45:00    20
2020-01-15 23:50:00    20
2020-01-15 23:55:00    20
2020-01-16 00:00:00    20
Length: 2305, dtype: int64[pyarrow]
[336]:
xprobe.clog.loc[day:end]
[336]:
CEN_02 CS2_02 PRI_03 UPL_02 UPL_01 VAR_02 H15_02 GSM_02
Date
2020-01-08 00:00:00 False True <NA> False False False False False
2020-01-08 00:05:00 False True <NA> False False False False False
2020-01-08 00:10:00 False True <NA> False False False False False
2020-01-08 00:15:00 False True <NA> False False False False False
2020-01-08 00:20:00 False True <NA> False False False False False
2020-01-08 00:25:00 False True <NA> False False False False False
2020-01-08 00:30:00 False True <NA> False False False False False
2020-01-08 00:35:00 False True <NA> False False False False False
2020-01-08 00:40:00 False True <NA> False False False False False
2020-01-08 00:45:00 False True <NA> False False False False False
2020-01-08 00:50:00 False True <NA> False False False False False
2020-01-08 00:55:00 False True <NA> False False False False False
2020-01-08 01:00:00 False True <NA> False False False False False
2020-01-08 01:05:00 False True <NA> False False False False False
2020-01-08 01:10:00 False True <NA> False False False False False
2020-01-08 01:15:00 False True <NA> False False False False False
2020-01-08 01:20:00 False True <NA> False False False False False
... ... ... ... ... ... ... ... ...
2020-01-15 22:40:00 False False <NA> True True False False False
2020-01-15 22:45:00 False False <NA> True True False False False
2020-01-15 22:50:00 False False <NA> True True False False False
2020-01-15 22:55:00 False False <NA> True True False False False
2020-01-15 23:00:00 False False <NA> True True False False False
2020-01-15 23:05:00 False False <NA> True True False False False
2020-01-15 23:10:00 False False <NA> True True False False False
2020-01-15 23:15:00 False False <NA> True True False False False
2020-01-15 23:20:00 False False <NA> True True False False False
2020-01-15 23:25:00 False False <NA> True True False False False
2020-01-15 23:30:00 False False <NA> True True False False False
2020-01-15 23:35:00 False False <NA> True True False False False
2020-01-15 23:40:00 False False <NA> True True False False False
2020-01-15 23:45:00 False False <NA> True True False False False
2020-01-15 23:50:00 False False <NA> True True False False False
2020-01-15 23:55:00 False False <NA> True True False False False
2020-01-16 00:00:00 False False <NA> True True False False False

2305 rows × 8 columns

[344]:
plt.close(22)
[345]:
end = day + pd.to_timedelta('10D')
strt = day - pd.to_timedelta('30D')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)
[345]:
<Axes: xlabel='Date'>
[346]:
for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')
[347]:
plt.figure()
clog.loc[day:end].plot(grid=True)
[347]:
<Axes: xlabel='Date'>

OK, it looks like UPLO and CS2MET were legitimately getting a lot more precip than CENT. I’m surprised VARA isn’t triggering this as well, but it was probably detuned to avoid those two large reverse clogs (clogs at VARA). What is confusing is why CENT SH kicks in as a clog. Once that kicks in, the total clog score is well above 66. And what’s most confusing is that it seems like the SA got more precip during this storm.

[352]:
flagged['CEN_01'].event.loc[day:end].QaRule_flag.unique()
[352]:
<ArrowExtensionArray>
[                      '',  'UUUUUUUUUUUUUUUUUUUUU',     'UUUUUUUUUUUUUUUUUU',
                    'CCC',                     'UU',       'UUUUUUUUUUUUUUUU',
                      'U',                  'CCCCC',     'UUUUUUUUUUUUUUUUCC',
                     'CC',                 'UUUUUU',                 'UCCCCC',
 'CCCCCCCCCCCCCCCCCCCCCC',                  'UUUUU',                    'UUU',
                 'UCCCUU',                'CUCCCCC',              'UUUUCCCCC',
   'UUUUUUUUUUUUUUUUUUUU',      'UUUUUUUUUUUUUUUUU',        'UUUUUUUUUUUUUUU',
 'CCCCCCCCCCCCCCCCCCCCUU',                 'CCCCCC',                      'C',
  'UUUUUUUUUUUUUUUUUUUCC',                   'CUUU', 'CCCCUUUUUUUUUUUUCUUUUU',
    'UUUUUUUUUUUUUUUUUUU',    'CUUUUUUUUUUUUUUUUUU',                    'UCC',
                   'UUUU',    'CUUUUUUUUUUUUUUUUCC',                   'CUCC',
              'CUUUCCCCC',              'UUUUCCCUU',   'CCCCCCCCCCCCCCCCCUCC',
   'CCCCCCCCCCCCCCCCCUUU',    'CCCCCCCCCCCCCCCCCUU',                'CUCCCUU',
 'CCCCUUUUUUUUUUUUCUUUCC']
Length: 40, dtype: string[pyarrow]

OK, I’ve rerun things a lot… Let’s get fresh data. #### Rerun (Fresh Data)

[353]:
# Get QA/QC'd data for all probes
# main.main runs Clog QA, so this ONLY runs other QA routines
all_probes = main.load_data(2019, 2024, fname_base='MS00413_PPT_L1_5min_', data_path='../config_new.yaml')

params = qaqc._load_yaml('../qa_param.yaml')
probes = params.keys()

flagged = {}
for probe in probes:
    site = probe[:3]
    nprobe = probe[-2:]
    df = all_probes.pivot_on_probe(all_probes.df, site, nprobe)

    param = params[probe]

    qa_flags, qa_events = main.qc_provisional(df, param)
    flags = main.apply_all_flags(df, qa_flags, qa_events, param)

    print(f'All quality checks and quality assurance rules applied to {probe}\n------------------\n')

    flagged[probe] = flags
Loading all PPT data from ../config_new.yaml

All quality checks and quality assurance rules applied to VAR_02
------------------

All quality checks and quality assurance rules applied to UPL_01
------------------

All quality checks and quality assurance rules applied to UPL_02
------------------

All quality checks and quality assurance rules applied to CEN_01
------------------

All quality checks and quality assurance rules applied to CEN_02
------------------

All quality checks and quality assurance rules applied to CS2_02
------------------

All quality checks and quality assurance rules applied to PRI_03
------------------

All quality checks and quality assurance rules applied to H15_02
------------------

All quality checks and quality assurance rules applied to GSM_02
------------------

[354]:
# build pivot table for cross site comparison
xppt = cross_probe_qc.BuildXTable.assemble_cross_table(flagged, ppt_col='adj_precip')
xacc = cross_probe_qc.BuildXTable.assemble_wy_acc(xppt)
[355]:
# Get parameters for probe
params = qaqc._load_yaml('../qa_param.yaml')

probe = 'CEN_01'
param = params[probe]
[356]:
# initiate cross probe quality checks for CEN01
xprobe = cross_probe_qc.XProbesQc(xacc.index, probe)
# create table of ratios with CEN01 as base
xprobe.set_accum_ratio(xacc)
[357]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])
[358]:
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)

Dig in to Data/Flags

[361]:
flagged['CEN_01'].event.loc[day:end].QaRule_flag.unique()
[361]:
<ArrowExtensionArray>
['']
Length: 1, dtype: string[pyarrow]
[363]:
flagged['CEN_02'].event.loc[day:end].QaRule_flag.unique()
[363]:
<ArrowExtensionArray>
['']
Length: 1, dtype: string[pyarrow]
[364]:
flagged['CEN_02'].event.loc[day:end].explanation.unique()
[364]:
<ArrowExtensionArray>
['', 'QaRule AutoFlag: drain_event; ']
Length: 2, dtype: string[pyarrow]

Where is this drain?

[375]:
flagged['CEN_02'].event.loc[day:end][flagged['CEN_02'].event.loc[day:end, 'explanation']=='QaRule AutoFlag: drain_event; ']
[375]:
prov_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2020-01-12 14:05:00 <NA> DRAIN QaRule AutoFlag: drain_event;
2020-01-13 22:50:00 <NA> DRAIN QaRule AutoFlag: drain_event;
2020-01-15 12:20:00 <NA> DRAIN QaRule AutoFlag: drain_event;
2020-01-15 16:50:00 <NA> DRAIN QaRule AutoFlag: drain_event;
[370]:
plt.close(24)
[391]:
plt.figure()
flagged['CEN_02'].data.tank_height.loc[strt:end].plot(grid=True)
[391]:
<Axes: xlabel='Date'>

OK, two problems here. First, this is not a drain. The threshold for drains is set at -25… Ahh, but drain events are set separately from neg_tank_delta. OK, fixed in the source code for set_drain_event.

But the other problem is this weird flat line and dip back in December.

[376]:
plt.figure()
flagged['CEN_02'].data.tank_height.loc[strt:end].plot(grid=True)
[376]:
<Axes: xlabel='Date'>
[377]:
flagged['CEN_01'].data.tank_height.loc[strt:end].plot(grid=True)
[377]:
<Axes: xlabel='Date'>
[378]:
ax1 = xacc.loc[strt:end, ['CEN_01', 'CEN_02']].plot(grid=True, legend=True)

OK, so that’s a little bit of missing data. Shouldn’t really impact things. Let’s re-graph that with a 0 start point.

[387]:
strt = pd.to_datetime('12/11/19 1200')
end = strt +  pd.to_timedelta('24h')
flagged['CEN_02'].event.loc[strt:end]
[387]:
prov_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2019-12-11 12:00:00 <NA>
2019-12-11 12:05:00 <NA>
2019-12-11 12:10:00 <NA>
2019-12-11 12:15:00 <NA>
2019-12-11 12:20:00 <NA>
2019-12-11 12:25:00 <NA>
2019-12-11 12:30:00 <NA>
2019-12-11 12:35:00 MMM M
2019-12-11 12:40:00 MMM M
2019-12-11 12:45:00 MMM M
2019-12-11 12:50:00 MMM M
2019-12-11 12:55:00 MMM M
2019-12-11 13:00:00 MMM M
2019-12-11 13:05:00 MMM M
2019-12-11 13:10:00 MMM M
2019-12-11 13:15:00 MMM M
2019-12-11 13:20:00 MMM M
... ... ... ... ... ... ...
2019-12-12 10:40:00 MMM M
2019-12-12 10:45:00 MMM M
2019-12-12 10:50:00 MMM M
2019-12-12 10:55:00 MMM M
2019-12-12 11:00:00 MMM M
2019-12-12 11:05:00 MMM M
2019-12-12 11:10:00 MMM M
2019-12-12 11:15:00 MMM M
2019-12-12 11:20:00 MMM M
2019-12-12 11:25:00 MMM M
2019-12-12 11:30:00 <NA>
2019-12-12 11:35:00 <NA>
2019-12-12 11:40:00 <NA>
2019-12-12 11:45:00 <NA>
2019-12-12 11:50:00 <NA>
2019-12-12 11:55:00 <NA>
2019-12-12 12:00:00 <NA>

289 rows × 6 columns

[388]:
flagged['CEN_02'].data.loc[strt:end]
[388]:
tank_height precip adj_precip
Date
2019-12-11 12:00:00 55.279999 0.0 0.0
2019-12-11 12:05:00 54.959999 0.0 0.0
2019-12-11 12:10:00 55.279999 0.0 0.0
2019-12-11 12:15:00 55.279999 0.0 0.0
2019-12-11 12:20:00 55.119999 0.0 0.0
2019-12-11 12:25:00 55.119999 0.0 0.0
2019-12-11 12:30:00 55.279999 0.0 0.0
2019-12-11 12:35:00 55.279999 0.0 <NA>
2019-12-11 12:40:00 55.279999 0.0 <NA>
2019-12-11 12:45:00 55.279999 0.0 <NA>
2019-12-11 12:50:00 55.279999 0.0 <NA>
2019-12-11 12:55:00 55.279999 0.0 <NA>
2019-12-11 13:00:00 55.279999 0.0 <NA>
2019-12-11 13:05:00 55.279999 0.0 <NA>
2019-12-11 13:10:00 55.279999 0.0 <NA>
2019-12-11 13:15:00 55.279999 0.0 <NA>
2019-12-11 13:20:00 55.279999 0.0 <NA>
... ... ... ...
2019-12-12 10:40:00 55.279999 0.0 <NA>
2019-12-12 10:45:00 55.279999 0.0 <NA>
2019-12-12 10:50:00 55.279999 0.0 <NA>
2019-12-12 10:55:00 55.279999 0.0 <NA>
2019-12-12 11:00:00 55.279999 0.0 <NA>
2019-12-12 11:05:00 55.279999 0.0 <NA>
2019-12-12 11:10:00 55.279999 0.0 <NA>
2019-12-12 11:15:00 55.279999 0.0 <NA>
2019-12-12 11:20:00 55.279999 0.0 <NA>
2019-12-12 11:25:00 55.279999 0.0 <NA>
2019-12-12 11:30:00 82.300003 26.860001 26.800001
2019-12-12 11:35:00 83.099998 0.8 0.8
2019-12-12 11:40:00 83.800003 0.7 0.4
2019-12-12 11:45:00 84.599998 0.8 0.8
2019-12-12 11:50:00 84.900002 0.3 0.0
2019-12-12 11:55:00 85.300003 0.4 0.4
2019-12-12 12:00:00 85.599998 0.3 0.0

289 rows × 3 columns

[397]:
plt.close(30)
[465]:
strt = pd.to_datetime('1/4/20')
end = pd.to_datetime('1/17/20')
[398]:
cacc = xacc.loc[strt:end, ['CEN_01', 'CEN_02']]
cacc -= cacc.iloc[0]


cacc.plot(grid=True, legend=True)
[398]:
<Axes: xlabel='Date'>
[399]:
cratio = (cacc['CEN_01'] - cacc['CEN_02'])/ cacc['CEN_01']

plt.figure()
cratio.plot(grid=True)
[399]:
<Axes: xlabel='Date'>

So, even though CEN01 accumulates more, I guess there are a few little periods where CEN02 accumulates more. This creates a few periods of dropping ratio. I’m going to recalculate with the whole record, but I guess this is just a subtle drop over a long period.

[466]:
plt.close(41)
[467]:
cratio = (xacc['CEN_01'] - xacc['CEN_02'])/ xacc['CEN_01']

plt.figure()
cratio[strt:end].plot(grid=True)
[467]:
<Axes: xlabel='Date'>

Wow, when you isolate it like that, it really does look like a dramatic downward trend.

Adjust CEN Parameter

So this looks like it’s a legitimate “clog”. Let’s see if we can detune it a little.

[ ]:

[497]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']
pair_param['window_precision'] = 0.02

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[498]:
xprobe.clog['CEN_02'] = clog
xprobe.U['CEN_02'] = U
xprobe.C['CEN_02'] = C

clog, uwt, cwt = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(clog, uwt, cwt)
[499]:
plt.close(42)
[500]:
ax1 = xacc[['CEN_01', 'CEN_02']].plot(grid=True, legend=True)
xacc.loc[xprobe.event.clog==True, ['CEN_01']].plot(grid=True, linestyle='', marker='.', ax=ax1, label='clogs')
[500]:
<Axes: xlabel='Date'>
[446]:
end = day + pd.to_timedelta('10D')
strt = day - pd.to_timedelta('30D')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)

for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')

OK, that got rid of that clog.

Check 2019 Clogs Still Work

[447]:
flagged['CEN_01'].apply_QaRules_flags(xprobe.event, xprobe.flags)

day = pd.to_datetime('10/29/18')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[447]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-10-29 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[448]:
day = pd.to_datetime('11/23/18 1800')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='12h', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[448]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-11-23 18:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[ ]:
OK, that one went away. So the manual flags that went with it need to go away too.
[449]:
day = pd.to_datetime('12/18/18')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='8D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[449]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2018-12-18 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[450]:
day = pd.to_datetime('4/6/19')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='12D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[450]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2019-04-06 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

10/23/20 Remove with manual flags

[452]:
plt.close(38)
[453]:
day = pd.to_datetime('10/23/20 1200')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[453]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2020-10-23 12:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

10/25/21: Lowest Ratio, GCE Missing During Clog, and Day 0 Clogs

[501]:
plt.close(39)
[502]:
day = pd.to_datetime('10/23/21')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='14D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[502]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2021-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[512]:
plt.close(44)
[513]:
end = day + pd.to_timedelta('12D')
strt = day - pd.to_timedelta('8D')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)

for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')

I don’t understand why that’s a dropping ratio, it clearly looks like an uptrend at GSM and CEN. I think I’ll need to actually graph out the running means to make sense of this one. Ahhh! But both those two are below normal ratio! Let’s take a look at the whole record and adjust.

[518]:
plt.close(47)
[519]:
plt.figure()
plt.subplot(211)
xprobe.ratio['CEN_02'].plot(grid=True)

ax1 = plt.subplot(212)
xacc[['CEN_01', 'CEN_02']].plot(grid=True, legend=True, ax=ax1)
[519]:
<Axes: xlabel='Date'>

So either the minimum accumulation needs to be way bigger (currently 35 mm), or the lowest normal needs to change.

However, the bigger issue is that there is a real clog that isn’t being flagged. The NA values should be flagged as a real clog, since the drain on the SA was left open Issue #82.

This is tricky. The clog looks like it starts before there is any accumulation. So, replacing the NA with zero will result in a ratio of (0-pair)/0. Dividing by 0, the clog still won’t be able to be identified. We could try to add a tiny amount at the start of each water year to avoid divide by 0 issues.

OK, on closer inspection, the first rain event is caught. So this isn’t a zero problem, the problem is that both gauges won’t register a ratio until they are above 35 mm, which is well after the clog. This will require a manual clog flag and adjusted lower limits.

[609]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']
pair_param['lowest_normal_ratio'] = -0.355

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[610]:
plt.close(46)
[611]:
plt.figure()
xprobe.ratio['CEN_02'].plot(grid=True)
xprobe.ratio.loc[clog, 'CEN_02'].plot(grid=True, linestyle='', marker='.')
[611]:
<Axes: xlabel='Date'>
[613]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'H15_02')

pair_param = param['auto_flag']['flag_x_clogs']['H15_02']['clog_pair_flagging_wrap']
pair_param['lowest_normal_ratio'] = -0.245

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'H15_02', **pair_param)
[616]:
plt.close(47)
[617]:
plt.figure()
xprobe.ratio['H15_02'].plot(grid=True)
xprobe.ratio.loc[clog, 'H15_02'].plot(grid=True, linestyle='', marker='.')
[617]:
<Axes: xlabel='Date'>
[620]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'GSM_02')

pair_param = param['auto_flag']['flag_x_clogs']['GSM_02']['clog_pair_flagging_wrap']
pair_param['lowest_normal_ratio'] = -0.643

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'GSM_02', **pair_param)
[621]:
plt.close(48)
[622]:
plt.figure()
xprobe.ratio['GSM_02'].plot(grid=True)
xprobe.ratio.loc[clog, 'GSM_02'].plot(grid=True, linestyle='', marker='.')
[622]:
<Axes: xlabel='Date'>
[740]:
# Get new parameters for probe
params = qaqc._load_yaml('../qa_param.yaml')

probe = 'CEN_01'
param = params[probe]
[727]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])

clog, uwt, cwt = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(clog, uwt, cwt)
[730]:
plt.close(53)
[731]:
xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)

for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')

OK, that looks better. Let’s check the big picture.

[733]:
plt.close(50)
[734]:
ax1 = xacc[['CEN_01', 'CEN_02']].plot(grid=True, legend=True)
xacc.loc[xprobe.event.clog==True, ['CEN_01']].plot(grid=True, linestyle='', marker='.', ax=ax1, label='clogs')

[734]:
<Axes: xlabel='Date'>

Well this seems to have fixed a lot. But we lost the clog in October 2018. Let’s look at the ratio there and see what we need to do to add it back.

Lost Oct 2018 clog

[626]:
strt = pd.to_datetime('10/22/18')
end = strt + pd.to_timedelta('17D')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)

for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')

UPLO, HI15, and VARA missed a few storm cycles, so they have a wave pattern that won’t let them flag the precip. Mack and CS2MET both ID the clog, but without CENT, they can’t trigger it. Let’s see if I can retune it again to be a little more sensitive.

[723]:
p2pqc = xprobe.get_Probe2ProbeXQc_inst(xppt, xacc, 'CEN_02')

pair_param = param['auto_flag']['flag_x_clogs']['CEN_02']['clog_pair_flagging_wrap']
pair_param['rolling_window'] = '6D'
pair_param['window_precision'] = 0.018

clog, U, C = p2pqc.clog_pair_flagging_wrap('CEN_01', 'CEN_02', **pair_param)
[724]:
plt.close(52)
[725]:
plt.figure()
xprobe.ratio['CEN_02'].plot(grid=True)
xprobe.ratio.loc[clog, 'CEN_02'].plot(grid=True, linestyle='', marker='.')
[725]:
<Axes: xlabel='Date'>
[ ]:

Every time I make a change, it just seems to reintroduce a lot of overflaging. Plus, they only marginally flag the day we want them to. I think this will have to be a manual flag.

Quick Code Check

The current creation of the ACC table is a little idiosyncratic. It mostly seems to be trying to create clear distinctions of class methods, but let’s real quick double check that there isn’t a time penalty.

[551]:
# Current setup
pd.options.display.min_rows = 15

xppt['CEN_01'].groupby(pd.Grouper(freq='YE-SEP')).cumsum()
[551]:
Date
2018-10-01 00:05:00            0.0
2018-10-01 00:10:00            0.0
2018-10-01 00:15:00            0.0
2018-10-01 00:20:00            0.0
2018-10-01 00:25:00            0.0
2018-10-01 00:30:00            0.0
2018-10-01 00:35:00            0.0
                          ...
2024-09-30 23:30:00    2059.400146
2024-09-30 23:35:00    2059.400146
2024-09-30 23:40:00    2059.400146
2024-09-30 23:45:00    2059.400146
2024-09-30 23:50:00    2059.400146
2024-09-30 23:55:00    2059.400146
2024-10-01 00:00:00            0.0
Name: CEN_01, Length: 631296, dtype: float[pyarrow]
[591]:
def calc_wy_acc(data_series):
    return data_series.groupby(pd.Grouper(freq='YE-SEP')).cumsum()
[592]:
%timeit xppt.transform(calc_wy_acc)
56.3 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
[595]:
%timeit xppt.apply(calc_wy_acc)
59.5 ms ± 6.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
[596]:
%timeit xppt.groupby(pd.Grouper(freq='YE-SEP')).cumsum()
36.2 ms ± 1.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

10/23/23: Overflag

[743]:
# Get new parameters for probe
params = qaqc._load_yaml('../qa_param.yaml')

probe = 'CEN_01'
param = params[probe]
[744]:
site = 'CEN'
nprobe = '01'

df = all_probes.pivot_on_probe(all_probes.df, site, nprobe)

qa_flags, qa_events = main.qc_provisional(df, param)
flagged[probe] = main.apply_all_flags(df, qa_flags, qa_events, param)
[745]:
xprobe.set_x_clogs(xppt, xacc, param['auto_flag']['flag_x_clogs'])

clog, uwt, cwt = xprobe.get_weight_x_clog(param['auto_flag']['weight_x_clogs'])
xprobe.flag_x_clogs(clog, uwt, cwt)
[746]:
flagged['CEN_01'].apply_QaRules_flags(xprobe.event, xprobe.flags)
[755]:
plt.close(55)
[756]:
day = pd.to_datetime('10/23/23')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='13D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[756]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Something crazy is going on with the shelter. Let’s take a look at the notes.

[760]:
strt, end = pd.to_datetime('10/25/23 1500'), pd.to_datetime('10/30/23 0930')

df = all_probes.pivot_on_probe(all_probes.df, 'CEN', '02')

pd.options.display.min_rows = 30
df[strt:end]
[760]:
INST INST_Flag TOT TOT_Flag ACC ACC_Flag
Date
2023-10-25 15:00:00 73.010002 <NA> 0.0 <NA> 160.619995 <NA>
2023-10-25 15:05:00 73.050003 <NA> 0.03 <NA> 160.649994 <NA>
2023-10-25 15:10:00 72.93 <NA> 0.0 <NA> 160.649994 <NA>
2023-10-25 15:15:00 73.760002 <NA> 0.71 <NA> 161.360001 <NA>
2023-10-25 15:20:00 73.800003 <NA> 0.04 <NA> 161.399994 <NA>
2023-10-25 15:25:00 73.75 <NA> 0.0 <NA> 161.399994 <NA>
2023-10-25 15:30:00 73.800003 <NA> 0.0 <NA> 161.399994 <NA>
2023-10-25 15:35:00 74.269997 <NA> 0.47 <NA> 161.869995 <NA>
2023-10-25 15:40:00 41.93 <NA> 0.0 R 161.869995 R
2023-10-25 15:45:00 41.880001 <NA> 0.0 <NA> 161.869995 <NA>
2023-10-25 15:50:00 42.049999 <NA> 0.12 <NA> 161.990005 <NA>
2023-10-25 15:55:00 42.02 <NA> 0.0 <NA> 161.990005 <NA>
2023-10-25 16:00:00 42.029999 <NA> 0.0 <NA> 161.990005 <NA>
2023-10-25 16:05:00 42.040001 <NA> 0.0 <NA> 161.990005 <NA>
2023-10-25 16:10:00 42.040001 <NA> 0.0 <NA> 161.990005 <NA>
... ... ... ... ... ... ...
2023-10-30 08:20:00 53.130001 <NA> 0.0 <NA> 172.389999 <NA>
2023-10-30 08:25:00 53.23 <NA> 0.0 <NA> 172.389999 <NA>
2023-10-30 08:30:00 53.349998 <NA> 0.9 W 173.289993 W
2023-10-30 08:35:00 53.419998 <NA> 0.07 <NA> 173.360001 <NA>
2023-10-30 08:40:00 53.43 <NA> 0.01 <NA> 173.369995 <NA>
2023-10-30 08:45:00 53.639999 <NA> 0.21 <NA> 173.580002 <NA>
2023-10-30 08:50:00 53.419998 <NA> 0.0 <NA> 173.580002 <NA>
2023-10-30 08:55:00 53.41 <NA> 0.0 <NA> 173.580002 <NA>
2023-10-30 09:00:00 53.43 <NA> 0.0 <NA> 173.580002 <NA>
2023-10-30 09:05:00 53.43 <NA> 0.0 <NA> 173.580002 <NA>
2023-10-30 09:10:00 40.970001 <NA> 0.0 R 173.580002 R
2023-10-30 09:15:00 40.98 <NA> 0.01 <NA> 173.589996 <NA>
2023-10-30 09:20:00 40.959999 <NA> 0.0 <NA> 173.589996 <NA>
2023-10-30 09:25:00 41.0 <NA> 0.02 <NA> 173.610001 <NA>
2023-10-30 09:30:00 41.200001 <NA> 0.2 <NA> 173.809998 <NA>

1375 rows × 6 columns

[762]:
plt.close(56)
[763]:
strt, end = pd.to_datetime('10/20/23 1500'), pd.to_datetime('11/10/23 0930')

xprobe.ratio.loc[strt:end].plot(grid=True, legend=True)

for prb in xprobe.clog:
    pratio = xprobe.ratio.loc[strt:end, prb]
    pclog = xprobe.clog.loc[strt:end, prb]
    pratio[pclog].plot(grid=True, linestyle='', marker='.')
[ ]:
Wow, they all agree.
[764]:
flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='13D', auto_qa_event=xprobe.event, paired_tank=flagged['VAR_02'].data.tank_height)
[764]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[768]:
plt.close(58)
[769]:
flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='13D', auto_qa_event=xprobe.event, paired_tank=flagged['UPL_02'].data.tank_height)
[769]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[783]:
plt.close(61)
[784]:
day = pd.to_datetime('10/23/23')
flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='13D', auto_qa_event=xprobe.event, paired_tank=flagged['H15_02'].data.tank_height)
[784]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

So CEN SH, UPLO, VARA, and HI15 all agree that CEN SA missed a storm from 10/23 - 10/26. But what about this storm from 11/1 - 11/5?

[777]:
plt.close(59)
[778]:
day = pd.to_datetime('10/28/23')
flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='8D', auto_qa_event=xprobe.event, paired_tank=flagged['H15_02'].data.tank_height)
[778]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-28 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[776]:
plt.close(60)
[779]:
day = pd.to_datetime('10/31/23')

flagged['CEN_01'].plot_flagged_day(day, 'CEN_01', tdelta='6D', auto_qa_event=xprobe.event, paired_tank=flagged['CEN_02'].data.tank_height)
[779]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'CEN_01 - 2023-10-31 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Wow, so UPLO and VARA wildly outpace CENT during the clog. In the last 2 graphs I moved the start date forward to just compare the second half where it appears to be overflagging. However, while CENT shelter seems to track well, even HI15 seems to outpace CENT during this period. So it is hard to argue that the clog shouldn’t continue for a bit. Plus, this early in the water year it takes a while to catch up if it gets behind.

I’ll manually unflag the second.