UPL_02- Generate Parameters for Clog Comparison

UPL02 (shelter-top rain gauge) is compared to each site for clogs, with each site given a value of 1 for clog and 0 for not. Each site is then weighted. When the sum of all weighted values exceeds 66, UPL02 is considered clogged.

To compare each site to UPL02, a number of parameters must be determined. This Jupyter Notebook determines the correct parameters for each pair. Ideally each pair does a good job of identifying clogs on its own. UPL_02 has several known clogs, so hopefully we can use those as guidelines for good parameters and use similar parameters for UPL01.

The weighting must also be developed for this site. It will vary depending on how well the other probes seem to correlate.

NOTE: This site is difficult to parameterize. No probe matches it consistently. Even lower elevation sites sometimes get more rain than it does, even though over a whole year they get less. This means that there are many storms where the ratio dips, but it is simply due to the geographic distribution of rainfall. If these sites are parameterized to catch all the clogs at UPLO, they also flag many false positives where UPLO received less precip.

First we must get clean data.

[193]:
# must install ipympl (Ipython-matplotlib) and nodejs
from ipywidgets.embed import embed_minimal_html
from ipywidgets import Layout
import matplotlib.pyplot as plt

# Jupyter magic to make plots display interactive
%matplotlib ipympl

# expand all plots to comfortable viewing size
#plt.rcParams['figure.figsize'] = [5, 2.5]
plt.rcParams['figure.dpi'] = 150
Layout(width='600px', height='400px')

import pandas as pd

import sys
sys.path.append("../")
from post_gce_qc import qaqc, data_transfer, cross_probe_qc, main
[2]:
all_flags = main.main(2019, 2024, probes={'all_params'}, data_path='../config_new.yaml', qa_params='../qa_param.yaml',
                      fname_base='MS00413_PPT_L1_5min_', write_csv=False)
Loading all PPT data from ../config_new.yaml

Load data from VAR_02

VAR_02: All quality checks and quality assurance rules applied
------------------

Load data from UPL_01

UPL_01: All quality checks and quality assurance rules applied
------------------

Load data from UPL_02

UPL_02: All quality checks and quality assurance rules applied
------------------

Load data from UPL_04

214: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
UPL_04: All quality checks and quality assurance rules applied
------------------

Load data from CEN_01

CEN_01: All quality checks and quality assurance rules applied
------------------

Load data from CEN_02

CEN_02: All quality checks and quality assurance rules applied
------------------

Load data from CEN_04

214: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
CEN_04: All quality checks and quality assurance rules applied
------------------

Load data from CS2_02

CS2_02: All quality checks and quality assurance rules applied
------------------

Load data from PRI_03

PRI_03: All quality checks and quality assurance rules applied
------------------

Load data from PRI_01

214: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
PRI_01: All quality checks and quality assurance rules applied
------------------

Load data from H15_02

H15_02: All quality checks and quality assurance rules applied
------------------

Load data from GSM_02

GSM_02: All quality checks and quality assurance rules applied
------------------

Generating cross probe tables

Checking for flagging consistency on VAR_02

304: UserWarning: Precip set to 0 without E flag or manual flag. E flag added
352: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Checking for flagging consistency on UPL_01

Checking for flagging consistency on UPL_02

Checking for flagging consistency on UPL_04

Performing cross probe on CEN_01

Checking for flagging consistency on CEN_01

Performing cross probe on CEN_02

352: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Checking for flagging consistency on CEN_02

Checking for flagging consistency on CEN_04

Performing cross probe on CS2_02

Checking for flagging consistency on CS2_02

352: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Performing cross probe on PRI_03

Checking for flagging consistency on PRI_03

Checking for flagging consistency on PRI_01

Checking for flagging consistency on H15_02

352: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
352: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Checking for flagging consistency on GSM_02

[3]:
# 1. build pivot table for cross site comparison
xppt = cross_probe_qc.BuildXTable.assemble_cross_table(all_flags, ppt_col='adj_precip')
xacc = cross_probe_qc.BuildXTable.assemble_wy_acc(xppt)
[18]:
# 2. create ACC ratios
xprobe = cross_probe_qc.XProbesQc(xacc.index, 'UPL_02')
xprobe.set_accum_ratio(xacc)

UPL01

We will start with this probe which should be highly correlated. Let’s use the relationship between CEN stand alone and shelter as a starting point.

Finding and Filling Gaps

There are some gaps with ratio jumps. I try to track them down and fix any issues.

4/14/19 - Manual Flag

[19]:
plt.close(1)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.345, min_accum=[34], days=[8], prec=[0.018, 0.036, 0.054])

OK, not clog, but a gap with a jump. Let’s check on it.

[20]:
day = pd.to_datetime('4/14/19')
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[20]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2019-04-14 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Well 40 mm of precip in 5 minutes is pretty suspicious. Let’s grab some raw data.

[22]:
strt = pd.to_datetime('4/15/19 1200')
end = strt + pd.to_timedelta('1h')
all_flags['UPL_02'].data[strt:end]
[22]:
tank_height precip adj_precip
Date
2019-04-15 12:00:00 516.5 0.0 0.0
2019-04-15 12:05:00 364.0 0.0 0.0
2019-04-15 12:10:00 14.6 0.0 0.0
2019-04-15 12:15:00 14.27 0.0 0.0
2019-04-15 12:20:00 14.6 0.33 0.0
2019-04-15 12:25:00 14.6 0.0 0.0
2019-04-15 12:30:00 14.6 0.0 <NA>
2019-04-15 12:35:00 14.6 0.0 <NA>
2019-04-15 12:40:00 14.6 0.0 <NA>
2019-04-15 12:45:00 56.799999 42.200001 42.0
2019-04-15 12:50:00 57.459999 0.66 0.4
2019-04-15 12:55:00 57.290001 0.0 0.0
2019-04-15 13:00:00 56.799999 0.0 0.0
[24]:
all_flags['UPL_02'].event.loc[strt:end]
[24]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2019-04-15 12:00:00 <NA> <NA>
2019-04-15 12:05:00 R R DRAIN QaRule AutoFlag: drain_event; QaRule AutoFlag:...
2019-04-15 12:10:00 R R DRAIN QaRule AutoFlag: drain_event; QaRule AutoFlag:...
2019-04-15 12:15:00 R <NA> DRAIN QaRule AutoFlag: drain_event;
2019-04-15 12:20:00 <NA> <NA> DRAIN QaRule AutoFlag: drain_event;
2019-04-15 12:25:00 <NA> <NA>
2019-04-15 12:30:00 MM M M M
2019-04-15 12:35:00 MM M M M
2019-04-15 12:40:00 MM M M M
2019-04-15 12:45:00 <NA> <NA>
2019-04-15 12:50:00 <NA> <NA>
2019-04-15 12:55:00 <NA> <NA>
2019-04-15 13:00:00 <NA> <NA>

OK, so the autoflags are catching the drain, but the recharge is too far after the drain to capture it. And the provisional flags got rid of the data, but they eneded 5 minutes too early… I’ll make a manual flag for this.

1/29/20 - Mystery Missing

[25]:
plt.close(3)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.345, min_accum=[34], days=[8], prec=[0.018, 0.036, 0.054])

Another gap with a big jump. Let’s dig in.

[26]:
day = pd.to_datetime('1/29/20')

plt.close(4)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='30D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[26]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2020-01-29 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[43]:
strt = pd.to_datetime('1/29/20 1300')
end = pd.to_datetime('2/20/20 1400')

pd.options.display.min_rows = 30

all_flags['UPL_01'].data[strt:end]
[43]:
tank_height precip adj_precip
Date
2020-01-29 13:00:00 489.700012 0.0 0.0
2020-01-29 13:05:00 489.799988 0.0 0.0
2020-01-29 13:10:00 489.799988 0.0 0.0
2020-01-29 13:15:00 489.799988 0.0 0.0
2020-01-29 13:20:00 489.899994 0.0 0.0
2020-01-29 13:25:00 490.0 0.1 0.1
2020-01-29 13:30:00 490.0 0.0 0.0
2020-01-29 13:35:00 489.899994 0.0 0.0
2020-01-29 13:40:00 490.0 0.0 0.0
2020-01-29 13:45:00 490.0 0.0 0.0
2020-01-29 13:50:00 490.100006 0.1 0.1
2020-01-29 13:55:00 490.200012 0.1 0.1
2020-01-29 14:00:00 490.200012 0.0 <NA>
2020-01-29 14:05:00 490.200012 0.0 <NA>
2020-01-29 14:10:00 490.200012 0.0 <NA>
... ... ... ...
2020-02-20 12:50:00 490.200012 0.0 <NA>
2020-02-20 12:55:00 490.200012 0.0 <NA>
2020-02-20 13:00:00 490.200012 0.0 <NA>
2020-02-20 13:05:00 490.200012 0.0 <NA>
2020-02-20 13:10:00 490.200012 0.0 <NA>
2020-02-20 13:15:00 490.200012 0.0 <NA>
2020-02-20 13:20:00 490.200012 0.0 <NA>
2020-02-20 13:25:00 490.200012 0.0 <NA>
2020-02-20 13:30:00 490.200012 0.0 <NA>
2020-02-20 13:35:00 490.200012 0.0 <NA>
2020-02-20 13:40:00 24.690001 0.0 0.0
2020-02-20 13:45:00 24.690001 0.0 0.0
2020-02-20 13:50:00 24.690001 0.0 0.0
2020-02-20 13:55:00 24.76 0.0 0.0
2020-02-20 14:00:00 24.610001 0.0 0.0

6349 rows × 3 columns

Definitely missinig data. Let’s look at where the flag is coming from.

[44]:
all_flags['UPL_01'].event[strt:end]
[44]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2020-01-29 13:00:00 <NA> <NA>
2020-01-29 13:05:00 <NA> <NA>
2020-01-29 13:10:00 <NA> <NA>
2020-01-29 13:15:00 <NA> <NA>
2020-01-29 13:20:00 <NA> <NA>
2020-01-29 13:25:00 <NA> <NA>
2020-01-29 13:30:00 <NA> <NA>
2020-01-29 13:35:00 <NA> <NA>
2020-01-29 13:40:00 <NA> <NA>
2020-01-29 13:45:00 <NA> <NA>
2020-01-29 13:50:00 <NA> <NA>
2020-01-29 13:55:00 <NA> <NA>
2020-01-29 14:00:00 MM M M M
2020-01-29 14:05:00 MM M M M
2020-01-29 14:10:00 MM M M M
... ... ... ... ... ... ... ...
2020-02-20 12:50:00 MM M M M
2020-02-20 12:55:00 MM M M M
2020-02-20 13:00:00 MM M M M
2020-02-20 13:05:00 MM M M M
2020-02-20 13:10:00 MM M M M
2020-02-20 13:15:00 MM M M M
2020-02-20 13:20:00 MM M M M
2020-02-20 13:25:00 MM M M M
2020-02-20 13:30:00 MM M M M
2020-02-20 13:35:00 MM M M M
2020-02-20 13:40:00 R <NA>
2020-02-20 13:45:00 <NA> <NA>
2020-02-20 13:50:00 <NA> <NA>
2020-02-20 13:55:00 <NA> <NA>
2020-02-20 14:00:00 <NA> <NA>

6349 rows × 7 columns

I can’t find any note about what is going on here. I will need to check in with Adam.

5/18/21 - Mystery Missing

[45]:
plt.close(5)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.345, min_accum=[34], days=[8], prec=[0.018, 0.036, 0.054])
[46]:
strt = pd.to_datetime('5/18/21 1200')
end = pd.to_datetime('7/8/21 0100')

all_flags['UPL_02'].data[strt:end]
[46]:
tank_height precip adj_precip
Date
2021-05-18 12:00:00 57.669998 0.15 0.0
2021-05-18 12:05:00 57.169998 0.0 0.0
2021-05-18 12:10:00 57.169998 0.0 <NA>
2021-05-18 12:15:00 57.169998 0.0 <NA>
2021-05-18 12:20:00 57.169998 0.0 <NA>
2021-05-18 12:25:00 57.169998 0.0 <NA>
2021-05-18 12:30:00 57.169998 0.0 <NA>
2021-05-18 12:35:00 57.169998 0.0 <NA>
2021-05-18 12:40:00 57.169998 0.0 <NA>
2021-05-18 12:45:00 57.169998 0.0 <NA>
2021-05-18 12:50:00 57.169998 0.0 <NA>
2021-05-18 12:55:00 57.169998 0.0 <NA>
2021-05-18 13:00:00 57.169998 0.0 <NA>
2021-05-18 13:05:00 57.169998 0.0 <NA>
2021-05-18 13:10:00 57.169998 0.0 <NA>
... ... ... ...
2021-07-07 23:50:00 135.0 0.0 <NA>
2021-07-07 23:55:00 135.0 0.0 <NA>
2021-07-08 00:00:00 135.0 0.0 <NA>
2021-07-08 00:05:00 165.800003 30.700001 30.4
2021-07-08 00:10:00 166.0 0.2 0.0
2021-07-08 00:15:00 166.0 0.0 0.0
2021-07-08 00:20:00 166.0 0.0 0.0
2021-07-08 00:25:00 166.0 0.0 0.0
2021-07-08 00:30:00 165.699997 0.0 0.0
2021-07-08 00:35:00 166.0 0.0 0.0
2021-07-08 00:40:00 165.699997 0.0 0.0
2021-07-08 00:45:00 166.0 0.0 0.0
2021-07-08 00:50:00 166.0 0.0 0.0
2021-07-08 00:55:00 165.800003 0.0 0.0
2021-07-08 01:00:00 165.699997 0.0 0.4

14557 rows × 3 columns

[47]:
all_flags['UPL_02'].event[strt:end]
[47]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2021-05-18 12:00:00 <NA> <NA>
2021-05-18 12:05:00 <NA> <NA>
2021-05-18 12:10:00 MM M M M
2021-05-18 12:15:00 MM M M M
2021-05-18 12:20:00 MM M M M
2021-05-18 12:25:00 MM M M M
2021-05-18 12:30:00 MM M M M
2021-05-18 12:35:00 MM M M M
2021-05-18 12:40:00 MM M M M
2021-05-18 12:45:00 MM M M M
2021-05-18 12:50:00 MM M M M
2021-05-18 12:55:00 MM M M M
2021-05-18 13:00:00 MM M M M
2021-05-18 13:05:00 MM M M M
2021-05-18 13:10:00 MM M M M
... ... ... ... ... ... ... ...
2021-07-07 23:50:00 MM M M M
2021-07-07 23:55:00 MM M M M
2021-07-08 00:00:00 MM M M M
2021-07-08 00:05:00 W R
2021-07-08 00:10:00 <NA> <NA>
2021-07-08 00:15:00 <NA> <NA>
2021-07-08 00:20:00 <NA> <NA>
2021-07-08 00:25:00 <NA> <NA>
2021-07-08 00:30:00 <NA> <NA>
2021-07-08 00:35:00 <NA> <NA>
2021-07-08 00:40:00 <NA> <NA>
2021-07-08 00:45:00 <NA> <NA>
2021-07-08 00:50:00 <NA> <NA>
2021-07-08 00:55:00 <NA> <NA>
2021-07-08 01:00:00 <NA> <NA>

14557 rows × 7 columns

Min Accumulation

Find the minimum WY accumulation before the ratio stabilizes.

[49]:
xprobe.plot_wy_start(xacc, 'UPL_01')
[52]:
plt.tight_layout()

Moving Window

Fall 2018 False Clogs

[54]:
plt.close(13)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.345, min_accum=[48], days=[8], prec=[0.018, 0.036, 0.054])

OK, WY 19 has a rocky start. Let’s zoom in and see if it’s maybe legit in any way.

[55]:
day = pd.to_datetime('11/21/18')

plt.close(14)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='15D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[55]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-11-21 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Well, that shouldn’t be flagged. Let’s back off of 0.18 just a little, because some of the other spots it’s flagging look great.

[57]:
plt.close(15)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.345, min_accum=[58], days=[8], prec=[0.02, 0.025, 0.03, 0.035])

So a small adjustment to the min accumulation, and increasing to 0.03 works well for WY 19, let’s check out some other years.

11/25/19 Clog

[71]:
plt.close(16)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.4, min_accum=[58], days=[8], prec=[0.02, 0.025, 0.03, 0.035])
[76]:
day = pd.to_datetime('11/25/19')

plt.close(17)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[76]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2019-11-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

That’s … maybe a clog. The stand alone is definitely outpacing the shelter, with a little clog just after noon on 11/26. The notes (ID 8145) show that the windscreen was partly blocking the orifice and wasn’t fixed until 12/5. Let’s look at some more examples.

12/5/19 clog

[84]:
plt.close(18)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.4, min_accum=[58], days=[8], prec=[0.02, 0.025, 0.03, 0.035])
[404]:
day = pd.to_datetime('12/9/19')

plt.close(116)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='32D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[404]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2019-12-09 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[80]:
plt.tight_layout()

Inbetween 12/5/19 and 1/9/20 a clog formed beneath the tipping bucket, preventing water from reaching the tank. We seem to have found the clog, but need to flag it more aggressively. This will be a good use of the lowest_normal_ratio. I would almost argue that 0.25 is doing a better job.

[87]:
plt.close(20)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[8], prec=[0.02, 0.025, 0.027, 0.03, 0.035])

OK, that one’s looking good. Let’s look at the next clog.

May 2020 Clog

[97]:
plt.close(23)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[6, 8, 10, 12], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
487: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.

OK, I’m gonna graph this backwards so we can see all the lines better, but UPLO01 will be the base!

[408]:
day = pd.to_datetime('5/1/20')

plt.close(117)
all_flags['UPL_01'].plot_flagged_day(day, 'UPL_01', tdelta='23D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_02'].data.tank_height)
[408]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_01 - 2020-05-01 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is a real clog where the tube going between the orifice funnel and the tank got plugged with needles. But this sort of clog leads to low, gradual drips, so the ratio drops very gradually. It seems that longer moving windows handle this better. Remember, the key is when the current ratio is below the moving average by a set amount (precision). A small precision often overflags, but creating a longer moving average creates a more stable number that these gradual drops get below. So it allows larger precision values to still be used. I’m tempted to manually flag, but the delayed precip dripping through the clog during dry periods would be really hard to flag right manually. Maybe I can stretch out the window size even further.

[122]:
plt.close(28)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[10, 12, 14, 16], prec=[0.02, 0.025, 0.027, 0.03, 0.035])

Flag values when they are 0.027 below the 14 Day running average of the ratio? Nothing but a ridiculously low precision value sems to catch the second set of storms during the clog. Let’s take another look at that second period. Again, I’ll use UPLO01 as the base so it displays better!

[117]:
day = pd.to_datetime('5/18/20 1800')

plt.close(25)
all_flags['UPL_01'].plot_flagged_day(day, 'UPL_01', tdelta='23D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_02'].data.tank_height)
[117]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_01 - 2020-05-18 18:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[124]:
day = pd.to_datetime('5/18/20 1800')

plt.close(29)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='23D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[124]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2020-05-18 18:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Final flagging:

  1. NotesDB has a QUALTY Event_Code, which fast_bridge will assign to the whole period 5/1-6/10

  2. Provisional has assigned Q flags to the whole period 5/1-6/10

  3. I manually flagged the last storm as undercatch 6/6-6/10

  4. The auto flags are catching 5/13-5/25

This is all pretty messy. There is a huge and unexplained chunck of missing data that seems to reset everything oddly for this WY. It would be nice to get that filled somehow.

11/10/20 Clog?

[125]:
plt.close(30)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[130]:
day = pd.to_datetime('11/10/20')

plt.close(31)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[130]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2020-11-10 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

The ratio certianly looks like a clog, and it is undercollecting. Maybe there won’t be a consensus on this one amongst the other sites. It’s only a 4mm difference at its peak.

1/3/22 Clog?

[131]:
plt.close(32)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[136]:
day = pd.to_datetime('1/3/22')

plt.close(33)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[136]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-01-03 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is similar to the last one, where the shelter is definitely being outpaced by the stand alone. However, the striaght line is a classic low dripping clog, and it is over an inch off. This seems to be an early warning of an impending clog.

October 2022 False Clog

[140]:
plt.close(35)
[141]:
plt.close(34)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[8,14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[146]:
day = pd.to_datetime('10/23/22')

plt.close(35)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='7D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[146]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-10-23 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

The ratio really looks like a clog, but it’s just being outpaced a bit. No real clog signal. I think this is mostly becausse it’s early in the water year.

11/1/22 Mini Clog

[409]:
#plt.close(42)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[8,14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[147]:
day = pd.to_datetime('11/1/22')

plt.close(40)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[147]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-11-01 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is another straightline during a heavy rainstorm, which indicates it may habve a partial clog and diminished intensities, with a total spread out over a longer period.

2/14/23 clog

[148]:
plt.close(41)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[8,14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[151]:
day = pd.to_datetime('2/14/23')

plt.close(42)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[151]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-02-14 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

So the ~50 mm of delayed precip is doubled, which has been flagged E and set to 0, so that’s good. Most of this period is missing, but the timestep before the delayed precip isn’t, so this gets flagged as a clog and the delayed precip should get a C. That seems like a descriptive set of flags.

October 2023 Double Clog

[410]:
#plt.close(43)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[45,58], days=[8,14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[413]:
day = pd.to_datetime('10/9/23')

plt.close(120)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1.5D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[413]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-10-09 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Wow, that looks like a real clog! Great!

The next one’s so big I have to reverse the graph and use UPLO01 as the base!

[164]:
day = pd.to_datetime('10/22/23')

plt.close(44)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='12D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[164]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-10-22 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

What a clog! that’s doing a great job of catching it.

2/15/24 Clog

[166]:
plt.close(45)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[167]:
day = pd.to_datetime('2/14/24')

plt.close(46)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='5D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[167]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2024-02-14 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This one seems a little under flagged, but it’s still catching it. Hopefully adding other sites will help make up for it.

5/23/24 clog

[168]:
plt.close(47)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[14], prec=[0.02, 0.025, 0.027, 0.03, 0.035])
[171]:
day = pd.to_datetime('5/3/24')

plt.close(48)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='6D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[171]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2024-05-03 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Big clog, slightly underflagged, with some serious delayed precip. I unclogged this one with a fish tape and the water came bursting through.

I’m pretty satisfied with that. Let’s move on.

UPL_04

This is the tipping bucket. It has limited years, but should be pretty close to the shelter since it literally measures the same rain, sitting beneath the shelter funnel and emptying into the shelter tank. However, it does consistently measure more precip than the tank, likely due to differences in the precision and calibration of the two instruments.

[172]:
plt.figure()
xprobe.ratio.UPL_04.plot(grid=True)
[172]:
<Axes: xlabel='Date'>
[178]:
plt.close(50)
xprobe.plot_clog_wind_thresholds('UPL_04', xppt, xacc, -10, min_accum=[10, 15], days=[6,8], prec=[0.015, 0.018, 0.02, 0.023, 0.025])

I can’t find any documentation for this clog, but it was clearly identified by UPL01 stand alone above, so it seems correct. Why the tipping bucket recorded it, but the tank didn’t I cannot expplain. This would indicate that the tipping bucket wasn’t properly draining into the tank, most likely due to some sort of clog. A 6 day window does a better job of capturing these two clogs. A 15 mm minimum year to date accumulation is necessary to prevent an initial false clog. If you zoom in, from a precision of 0.018 up, the clog is on and off repeatedly. Only a precision of 0.015 is on and stays on.

Now let’s look at the other clog in WY23.

[194]:
plt.close(51)
xprobe.plot_clog_wind_thresholds('UPL_04', xppt, xacc, -10, min_accum=[15, 25], days=[6], prec=[0.015, 0.018, 0.02,])

Let’s see if we can figure out why there is this descending ratio.

[195]:
xacc[['UPL_02', 'UPL_01', 'UPL_04']].plot(grid=True, legend=True)
[195]:
<Axes: xlabel='Date'>
[196]:
day = pd.to_datetime('10/21/22')

plt.close(53)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='6D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[196]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-10-21 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[185]:
day = pd.to_datetime('10/21/22 1500')
end = pd.to_datetime('10/21/22 2200')

all_flags['UPL_02'].data.loc[day:end]
[185]:
tank_height precip adj_precip
Date
2022-10-21 15:00:00 65.0 0.35 0.4
2022-10-21 15:05:00 65.07 0.07 0.0
2022-10-21 15:10:00 65.050003 0.0 0.0
2022-10-21 15:15:00 65.529999 0.46 0.4
2022-10-21 15:20:00 65.529999 0.0 <NA>
2022-10-21 15:25:00 65.529999 0.0 <NA>
2022-10-21 15:30:00 65.529999 0.0 <NA>
2022-10-21 15:35:00 65.529999 0.0 <NA>
2022-10-21 15:40:00 65.529999 0.0 <NA>
2022-10-21 15:45:00 65.529999 0.0 <NA>
2022-10-21 15:50:00 65.529999 0.0 <NA>
2022-10-21 15:55:00 65.529999 0.0 <NA>
2022-10-21 16:00:00 65.529999 0.0 <NA>
2022-10-21 16:05:00 65.529999 0.0 <NA>
2022-10-21 16:10:00 65.529999 0.0 <NA>
... ... ... ...
2022-10-21 20:50:00 65.529999 0.0 <NA>
2022-10-21 20:55:00 65.529999 0.0 <NA>
2022-10-21 21:00:00 65.529999 0.0 <NA>
2022-10-21 21:05:00 65.529999 0.0 <NA>
2022-10-21 21:10:00 65.529999 0.0 <NA>
2022-10-21 21:15:00 65.529999 0.0 <NA>
2022-10-21 21:20:00 65.529999 0.0 <NA>
2022-10-21 21:25:00 65.529999 0.0 <NA>
2022-10-21 21:30:00 65.529999 0.0 <NA>
2022-10-21 21:35:00 65.529999 0.0 <NA>
2022-10-21 21:40:00 65.529999 0.0 <NA>
2022-10-21 21:45:00 65.529999 0.0 <NA>
2022-10-21 21:50:00 65.709999 0.18 0.0
2022-10-21 21:55:00 94.800003 29.09 28.800001
2022-10-21 22:00:00 95.099998 0.3 0.0

85 rows × 3 columns

[186]:
all_flags['UPL_02'].event.loc[day:end]
[186]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2022-10-21 15:00:00 <NA> <NA>
2022-10-21 15:05:00 <NA> <NA>
2022-10-21 15:10:00 <NA> <NA>
2022-10-21 15:15:00 <NA> <NA>
2022-10-21 15:20:00 MM M M M
2022-10-21 15:25:00 MM M M M
2022-10-21 15:30:00 MM M M M
2022-10-21 15:35:00 MM M M M
2022-10-21 15:40:00 MM M M M
2022-10-21 15:45:00 MM M M M
2022-10-21 15:50:00 MM M M M
2022-10-21 15:55:00 MM M M M
2022-10-21 16:00:00 MM M M M
2022-10-21 16:05:00 MM M M M
2022-10-21 16:10:00 MM M M M
... ... ... ... ... ... ... ...
2022-10-21 20:50:00 MM M M M
2022-10-21 20:55:00 MM M M M
2022-10-21 21:00:00 MM M M M
2022-10-21 21:05:00 MM M M M
2022-10-21 21:10:00 MM M M M
2022-10-21 21:15:00 MM M M M
2022-10-21 21:20:00 MM M M M
2022-10-21 21:25:00 MM M M M
2022-10-21 21:30:00 MM M M M
2022-10-21 21:35:00 MM M M M
2022-10-21 21:40:00 MM M M M
2022-10-21 21:45:00 MM M M M
2022-10-21 21:50:00 <NA> <NA>
2022-10-21 21:55:00 J R
2022-10-21 22:00:00 W <NA>

85 rows × 7 columns

[188]:
all_flags['UPL_04'].event.loc[day:end]
[188]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2022-10-21 15:00:00 <NA>
2022-10-21 15:05:00 <NA>
2022-10-21 15:10:00 <NA>
2022-10-21 15:15:00 <NA>
2022-10-21 15:20:00 M
2022-10-21 15:25:00 M
2022-10-21 15:30:00 M
2022-10-21 15:35:00 M
2022-10-21 15:40:00 M
2022-10-21 15:45:00 M
2022-10-21 15:50:00 M
2022-10-21 15:55:00 M
2022-10-21 16:00:00 M
2022-10-21 16:05:00 M
2022-10-21 16:10:00 M
... ... ... ... ... ... ... ...
2022-10-21 20:50:00 M
2022-10-21 20:55:00 M
2022-10-21 21:00:00 M
2022-10-21 21:05:00 M
2022-10-21 21:10:00 M
2022-10-21 21:15:00 M
2022-10-21 21:20:00 M
2022-10-21 21:25:00 M
2022-10-21 21:30:00 M
2022-10-21 21:35:00 M
2022-10-21 21:40:00 M
2022-10-21 21:45:00 M
2022-10-21 21:50:00 <NA>
2022-10-21 21:55:00 <NA>
2022-10-21 22:00:00 <NA>

85 rows × 7 columns

So the tipping bucket accumulates first, but before the shelter tank catches up, the data becomes missing. When it comes back, the tank catches up, but now the tipping bucket is at a real deficit, which takes a while to correct. Ideally, we would find the missing data on the card downloads. If not, then maybe the minimum accumulation can be increased.

[198]:
plt.close(54)
xprobe.plot_clog_wind_thresholds('UPL_04', xppt, xacc, -10, min_accum=[40,60,80, 100], days=[6], prec=[0.015, 0.018, 0.02,])

That doesn’t really improve things much. I guess we’ll have to depend on the fact that no other site is likely to flag this, especially because it is when the water year accumulation is so low, so it will be below the minimum for most other sites.

Since this probe shares the orifice funnel with UPLO_02, almost every time it is clogged, the tank gauge (UPL_02) will also be clogged. If this probe reports a clog, it should only need a single other probe to confirm it, so, like UPL_01, it will get a weight of 58, only 8 below the threshold of 66 for identifying a composite clog.

GSM_02

These two probes often track together. The problem will be that sometimes Mack accumulates more precip and sometimes UPLO does. Let’s look at the ratio and see what it looks like.

[201]:
plt.figure()
xprobe.ratio['GSM_02'].plot(grid=True)
[201]:
<Axes: xlabel='Date'>
[277]:
xacc[['UPL_02', 'GSM_02']].plot(grid=True, legend=True)
[277]:
<Axes: xlabel='Date'>

Some periods of real agreement and periods of dissagreement. Some years they are pretty close, but other years one pulls way ahead of the other. Let’s try to narrow this down.

Min Accumulation

[202]:
xprobe.plot_wy_start(xacc, 'GSM_02')

Moving Window

The magic number is 9 clogs found by UPL_01. I think I’m going to need to break this down by water year to make any sense of it.

2024

There should be three clogs in this water year: October, February, May.

[206]:
plt.close(62)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -10, min_accum=[45], days=[6, 8, 10], prec=[0.02,0.04, 0.06, 0.08])

6 days at 0.02 precision caught two of them correctly, but the fall is all messed up. Let’s recalibrate off of this.

[208]:
plt.close(63)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -10, min_accum=[45], days=[6, 12, 14], prec=[0.02, 0.03, 0.04, 0.06])

I think we have to go with 6D and 0.02 to capture the spring clogs, and just deal with the fall being messed up.

2023

There should be 3 clogs: October, November, February.

[216]:
plt.close(64)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -10, min_accum=[45], days=[6, 14], prec=[0.02, 0.03, 0.04, 0.06])

That’s really bad. We need to back off on the sensitivity and get rid of some of these false clogs.

[218]:
plt.close(65)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -10, min_accum=[45], days=[6, 14], prec=[0.06,0.08, 0.1, 0.12])
[254]:
day = pd.to_datetime('10/25/22')

plt.close(70)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['GSM_02'].data.tank_height)
[254]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-10-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[241]:
day = pd.to_datetime('11/4/22')

plt.close(66)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['GSM_02'].data.tank_height)
[241]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-11-04 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This first one is a legitimate pattern where Mack is outpacing UPLO. We want these sorts of dips to get flagged, we just want the weighting of Mack to be low enough that it doesn lead to a final flag. The second one is another case where Mack accumulation is outpacing UPLO. 0.08 precision at 6D window seems like the best option here. That will knock out most of our 2024 clogs, including a lot of the ridiculousnes in the fall. Let’s see how the next year looks.

2022 False Clog

[243]:
plt.close(67)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -10, min_accum=[45], days=[6,14], prec=[0.06,0.08, 0.1, 0.12])
[245]:
day = pd.to_datetime('10/25/21')

plt.close(68)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['GSM_02'].data.tank_height)
[245]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2021-10-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

That certainly looks like the ratio of a clog, and GSM is getting a ton more precip than UPLO. Let’s check UPLO stand alone, but this is likely a legitimate case where GSM greatly outpaced UPLO, so we want it to flag this sort of thing.

[246]:
plt.close(69)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[246]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2021-10-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Yep, that is a storm where GSM just got a lot more precip. No clog, but the appearance of one. It’s telling that both 14D and 6D windows flag it about the same.

2020

This is the next water year with clogs. They should be in: November, December and May.

[256]:
plt.close(71)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -0.43, min_accum=[45], days=[6,8,14], prec=[0.06,0.08, 0.1, 0.12])

May is completely missed and this December clog is way too short. Unfortunately, the relationship between these two probes isn’t very tight. I don’t think we can make it much better.

2018 False Clog

[257]:
plt.close(72)
xprobe.plot_clog_wind_thresholds('GSM_02', xppt, xacc, -0.43, min_accum=[45], days=[6,8,14], prec=[0.06,0.08, 0.1, 0.12])
[258]:
day = pd.to_datetime('10/28/18')

plt.close(73)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['GSM_02'].data.tank_height)
[258]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-10-28 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[364]:
day = pd.to_datetime('11/22/18')

plt.close(74)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['GSM_02'].data.tank_height)
[364]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-11-22 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Both cases of Mack outpacing the accumulation at UPLO. I can accept that, as long as the weighting for Mack is low.

CS2_02

Let’s see if the same parameters work well for CS2MET, since it has a relatively similar geographic position to Mack.

[271]:
plt.close(75)
xprobe.plot_clog_wind_thresholds('CS2_02', xppt, xacc, -0.43, min_accum=[45], days=[6,8,14], prec=[0.06,0.08, 0.1, 0.12, 0.14])

I think we can up the threshold to 0.1, but keep using a 6 D window. That would keep 3 big important clogs, but get rid of all the other clogs, which are all false.

PRI_03

Let’s see if the same is true for the PRIM NOAHIV.

[276]:
plt.close(76)
xprobe.plot_clog_wind_thresholds('PRI_03', xppt, xacc, -0.43, min_accum=[45], days=[6,8,14], prec=[0.06,0.08, 0.1, 0.12, 0.14])

0.1 and a 6D window keep the clog in 2023 and 2019, but they add one in fall of 2021. Let’s check it out.

[280]:
plt.close(78)
xprobe.plot_clog_wind_thresholds('PRI_03', xppt, xacc, -0.43, min_accum=[45, 70], days=[6], prec=[0.06,0.08, 0.1, 0.12, 0.14])

Upping the minimum accumulation got rid of that bad flagging there. Let’s make sure the rest looks good.

[292]:
plt.close(79)
xprobe.plot_clog_wind_thresholds('PRI_03', xppt, xacc, -0.3, min_accum=[70], days=[6], prec=[0.08, 0.1, 0.11, 0.12])

Increasing the minnimum accumulation added a flag in fall of 2018. We can still get rid of that by chosing a larger threshold, 0.12.

PRI_01

Let’s try to use the same parameters for the other PRIMET rain gauge.

[295]:
plt.close(80)
xprobe.plot_clog_wind_thresholds('PRI_01', xppt, xacc, -0.38, min_accum=[70], days=[6], prec=[0.06, 0.08, 0.1, 0.11])

Well, we can actually use a lower threshold for this one, 0.08. But otherwise, the settings match up well.

VAR_02

Vara is very messy because it has a ton of clogs. And, in an average year, it gets more precip. But it is at the same elevation as UPLO. So, since sometimes it agrees with UPLO and sometimes it is independent, I’m guessing that it will be useful about half the time. Let’s take a look.

Min Accumulation

[299]:
xprobe.plot_wy_start(xacc, 'VAR_02')

While there is a lot of oscillation, most stabilize around 20 mm and seem to respond to honest changes in accumulation between the gauges after that.

Moving Window

Let’s look at this in chunks.

2019 - 2020

This period should have three clogs, all in water year 20.

[310]:
plt.close(88)
xprobe.plot_clog_wind_thresholds('VAR_02', xppt, xacc, -0.38, min_accum=[25], days=[6, 8 , 10, 14], prec=[0.08, 0.1, 0.11, 0.12])

At a broad view, the size of the window doesn’t make much of a difference. We get false clog flagging in 2018 until the threshold bumps up to 0.12. And the clog in December of 2019 is always caught. On closer inspection (see below) the length of the clog identified in 2020 depends on the window size or precision. The clog is in NotesDB as being in the tipping bucket and ending on 1/9/20. From 12/17/19 on, the funnel was likely overflowing into the tank with less precip lost to evaporation and spillage from the funnel overflow. Here, it is interrupted by a clog at VARA. However, a 6D window flags for an unreasonably short period of time. By upping the minimum accumulation to 40 mm, the 2018 false clog goes away even at lower precision thresholds. This allows us to use lower precision values to increase the length of time flagged, leaving us with a window of 8D or greater and a precision of 0.08 or greater for all but a 14D window.

[312]:
plt.close(89)
xprobe.plot_clog_wind_thresholds('VAR_02', xppt, xacc, -0.38, min_accum=[40], days=[6, 8 , 10, 14], prec=[0.08, 0.1, 0.11, 0.12])
[328]:
day = pd.to_datetime('12/4/19')
end = pd.to_datetime('1/9/20')

plt.close(90)
clog = xacc.loc[day:end, ['UPL_01', 'UPL_02', 'VAR_02']] - xacc.loc[day, ['UPL_01', 'UPL_02', 'VAR_02']]
clog.plot(grid=True, legend=True)
[328]:
<Axes: xlabel='Date'>

2021 - 2022

There should be no flags here.

[329]:
plt.close(91)
xprobe.plot_clog_wind_thresholds('VAR_02', xppt, xacc, -0.38, min_accum=[40], days=[6, 8 , 10, 14], prec=[0.08, 0.1, 0.11, 0.12])

Most of this period is missing data for this gauge. 2021 the gauge was clogged and wouldn’t drain, so it was capped. 2022 the sensor died. To get rid of the false flags here, the precision needs to go up and the window size needs to be either 8D or 14D.

2023-2024

There should be 6 clogs during this period, three in each water year.

[336]:
plt.close(92)
xprobe.plot_clog_wind_thresholds('VAR_02', xppt, xacc, -0.38, min_accum=[40], days=[6, 8 , 10, 14], prec=[0.08, 0.1, 0.11, 0.12, 0.2])

First, there is a very clear pattern of reverse clogs (clogging at VARA) in 2023. We would hope to see this in other years as well, knowing that VARA was chronically clogged. This does, however, make the signal for clogs at UPLO harder to decipher.

Even upping the precision threshold to 0.2 (the ratio would have to be >0.2 below the running avg ratio), there are still false positives at the start of WY 23. Even at 0.12, only one of the 9 clogs is identified. By upping the threshold to 0.2, no clogs are ID’d and there are still false positives. That renders this comparison completely useless. I suggest that we chose a middle path, allowing more false positives, using a low weight for this probe, and hoping that in future water years with less false clogging that the signal is clearer.

Final Selection

Only 1 of 9 clogs was successfully identified and several false positives remain. It is apparent that the fall pattern can take a long time to reach the year’s final, stable ratio. This makes this probe hard to use as a comparison to UPLO. Hopefully, future years without clogs will improve this comparison, and may make it worth revistting these parameters. For that reason, the weighting should not be as low as the low elevation sites like PRIM, but cannot be very high due to the false positives.

Let’s try 15 and see how the composite scores look.

[338]:
plt.close(93)
xprobe.plot_clog_wind_thresholds('VAR_02', xppt, xacc, -0.28, min_accum=[40], days=[8], prec=[0.12])

H15_02

This is a mid-elevation site, 1,000 ft below UPLO, and mid-slope on the McCrae ridge that forms the northern boundary of H.J.A. It is likely to track UPLO in a similar pattern to VARA, but with only one known clog. We will start by testing the parameters used for VARA and see if they work for this probe as well.

[339]:
plt.close(94)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[8], prec=[0.12])

There’s a lot of up and down there, but there are no false positives, so there is room to tighten the precision on this comparison.

We’ll need to break this down into smaller bits to assess.

2019

[341]:
plt.close(95)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[6,8,10,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

That’s a tough one. These are all probably false positives, but 0.04 seems to be flagging the correct pattern. 8 or 6 day windows seem to resolve the events more finely, breaking up individual dips in the ratio. Despite all these false positives, there may be some worth to using 0.04 at 8D.

2020

There should be 3 clogs in this water year.

[342]:
plt.close(96)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[6,8,10,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

The 0.04 threshold actually is catching the first clog in November. Too bad it is adding a false clog in October and only the 0.02 precision catches the clog in May. If we need to back off on these false clogs, we need to go all the way to 0.12, but the longer 14D window does a better job there at 0.08 precision.

2021

There shouldn’t be any clogs this year.

Target is to hopefully either use 0.12/14D or 0.04/8D.

[343]:
plt.close(97)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[6,8,10,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

At 8D and 0.04 that adds 4 more false positives, and 2 of them aren’t a very notable dip, just a gradual drop. 0.08 with a 14D window catches the one false positive, but none of the others. That will still keep 2 false positives in 2018 as well, while doing a decent job of catching the one clog in December 2019, skipping the earlier November clog.

2022

This should also have no clogs.

[344]:
plt.close(98)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[6,8,10,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

14 D at 0.08 works fine for this one.

2023

We should see 3 clogs here.

[348]:
plt.close(99)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.28, min_accum=[40], days=[8,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

Well, that barely catches 1 out of 3. But no false positives.

2024

Should be 3 clogs.

[351]:
plt.close(100)
xprobe.plot_clog_wind_thresholds('H15_02', xppt, xacc, -0.22, min_accum=[40], days=[8,14], prec=[0.02,0.04, 0.06, 0.08, 0.1, 0.12])

Again, it only catches one of the clogs. Pretty disappointing, but no false positives at our chosen parameters.

CEN_02

This sits on a ridge between VARA and UPLO about 1,000 ft below UPLO. So it should get less rain, but may get more consistent patterns of rain, i.e. get rain when UPLO gets rain. The relationship between CEN and UPLO was reasonably reliable. We will start by using the parameters generated for CEN_02: 0.03/8D.

[353]:
plt.close(101)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.22, min_accum=[93], days=[8], prec=[0.02, 0.03, 0.04])

Well, not very encouraging so far. We’ll need to take this year by year.

2019

No clogs.

[355]:
plt.close(102)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.22, min_accum=[93], days=[6,8,10], prec=[0.02, 0.03, 0.04, 0.08])

The problem is that these are all the sorts of dips we want it to catch…We can back off ot 0.08 if needed, and even bavcking off to 0.04 gets rid of 2 more false flag events. Let’s see how this shakes out with real clogs in 2020.

[365]:
day = pd.to_datetime('11/22/18')

plt.close(104)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=all_flags['CEN_02'].data.tank_height)
[365]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-11-22 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[366]:
day = pd.to_datetime('11/27/18')

plt.close(105)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=all_flags['CEN_02'].data.tank_height)
[366]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-11-27 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Those aren’t clogs, but CEN is legitimately accumulating more rain. I don’t feel super comfortable unflagging them, but we don’t want the composite score to reflect them either.

2020

3 clogs.

[368]:
plt.close(106)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, 0.1, min_accum=[93], days=[6,8,10], prec=[0.02, 0.03, 0.04, 0.08])

Well… not amazing. None of them capture the start of the November clog…and none of them capture the May clog without making it super senstive.

2021

Should be no clogs. Let’s test a higher threshold and see if it is viable.

[370]:
plt.close(107)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, 0.1, min_accum=[93], days=[6,8,10], prec=[0.03, 0.04, 0.08, 0.09])

That takes care of all of the false positives for that year. It will also get rid of most of the false positives in 2018.

Let’s double check that will still work in 2020.

[371]:
plt.close(108)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, 0.1, min_accum=[93], days=[6,8,10], prec=[0.03, 0.04, 0.08, 0.09])

That seems like the answer. Let’s keep moving forward with 2022.

2022

No clogs.

[373]:
plt.close(109)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, 0.1, min_accum=[93], days=[6,8,10], prec=[0.03, 0.04, 0.08, 0.09])
[378]:
day = pd.to_datetime('10/19/21')

plt.close(110)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='15D', auto_qa_event=xprobe.event, paired_tank=all_flags['CEN_02'].data.tank_height)
[378]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2021-10-19 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

It’s impressive how often CENT gets more precip than UPLO during an individual storm like this. It mostly seems like a minnimum accumulation issue.

[380]:
plt.close(111)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.14, min_accum=[93], days=[6,8,10], prec=[0.03, 0.04, 0.08, 0.09])

2023

3 clogs.

[385]:
plt.close(112)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.14, min_accum=[90], days=[6,8,10], prec=[0.03, 0.04, 0.08, 0.09])

Missing all the clogs…great.

2024

3 clogs.

[392]:
plt.close(113)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.14, min_accum=[90], days=[6,8,10], prec=[0.025,0.03, 0.04, 0.08, 0.09])

Final Selection

This probe was tuned to minimize false positives, but this lead to the ID of very few clogs. It may be difficult to ID many of the clogs at UPLO from other sites. The weight of this probe shouldn’t be very high, but due to th elack of false positives, there also isn’t much harm in giving it a slightly increased weight.

[393]:
plt.close(114)
xprobe.plot_clog_wind_thresholds('CEN_02', xppt, xacc, -0.14, min_accum=[90], days=[8], prec=[0.09])

CEN_01

Let’s see if we can use the same values for the stand alone rain gauge at CEN. In theory, it should be about the same, but it did have a number of clogs in 2019, which may affect how well it functions.

[417]:
plt.close(121)
xprobe.plot_clog_wind_thresholds('CEN_01', xppt, xacc, -0.14, min_accum=[90], days=[8], prec=[0.09])

This gauge has a clog in October of 24, so it is missing an additional flag during that period.

Let’s just briefly test a lower precision.

[418]:
plt.close(122)
xprobe.plot_clog_wind_thresholds('CEN_01', xppt, xacc, -0.14, min_accum=[90], days=[8, 14], prec=[0.06, 0.09])

Not an improvement. We’ll just stick with the parameters from CEN_02.

Composite Clogs

Let’s see what gets flagged when aggregating a composite of what all sites flag

[444]:
params = qaqc._load_yaml('../qa_param.yaml')

fnc_params = params['UPL_02']['auto_flag']['flag_x_clogs']
wt_params = params['UPL_02']['auto_flag']['weight_x_clogs']
[445]:
# compare against each probe for clogs against the base probe
xprobe.set_x_clogs(xppt, xacc, fnc_params)

# Get the weighted value for each site to decide on final flags.
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(wt_params)
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)
[429]:
plt.close(123)
xacc[['UPL_02']].plot(grid=True, legend=True)
xacc.loc[xprobe.event.clog, 'UPL_02'].plot(grid=True, linestyle='', marker='.')
[429]:
<Axes: xlabel='Date'>

So we caught 5 out of 9… lovely. and all substantially underflagged. Let’s try changing the weighting. All of these are tuned to avoid false positives, but catch very few of the actual clogs…except for UPL_01. As a rule, it always takes at least 2 probes to confirm a clog. But maybe this site needs to be an exception to that. Let’s see what it wouldl look like to give UPL_01 a weight of 67.

[430]:
wt_params
[430]:
{'UPL_01': 58,
 'UPL_04': 58,
 'GSM_02': 9,
 'CS2_02': 9,
 'PRI_03': 4,
 'PRI_01': 5,
 'VAR_02': 15,
 'H15_02': 10,
 'CEN_02': 9,
 'CEN_01': 9}
[447]:
wt_params['UPL_01'] = 67
wt_params
[447]:
{'UPL_01': 67,
 'UPL_04': 58,
 'GSM_02': 9,
 'CS2_02': 9,
 'PRI_03': 4,
 'PRI_01': 5,
 'VAR_02': 15,
 'H15_02': 10,
 'CEN_02': 9,
 'CEN_01': 9}
[448]:
# Get the weighted value for each site to decide on final flags.
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(wt_params)
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)
[434]:
plt.close(124)
xacc[['UPL_02']].plot(grid=True, legend=True)
xacc.loc[xprobe.event.clog, 'UPL_02'].plot(grid=True, linestyle='', marker='.')
[434]:
<Axes: xlabel='Date'>

We will go through these one by one to verify them.

10/10/18 - Adjust Params to Remove

[449]:
day = pd.to_datetime('10/10/18')

plt.close(127)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[449]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2018-10-10 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[440]:
plt.close(126)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[48, 58, 68], days=[14], prec=[0.027, 0.028, 0.029, 0.03])

So, to get rid of that stupid spot in 2018, either the weight needs to be reduced, or the precision needs to increase to 0.03.

[450]:
fnc_params['UPL_01']['clog_pair_flagging_wrap']['window_precision'] = 0.03
[460]:
# compare against each probe for clogs against the base probe
xprobe.set_x_clogs(xppt, xacc, fnc_params)

# Get the weighted value for each site to decide on final flags.
eventwt, Uwt, Cwt, = xprobe.get_weight_x_clog(wt_params)
xprobe.flag_x_clogs(eventwt, Uwt, Cwt)

all_flags['UPL_02'].flags[['U', 'C']] |= xprobe.flags

11/25/19 - Correctly Flagging

To visualize, UPL_01 (stand alone) will be used as the base to highlight the clog in UPL_02 (shelter).

[453]:
day = pd.to_datetime('11/25/19')

plt.close(128)
all_flags['UPL_01'].plot_flagged_day(day, 'UPL_01', tdelta='5D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_02'].data.tank_height)
[453]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_01 - 2019-11-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Let’s zoom in on that.

[461]:
day = pd.to_datetime('11/24/19')

plt.close(129)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2.5D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[461]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2019-11-24 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

Well, that looked better with a window precision of 0.027, but I think it’s getting the job done.

12/11/19 - Correctly Flagging

This one mus also be visualized with UPL_01 as the base since it collects so much more precip than the shelter.

[465]:
day = pd.to_datetime('12/9/19')

plt.close(132)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='32D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[465]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2019-12-09 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

A few more C’s for delayed precip than I would have expected, but otherwise pretty good. A couple mm difference probably starts 2 days prior, but this seems like its performing well in the big pucture, cathcing the majority of the mm where it is lagging.

5/2/20 - Significant Under Flagging

[466]:
day = pd.to_datetime('5/2/20')

plt.close(133)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='32D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[466]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2020-05-02 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

We can see minimal new flagging on top of the provisional Q flags. In final flagging the C’s and U’s will take precedence, and anywhere with a clog event can only be assigned a C, a U, or nothing. So a lot of these Q’s will be wiped out. To view this well, the stand alone will need to be used as a base.

[472]:
day = pd.to_datetime('5/2/20')

plt.close(134)
all_flags['UPL_01'].plot_flagged_day(day, 'UPL_01', tdelta='40D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_02'].data.tank_height)
[472]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_01 - 2020-05-02 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is way underflagged. The second section in June already has a manual flag in place. The whole period looks like it needs this treatment.

11/10/20 - Minor Clog

This one is an the beginnings of a clog, where the shelter measures precip at a linear rate throughout a storm, but the stand alone measures more intense pulses in an arcing pattern.

[474]:
day = pd.to_datetime('11/10/20')

plt.close(135)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[474]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2020-11-10 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

10/24/21 - Removed By Precision Adjustment

This flag, which was a single timestamp, was removed when the window precision was increased to 0.03.

[475]:
day = pd.to_datetime('10/24/21')

plt.close(136)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[475]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2021-10-24 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

1/3/22 - Under Flagging

[491]:
day = pd.to_datetime('1/3/22')

plt.close(137)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[491]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-01-03 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

If anything, this looks underflagged. This is similar to the minor clog above, the shelter has a linear accumulation, likely due to a clog in the funnel. So there is a lot of undercatch, and moments of delayed accumulation. The problem with manually flagging it is that everything will get one flag, instead of calculating U or C as it goes.

[492]:
day = pd.to_datetime('1/6/22 1310')

plt.close(138)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[492]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-01-06 13:10:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is a relatively subtle and minor clog. The magnitude of undercatch and delayed precip is relatively small. And manually flagging it would be imprecise, giving a single flag to the whole period. I’ll leave this as is.

10/22/22 - Minor Clog

[514]:
day = pd.to_datetime('10/25/22')

plt.close(143)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='1.5D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[514]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-10-25 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is a minor clog, but it is flagging it correctly. These sorts of errors are caught more easily early in the water year. The nature of the ratio of water year accumulation is that it gets less sensitive as the total accumulation increases, hence why this October example is so heavily flagged, where a similar example in January was under flagged.

11/4/22 - Bad Flag Manually Removed!

[518]:
day = pd.to_datetime('11/4/22 0000')

plt.close(144)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='2D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[518]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2022-11-04 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

This is a false positive. I will remove it with a manual flag.

2/13/23 - Missing Data!

This is flagged correctly when there is data to be flagged. Missing data must have a flag of M, however there is no event code.

[525]:
day = pd.to_datetime('2/13/23 0000')

plt.close(145)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[525]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-02-13 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

There actually may not have been a clog here. The data is just missing. And then the next piece of information is this jump in tank level, which is a cumulative precipitation since the last known measurement. So really, this is all working correctly. Just a little unfortunate.

10/9/23 - Bad Flag Manually Removed

[531]:
day = pd.to_datetime('10/10/23 0400')

plt.close(146)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='3D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[531]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-10-10 04:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

That’s a bad flag right at the beginning of the water year. There is actually a very small clog just before it where there isn’t enough accumulation to catch it, so it is added manually. This zone is probably responding to that difference in water year accumulation. It was manually removed.

10/24/23 - Correctly Flagging

It ends slightly prematurely, but captures the event pretty well overall.

[499]:
day = pd.to_datetime('10/24/23 0000')

plt.close(139)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='9D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[499]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2023-10-24 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

2/15/24 - Massive Under-Flag

This is way under flagging!

[533]:
day = pd.to_datetime('2/15/24 0000')

plt.close(147)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='7D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[533]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2024-02-15 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)

That’s real bad! that huge clog needs more flagging. Let’s take a look at the ratios and figure out why it’s not.

[538]:
day = pd.to_datetime('2/1/24 0000')

xprobe.plot_x_clogs(day, tdelta='20D')
[538]:
<Axes: xlabel='Date', ylabel='Ratio ($\\frac{Base-Pair}{Base}$)'>

It’s so frustrating when all of the sites have a ratio that shows the pattern, but none of them properly ID it. Let’s look at the broader pattern and see if it makes any sense.

[539]:
plt.figure()
xprobe.ratio['UPL_01'].plot(grid=True)
[539]:
<Axes: xlabel='Date'>

That steep downward trajectory in January is killing us! Based on the progression of clogs, there was probably a gradual build up of material in the funnel hose that was leading to some consistent undercatch. Let’s see if we could even tweak the numbers to get there.

[541]:
plt.close(153)
xprobe.plot_clog_wind_thresholds('UPL_01', xppt, xacc, -0.32, min_accum=[58], days=[6, 14], prec=[0.027, 0.028, 0.029, 0.03])
[543]:
day = pd.to_datetime('2/15/24 0000')
end = day + pd.to_timedelta('2D')
all_flags['UPL_02'].event[day:]
[543]:
prov_flag tank_flag QaRule_flag manual_flag final_flag event_code explanation
Date
2024-02-15 00:00:00 <NA> T
2024-02-15 00:05:00 <NA> T
2024-02-15 00:10:00 <NA> T
2024-02-15 00:15:00 <NA> T
2024-02-15 00:20:00 <NA> T
2024-02-15 00:25:00 <NA> T
2024-02-15 00:30:00 <NA> T
2024-02-15 00:35:00 <NA> T
2024-02-15 00:40:00 <NA> T
2024-02-15 00:45:00 <NA> T
2024-02-15 00:50:00 <NA> T
2024-02-15 00:55:00 <NA> T
2024-02-15 01:00:00 <NA> T
2024-02-15 01:05:00 <NA> T
2024-02-15 01:10:00 <NA> T
... ... ... ... ... ... ... ...
2024-09-30 22:50:00 <NA> Q Q
2024-09-30 22:55:00 <NA> Q Q
2024-09-30 23:00:00 <NA> Q Q
2024-09-30 23:05:00 <NA> Q Q
2024-09-30 23:10:00 <NA> Q Q
2024-09-30 23:15:00 <NA> Q Q
2024-09-30 23:20:00 <NA> Q Q
2024-09-30 23:25:00 <NA> Q Q
2024-09-30 23:30:00 <NA> Q Q
2024-09-30 23:35:00 <NA> Q Q
2024-09-30 23:40:00 <NA> Q Q
2024-09-30 23:45:00 <NA> Q Q
2024-09-30 23:50:00 <NA> Q Q
2024-09-30 23:55:00 <NA> Q Q
2024-10-01 00:00:00 <NA> Q Q

65953 rows × 7 columns

This is a real problem. I could apply a manual flag to the whole period, but it would only get one flag, U or C, and this clog clearly goes back and forth between the two. The only solution I can think of is to identify broad zones of U and C and write different manual flags for each. I’m ending with 4 manual flags, 2 each of U and of C.

5/5/24 - Correctly Flagging

[535]:
day = pd.to_datetime('5/5/24 0000')

plt.close(148)
all_flags['UPL_02'].plot_flagged_day(day, 'UPL_02', tdelta='4D', auto_qa_event=xprobe.event, paired_tank=all_flags['UPL_01'].data.tank_height)
[535]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
 <Axes: title={'center': 'UPL_02 - 2024-05-05 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
[ ]: