Assigning Final Flags¶
The provisional data is checked by GCE, which assigns multiple flags to the data. The QaRules class then performs a number of quality checks on the data, assigning flags with each check. These accumulated flags are then applied to the final format by apply_QaRules_flags. Then flags are manually added by apply_manual_flags. Additionally, there are provisional flags that are imported using import_provisional_data, but are not applied. All of these flags need to be combined into one final flag. This section explores how to do that.
[1]:
import pandas as pd
import matplotlib.pyplot as plt
# Jupyter magic to make plots display interactive
# must install ipympl (Ipython-matplotlib) and nodejs
from ipywidgets.embed import embed_minimal_html
%matplotlib widget
import sys
sys.path.append("../")
from post_gce_qc import qaqc, data_transfer, cross_probe_qc, main
[20]:
all_flags = main.main(2019, 2024, probes={'all_params'}, data_path='../config_new.yaml', qa_params='../qa_param.yaml',
fname_base='MS00413_PPT_L1_5min_', write_csv=False)
Loading all PPT data from ../config_new.yaml
Load data from VAR_02
All quality checks and quality assurance rules applied to VAR_02
------------------
Load data from UPL_01
All quality checks and quality assurance rules applied to UPL_01
------------------
Load data from UPL_02
All quality checks and quality assurance rules applied to UPL_02
------------------
Load data from UPL_04
All quality checks and quality assurance rules applied to UPL_04
------------------
Load data from CEN_01
All quality checks and quality assurance rules applied to CEN_01
------------------
Load data from CEN_02
All quality checks and quality assurance rules applied to CEN_02
------------------
Load data from CEN_04
All quality checks and quality assurance rules applied to CEN_04
------------------
Load data from CS2_02
All quality checks and quality assurance rules applied to CS2_02
------------------
Load data from PRI_03
All quality checks and quality assurance rules applied to PRI_03
------------------
Load data from PRI_01
All quality checks and quality assurance rules applied to PRI_01
------------------
Load data from H15_02
All quality checks and quality assurance rules applied to H15_02
------------------
Load data from GSM_02
All quality checks and quality assurance rules applied to GSM_02
------------------
Check NA has M Flag¶
[3]:
cf = all_flags['CEN_01'].flags
[9]:
(cf.SetNA == cf.M).all()
[9]:
False
[10]:
cf[cf.SetNA != cf.M]
[10]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2019-07-01 12:45:00 | False | False | False | False | True | False | False | False |
| 2019-07-01 13:00:00 | False | False | False | False | True | False | False | False |
| 2019-07-01 13:15:00 | False | False | False | False | True | False | False | False |
| 2019-07-01 13:30:00 | False | False | False | False | True | False | False | False |
| 2019-07-01 13:45:00 | False | False | False | False | True | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-04-10 12:45:00 | False | False | False | False | True | False | False | False |
| 2024-04-10 13:00:00 | False | False | False | False | True | False | False | False |
| 2024-04-10 13:15:00 | False | False | False | False | True | False | False | False |
| 2024-04-10 13:30:00 | False | False | False | False | True | False | False | False |
| 2024-09-06 16:15:00 | False | False | False | False | True | False | False | False |
2196 rows × 8 columns
[11]:
cf[cf.SetNA != cf.M].describe()
[11]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| count | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 |
| unique | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| top | False | False | False | False | True | False | False | False |
| freq | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 |
[12]:
cf.describe()
[12]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| count | 210432 | 210432 | 210432 | 210432 | 210432 | 210431 | 210431 | 210432 |
| unique | 1 | 2 | 2 | 1 | 2 | 2 | 2 | 1 |
| top | False | False | False | False | False | False | False | False |
| freq | 210432 | 209887 | 210414 | 210432 | 208236 | 209841 | 210338 | 210432 |
Ok, so it looks like there are no Missing flags. That’s an easy fix. Right now apply_NAN_val sets self.event['final_flag'] = 'M', but it doesn’t touch flags. Let’s see if we can stick with the convention of using this boolean data and make that column true instead.
But are there any instances where there is an additional flag assigned with SetNA?
[14]:
cf[cf.SetNA == True].describe()
[14]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| count | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 |
| unique | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| top | False | False | False | False | True | False | False | False |
| freq | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 | 2196 |
[17]:
c = cf.columns
c.isin(['SetNA', 'Set0'])
[17]:
array([False, False, False, False, True, True, False, False])
[66]:
cf[cf.SetNA == True].any()
[66]:
Q False
U False
C False
* False
SetNA True
Set0 False
E False
M True
dtype: bool
[69]:
cf.loc[cf.SetNA == True, ~cf.columns.isin(['SetNA', 'Set0'])].any().any()
[69]:
True
[71]:
(cf.M & ~cf.SetNA).any()
[71]:
False
[72]:
for probe, flags in all_flags.items():
setNa_wo_M = (flags.flags['M'] & ~flags.flags['SetNA']).any()
print(f'{probe}: {setNa_wo_M}')
VAR_02: True
UPL_01: False
UPL_02: False
UPL_04: False
CEN_01: False
CEN_02: False
CEN_04: False
CS2_02: False
PRI_03: False
PRI_01: False
H15_02: False
GSM_02: False
OK, so VARA has some places where there is an M flag without SetNA. Let’s look at why. Maybe we can also test where SetNA doesn’t have an M.
[96]:
for probe, flags in all_flags.items():
setNa_wo_M = (flags.flags['M'] ^ flags.flags['SetNA']).any()
print(f'{probe}: {setNa_wo_M}')
VAR_02: True
UPL_01: False
UPL_02: False
UPL_04: False
CEN_01: False
CEN_02: False
CEN_04: False
CS2_02: False
PRI_03: False
PRI_01: False
H15_02: False
GSM_02: False
So at least we know this XOR works and there aren’t any new sites. But let’s see how this was flagged
[97]:
v = all_flags['VAR_02']
v.event[v.flags['M'] ^ v.flags['SetNA']]
[97]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-09-24 12:30:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-24 12:45:00 | RQ | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-24 13:00:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-24 13:15:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 09:00:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 09:15:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 09:30:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 09:45:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 10:00:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 10:15:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 10:30:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 10:45:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 11:00:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 11:15:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 11:30:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 11:45:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 12:00:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 12:15:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; | ||
| 2019-09-25 12:30:00 | <NA> | E | EM | ManualFlag: Sensor removed and rewired; ManualFlag: False precip during tank fluctuation; |
OK, so the M flag was written first. The E flag overwrote the manual flag, but the explanation is additive and the final flags are based on the boolean, so it retained both. It looks like the manual flags overlap, but the M flag was taken from NotesDB, and doesn’t match the raw data below. I’ll edit the manual flag, but we still need a rule here that will be consistent.
The M flag should take presidence. All of the data around those E flags will be gone, so it won’t make sense otherwise.
[93]:
plt.close(1)
[94]:
v.data.loc[pd.to_datetime('9/10/19'):pd.to_datetime('9/26/19'), 'tank_height'].plot(grid=True, marker='.')
[94]:
<Axes: xlabel='Date'>
[95]:
plt.figure()
v.data.loc[pd.to_datetime('9/10/19'):pd.to_datetime('9/26/19'), 'precip'].plot(grid=True, marker='.', legend=True)
v.data.loc[pd.to_datetime('9/10/19'):pd.to_datetime('9/26/19'), 'adj_precip'].plot(grid=True, marker='.', legend=True)
[95]:
<Axes: xlabel='Date'>
This is a unique instance where the raw tank data was -184. This data is automatically replaced by GCE, and this process forward fills tank values, so the last recorded value is displayed. And then I asked Adam to add a manual flag to catch a pre-disconnect spike. Likely this was due to messing with wiring.
SetZero Must Have Flag¶
Whenever a value is set to 0, it should have a corresponding flag to explain it. Is there ever a case where this shouldn’t be an E flag?
[27]:
cf[cf.Set0 == True].any()
[27]:
Q False
U True
C False
* False
SetNA False
Set0 True
E True
M False
dtype: bool
Whoa, U or E?
[31]:
all_flags['CEN_01'].event[(cf.U == True) & (cf.E == True)]
[31]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
OK, looks like there are no U’s when there’s an E.
[35]:
all_flags['CEN_01'].event[(cf.Set0 == True) & (cf.E == True)].describe()
[35]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| count | 0 | 93 | 93 | 93 | 93 | 93 |
| unique | 0 | 1 | 1 | 1 | 1 | 1 |
| top | NaN | E | E | INTPRO | QaRule AutoFlag: duplicate; | |
| freq | NaN | 93 | 93 | 93 | 93 | 93 |
[37]:
all_flags['CEN_01'].event[(cf.Set0 == True) & (cf.U == True)].describe()
[37]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| count | 497 | 497 | 497 | 497 | 497 | 497 |
| unique | 1 | 1 | 1 | 1 | 1 | 1 |
| top | MMM | U | U | CLOG | ManualFlag: Valve left open at gauge. Not enough accumulated precip for autoflagging.; | |
| freq | 497 | 497 | 497 | 497 | 497 | 497 |
OK, so that was all a manual flag during a clog…
OK, so what’s happening here is there is a manual flag being added with the E, but then the clog analysis is adding a U flag. So is it more clear to be estimated as 0 with a clog event_code or to let the U shine through?
Also, how do I write something to catch this case?
Overall, I think having a E flag should mean we think the number is real. So this should all get a U flag.
So this works, but there needs to be an exception for manual flags.
[63]:
ce = all_flags['CEN_01'].event
has_manual_flag = (ce['manual_flag'] != '')
(cf.Set0 & ~cf.E & ~has_manual_flag).any()
[63]:
False
[64]:
for probe, flags in all_flags.items():
has_manual_flag = (flags.event['manual_flag'] != '')
any_missing = (flags.flags.Set0 & ~flags.flags.E & ~has_manual_flag).any()
print(f'{probe}: {any_missing}\n')
VAR_02: True
UPL_01: False
UPL_02: False
UPL_04: False
CEN_01: False
CEN_02: False
CEN_04: False
CS2_02: False
PRI_03: False
PRI_01: False
H15_02: False
GSM_02: False
[65]:
vf = all_flags['VAR_02'].flags
ve = all_flags['VAR_02'].event
has_manual_flag = (ve['manual_flag'] != '')
missing_e = vf.Set0 & ~vf.E & ~has_manual_flag
ve[missing_e]
[65]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-07-03 16:35:00 | MMM | INSREM | ManualFlag: Sensor reinstallation. Tank raised following empty state while disconnected.; | |||
| 2022-07-03 16:40:00 | MMM | INSREM | ManualFlag: Sensor reinstallation. Tank raised following empty state while disconnected.; | |||
| 2022-07-03 16:45:00 | MMM | INSREM | ManualFlag: Sensor reinstallation. Tank raised following empty state while disconnected.; | |||
| 2022-07-13 16:35:00 | <NA> | MAINTE | ManualFlag: Sensor standpipe drained for repair and tank refilled by bucket.; |
So that will be rewritten with an E flag. I think that makes sense.
Two issues coming up: in the last section we had dueling manual flags, and here we have a contradicting provisional flag.
Finding Multiple Flags¶
First, we’ll look at the flag DataFrame and see if the boolean data can be leveraged to quickly ID where there are multiple flags.
[21]:
cf = all_flags['CEN_01'].flags
[22]:
col = ~cf.columns.isin(['Set0', 'SetNA'])
n_flags = cf.loc[:, col].sum(axis=1)
(n_flags > 1).any()
[22]:
False
[23]:
cf[n_flags>1]
[23]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| Date |
[24]:
all_flags['CEN_01'].event[n_flags>1]
[24]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
[136]:
for probe, flags in all_flags.items():
col = ~flags.flags.columns.isin(['Set0', 'SetNA'])
n_flags = flags.flags.loc[:, col].sum(axis=1)
print(f'{probe}: {(n_flags).max()}')
VAR_02: 3
UPL_01: 1
UPL_02: 2
UPL_04: 0
CEN_01: 1
CEN_02: 2
CEN_04: 0
CS2_02: 2
PRI_03: 2
PRI_01: 0
H15_02: 2
GSM_02: 1
Let’s see how much of this can be fixed if manual flags clear all flags except the manual assignment.
[152]:
import sys
del sys.modules['post_gce_qc.qaqc']
del sys.modules['post_gce_qc']
del sys.modules['post_gce_qc.main']
del all_flags
from post_gce_qc import qaqc, main
[153]:
all_flags = main.main(2019, 2024, probes={'all_params'}, data_path='../config_new.yaml', qa_params='../qa_param.yaml',
fname_base='MS00413_PPT_L1_5min_', write_csv=False)
Loading all PPT data from ../config_new.yaml
Load data from VAR_02
All quality checks and quality assurance rules applied to VAR_02
------------------
Load data from UPL_01
All quality checks and quality assurance rules applied to UPL_01
------------------
Load data from UPL_02
All quality checks and quality assurance rules applied to UPL_02
------------------
Load data from UPL_04
All quality checks and quality assurance rules applied to UPL_04
------------------
Load data from CEN_01
All quality checks and quality assurance rules applied to CEN_01
------------------
Load data from CEN_02
All quality checks and quality assurance rules applied to CEN_02
------------------
Load data from CEN_04
All quality checks and quality assurance rules applied to CEN_04
------------------
Load data from CS2_02
All quality checks and quality assurance rules applied to CS2_02
------------------
Load data from PRI_03
All quality checks and quality assurance rules applied to PRI_03
------------------
Load data from PRI_01
All quality checks and quality assurance rules applied to PRI_01
------------------
Load data from H15_02
All quality checks and quality assurance rules applied to H15_02
------------------
Load data from GSM_02
All quality checks and quality assurance rules applied to GSM_02
------------------
[154]:
for probe, flags in all_flags.items():
col = ~flags.flags.columns.isin(['Set0', 'SetNA'])
n_flags = flags.flags.loc[:, col].sum(axis=1)
print(f'{probe}: {(n_flags).max()}')
VAR_02: 2
UPL_01: 1
UPL_02: 2
UPL_04: 0
CEN_01: 1
CEN_02: 2
CEN_04: 0
CS2_02: 2
PRI_03: 2
PRI_01: 0
H15_02: 1
GSM_02: 1
OK, still a lot of duplicate flags. Let’s dig in.
[155]:
v = all_flags['VAR_02']
col = ~v.flags.columns.isin(['Set0', 'SetNA'])
n_flags = v.flags.loc[:, col].sum(axis=1)
v.event[n_flags>1]
[155]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2024-01-25 15:05:00 | JR | E | C | CE | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations |
| 2024-01-25 15:10:00 | <NA> | E | C | CE | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations |
| 2024-01-25 15:15:00 | <NA> | E | C | CE | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations |
OK, this is because manual flags have to be applied to clean the data before the prorating process is applied. So the prorating flag is overtop of the manual flag. Therfore, this is a case where the manual flags also need to be applied after. Let’s look at another site and see if it’s any different.
[157]:
u = all_flags['CEN_02']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[157]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-03-03 10:05:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2024-05-16 15:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; |
[215]:
u = all_flags['UPL_02']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[215]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-04-15 12:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-09-12 16:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-09-12 16:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2021-11-03 14:50:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; |
So we’re having some sort of drain event issue. Not sure where the E is coming from.
Ah, I see that flag_drains assigns an E when it removes data. How unconventional. That’s an easy change, however. apply_NAN_val then applies the NAN flag and sets the flag to missing.
[216]:
multi_flag = n_flags > 1
for s in range(-2,3):
multi_flag |= multi_flag.shift(s)
u.data[multi_flag]
[216]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2019-04-15 12:00:00 | 516.5 | 0.0 | 0.0 |
| 2019-04-15 12:15:00 | 14.27 | 0.0 | 0.0 |
| 2019-04-15 12:30:00 | 14.27 | 0.0 | <NA> |
| 2019-04-15 12:45:00 | 56.799999 | 42.200001 | <NA> |
| 2019-04-15 13:00:00 | 56.799999 | 0.0 | 0.0 |
| 2019-04-15 13:15:00 | 56.150002 | 0.0 | 0.0 |
| 2019-04-15 13:30:00 | 56.150002 | 0.0 | 0.0 |
| 2019-09-12 15:15:00 | 336.200012 | 0.0 | 0.0 |
| 2019-09-12 15:30:00 | 336.0 | 0.0 | 0.0 |
| 2019-09-12 15:45:00 | 14.85 | 0.0 | 0.0 |
| 2019-09-12 16:00:00 | 47.07 | 7.69 | <NA> |
| 2019-09-12 16:15:00 | 59.330002 | 2.94 | <NA> |
| 2019-09-12 16:30:00 | 61.279999 | 0.31 | 0.0 |
| 2019-09-12 16:45:00 | 61.59 | 0.0 | 0.0 |
| 2019-09-12 17:00:00 | 61.900002 | 0.0 | 0.0 |
| 2021-11-03 14:35:00 | 215.800003 | 0.0 | 0.0 |
| 2021-11-03 14:40:00 | 215.800003 | 0.0 | 0.0 |
| 2021-11-03 14:45:00 | 43.240002 | 0.0 | 0.0 |
| 2021-11-03 14:50:00 | 53.709999 | 10.47 | <NA> |
| 2021-11-03 14:55:00 | 53.849998 | 0.14 | 0.0 |
| 2021-11-03 15:00:00 | 53.860001 | 0.01 | 0.0 |
| 2021-11-03 15:05:00 | 54.009998 | 0.15 | 0.0 |
[217]:
u.event[multi_flag]
[217]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-04-15 12:00:00 | <NA> | DRAIN | QaRule AutoFlag: drain_event; | |||
| 2019-04-15 12:15:00 | <NA> | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |||
| 2019-04-15 12:30:00 | MMM | M | DRAIN | QaRule AutoFlag: drain_event; | ||
| 2019-04-15 12:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-04-15 13:00:00 | <NA> | |||||
| 2019-04-15 13:15:00 | <NA> | |||||
| 2019-04-15 13:30:00 | <NA> | |||||
| 2019-09-12 15:15:00 | <NA> | |||||
| 2019-09-12 15:30:00 | <NA> | |||||
| 2019-09-12 15:45:00 | <NA> | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |||
| 2019-09-12 16:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-09-12 16:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-09-12 16:30:00 | <NA> | |||||
| 2019-09-12 16:45:00 | <NA> | |||||
| 2019-09-12 17:00:00 | <NA> | |||||
| 2021-11-03 14:35:00 | <NA> | |||||
| 2021-11-03 14:40:00 | <NA> | |||||
| 2021-11-03 14:45:00 | RR | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |||
| 2021-11-03 14:50:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2021-11-03 14:55:00 | <NA> | Q | Q | DRAIN | QaRule AutoFlag: drain_event; | |
| 2021-11-03 15:00:00 | <NA> | |||||
| 2021-11-03 15:05:00 | <NA> |
Ok, let’s check another site.
[161]:
u = all_flags['CS2_02']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[161]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2018-10-09 16:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2018-11-27 16:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2018-11-27 16:30:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2018-12-18 15:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2019-02-25 06:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 06:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 09:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| ... | ... | ... | ... | ... | ... | ... |
| 2020-01-20 12:00:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-02-08 10:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-09-29 15:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-11-08 11:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2022-04-18 09:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2022-09-08 10:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2023-03-02 18:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2023-12-03 09:00:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2023-12-06 15:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-01-09 11:30:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-01-17 13:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-01-28 09:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-02-21 18:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-03-06 17:45:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-05-01 18:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; |
328 rows × 6 columns
[163]:
u.event[n_flags>1].final_flag.unique()
[163]:
<ArrowExtensionArray>
['EM', '*M']
Length: 2, dtype: string[pyarrow]
That’s a unique case. I woud argue that * is more descriptive as to why there is no data. However, it violates the convention. Yet again, this can be fixed. It seems to indicate that these should trigger warnings when they occur.
[164]:
u = all_flags['PRI_03']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[164]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2018-11-28 20:00:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-02-28 16:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2019-04-09 23:15:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2019-09-17 18:00:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2019-12-10 20:30:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2020-04-07 23:45:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2020-07-27 18:30:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2020-11-25 17:45:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-02-18 20:00:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-09-29 21:45:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2021-12-23 20:30:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2022-09-21 15:00:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2022-12-29 13:45:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2023-11-06 14:15:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |
| 2024-01-09 11:15:00 | QR | E | EM | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; |
So what would the rule here be? There should be a screen warning whenver there is more than one flag. But also, there should be a way to choose the final flag, an order of precedence:
Manual flag trumps all
M trumps any non-manual flag
E trumps any flag that isn’t 1 or 2…?
Actually, that ignoresd clogs. Clogs are a unique case and should always take precedence as a descriptor. So, an ammended order of precedence:
Manual flag trumps all
M trumps any non-manual flag
U trumps all but 1 or 2
C trumps all but 1-3
E trumps any flag that isn’t 1 -4 …?
That only leaves Q and * … The * is basically unused, with the exception of the above manual flag, so that seems pretty good to me.
[196]:
u = all_flags['CS2_02']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
[176]:
has_multi_flag = n_flags > 1
has_manual_flag = u.event.loc[multi_flag, 'manual_flag'] != ''
manual_flags = u.event.loc[multi_flag, 'manual_flag'].unique()
manual_flags
[176]:
<ArrowExtensionArray>
['', '*']
Length: 2, dtype: string[pyarrow]
[185]:
u.event[u.event['manual_flag'].str.contains('*', regex=False)]
[185]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-02-25 06:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 06:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 07:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 08:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 09:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 09:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 09:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 09:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-25 10:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| ... | ... | ... | ... | ... | ... | ... |
| 2019-02-28 07:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 07:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 08:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 08:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 08:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 08:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 09:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 09:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 09:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 09:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 10:00:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 10:15:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 10:30:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 10:45:00 | <NA> | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; | ||
| 2019-02-28 11:00:00 | QM | * | *M | ManualFlag: Snowdown overflowed noahiv with snow.; |
307 rows × 6 columns
[200]:
def _clear_all_flags_but(flags, rows, keep_cols):
clear_cols = ~flags.columns.isin(keep_cols)
flags.loc[rows, keep_cols] = True
flags.loc[rows, clear_cols] = False
return flags
for mf in manual_flags:
if mf != '':
# first, clear anything that isn't the manual flag
# because this draws the manual flags from events, any other clearing does not have an affect
has_manual_flag = u.event.manual_flag.str.contains(mf, regex=False)
assign_manual_flag = has_manual_flag & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_manual_flag, [mf])
elif mf == '':
# in ORDER of PRECEDENCE: keep only the following and clear all else
precedence = {'M': {'keep_col':['M', 'SetNA'], 'has': has_multi_flag},
'U': {'keep_col':['U'], 'has': has_multi_flag & u.event.event_code.str.contains('CLOG')},
'C': {'keep_col':['C'], 'has': has_multi_flag},
'E': {'keep_col':['E', 'Set0'], 'has': has_multi_flag},
}
for flg, f_param in precedence.items():
# Next, keep only the M flag, all other flags paired with M will be cleared.
# From the remaining flags in the multiclog, keep only clog flags, with
assign_flag = u.flags[flg] & f_param['has']
u.flags = _clear_all_flags_but(u.flags, assign_flag, f_param['keep_col'])
'''
# Next, keep only the M flag, all other flags paired with M will be cleared.
assign_M_flag = u.flags['M'] & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_M_flag, ['M', 'SetNA'])
# From the remaining flags in the multiclog, keep only clog flags, with u
has_clog = u.event.event_code.str.contains('CLOG')
assign_U_flag = u.flags['U'] & has_clog & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_U_flag, ['U'])
# From the remaining flags in the multiclog, keep only C flags
assign_C_flag = u.flags['C'] & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_C_flag, ['C'])
# From the remaining flags in the multiclog,, keep only the E flag
assign_E_flag = u.flags['E'] & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_E_flag, ['E', 'Set0'])
'''
[192]:
u.flags[has_multi_flag]
[192]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2018-10-09 16:00:00 | False | False | False | False | True | False | False | True |
| 2018-11-27 16:15:00 | False | False | False | False | True | False | False | True |
| 2018-11-27 16:30:00 | False | False | False | False | True | False | False | True |
| 2018-12-18 15:30:00 | False | False | False | False | True | False | False | True |
| 2019-02-25 06:30:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 06:45:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 07:00:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 07:15:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 07:30:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 07:45:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 08:00:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 08:15:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 08:30:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 08:45:00 | False | False | False | True | False | False | False | False |
| 2019-02-25 09:00:00 | False | False | False | True | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2020-01-20 12:00:00 | False | False | False | False | True | False | False | True |
| 2021-02-08 10:30:00 | False | False | False | False | True | False | False | True |
| 2021-09-29 15:45:00 | False | False | False | False | True | False | False | True |
| 2021-11-08 11:00:00 | False | False | False | False | True | False | False | True |
| 2022-04-18 09:45:00 | False | False | False | False | True | False | False | True |
| 2022-09-08 10:15:00 | False | False | False | False | True | False | False | True |
| 2023-03-02 18:30:00 | False | False | False | False | True | False | False | True |
| 2023-12-03 09:00:00 | False | False | False | False | True | False | False | True |
| 2023-12-06 15:00:00 | False | False | False | False | True | False | False | True |
| 2024-01-09 11:30:00 | False | False | False | False | True | False | False | True |
| 2024-01-17 13:30:00 | False | False | False | False | True | False | False | True |
| 2024-01-28 09:15:00 | False | False | False | False | True | False | False | True |
| 2024-02-21 18:15:00 | False | False | False | False | True | False | False | True |
| 2024-03-06 17:45:00 | False | False | False | False | True | False | False | True |
| 2024-05-01 18:30:00 | False | False | False | False | True | False | False | True |
328 rows × 8 columns
[194]:
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[194]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
That looks like it works! Let’s check it on another.
[201]:
u = all_flags['CEN_02']
col = ~u.flags.columns.isin(['Set0', 'SetNA'])
n_flags = u.flags.loc[:, col].sum(axis=1)
[202]:
u.event[n_flags>1]
[202]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-03-03 10:05:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; | |
| 2024-05-16 15:30:00 | <NA> | E | EM | DRAIN | QaRule AutoFlag: drain_event; |
[203]:
has_multi_flag = n_flags > 1
manual_flags = u.event.loc[has_multi_flag, 'manual_flag'].unique()
[204]:
for mf in manual_flags:
if mf != '':
# first, clear anything that isn't the manual flag
# because this draws the manual flags from events, any other clearing does not have an affect
has_manual_flag = u.event.manual_flag.str.contains(mf, regex=False)
assign_manual_flag = has_manual_flag & has_multi_flag
u.flags = _clear_all_flags_but(u.flags, assign_manual_flag, [mf])
elif mf == '':
# in ORDER of PRECEDENCE: keep only the following and clear all else
precedence = {'M': {'keep_col':['M', 'SetNA'], 'has': has_multi_flag},
'U': {'keep_col':['U'], 'has': has_multi_flag & u.event.event_code.str.contains('CLOG')},
'C': {'keep_col':['C'], 'has': has_multi_flag},
'E': {'keep_col':['E', 'Set0'], 'has': has_multi_flag},
}
for flg, f_param in precedence.items():
# Next, keep only the M flag, all other flags paired with M will be cleared.
# From the remaining flags in the multiclog, keep only clog flags, with
assign_flag = u.flags[flg] & f_param['has']
u.flags = _clear_all_flags_but(u.flags, assign_flag, f_param['keep_col'])
[205]:
u.flags[has_multi_flag]
[205]:
| Q | U | C | * | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2022-03-03 10:05:00 | False | False | False | False | True | False | False | True |
| 2024-05-16 15:30:00 | False | False | False | False | True | False | False | True |
[212]:
n_flags = u.flags.loc[:, col].sum(axis=1)
u.event[n_flags>1]
[212]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
Let’s look at this another way. It looks like the boolean dataframe is behaving well. It has all the QaRule flags and the manual flags properly undid anything before assignment. So let’s look at the events, where we have a record of all the flags that were applied at any stage.
[25]:
ce = all_flags['CEN_01'].event
[32]:
ce[(ce['QaRule_flag'] != ce['manual_flag'])&(ce['manual_flag']!= '')&(ce['QaRule_flag']!= '')]
[32]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
[42]:
ce[(ce.prov_flag == ce.QaRule_flag)]# & (~ce.prov_flag.isna())]
[42]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date |
Those ways of looking at the data don’t seem that helpful. The boolean logic above seems to be handling things well. I’m pretty happy with the logic.
GCE Flags During Clogs¶
The cross probe comparisons is a substantial body of work developed to identify clogs. All clogs are flagged with either a U flag for undercatch, a C flag for delayed precip, or no flag for dry period where a majority of other rain gauges show no rain occured, and therefore no rain was missed.
GCE flags all clogs manually. They are either flagged as Missing (M) or Questionable (Q). The methods developed above to remove duplicate flags will give precedence to the M flag. So all known cases where GCE manually applied M flags to clogs were removed by manual flags so that the clog analysis could be conducted. The Q flag has the lowest precedence, so U and C flags will overwrite it. However, this will leave Q flags in periods where the clog analysis determined it was a dry period. This is contrary to the logic developed for clogs. Will it be confusing to have a Q with a CLOG event_code sometimes? Every clog doesn’t get a manual flag in GCE, so it will be inconsistent.
We probably don’t want to wipe all Q flags. Especially since they may identify a problem that gets missed by these post-GCE methods. There is no reason to throw good work done to flag the data unless we have a known flag to replace it with. I think the CLOG event_code is a great example of having a known flag to replace it with, even when there isn’t a flag. So let’s try selecting off of event_code and making sure we only have U or C flags during CLOG’s. This way, we are keeping the Q’s from GCE until we have a known replacement.
CEN_01 is the only site that is completely parameterized for clog analysis, so let’s see what this would look like at CEN_01
[374]:
site = 'CEN_01'
clog = all_flags[site].event['event_code'] == 'CLOG'
u = all_flags[site].event['QaRule_flag'].str.contains('U')
c = all_flags[site].event['QaRule_flag'].str.contains('C')
u |= all_flags[site].event['manual_flag'].str.contains('U')
c |= all_flags[site].event['manual_flag'].str.contains('C')
all_flags[site].event[clog & ~u & ~c]
[374]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-10-30 15:40:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 15:45:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 15:50:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 15:55:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:00:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:05:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:10:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:15:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:20:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:25:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:30:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:35:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:40:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:45:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| 2018-10-30 16:50:00 | <NA> | <NA> | CLOG | QaRule AutoFlag: clog; | |||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-04-04 12:35:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 12:40:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 12:45:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 12:50:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:00:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:05:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:10:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:15:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:20:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 13:55:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 14:00:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 14:10:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 14:20:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 14:30:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; | |||
| 2024-04-04 14:40:00 | <NA> | T | CLOG | QaRule AutoFlag: clog; |
6812 rows × 7 columns
OK, lots of NO FLAG instances during dry periods during a clog.
[375]:
all_flags[site].event[clog & ~u & ~c].prov_flag.unique()
[375]:
<ArrowExtensionArray>
[<NA>, 'MM']
Length: 2, dtype: string[pyarrow]
[376]:
all_flags[site].event[clog & ~u & ~c].tank_flag.unique()
[376]:
<ArrowExtensionArray>
[<NA>, 'M', 'T']
Length: 3, dtype: string[pyarrow]
Well, at least at CEN_01, the GCE manual flags were all Missing, not Q. The T flag indicates the orifice temp is below 4C, so it can help ID clogs, but is inconsistent: the orifice temp can drop below freezing without any precip falling, so there is no clog. The orifice temp can also be above freezing without the clog breaking.
So it seems reasonable to enforce that all CLOG event_codes can only be paired with a U or a C flag.
Including Provisional Flagging¶
There are a lot of provisional flags that don’t exist in the final format. These flags should be assessed and converted into one of the final flag types. Further, GCE will assign more than one flag at a timestep, plus there are separate flags for the tank and the precip, and they often do not match. So these flag combinations need to be carefully interpreted, converted, and added to the data. However, care needs to be taken that they do not overwrite more descriptive flags that have already been assigned by this QA process.
[43]:
ce.prov_flag.unique()
[43]:
<ArrowExtensionArray>
[<NA>, 'RR', 'MMM', 'JQ', 'WR', 'MR', 'JT', 'WT']
Length: 8, dtype: string[pyarrow]
Let’s look at all the sites and see if they have similar combinations.
[50]:
for probe, flags in all_flags.items():
print(f'{probe}: {flags.event.prov_flag.unique()}\n')
VAR_02: <ArrowExtensionArray>
[ <NA>, 'WQ', 'RT', 'RQ', 'JQ', 'MMM', 'WT', 'RR', 'JR', 'QR', 'WR',
'QQ', 'EQ', 'JT', 'QT', 'ET', 'ER', 'MQ']
Length: 18, dtype: string[pyarrow]
UPL_01: <ArrowExtensionArray>
[<NA>, 'RR', 'MMM', 'JR', 'WT', 'JT', 'WQ']
Length: 7, dtype: string[pyarrow]
UPL_02: <ArrowExtensionArray>
[<NA>, 'RR', 'MMM', 'WT', 'JR', 'WQ', 'RQ', 'WR', 'RT', 'QQ']
Length: 10, dtype: string[pyarrow]
UPL_04: <ArrowExtensionArray>
[<NA>, 'M', 'T', 'Q']
Length: 4, dtype: string[pyarrow]
CEN_01: <ArrowExtensionArray>
[<NA>, 'RR', 'MMM', 'JQ', 'WR', 'MR', 'JT', 'WT']
Length: 8, dtype: string[pyarrow]
CEN_02: <ArrowExtensionArray>
[<NA>, 'RR', 'WQ', 'MMM', 'WR', 'JR']
Length: 6, dtype: string[pyarrow]
CEN_04: <ArrowExtensionArray>
['M', 'T', <NA>]
Length: 3, dtype: string[pyarrow]
CS2_02: <ArrowExtensionArray>
[<NA>, 'QR', 'QM', 'QQ']
Length: 4, dtype: string[pyarrow]
PRI_03: <ArrowExtensionArray>
[<NA>, 'QQ', 'QR', 'QM']
Length: 4, dtype: string[pyarrow]
PRI_01: <ArrowExtensionArray>
[<NA>, 'Q', 'M', 'T']
Length: 4, dtype: string[pyarrow]
H15_02: <ArrowExtensionArray>
[<NA>]
Length: 1, dtype: string[pyarrow]
GSM_02: <ArrowExtensionArray>
[<NA>, 'RR', 'MMM', 'JR']
Length: 4, dtype: string[pyarrow]
I think interpretation is going to require that we separate Tank and precip flags to be able to interpret this.
T Flag¶
Let’s start with T. T is triggered by:
col_ORI_SA_AVG<3.5='T'
So, most instances of this should be covered by U and C flags inserted by the clogging routine. So T’s shouldn’t need a flag, they should be superceded by whatever the cross-probe comparison assigns.
Let’s start with the ET combo at VARA. When would a frozen orifice get an estimate flag?
[129]:
T = all_flags['VAR_02'].event['prov_flag'].str.contains('ET')
all_flags['VAR_02'].data[T]
[129]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2020-11-07 18:35:00 | 658.099976 | 6.0 | <NA> |
| 2020-12-24 02:50:00 | 677.0 | 5.4 | <NA> |
| 2021-02-04 05:10:00 | 729.0 | 4.6 | <NA> |
OK, those are removed from the final data, but I can’t make heads or tails as to why from that snippet. I’ll load the VARA data set and dig in.
[119]:
prov = data_transfer.LoadProvisionalData(strtyr=2019, endyr=2024, file_n='../config_new.yaml',
fname_base='MS00413_PPT_L1_5min_')
prov.load_ppt_data()
df = prov.pivot_on_probe(prov.df, 'VAR', '02')
[130]:
pd.options.display.min_rows = 30
for s in range(-2,3):
T |= T.shift(s)
df[T]
[130]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-11-07 18:20:00 | 657.900024 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:25:00 | 660.0 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:30:00 | 652.099976 | T | 0.0 | R | 24173.300781 | R |
| 2020-11-07 18:35:00 | 658.099976 | T | 6.0 | E | 24179.300781 | E |
| 2020-11-07 18:40:00 | 659.299988 | T | 7.2 | W | 24186.5 | W |
| 2020-11-07 18:45:00 | 657.799988 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-11-07 18:50:00 | 659.0 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-12-24 02:35:00 | 678.5 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:40:00 | 677.0 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:45:00 | 671.599976 | T | 0.0 | R | 30933.800781 | R |
| 2020-12-24 02:50:00 | 677.0 | T | 5.4 | E | 30939.199219 | E |
| 2020-12-24 02:55:00 | 678.5 | T | 6.9 | W | 30946.099609 | W |
| 2020-12-24 03:00:00 | 677.5 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2020-12-24 03:05:00 | 677.799988 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2021-02-04 04:55:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:00:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:05:00 | 724.400024 | T | 0.0 | R | 34103.601562 | R |
| 2021-02-04 05:10:00 | 729.0 | T | 4.6 | E | 34108.199219 | E |
| 2021-02-04 05:15:00 | 729.900024 | T | 5.5 | W | 34113.699219 | W |
| 2021-02-04 05:20:00 | 728.900024 | T | 0.0 | <NA> | 34113.699219 | <NA> |
| 2021-02-04 05:25:00 | 729.700012 | T | 0.0 | <NA> | 34113.699219 | <NA> |
OK. Good to know that T is only applied to the tank. The precip still gets it’s own flag.
The second two instances of this are during the period where the sensor was producing totally bogus data:
- start: '10/1/20 0005'
end: '10/1/21 0000'
replace_wNAN: True
replace_w0: False
flag: M
explanation: Sensor failure. Rain gauge capped. Data removed.
So this is still in the provisional data, and is producing erroneous numbers.
Here we were just looking at T in the tank but E in the precip flag column. It may be necessary to split the tank flags out. But the T’s overall indicate an ambiguous case. They are triggered by:
col_ORI_SH_AVG<3.5='T'
This will occur in 2 cases: 1) where the orifice is frozen and snow likely collects in the funnel instead of accumulating in the tank, or 2) where there are cold temperatures, but no precip. Similarly, the orifice temp will raise above this threshold for some amount of time before any precipitation in the funnel melts. So, the T flag is a warning that a gauges heater is broken, which makes it invaluable operationally. And it is a warning that something may be wrong in the data, but it is not a warning that something is definitely wrong in the data.
But these REW flags seem to pose a different problem.
E Following R¶
Let’s look into this pattern. They seem to be similar to an F flag following a J, where the F flag simply duplicates the J flag precip.
[132]:
last_r = df.TOT_Flag.str.contains('R').shift(1)
reboundE = df.TOT_Flag.str.contains('E') & last_r
for s in range(-2,3):
reboundE |= reboundE.shift(s)
df[reboundE]
[132]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-09-01 16:55:00 | 34.139999 | Q | 0.0 | <NA> | 5017.830078 | <NA> |
| 2020-09-01 17:00:00 | 33.759998 | Q | 0.0 | R | 5017.830078 | R |
| 2020-09-01 17:05:00 | 28.889999 | Q | 0.0 | R | 5017.830078 | R |
| 2020-09-01 17:10:00 | 42.790001 | Q | 13.9 | E | 5031.72998 | E |
| 2020-09-01 17:15:00 | 27.129999 | Q | 13.9 | <NA> | 5045.629883 | <NA> |
| 2020-09-01 17:20:00 | 27.120001 | Q | 13.9 | <NA> | 5059.529785 | <NA> |
| 2020-09-01 17:25:00 | 27.379999 | Q | 13.9 | <NA> | 5073.430176 | <NA> |
| 2020-11-07 18:20:00 | 657.900024 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:25:00 | 660.0 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:30:00 | 652.099976 | T | 0.0 | R | 24173.300781 | R |
| 2020-11-07 18:35:00 | 658.099976 | T | 6.0 | E | 24179.300781 | E |
| 2020-11-07 18:40:00 | 659.299988 | T | 7.2 | W | 24186.5 | W |
| 2020-11-07 18:45:00 | 657.799988 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-11-07 18:50:00 | 659.0 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-12-24 02:35:00 | 678.5 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:40:00 | 677.0 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:45:00 | 671.599976 | T | 0.0 | R | 30933.800781 | R |
| 2020-12-24 02:50:00 | 677.0 | T | 5.4 | E | 30939.199219 | E |
| 2020-12-24 02:55:00 | 678.5 | T | 6.9 | W | 30946.099609 | W |
| 2020-12-24 03:00:00 | 677.5 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2020-12-24 03:05:00 | 677.799988 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2021-02-04 04:55:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:00:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:05:00 | 724.400024 | T | 0.0 | R | 34103.601562 | R |
| 2021-02-04 05:10:00 | 729.0 | T | 4.6 | E | 34108.199219 | E |
| 2021-02-04 05:15:00 | 729.900024 | T | 5.5 | W | 34113.699219 | W |
| 2021-02-04 05:20:00 | 728.900024 | T | 0.0 | <NA> | 34113.699219 | <NA> |
| 2021-02-04 05:25:00 | 729.700012 | T | 0.0 | <NA> | 34113.699219 | <NA> |
| 2021-05-05 12:30:00 | 778.0 | Q | 0.0 | <NA> | 35212.0 | <NA> |
| 2021-05-05 12:35:00 | 778.0 | Q | 0.0 | <NA> | 35212.0 | <NA> |
| 2021-05-05 12:40:00 | 532.400024 | R | 0.0 | R | 35212.0 | R |
| 2021-05-05 12:45:00 | 604.599976 | R | 72.199997 | E | 35284.199219 | E |
| 2021-05-05 12:50:00 | 600.799988 | Q | 68.400002 | W | 35352.601562 | W |
| 2021-05-05 12:55:00 | 598.299988 | Q | 0.0 | <NA> | 35352.601562 | <NA> |
| 2021-05-05 13:00:00 | 655.0 | R | 54.200001 | J | 35406.800781 | J |
| 2021-05-26 14:10:00 | 779.0 | Q | 0.0 | <NA> | 38058.859375 | <NA> |
| 2021-05-26 14:15:00 | 778.900024 | Q | 0.0 | <NA> | 38058.859375 | <NA> |
| 2021-05-26 14:20:00 | 16.360001 | R | 0.0 | R | 38058.859375 | R |
| 2021-05-26 14:25:00 | 729.700012 | Q | 713.340027 | E | 38772.199219 | E |
| 2021-05-26 14:30:00 | 735.299988 | Q | 718.940002 | W | 39491.140625 | W |
| 2021-05-26 14:35:00 | 725.299988 | Q | 0.0 | R | 39491.140625 | R |
| 2021-05-26 14:40:00 | 736.599976 | Q | 11.3 | <NA> | 39502.441406 | <NA> |
| 2021-06-25 09:55:00 | 701.599976 | Q | 0.3 | <NA> | 65285.058594 | <NA> |
| 2021-06-25 10:00:00 | 701.099976 | Q | 0.3 | <NA> | 65285.359375 | <NA> |
| 2021-06-25 10:05:00 | 696.900024 | Q | 0.3 | R | 65285.660156 | R |
| 2021-06-25 10:10:00 | 701.599976 | Q | 4.7 | E | 65290.359375 | E |
| 2021-06-25 10:15:00 | 704.700012 | Q | 7.8 | W | 65298.160156 | W |
| 2021-06-25 10:20:00 | 702.700012 | Q | 0.0 | <NA> | 65298.160156 | <NA> |
| 2021-06-25 10:25:00 | 700.5 | Q | 0.0 | R | 65298.160156 | R |
| 2021-07-03 18:40:00 | 430.899994 | Q | 0.0 | <NA> | 75933.601562 | <NA> |
| 2021-07-03 18:45:00 | 432.399994 | Q | 0.0 | <NA> | 75933.601562 | <NA> |
| 2021-07-03 18:50:00 | 427.100006 | Q | 0.0 | R | 75933.601562 | R |
| 2021-07-03 18:55:00 | 489.399994 | R | 62.299999 | E | 75995.898438 | E |
| 2021-07-03 19:00:00 | 482.899994 | Q | 55.799999 | W | 76051.703125 | W |
| 2021-07-03 19:05:00 | 489.299988 | Q | 6.4 | <NA> | 76058.101562 | <NA> |
| 2021-07-03 19:10:00 | 485.600006 | Q | 0.0 | R | 76058.101562 | R |
All but one of these is during the period of sensor failure. But both the E and usually the W all seem bogus. It’s exactly like the J and F problem. Except there the J was real precip and the F following it was fake. Here they both seem fake. Let’s double check that this data is getting filtered. Unfortunately, none of the other sites seem to have this phenomenon, so I just have the one example where the data isn’t crazy to go off of.
[213]:
all_flags['VAR_02'].event[reboundE]
[213]:
| prov_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-09-01 16:55:00 | <NA> | |||||
| 2020-09-01 17:00:00 | RQ | |||||
| 2020-09-01 17:05:00 | RQ | |||||
| 2020-09-01 17:10:00 | EQ | E | E | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations | |
| 2020-09-01 17:15:00 | <NA> | E | E | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations | |
| 2020-09-01 17:20:00 | <NA> | E | E | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations | |
| 2020-09-01 17:25:00 | <NA> | E | E | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations | |
| 2020-11-07 18:20:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-11-07 18:25:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-11-07 18:30:00 | RT | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-11-07 18:35:00 | ET | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2020-11-07 18:40:00 | WT | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2020-11-07 18:45:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-11-07 18:50:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-12-24 02:35:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-12-24 02:40:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-12-24 02:45:00 | RT | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-12-24 02:50:00 | ET | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2020-12-24 02:55:00 | WT | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2020-12-24 03:00:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2020-12-24 03:05:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-02-04 04:55:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-02-04 05:00:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-02-04 05:05:00 | RT | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-02-04 05:10:00 | ET | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2021-02-04 05:15:00 | WT | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2021-02-04 05:20:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-02-04 05:25:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-05 12:30:00 | <NA> | U | M | M | QaRule AutoFlag: overflow; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | |
| 2021-05-05 12:35:00 | <NA> | U | M | M | QaRule AutoFlag: overflow; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | |
| 2021-05-05 12:40:00 | RR | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-05 12:45:00 | ER | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-05 12:50:00 | WQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-05 12:55:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-05 13:00:00 | JR | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-26 14:10:00 | <NA> | U | M | M | QaRule AutoFlag: overflow; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | |
| 2021-05-26 14:15:00 | <NA> | U | M | M | QaRule AutoFlag: overflow; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | |
| 2021-05-26 14:20:00 | RR | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-26 14:25:00 | EQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-26 14:30:00 | WQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-26 14:35:00 | RQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-05-26 14:40:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 09:55:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:00:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:05:00 | RQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:10:00 | EQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:15:00 | WQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:20:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-06-25 10:25:00 | RQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-07-03 18:40:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-07-03 18:45:00 | <NA> | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-07-03 18:50:00 | RQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; | ||
| 2021-07-03 18:55:00 | ER | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2021-07-03 19:00:00 | WQ | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2021-07-03 19:05:00 | <NA> | E | M | M | INTPRO | QaRule AutoFlag: diurnal_flux; ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
| 2021-07-03 19:10:00 | RQ | M | M | ManualFlag: Sensor failure. Rain gauge capped. Data removed.; |
OK, so both the E flag and the repeating values after it were caught. Now let’s look at that day as a whole.
[214]:
day = pd.to_datetime('9/1/20')
all_flags['VAR_02'].plot_flagged_day(day, 'VARA_02', paired_tank=all_flags['UPL_01'].data.tank_height)
[214]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
<Axes: title={'center': 'VARA_02 - 2020-09-01 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
OK, so all the instances of this happening are batshit crazy… After talking to Adam, these E flags are being inserted by simple_pre.m. He was able to find the snippet here where the E is triggered in the “NOT_RAINING” section:
%%% similar to above, if the Diff > 0 or = 0 then the base and
%%% accumulation reflect this change and it is accepted
if Diff3 >= 0
Rebounding = 0;
Baseline3 = CompareBase; % a positive change in summer does not change the baseline
if Diff3 < 4
Flag3 = ''; % probably is ok
CorrDiff3 = 0;
Recent_Diffs = Recent_Diffs + Diff3;
% if the Difference is > 4 and the last measurement was ok it's a "J"
elseif Diff3 > 4 && strcmp(CompareFlag,'') ==1
Flag3 = 'J';
%fprintf(1,'%s%s%s%4.2f\n','a summer J was triggered on ', cHumanDate3, ' due to a gain of ', Diff3);
CorrDiff3 = Diff3;
% there is the chance that it could be the "rebound" from a reset
elseif Diff3 > 4 && strcmp(CompareFlag,'R')==1
Flag3 = 'E';
CorrDiff3 = Diff3;
%Results{i,9} = 'MAINTE';
%fprintf(1,'%s%s%s\n','the maintenance flag was applied to ', cHumanDate3, ' because a positive difference occured following a reset')
% do not upgrade the recent diffs if a J occurs
elseif Diff3 > 4 && strcmp(CompareFlag,'E') ==1
Flag3 = '';
CorrDiff3 = Diff3;
Recent_Diffs = Recent_Diffs + Diff3;
% fprintf(fid_log,'%s%s%s%4.2f\n','post-reset gain is counted from ', cHumanDate3, ' with a magnitude of ', Diff3);
% if the diff > 4 and the last measurement was a J, especially in the summer, we question it...
elseif Diff3 > 4 && strcmp(CompareFlag,'J') ==1
Flag3 = 'Q';
CorrDiff3 = 0;
% we add it to the recent diffs though, to move us towards rain
Recent_Diffs = Recent_Diffs + Diff3;
% AK commented out this line. Fox said this was legacy from
% testing period.
% fprintf(fid_log,'%s%s%s%4.2f\n','post-J gain is counted from ', cHumanDate3, ' with a magnitude of ', Diff3);
% if Adam's code gives empty and the original flag is empty, we accept it and add to recent diffs
elseif Diff3 > 4 && strcmp(OriFlag3,'""')==0
Flag3 = '';
CorrDiff3 = 0;
Recent_Diffs = Recent_Diffs + Diff3;
end
I think that logic is bad… R followed by E seems to always be false. Looking back at Finding Multiple Flags, this never seems to occur during recharge after a drain. We don’t have many examples. On the one hand, the drain and recharge methods should catch this example. And it never shows up outside of these totally crazy examples. On the other hand, it should, at best, be a Q flag with that logic; nothing is being estimated. And in practice, the values are all bad.
Since there are so few examples, I will change the flag to appropriately reflect the logic, and replace the E with a Q. Plus, by placing a Q it still puts a warnig on these bogus numbers. However it doesn’t always remove them in case sometimes the numbers are real.
Separate Tank Flags¶
Let’s try to make this more clear. Currently the tank flags are added to the provisional flagging string. Let’s give them their own column and try to interpret these GCE flags.
[280]:
import sys
del sys.modules['post_gce_qc.qaqc']
del sys.modules['post_gce_qc']
del sys.modules['post_gce_qc.main']
del all_flags
from post_gce_qc import qaqc, main
[281]:
all_flags = main.main(2019, 2024, probes={'all_params'}, data_path='../config_new.yaml', qa_params='../qa_param.yaml',
fname_base='MS00413_PPT_L1_5min_', write_csv=False)
Loading all PPT data from ../config_new.yaml
Load data from VAR_02
VAR_02: All quality checks and quality assurance rules applied
------------------
Load data from UPL_01
UPL_01: All quality checks and quality assurance rules applied
------------------
Load data from UPL_02
UPL_02: All quality checks and quality assurance rules applied
------------------
Load data from UPL_04
149: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
UPL_04: All quality checks and quality assurance rules applied
------------------
Load data from CEN_01
CEN_01: All quality checks and quality assurance rules applied
------------------
Load data from CEN_02
CEN_02: All quality checks and quality assurance rules applied
------------------
Load data from CEN_04
149: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
CEN_04: All quality checks and quality assurance rules applied
------------------
Load data from CS2_02
CS2_02: All quality checks and quality assurance rules applied
------------------
Load data from PRI_03
PRI_03: All quality checks and quality assurance rules applied
------------------
Load data from PRI_01
149: UserWarning: No existing flags found. qaqc.ApplyFlags.apply_GCE_flags was designed to fill in where there are not other flags. Consider running qaqc.ApplyFlags.apply_QaRules_flags first.
PRI_01: All quality checks and quality assurance rules applied
------------------
Load data from H15_02
H15_02: All quality checks and quality assurance rules applied
------------------
Load data from GSM_02
GSM_02: All quality checks and quality assurance rules applied
------------------
Generating cross probe tables
292: UserWarning: Precip set to 0 without E flag or manual flag. E flag added
340: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
340: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Performing cross probe and final checks on CEN_01
Performing cross probe and final checks on CEN_02
340: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Performing cross probe and final checks on CS2_02
340: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
Performing cross probe and final checks on PRI_03
340: UserWarning: More than one flag assigned at the same time. Only one flag is retained by precedence.
[287]:
U = all_flags['CEN_01'].flags.U
u = all_flags['CEN_01'].event.QaRule_flag.str.contains('U')
all_flags['CEN_01'].event[u]
[287]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-12-18 06:50:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 06:55:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:00:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:05:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:10:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:15:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:20:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:25:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:30:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:35:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:40:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:45:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:50:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 07:55:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| 2018-12-18 08:00:00 | <NA> | M | MU | U | CLOG | ManualFlag: remove gce m flag during clog period to include rapid unclog and catch up of cumulative precip.; QaRule AutoFlag: clog; | |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-04-04 07:55:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:00:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:05:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:10:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:15:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:20:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:25:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:40:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 08:45:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:25:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:30:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:35:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:40:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:45:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; | |
| 2024-04-04 13:50:00 | <NA> | T | U | U | CLOG | QaRule AutoFlag: clog; |
3017 rows × 7 columns
That looks like it worked and flags are being assigned correctly. Let’s get back to VARA flags as our example.
[290]:
all_flags['VAR_02'].event['prov_flag'].unique()
[290]:
<ArrowExtensionArray>
[<NA>, 'W', 'R', 'J', 'MM', 'Q', 'E', 'M']
Length: 8, dtype: string[pyarrow]
[291]:
all_flags['VAR_02'].event['tank_flag'].unique()
[291]:
<ArrowExtensionArray>
[<NA>, 'R', 'Q', 'T', 'M']
Length: 5, dtype: string[pyarrow]
That’s an easier set of flags to track down. Let’s try to make some sense of it.
E Flags¶
In theory, this means a value is being estimated. So far, this seems to be limited to the crazy periods where the probe at VARA was dead and giving erratic numbers. But let’s see if we can figure out what’s going on.
Precip Flag¶
[321]:
e = all_flags['VAR_02'].event['prov_flag'].str.contains('E')
all_flags['VAR_02'].event[e]
[321]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2020-09-01 17:10:00 | E | Q | E | E | INTPRO | ApplyFlags AutoFlag: prorate precip during diurnal tank fluctuations; | |
| 2020-11-07 18:35:00 | E | T | E | M | M | QaRule AutoFlag: diurnal_flux; ManualFlag: sensor failure. rain gauge capped. data removed.; | |
| 2020-12-24 02:50:00 | E | T | E | M | M | QaRule AutoFlag: diurnal_flux; ManualFlag: sensor failure. rain gauge capped. data removed.; | |
| 2021-02-04 05:10:00 | E | T | E | M | M | QaRule AutoFlag: diurnal_flux; ManualFlag: sensor failure. rain gauge capped. data removed.; | |
| 2021-05-05 12:45:00 | E | R | M | M | ManualFlag: sensor failure. rain gauge capped. data removed.; | ||
| 2021-05-26 14:25:00 | E | Q | M | M | ManualFlag: sensor failure. rain gauge capped. data removed.; | ||
| 2021-06-25 10:10:00 | E | Q | M | M | ManualFlag: sensor failure. rain gauge capped. data removed.; | ||
| 2021-07-03 18:55:00 | E | R | E | M | M | QaRule AutoFlag: diurnal_flux; ManualFlag: sensor failure. rain gauge capped. data removed.; |
[322]:
for s in range(-2,3):
e |= e.shift(s)
df[e]
4: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
[322]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-09-01 16:55:00 | 34.139999 | Q | 0.0 | <NA> | 5017.830078 | <NA> |
| 2020-09-01 17:00:00 | 33.759998 | Q | 0.0 | R | 5017.830078 | R |
| 2020-09-01 17:05:00 | 28.889999 | Q | 0.0 | R | 5017.830078 | R |
| 2020-09-01 17:10:00 | 42.790001 | Q | 13.9 | E | 5031.72998 | E |
| 2020-09-01 17:15:00 | 27.129999 | Q | 13.9 | <NA> | 5045.629883 | <NA> |
| 2020-09-01 17:20:00 | 27.120001 | Q | 13.9 | <NA> | 5059.529785 | <NA> |
| 2020-09-01 17:25:00 | 27.379999 | Q | 13.9 | <NA> | 5073.430176 | <NA> |
| 2020-11-07 18:20:00 | 657.900024 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:25:00 | 660.0 | T | 0.0 | <NA> | 24173.300781 | <NA> |
| 2020-11-07 18:30:00 | 652.099976 | T | 0.0 | R | 24173.300781 | R |
| 2020-11-07 18:35:00 | 658.099976 | T | 6.0 | E | 24179.300781 | E |
| 2020-11-07 18:40:00 | 659.299988 | T | 7.2 | W | 24186.5 | W |
| 2020-11-07 18:45:00 | 657.799988 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-11-07 18:50:00 | 659.0 | T | 0.0 | <NA> | 24186.5 | <NA> |
| 2020-12-24 02:35:00 | 678.5 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:40:00 | 677.0 | T | 0.0 | <NA> | 30933.800781 | <NA> |
| 2020-12-24 02:45:00 | 671.599976 | T | 0.0 | R | 30933.800781 | R |
| 2020-12-24 02:50:00 | 677.0 | T | 5.4 | E | 30939.199219 | E |
| 2020-12-24 02:55:00 | 678.5 | T | 6.9 | W | 30946.099609 | W |
| 2020-12-24 03:00:00 | 677.5 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2020-12-24 03:05:00 | 677.799988 | T | 0.0 | <NA> | 30946.099609 | <NA> |
| 2021-02-04 04:55:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:00:00 | 728.900024 | T | 0.0 | <NA> | 34103.601562 | <NA> |
| 2021-02-04 05:05:00 | 724.400024 | T | 0.0 | R | 34103.601562 | R |
| 2021-02-04 05:10:00 | 729.0 | T | 4.6 | E | 34108.199219 | E |
| 2021-02-04 05:15:00 | 729.900024 | T | 5.5 | W | 34113.699219 | W |
| 2021-02-04 05:20:00 | 728.900024 | T | 0.0 | <NA> | 34113.699219 | <NA> |
| 2021-02-04 05:25:00 | 729.700012 | T | 0.0 | <NA> | 34113.699219 | <NA> |
| 2021-05-05 12:30:00 | 778.0 | Q | 0.0 | <NA> | 35212.0 | <NA> |
| 2021-05-05 12:35:00 | 778.0 | Q | 0.0 | <NA> | 35212.0 | <NA> |
| 2021-05-05 12:40:00 | 532.400024 | R | 0.0 | R | 35212.0 | R |
| 2021-05-05 12:45:00 | 604.599976 | R | 72.199997 | E | 35284.199219 | E |
| 2021-05-05 12:50:00 | 600.799988 | Q | 68.400002 | W | 35352.601562 | W |
| 2021-05-05 12:55:00 | 598.299988 | Q | 0.0 | <NA> | 35352.601562 | <NA> |
| 2021-05-05 13:00:00 | 655.0 | R | 54.200001 | J | 35406.800781 | J |
| 2021-05-26 14:10:00 | 779.0 | Q | 0.0 | <NA> | 38058.859375 | <NA> |
| 2021-05-26 14:15:00 | 778.900024 | Q | 0.0 | <NA> | 38058.859375 | <NA> |
| 2021-05-26 14:20:00 | 16.360001 | R | 0.0 | R | 38058.859375 | R |
| 2021-05-26 14:25:00 | 729.700012 | Q | 713.340027 | E | 38772.199219 | E |
| 2021-05-26 14:30:00 | 735.299988 | Q | 718.940002 | W | 39491.140625 | W |
| 2021-05-26 14:35:00 | 725.299988 | Q | 0.0 | R | 39491.140625 | R |
| 2021-05-26 14:40:00 | 736.599976 | Q | 11.3 | <NA> | 39502.441406 | <NA> |
| 2021-06-25 09:55:00 | 701.599976 | Q | 0.3 | <NA> | 65285.058594 | <NA> |
| 2021-06-25 10:00:00 | 701.099976 | Q | 0.3 | <NA> | 65285.359375 | <NA> |
| 2021-06-25 10:05:00 | 696.900024 | Q | 0.3 | R | 65285.660156 | R |
| 2021-06-25 10:10:00 | 701.599976 | Q | 4.7 | E | 65290.359375 | E |
| 2021-06-25 10:15:00 | 704.700012 | Q | 7.8 | W | 65298.160156 | W |
| 2021-06-25 10:20:00 | 702.700012 | Q | 0.0 | <NA> | 65298.160156 | <NA> |
| 2021-06-25 10:25:00 | 700.5 | Q | 0.0 | R | 65298.160156 | R |
| 2021-07-03 18:40:00 | 430.899994 | Q | 0.0 | <NA> | 75933.601562 | <NA> |
| 2021-07-03 18:45:00 | 432.399994 | Q | 0.0 | <NA> | 75933.601562 | <NA> |
| 2021-07-03 18:50:00 | 427.100006 | Q | 0.0 | R | 75933.601562 | R |
| 2021-07-03 18:55:00 | 489.399994 | R | 62.299999 | E | 75995.898438 | E |
| 2021-07-03 19:00:00 | 482.899994 | Q | 55.799999 | W | 76051.703125 | W |
| 2021-07-03 19:05:00 | 489.299988 | Q | 6.4 | <NA> | 76058.101562 | <NA> |
| 2021-07-03 19:10:00 | 485.600006 | Q | 0.0 | R | 76058.101562 | R |
All of the E’s follow R’s which we dealt with above.
Tank Flag¶
Now let’s see about where the tank is flagged E.
[323]:
e = all_flags['VAR_02'].event['tank_flag'].str.contains('E')
all_flags['VAR_02'].event[e]
[323]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
OK, so we aren’t seeing this anywhere. None of the other sites have any E’s and all of the ones at VARA show up after an R, which is already dealt with.
The only place it used to show up was in filled data. GCE used to fill missing data by linear interpolation between last known tank value and the current value. This practice has ended, but, in case the code gets turned back on at somepoint, All E’s will be turned to Missing. The E’s following R’s will already have been turned to Q’s.
Q Flag¶
VAR_02¶
[324]:
man = all_flags['VAR_02'].event['manual_flag'] != ''
qar = all_flags['VAR_02'].event['QaRule_flag'] != ''
q = all_flags['VAR_02'].event['prov_flag'].str.contains('Q')
all_flags['VAR_02'].event[q & ~qar & ~man]
[324]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2022-10-01 00:10:00 | Q | Q | Q | ||||
| 2022-10-25 14:05:00 | Q | R | Q | ||||
| 2022-12-15 10:25:00 | Q | R | Q | ||||
| 2023-01-20 10:40:00 | Q | Q | Q | ||||
| 2023-12-05 15:20:00 | Q | Q | Q | ||||
| 2024-03-07 11:20:00 | Q | R | Q |
[326]:
soloq = q & ~qar & ~man
for s in range(-2,3):
soloq |= soloq.shift(s)
all_flags['VAR_02'].event[soloq]
[326]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2022-09-30 23:55:00 | <NA> | Q | |||||
| 2022-10-01 00:00:00 | <NA> | Q | |||||
| 2022-10-01 00:05:00 | <NA> | Q | |||||
| 2022-10-01 00:10:00 | Q | Q | Q | ||||
| 2022-10-01 00:15:00 | <NA> | Q | |||||
| 2022-10-01 00:20:00 | <NA> | Q | |||||
| 2022-10-01 00:25:00 | <NA> | Q | |||||
| 2022-10-25 13:50:00 | <NA> | Q | |||||
| 2022-10-25 13:55:00 | <NA> | Q | |||||
| 2022-10-25 14:00:00 | J | Q | |||||
| 2022-10-25 14:05:00 | Q | R | Q | ||||
| 2022-10-25 14:10:00 | <NA> | Q | |||||
| 2022-10-25 14:15:00 | <NA> | Q | |||||
| 2022-10-25 14:20:00 | <NA> | Q | |||||
| 2022-12-15 10:10:00 | <NA> | Q | |||||
| 2022-12-15 10:15:00 | <NA> | Q | |||||
| 2022-12-15 10:20:00 | J | Q | |||||
| 2022-12-15 10:25:00 | Q | R | Q | ||||
| 2022-12-15 10:30:00 | <NA> | Q | |||||
| 2022-12-15 10:35:00 | <NA> | Q | |||||
| 2022-12-15 10:40:00 | <NA> | Q | |||||
| 2023-01-20 10:25:00 | <NA> | Q | |||||
| 2023-01-20 10:30:00 | <NA> | Q | |||||
| 2023-01-20 10:35:00 | J | R | |||||
| 2023-01-20 10:40:00 | Q | Q | Q | ||||
| 2023-01-20 10:45:00 | <NA> | Q | |||||
| 2023-01-20 10:50:00 | <NA> | Q | |||||
| 2023-01-20 10:55:00 | <NA> | Q | |||||
| 2023-12-05 15:05:00 | <NA> | Q | |||||
| 2023-12-05 15:10:00 | <NA> | Q | |||||
| 2023-12-05 15:15:00 | J | Q | |||||
| 2023-12-05 15:20:00 | Q | Q | Q | ||||
| 2023-12-05 15:25:00 | <NA> | Q | |||||
| 2023-12-05 15:30:00 | R | R | |||||
| 2023-12-05 15:35:00 | <NA> | Q | |||||
| 2024-03-07 11:05:00 | <NA> | Q | |||||
| 2024-03-07 11:10:00 | <NA> | Q | |||||
| 2024-03-07 11:15:00 | J | Q | |||||
| 2024-03-07 11:20:00 | Q | R | Q | ||||
| 2024-03-07 11:25:00 | <NA> | R | |||||
| 2024-03-07 11:30:00 | <NA> | Q | |||||
| 2024-03-07 11:35:00 | <NA> | Q |
[327]:
all_flags['VAR_02'].data[soloq]
[327]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2022-09-30 23:55:00 | 36.279999 | 0.0 | 0.0 |
| 2022-10-01 00:00:00 | 36.27 | 0.0 | 0.0 |
| 2022-10-01 00:05:00 | 36.27 | 0.0 | 0.0 |
| 2022-10-01 00:10:00 | 36.279999 | 0.01 | 0.0 |
| 2022-10-01 00:15:00 | 36.290001 | 0.02 | 0.0 |
| 2022-10-01 00:20:00 | 36.290001 | 0.0 | 0.0 |
| 2022-10-01 00:25:00 | 36.290001 | 0.0 | 0.0 |
| 2022-10-25 13:50:00 | 56.869999 | 0.0 | 0.0 |
| 2022-10-25 13:55:00 | 56.869999 | 0.0 | 0.0 |
| 2022-10-25 14:00:00 | 74.809998 | 17.940001 | 17.800001 |
| 2022-10-25 14:05:00 | 115.900002 | 41.09 | 41.200001 |
| 2022-10-25 14:10:00 | 117.099998 | 1.2 | 1.2 |
| 2022-10-25 14:15:00 | 117.0 | 0.0 | 0.0 |
| 2022-10-25 14:20:00 | 117.099998 | 0.0 | 0.0 |
| 2022-12-15 10:10:00 | 66.93 | 0.0 | 0.0 |
| 2022-12-15 10:15:00 | 66.970001 | 0.0 | 0.0 |
| 2022-12-15 10:20:00 | 77.400002 | 10.28 | 10.2 |
| 2022-12-15 10:25:00 | 221.100006 | 143.699997 | 143.600006 |
| 2022-12-15 10:30:00 | 221.100006 | 0.0 | 0.0 |
| 2022-12-15 10:35:00 | 221.100006 | 0.0 | 0.0 |
| 2022-12-15 10:40:00 | 221.100006 | 0.0 | 0.0 |
| 2023-01-20 10:25:00 | 169.800003 | 0.0 | 0.0 |
| 2023-01-20 10:30:00 | 169.800003 | 0.0 | 0.0 |
| 2023-01-20 10:35:00 | 225.199997 | 55.200001 | 55.200001 |
| 2023-01-20 10:40:00 | 231.300003 | 6.1 | 6.0 |
| 2023-01-20 10:45:00 | 234.199997 | 2.9 | 3.0 |
| 2023-01-20 10:50:00 | 236.5 | 2.3 | 2.2 |
| 2023-01-20 10:55:00 | 236.5 | 0.0 | 0.0 |
| 2023-12-05 15:05:00 | 221.100006 | 0.0 | 0.0 |
| 2023-12-05 15:10:00 | 221.100006 | 0.0 | 0.0 |
| 2023-12-05 15:15:00 | 231.800003 | 10.6 | 10.6 |
| 2023-12-05 15:20:00 | 247.100006 | 15.3 | 15.2 |
| 2023-12-05 15:25:00 | 247.100006 | 0.0 | 0.0 |
| 2023-12-05 15:30:00 | 8.71 | 0.0 | 0.0 |
| 2023-12-05 15:35:00 | 8.71 | 0.0 | 0.0 |
| 2024-03-07 11:05:00 | 223.399994 | 0.0 | 0.0 |
| 2024-03-07 11:10:00 | 223.300003 | 0.0 | 0.0 |
| 2024-03-07 11:15:00 | 232.199997 | 8.7 | 8.6 |
| 2024-03-07 11:20:00 | 288.100006 | 55.900002 | 56.0 |
| 2024-03-07 11:25:00 | 345.200012 | 57.099998 | 57.0 |
| 2024-03-07 11:30:00 | 344.5 | 0.0 | 0.0 |
| 2024-03-07 11:35:00 | 344.5 | 0.0 | 0.0 |
[328]:
df[soloq]
1: UserWarning: Boolean Series key will be reindexed to match DataFrame index.
[328]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-09-30 23:55:00 | 36.279999 | Q | 0.0 | <NA> | 495.049988 | <NA> |
| 2022-10-01 00:00:00 | 36.27 | Q | 0.0 | <NA> | 495.049988 | <NA> |
| 2022-10-01 00:05:00 | 36.27 | Q | 0.0 | <NA> | 0.0 | <NA> |
| 2022-10-01 00:10:00 | 36.279999 | Q | 0.01 | Q | 0.01 | Q |
| 2022-10-01 00:15:00 | 36.290001 | Q | 0.02 | <NA> | 0.03 | <NA> |
| 2022-10-01 00:20:00 | 36.290001 | Q | 0.0 | <NA> | 0.03 | <NA> |
| 2022-10-01 00:25:00 | 36.290001 | Q | 0.0 | <NA> | 0.03 | <NA> |
| 2022-10-25 13:50:00 | 56.869999 | Q | 0.0 | <NA> | 26.83 | <NA> |
| 2022-10-25 13:55:00 | 56.869999 | Q | 0.0 | <NA> | 26.83 | <NA> |
| 2022-10-25 14:00:00 | 74.809998 | Q | 17.940001 | J | 44.77 | J |
| 2022-10-25 14:05:00 | 115.900002 | R | 41.09 | Q | 85.860001 | Q |
| 2022-10-25 14:10:00 | 117.099998 | Q | 1.2 | <NA> | 87.059998 | <NA> |
| 2022-10-25 14:15:00 | 117.0 | Q | 0.0 | <NA> | 87.059998 | <NA> |
| 2022-10-25 14:20:00 | 117.099998 | Q | 0.0 | <NA> | 87.059998 | <NA> |
| 2022-12-15 10:10:00 | 66.93 | Q | 0.0 | <NA> | 420.399994 | <NA> |
| 2022-12-15 10:15:00 | 66.970001 | Q | 0.0 | <NA> | 420.399994 | <NA> |
| 2022-12-15 10:20:00 | 77.400002 | Q | 10.28 | J | 430.679993 | J |
| 2022-12-15 10:25:00 | 221.100006 | R | 143.699997 | Q | 574.380005 | Q |
| 2022-12-15 10:30:00 | 221.100006 | Q | 0.0 | <NA> | 574.380005 | <NA> |
| 2022-12-15 10:35:00 | 221.100006 | Q | 0.0 | <NA> | 574.380005 | <NA> |
| 2022-12-15 10:40:00 | 221.100006 | Q | 0.0 | <NA> | 574.380005 | <NA> |
| 2023-01-20 10:25:00 | 169.800003 | Q | 0.0 | <NA> | 743.960022 | <NA> |
| 2023-01-20 10:30:00 | 169.800003 | Q | 0.0 | <NA> | 743.960022 | <NA> |
| 2023-01-20 10:35:00 | 225.199997 | R | 55.200001 | J | 799.159973 | J |
| 2023-01-20 10:40:00 | 231.300003 | Q | 6.1 | Q | 805.26001 | Q |
| 2023-01-20 10:45:00 | 234.199997 | Q | 2.9 | <NA> | 808.159973 | <NA> |
| 2023-01-20 10:50:00 | 236.5 | Q | 2.3 | <NA> | 810.460022 | <NA> |
| 2023-01-20 10:55:00 | 236.5 | Q | 0.0 | <NA> | 810.460022 | <NA> |
| 2023-12-05 15:05:00 | 221.100006 | Q | 0.0 | <NA> | 667.450012 | <NA> |
| 2023-12-05 15:10:00 | 221.100006 | Q | 0.0 | <NA> | 667.450012 | <NA> |
| 2023-12-05 15:15:00 | 231.800003 | Q | 10.6 | J | 678.049988 | J |
| 2023-12-05 15:20:00 | 247.100006 | Q | 15.3 | Q | 693.349976 | Q |
| 2023-12-05 15:25:00 | 247.100006 | Q | 0.0 | <NA> | 693.349976 | <NA> |
| 2023-12-05 15:30:00 | 8.71 | R | 0.0 | R | 693.349976 | R |
| 2023-12-05 15:35:00 | 8.71 | Q | 0.0 | <NA> | 693.349976 | <NA> |
| 2024-03-07 11:05:00 | 223.399994 | Q | 0.0 | <NA> | 1608.079956 | <NA> |
| 2024-03-07 11:10:00 | 223.300003 | Q | 0.0 | <NA> | 1608.079956 | <NA> |
| 2024-03-07 11:15:00 | 232.199997 | Q | 8.7 | J | 1616.780029 | J |
| 2024-03-07 11:20:00 | 288.100006 | R | 55.900002 | Q | 1672.680054 | Q |
| 2024-03-07 11:25:00 | 345.200012 | R | 57.099998 | <NA> | 1729.780029 | <NA> |
| 2024-03-07 11:30:00 | 344.5 | Q | 0.0 | <NA> | 1729.780029 | <NA> |
| 2024-03-07 11:35:00 | 344.5 | Q | 0.0 | <NA> | 1729.780029 | <NA> |
OK, so looking only at places where there wasn’t another flag assigned, all but one example are hopefully going to be caught by the clog cross probe qa once it is set up for this site.
One thing that does stand out here is that INST has a constant Q flag. This is from a manual flag Adam added:
col_Date>=datenum('10/01/2022 00:00:00')&col_Date<=datenum('09/19/2024 14:15:00')='Q'
Most of this period will have U and C flags with a CLOG event, but if I apply the GCE Q’s to the data, the whole period will be filled with Q whenever there isn’t a U or C. For example, during rain free periods, it will still have a CLOG eventcode, but no flag, unless I fill from GCE. Then all the rain free periods will be filled with Q’s.
Since the manual flags are on the tank, the carefully curated manual flags will be lost unless tank flags are included. On the other hand, this kind of fills in the more sophisticated clog analysis being performed here.
Let’s look at a different example and see if anything stands out.
CEN_01¶
[329]:
site = 'CEN_01'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[329]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
[349]:
site = 'CEN_01'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['tank_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[349]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-09-07 19:00:00 | <NA> | Q | |||||
| 2019-09-07 19:05:00 | <NA> | Q | |||||
| 2019-09-07 19:10:00 | <NA> | Q | |||||
| 2019-09-07 19:15:00 | <NA> | Q | |||||
| 2019-09-07 19:20:00 | <NA> | Q | |||||
| 2019-09-07 19:25:00 | <NA> | Q | |||||
| 2019-09-07 19:30:00 | <NA> | Q | |||||
| 2019-09-07 19:35:00 | <NA> | Q | |||||
| 2019-09-07 19:40:00 | <NA> | Q | |||||
| 2019-09-07 19:45:00 | <NA> | Q | |||||
| 2019-09-07 19:50:00 | <NA> | Q | |||||
| 2019-09-07 19:55:00 | <NA> | Q | |||||
| 2019-09-07 20:00:00 | <NA> | Q | |||||
| 2019-09-07 20:05:00 | <NA> | Q | |||||
| 2019-09-07 20:10:00 | <NA> | Q | |||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2019-09-08 12:05:00 | <NA> | Q | |||||
| 2019-09-08 12:10:00 | <NA> | Q | |||||
| 2019-09-08 12:15:00 | <NA> | Q | |||||
| 2019-09-08 12:20:00 | <NA> | Q | |||||
| 2019-09-08 12:25:00 | <NA> | Q | |||||
| 2019-09-08 12:30:00 | <NA> | Q | |||||
| 2019-09-08 12:35:00 | <NA> | Q | |||||
| 2019-09-08 12:40:00 | <NA> | Q | |||||
| 2019-09-08 12:45:00 | <NA> | Q | |||||
| 2019-09-08 12:50:00 | <NA> | Q | |||||
| 2019-09-08 12:55:00 | <NA> | Q | |||||
| 2019-09-08 13:00:00 | <NA> | Q | |||||
| 2019-09-08 13:05:00 | <NA> | Q | |||||
| 2019-09-08 13:10:00 | <NA> | Q | |||||
| 2019-09-08 13:15:00 | J | Q |
220 rows × 7 columns
OK, so all of these Q’s on the tank come from one manual flag:
col_Date>=datenum('09/07/2019 19:00:00')&col_Date<=datenum('09/08/2019 13:15:00')='Q'
I can’t find any notes or bitbucket issue to go with it. It does raise the question, however, as to how this manual flag relates to precip flags. This site specifically makes it look like the tank flags can be ignored, but let’s look at another site.
CEN_02¶
[336]:
site = 'CEN_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[336]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-10-01 00:10:00 | Q | <NA> | Q |
[348]:
del soloq
soloq = q & ~qar & ~man
for s in range(-4,3):
soloq |= soloq.shift(s)
all_flags[site].event[soloq]
[348]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-10-01 00:05:00 | <NA> | <NA> | |||||
| 2018-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2018-10-01 00:15:00 | <NA> | <NA> | |||||
| 2018-10-01 00:20:00 | <NA> | <NA> | |||||
| 2018-10-01 00:25:00 | <NA> | <NA> |
[338]:
all_flags[site].data[soloq]
[338]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2018-10-01 00:05:00 | 43.990002 | 0.0 | 0.0 |
| 2018-10-01 00:10:00 | 44.32 | 0.33 | 0.0 |
| 2018-10-01 00:15:00 | 44.150002 | 0.16 | 0.4 |
| 2018-10-01 00:20:00 | 44.150002 | 0.0 | 0.0 |
| 2018-10-01 00:25:00 | 43.990002 | 0.0 | 0.0 |
Well, that looks like sensor bounce that should be thrown out… but the Q is on the wrong timestep after the rounding. This seems to happen in this chunk of code taken from before the loop starts when there are only 2 measurements:
% If the difference is small
if Diff2 <= 0 && abs(Diff2)/10 < abs(Diff) + 0.1
% Flag as acceptable
Flag2 = '';
% Do not count negative to corrected value, CorrDiff2 is 0
CorrDiff2 = 0;
% Baseline 2 represents the small loss
Baseline2 = Baseline + Diff2;
% Recent Diffs includes this small loss.
Recent_Diffs = Recent_Diffs + Diff2;
%%% If all these cases fail (for example a NaN in the data, which is the
%%% fail case I can think of), then the Flag is 'M' and the data is just
%%% repeated from the previous good measurement to keep the program
%%% running.
elseif isnan(Baseline2)
Baseline2 = Baseline; Flag2 = 'M'; RawGauge2 = RawGauge; CorrDiff2 = 0; Diff2 = 0;
%%% If some other condition occurs that we didn't think of...
else Baseline2 = Baseline; Flag2 = 'Q'; RawGauge2 = RawGauge; CorrDiff2 = Diff2; Recent_Diffs = Recent_Diffs + Diff2;
end
So basically, whenever there is precip before there are 3 measurements, the precip will be flagged Q. That seems reasonable, it isn’t really enough data to calculate precip.
Let’s take a look at the tank flags for this site and see how they line up.
[350]:
site = 'CEN_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['tank_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[350]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-02-23 15:10:00 | <NA> | Q | |||||
| 2019-02-23 15:15:00 | <NA> | Q | |||||
| 2019-02-23 15:20:00 | <NA> | Q | |||||
| 2019-02-23 15:25:00 | <NA> | Q | |||||
| 2019-02-23 15:30:00 | <NA> | Q | |||||
| 2019-02-23 15:35:00 | <NA> | Q | |||||
| 2019-02-23 15:40:00 | <NA> | Q | |||||
| 2019-02-23 15:45:00 | <NA> | Q | |||||
| 2019-02-23 15:50:00 | <NA> | Q | |||||
| 2019-02-23 15:55:00 | <NA> | Q | |||||
| 2019-02-23 16:00:00 | <NA> | Q | |||||
| 2019-02-23 16:05:00 | <NA> | Q | |||||
| 2019-02-23 16:10:00 | <NA> | Q | |||||
| 2019-02-23 16:15:00 | <NA> | Q | |||||
| 2019-02-23 16:20:00 | <NA> | Q | |||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2019-03-26 07:50:00 | <NA> | Q | |||||
| 2019-03-26 07:55:00 | <NA> | Q | |||||
| 2019-03-26 08:00:00 | <NA> | Q | |||||
| 2019-03-26 08:05:00 | <NA> | Q | |||||
| 2019-03-26 08:10:00 | <NA> | Q | |||||
| 2019-03-26 08:15:00 | <NA> | Q | |||||
| 2019-03-26 08:20:00 | <NA> | Q | |||||
| 2019-03-26 08:25:00 | <NA> | Q | |||||
| 2019-03-26 08:30:00 | <NA> | Q | |||||
| 2019-03-26 08:35:00 | <NA> | Q | |||||
| 2019-03-26 08:40:00 | <NA> | Q | |||||
| 2019-03-26 08:45:00 | <NA> | Q | |||||
| 2019-03-26 08:50:00 | <NA> | Q | |||||
| 2019-03-26 08:55:00 | <NA> | Q | |||||
| 2019-03-26 09:00:00 | <NA> | Q |
8579 rows × 7 columns
Well here again we have a manual flag…
col_Date>=datenum('02/23/2019 15:05:00')&col_Date<=datenum('03/26/2019 09:00:00')='Q';
This does seem to line up with a clog event. There is a note on 3/16/19 that says:
NEEDLES CLOGGING ORIFICE
Some needles were clogging orifice; Could explain cooler shelter orifice temp (standing water); Could explain gauge float stagnation for multiple weeks.
So, on the one hand, if this is a real clog, then it will probably be caught by the clogging routine once this probe is parametrized for that. In which case, those U and C flags will take precednce over the Q flag and these flags will be moot. On the other hand, we don’t want to miss the clog when Adam has carefully flagged the event. Though it is concerning that the note and the flag don’t end at the same time.
I think it makes the most sense to keep the flags until they’ve been superceded. If they seem really confusing or unnecessary, they can always be superceded by a manual flag if necessary.
CEN_04¶
[370]:
site = 'CEN_04'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[370]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
[371]:
q = all_flags[site].event['tank_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[371]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
Nothing to see there.
UPL_01¶
[368]:
site = 'UPL_01'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[368]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2020-10-01 00:10:00 | Q | <NA> | Q |
Both of those are before there are 3 timesteps to measure a baseline off of. So this is the same as we discovered before, and probably reasonable to question tank derivitaves with such a small sample.
[369]:
q = all_flags[site].event['tank_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[369]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2021-10-10 03:15:00 | <NA> | Q | |||||
| 2021-10-10 03:20:00 | <NA> | Q | |||||
| 2021-10-10 03:25:00 | <NA> | Q | |||||
| 2021-10-10 03:30:00 | <NA> | Q | |||||
| 2021-10-10 03:35:00 | <NA> | Q | |||||
| 2021-10-10 03:40:00 | <NA> | Q | |||||
| 2021-10-10 03:45:00 | <NA> | Q | |||||
| 2021-10-10 03:50:00 | <NA> | Q | |||||
| 2021-10-10 03:55:00 | <NA> | Q | |||||
| 2021-10-10 04:00:00 | <NA> | Q | |||||
| 2021-10-10 04:05:00 | <NA> | Q | |||||
| 2021-10-10 04:10:00 | <NA> | Q | |||||
| 2021-10-10 04:15:00 | <NA> | Q | |||||
| 2021-10-10 04:20:00 | <NA> | Q | |||||
| 2021-10-10 04:25:00 | <NA> | Q | |||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-10-12 15:45:00 | <NA> | Q | |||||
| 2021-10-12 15:50:00 | <NA> | Q | |||||
| 2021-10-12 15:55:00 | <NA> | Q | |||||
| 2021-10-12 16:00:00 | <NA> | Q | |||||
| 2021-10-12 16:05:00 | <NA> | Q | |||||
| 2021-10-12 16:10:00 | <NA> | Q | |||||
| 2021-10-12 16:15:00 | <NA> | Q | |||||
| 2021-10-12 16:20:00 | <NA> | Q | |||||
| 2021-10-12 16:25:00 | <NA> | Q | |||||
| 2021-10-12 16:30:00 | <NA> | Q | |||||
| 2021-10-12 16:35:00 | <NA> | Q | |||||
| 2021-10-12 16:40:00 | <NA> | Q | |||||
| 2021-10-12 16:45:00 | <NA> | Q | |||||
| 2021-10-12 16:50:00 | <NA> | Q | |||||
| 2021-10-12 16:55:00 | <NA> | Q |
338 rows × 7 columns
Well, that’s a manual flag of a known clog…again, without these, the provisional data really suffers, and we wouldn’t want to accidentally miss the clog. But this will interject Q’s into the U’s and C’s of the clog event.
UPL02¶
[355]:
site = 'UPL_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[355]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-09-12 16:00:00 | Q | <NA> | Q | ||||
| 2020-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2022-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2024-05-08 17:15:00 | Q | Q | Q |
Two of those are the predictable second timestep of the water year. But two are caused by something else, so let’s take a look.
[359]:
soloq = q & ~qar & ~man
for s in range(-3,3):
soloq |= soloq.shift(s)
all_flags[site].event[soloq]
[359]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-09-12 15:30:00 | <NA> | <NA> | |||||
| 2019-09-12 15:35:00 | R | R | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |||
| 2019-09-12 15:40:00 | R | R | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | |||
| 2019-09-12 15:45:00 | R | <NA> | DRAIN | QaRule AutoFlag: drain_event; | |||
| 2019-09-12 15:50:00 | <NA> | <NA> | E | M | DRAIN | QaRule AutoFlag: drain_event; | |
| 2019-09-12 15:55:00 | J | <NA> | |||||
| 2019-09-12 16:00:00 | Q | <NA> | Q | ||||
| 2019-09-12 16:05:00 | <NA> | <NA> | |||||
| 2019-09-12 16:10:00 | <NA> | <NA> | |||||
| 2019-09-12 16:15:00 | <NA> | <NA> | |||||
| 2020-09-30 23:40:00 | <NA> | <NA> | |||||
| 2020-09-30 23:45:00 | <NA> | <NA> | |||||
| 2020-09-30 23:50:00 | <NA> | <NA> | |||||
| 2020-09-30 23:55:00 | <NA> | <NA> | |||||
| 2020-10-01 00:00:00 | <NA> | <NA> | |||||
| 2020-10-01 00:05:00 | <NA> | <NA> | |||||
| 2020-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2020-10-01 00:15:00 | <NA> | <NA> | |||||
| 2020-10-01 00:20:00 | <NA> | <NA> | |||||
| 2020-10-01 00:25:00 | <NA> | <NA> | |||||
| 2022-09-30 23:40:00 | <NA> | <NA> | |||||
| 2022-09-30 23:45:00 | <NA> | <NA> | |||||
| 2022-09-30 23:50:00 | <NA> | <NA> | |||||
| 2022-09-30 23:55:00 | <NA> | <NA> | |||||
| 2022-10-01 00:00:00 | <NA> | <NA> | |||||
| 2022-10-01 00:05:00 | <NA> | <NA> | |||||
| 2022-10-01 00:10:00 | Q | <NA> | Q | ||||
| 2022-10-01 00:15:00 | <NA> | <NA> | |||||
| 2022-10-01 00:20:00 | <NA> | <NA> | |||||
| 2022-10-01 00:25:00 | <NA> | <NA> | |||||
| 2024-05-08 16:45:00 | <NA> | Q | |||||
| 2024-05-08 16:50:00 | <NA> | Q | |||||
| 2024-05-08 16:55:00 | <NA> | Q | |||||
| 2024-05-08 17:00:00 | <NA> | Q | |||||
| 2024-05-08 17:05:00 | <NA> | Q | |||||
| 2024-05-08 17:10:00 | J | R | |||||
| 2024-05-08 17:15:00 | Q | Q | Q | ||||
| 2024-05-08 17:20:00 | <NA> | Q | |||||
| 2024-05-08 17:25:00 | <NA> | Q | |||||
| 2024-05-08 17:30:00 | <NA> | Q |
[360]:
all_flags[site].data[soloq]
[360]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2019-09-12 15:30:00 | 336.0 | 0.0 | 0.0 |
| 2019-09-12 15:35:00 | 119.5 | 0.0 | 0.0 |
| 2019-09-12 15:40:00 | 14.85 | 0.0 | 0.0 |
| 2019-09-12 15:45:00 | 14.85 | 0.0 | 0.0 |
| 2019-09-12 15:50:00 | 28.91 | 14.06 | <NA> |
| 2019-09-12 15:55:00 | 39.380001 | 10.47 | 10.400001 |
| 2019-09-12 16:00:00 | 47.07 | 7.69 | 7.6 |
| 2019-09-12 16:05:00 | 52.459999 | 5.39 | 5.2 |
| 2019-09-12 16:10:00 | 56.389999 | 3.93 | 3.6 |
| 2019-09-12 16:15:00 | 59.330002 | 2.94 | 2.8 |
| 2020-09-30 23:40:00 | 130.600006 | 0.0 | 0.0 |
| 2020-09-30 23:45:00 | 130.600006 | 0.0 | 0.0 |
| 2020-09-30 23:50:00 | 130.899994 | 0.0 | 0.0 |
| 2020-09-30 23:55:00 | 130.600006 | 0.0 | 0.0 |
| 2020-10-01 00:00:00 | 130.600006 | 0.0 | 0.0 |
| 2020-10-01 00:05:00 | 130.600006 | 0.0 | 0.0 |
| 2020-10-01 00:10:00 | 130.899994 | 0.3 | 0.0 |
| 2020-10-01 00:15:00 | 130.800003 | 0.2 | 0.4 |
| 2020-10-01 00:20:00 | 130.899994 | 0.1 | 0.0 |
| 2020-10-01 00:25:00 | 130.600006 | 0.0 | 0.0 |
| 2022-09-30 23:40:00 | 71.559998 | 0.0 | 0.0 |
| 2022-09-30 23:45:00 | 71.5 | 0.0 | 0.0 |
| 2022-09-30 23:50:00 | 71.550003 | 0.0 | 0.0 |
| 2022-09-30 23:55:00 | 71.470001 | 0.0 | 0.0 |
| 2022-10-01 00:00:00 | 71.480003 | 0.0 | 0.0 |
| 2022-10-01 00:05:00 | 71.519997 | 0.0 | 0.0 |
| 2022-10-01 00:10:00 | 71.550003 | 0.03 | 0.0 |
| 2022-10-01 00:15:00 | 71.459999 | 0.0 | 0.0 |
| 2022-10-01 00:20:00 | 71.459999 | 0.0 | 0.0 |
| 2022-10-01 00:25:00 | 71.519997 | 0.0 | 0.0 |
| 2024-05-08 16:45:00 | 243.899994 | 0.0 | 0.0 |
| 2024-05-08 16:50:00 | 243.899994 | 0.0 | 0.0 |
| 2024-05-08 16:55:00 | 244.0 | 0.1 | 0.0 |
| 2024-05-08 17:00:00 | 244.0 | 0.0 | 0.0 |
| 2024-05-08 17:05:00 | 243.899994 | 0.0 | 0.0 |
| 2024-05-08 17:10:00 | 294.399994 | 50.400002 | 50.400002 |
| 2024-05-08 17:15:00 | 299.799988 | 5.4 | 5.2 |
| 2024-05-08 17:20:00 | 300.399994 | 0.6 | 0.8 |
| 2024-05-08 17:25:00 | 300.5 | 0.1 | 0.0 |
| 2024-05-08 17:30:00 | 300.600006 | 0.1 | 0.0 |
OK, the first case was an overdrain where water was added to the tank. It looks like the tank level continued to jump up for a while in large increments, so I added a manual flag to replace the Q with an M (removing the data) and add a Q to the rest of the sequence.
The next two cases are the second time step of the water year, so expected.
The second case is a large sudden jump. From check sheets, this was a clog that was busted. Hopefully, once this probe is parameterized for clog comparisons this will be caught and flagged C. Even if the 50.4 mm currently flagged J gets a C flag, GCE adding a Q flag to the 5.2 mm after the unclog would make sense. So I think it is OK to leave this.
UPLO_04¶
[361]:
site = 'UPL_04'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[361]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2024-05-08 17:10:00 | Q | Q |
[363]:
soloq = q & ~qar & ~man
for s in range(-2,3):
soloq |= soloq.shift(s)
all_flags[site].event[soloq]
[363]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2024-05-08 16:55:00 | <NA> | ||||||
| 2024-05-08 17:00:00 | <NA> | ||||||
| 2024-05-08 17:05:00 | <NA> | ||||||
| 2024-05-08 17:10:00 | Q | Q | |||||
| 2024-05-08 17:15:00 | <NA> | ||||||
| 2024-05-08 17:20:00 | <NA> | ||||||
| 2024-05-08 17:25:00 | <NA> |
[365]:
all_flags[site].data[soloq]
[365]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2024-05-08 16:55:00 | <NA> | 0.0 | 0.0 |
| 2024-05-08 17:00:00 | <NA> | 0.0 | 0.0 |
| 2024-05-08 17:05:00 | <NA> | 0.0 | 0.0 |
| 2024-05-08 17:10:00 | <NA> | 20.719999 | 20.719999 |
| 2024-05-08 17:15:00 | <NA> | 1.63 | 1.63 |
| 2024-05-08 17:20:00 | <NA> | 0.0 | 0.0 |
| 2024-05-08 17:25:00 | <NA> | 0.0 | 0.0 |
This is the same clog discussed above in UPLO_02 that should be caught in the future once the clog qc is run on it. In the mean time, a Q doesn’t seem unreasonable place holder. I do worry that this will be confusing in the final data. Maybe we need a check to enforce only U or C during CLOGS. See GCE Flags During Clogs.
CS202¶
[377]:
site = 'CS2_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[377]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-04-08 01:00:00 | Q | Q | Q | ||||
| 2019-04-08 01:15:00 | Q | Q | Q | ||||
| 2019-04-08 01:30:00 | Q | Q | Q | ||||
| 2019-04-08 01:45:00 | Q | Q | Q | ||||
| 2019-04-08 02:00:00 | Q | Q | Q | ||||
| 2019-04-08 02:15:00 | Q | Q | Q | ||||
| 2019-04-08 02:30:00 | Q | Q | Q | ||||
| 2019-04-08 02:45:00 | Q | Q | Q | ||||
| 2019-04-08 03:00:00 | Q | Q | Q | ||||
| 2019-04-08 03:15:00 | Q | Q | Q | ||||
| 2019-04-09 16:45:00 | Q | R | Q | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | ||
| 2019-04-09 17:00:00 | Q | <NA> | Q | DRAIN | QaRule AutoFlag: drain_event; | ||
| 2019-04-09 17:15:00 | Q | <NA> | Q | DRAIN | QaRule AutoFlag: drain_event; | ||
| 2019-04-09 17:30:00 | Q | <NA> | Q | ||||
| 2019-04-09 17:45:00 | Q | <NA> | Q | ||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 12:00:00 | Q | <NA> | Q | ||||
| 2021-12-27 12:15:00 | Q | <NA> | Q | ||||
| 2021-12-27 12:30:00 | Q | <NA> | Q | ||||
| 2021-12-27 12:45:00 | Q | <NA> | Q | ||||
| 2021-12-27 13:00:00 | Q | <NA> | Q | ||||
| 2021-12-27 13:15:00 | Q | <NA> | Q | ||||
| 2021-12-27 13:30:00 | Q | <NA> | Q | ||||
| 2021-12-27 14:00:00 | Q | R | Q | ||||
| 2021-12-27 14:15:00 | Q | <NA> | Q | ||||
| 2021-12-27 14:30:00 | Q | <NA> | Q | ||||
| 2021-12-27 14:45:00 | Q | <NA> | Q | ||||
| 2021-12-27 15:00:00 | Q | <NA> | Q | ||||
| 2022-08-09 18:00:00 | Q | <NA> | Q | ||||
| 2023-05-16 14:30:00 | Q | <NA> | Q | ||||
| 2024-08-17 15:15:00 | Q | <NA> | Q |
327 rows × 7 columns
[378]:
all_flags[site].data[q & ~qar & ~man]
[378]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2019-04-08 01:00:00 | 408.359985 | 0.51 | 0.5 |
| 2019-04-08 01:15:00 | 408.869995 | 0.51 | 0.5 |
| 2019-04-08 01:30:00 | 409.709991 | 0.76 | 0.75 |
| 2019-04-08 01:45:00 | 410.329987 | 0.51 | 0.5 |
| 2019-04-08 02:00:00 | 411.119995 | 1.02 | 1.0 |
| 2019-04-08 02:15:00 | 411.440002 | 0.25 | 0.25 |
| 2019-04-08 02:30:00 | 411.679993 | 0.25 | 0.25 |
| 2019-04-08 02:45:00 | 411.690002 | 0.0 | 0.0 |
| 2019-04-08 03:00:00 | 411.769989 | 0.0 | 0.0 |
| 2019-04-08 03:15:00 | 411.890015 | 0.25 | 0.25 |
| 2019-04-09 16:45:00 | 67.669998 | 0.0 | 0.0 |
| 2019-04-09 17:00:00 | 67.690002 | 0.0 | 0.0 |
| 2019-04-09 17:15:00 | 67.669998 | 0.0 | 0.0 |
| 2019-04-09 17:30:00 | 67.709999 | 0.0 | 0.0 |
| 2019-04-09 17:45:00 | 67.650002 | 0.0 | 0.0 |
| ... | ... | ... | ... |
| 2021-12-27 12:00:00 | 104.400002 | 0.0 | 0.0 |
| 2021-12-27 12:15:00 | 104.400002 | 0.0 | 0.0 |
| 2021-12-27 12:30:00 | 104.400002 | 0.0 | 0.0 |
| 2021-12-27 12:45:00 | 104.400002 | 0.0 | 0.0 |
| 2021-12-27 13:00:00 | 104.400002 | 0.0 | 0.0 |
| 2021-12-27 13:15:00 | 104.389999 | 0.0 | 0.0 |
| 2021-12-27 13:30:00 | 104.419998 | 0.0 | 0.0 |
| 2021-12-27 14:00:00 | 83.080002 | 0.25 | 0.25 |
| 2021-12-27 14:15:00 | 74.970001 | 24.889999 | 24.75 |
| 2021-12-27 14:30:00 | 74.919998 | 0.0 | 0.0 |
| 2021-12-27 14:45:00 | 74.93 | 0.0 | 0.0 |
| 2021-12-27 15:00:00 | 74.940002 | 0.0 | 0.0 |
| 2022-08-09 18:00:00 | 190.550003 | 14.22 | 14.0 |
| 2023-05-16 14:30:00 | 134.300003 | 18.799999 | 18.75 |
| 2024-08-17 15:15:00 | 177.029999 | 13.21 | 13.0 |
327 rows × 3 columns
Ok, the first chunk on 4/8/19 is because the tank level is too high. The GCE number (381 mm) is just a little more conservative than the 412 used here for tank overflow. MOst of the rest seems like manual flags for high precip. Tank value changes > 12.7 get an R, but precip >13 gets a Q. I want to see a little more.
[383]:
Q = q & ~qar & ~man
for s in range(-2,3):
Q |= Q.shift(s)
pd.options.display.min_rows = 50
all_flags[site].event[Q]
[383]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-04-08 00:15:00 | <NA> | Q | |||||
| 2019-04-08 00:30:00 | <NA> | Q | |||||
| 2019-04-08 00:45:00 | <NA> | Q | |||||
| 2019-04-08 01:00:00 | Q | Q | Q | ||||
| 2019-04-08 01:15:00 | Q | Q | Q | ||||
| 2019-04-08 01:30:00 | Q | Q | Q | ||||
| 2019-04-08 01:45:00 | Q | Q | Q | ||||
| 2019-04-08 02:00:00 | Q | Q | Q | ||||
| 2019-04-08 02:15:00 | Q | Q | Q | ||||
| 2019-04-08 02:30:00 | Q | Q | Q | ||||
| 2019-04-08 02:45:00 | Q | Q | Q | ||||
| 2019-04-08 03:00:00 | Q | Q | Q | ||||
| 2019-04-08 03:15:00 | Q | Q | Q | ||||
| 2019-04-08 03:30:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-08 03:45:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-08 04:00:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-09 16:00:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-09 16:15:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-09 16:30:00 | Q | Q | U | U | QaRule AutoFlag: overflow; | ||
| 2019-04-09 16:45:00 | Q | R | Q | DRAIN | QaRule AutoFlag: drain_event; QaRule AutoFlag: neg_delta_tank; | ||
| 2019-04-09 17:00:00 | Q | <NA> | Q | DRAIN | QaRule AutoFlag: drain_event; | ||
| 2019-04-09 17:15:00 | Q | <NA> | Q | DRAIN | QaRule AutoFlag: drain_event; | ||
| 2019-04-09 17:30:00 | Q | <NA> | Q | ||||
| 2019-04-09 17:45:00 | Q | <NA> | Q | ||||
| 2019-04-09 18:00:00 | Q | <NA> | Q | ||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 15:00:00 | Q | <NA> | Q | ||||
| 2021-12-27 15:15:00 | <NA> | <NA> | |||||
| 2021-12-27 15:30:00 | <NA> | <NA> | |||||
| 2021-12-27 15:45:00 | <NA> | <NA> | |||||
| 2022-08-09 17:15:00 | <NA> | <NA> | |||||
| 2022-08-09 17:30:00 | <NA> | <NA> | |||||
| 2022-08-09 17:45:00 | <NA> | <NA> | |||||
| 2022-08-09 18:00:00 | Q | <NA> | Q | ||||
| 2022-08-09 18:15:00 | <NA> | <NA> | |||||
| 2022-08-09 18:30:00 | <NA> | <NA> | |||||
| 2022-08-09 18:45:00 | <NA> | <NA> | |||||
| 2023-05-16 13:45:00 | <NA> | <NA> | |||||
| 2023-05-16 14:00:00 | <NA> | <NA> | |||||
| 2023-05-16 14:15:00 | <NA> | <NA> | |||||
| 2023-05-16 14:30:00 | Q | <NA> | Q | ||||
| 2023-05-16 14:45:00 | <NA> | <NA> | |||||
| 2023-05-16 15:00:00 | <NA> | <NA> | |||||
| 2023-05-16 15:15:00 | <NA> | <NA> | |||||
| 2024-08-17 14:30:00 | <NA> | <NA> | |||||
| 2024-08-17 14:45:00 | <NA> | <NA> | |||||
| 2024-08-17 15:00:00 | <NA> | <NA> | |||||
| 2024-08-17 15:15:00 | Q | <NA> | Q | ||||
| 2024-08-17 15:30:00 | <NA> | <NA> | |||||
| 2024-08-17 15:45:00 | <NA> | <NA> | |||||
| 2024-08-17 16:00:00 | <NA> | <NA> |
363 rows × 7 columns
[384]:
all_flags[site].data[Q]
[384]:
| tank_height | precip | adj_precip | |
|---|---|---|---|
| Date | |||
| 2019-04-08 00:15:00 | 406.970001 | 0.51 | 0.5 |
| 2019-04-08 00:30:00 | 407.429993 | 0.51 | 0.5 |
| 2019-04-08 00:45:00 | 407.890015 | 0.51 | 0.5 |
| 2019-04-08 01:00:00 | 408.359985 | 0.51 | 0.5 |
| 2019-04-08 01:15:00 | 408.869995 | 0.51 | 0.5 |
| 2019-04-08 01:30:00 | 409.709991 | 0.76 | 0.75 |
| 2019-04-08 01:45:00 | 410.329987 | 0.51 | 0.5 |
| 2019-04-08 02:00:00 | 411.119995 | 1.02 | 1.0 |
| 2019-04-08 02:15:00 | 411.440002 | 0.25 | 0.25 |
| 2019-04-08 02:30:00 | 411.679993 | 0.25 | 0.25 |
| 2019-04-08 02:45:00 | 411.690002 | 0.0 | 0.0 |
| 2019-04-08 03:00:00 | 411.769989 | 0.0 | 0.0 |
| 2019-04-08 03:15:00 | 411.890015 | 0.25 | 0.25 |
| 2019-04-08 03:30:00 | 412.040009 | 0.0 | 0.0 |
| 2019-04-08 03:45:00 | 412.119995 | 0.25 | 0.25 |
| 2019-04-08 04:00:00 | 412.109985 | 0.0 | 0.0 |
| 2019-04-09 16:00:00 | 412.609985 | 0.0 | 0.0 |
| 2019-04-09 16:15:00 | 412.640015 | 0.0 | 0.0 |
| 2019-04-09 16:30:00 | 412.640015 | 0.0 | 0.0 |
| 2019-04-09 16:45:00 | 67.669998 | 0.0 | 0.0 |
| 2019-04-09 17:00:00 | 67.690002 | 0.0 | 0.0 |
| 2019-04-09 17:15:00 | 67.669998 | 0.0 | 0.0 |
| 2019-04-09 17:30:00 | 67.709999 | 0.0 | 0.0 |
| 2019-04-09 17:45:00 | 67.650002 | 0.0 | 0.0 |
| 2019-04-09 18:00:00 | 67.650002 | 0.0 | 0.0 |
| ... | ... | ... | ... |
| 2021-12-27 15:00:00 | 74.940002 | 0.0 | 0.0 |
| 2021-12-27 15:15:00 | 74.93 | 0.0 | 0.0 |
| 2021-12-27 15:30:00 | 74.940002 | 0.0 | 0.0 |
| 2021-12-27 15:45:00 | 74.980003 | 0.0 | 0.0 |
| 2022-08-09 17:15:00 | 176.339996 | 0.0 | 0.0 |
| 2022-08-09 17:30:00 | 176.339996 | 0.0 | 0.0 |
| 2022-08-09 17:45:00 | 176.330002 | 0.0 | 0.0 |
| 2022-08-09 18:00:00 | 190.550003 | 14.22 | 14.0 |
| 2022-08-09 18:15:00 | 193.050003 | 2.29 | 2.25 |
| 2022-08-09 18:30:00 | 195.759995 | 2.79 | 2.75 |
| 2022-08-09 18:45:00 | 196.130005 | 0.51 | 0.5 |
| 2023-05-16 13:45:00 | 108.410004 | 0.0 | 0.0 |
| 2023-05-16 14:00:00 | 108.370003 | 0.0 | 0.0 |
| 2023-05-16 14:15:00 | 115.730003 | 7.11 | 7.0 |
| 2023-05-16 14:30:00 | 134.300003 | 18.799999 | 18.75 |
| 2023-05-16 14:45:00 | 136.490005 | 2.03 | 2.0 |
| 2023-05-16 15:00:00 | 136.660004 | 0.25 | 0.25 |
| 2023-05-16 15:15:00 | 136.660004 | 0.0 | 0.0 |
| 2024-08-17 14:30:00 | 162.809998 | 5.33 | 5.25 |
| 2024-08-17 14:45:00 | 162.679993 | 0.25 | 0.25 |
| 2024-08-17 15:00:00 | 163.690002 | 1.02 | 1.0 |
| 2024-08-17 15:15:00 | 177.029999 | 13.21 | 13.0 |
| 2024-08-17 15:30:00 | 178.220001 | 1.27 | 1.25 |
| 2024-08-17 15:45:00 | 178.380005 | 0.0 | 0.0 |
| 2024-08-17 16:00:00 | 178.460007 | 0.25 | 0.25 |
363 rows × 3 columns
[381]:
all_flags[site].data[Q].to_clipboard()
[382]:
all_flags[site].event[Q].to_clipboard()
So the overflow on 4/8-4/9 is flagged with U, but is bookended by Q’s through a manual flag. I don’t mind that. It makes sense that we would have some uncertainty as we approach maximum tank levels, and as the overflow is resolved.
The last three are some truly large increases, though there is no reason to think they aren’t legit. In fact, given the time of year, they could be sudden thunderstorms where we would expect such a downpour. But I’m OK leaving 3 isolated Q flags.
The only thing I’m unsure about is the manual flag from 12/24-12/27/21. It has a note and a bitbucket issue identifying that it was overtopped with snow. However, why wasn’t the drain recharge caught? I would hope that clog analysis would ID this and replace most of the Q’s with U’s anyway. But the drain is troubling.
Overall this looks OK, and I’m good with moving these Q’s to final data, but let’s dig in on the drain, because that’s a problem.
[388]:
df = prov.pivot_on_probe(prov.df, 'CS2', '02')
rules = qaqc.QaRules(df, {'precision':0.25})
[387]:
strt, end = pd.to_datetime('12/24/21 1115'), pd.to_datetime('12/27/21 1545')
df[strt:end]
[387]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2021-12-24 11:15:00 | 83.589996 | <NA> | 0.51 | <NA> | 799.080017 | <NA> |
| 2021-12-24 11:30:00 | 84.010002 | <NA> | 0.25 | <NA> | 799.340027 | <NA> |
| 2021-12-24 11:45:00 | 84.209999 | <NA> | 0.25 | <NA> | 799.590027 | <NA> |
| 2021-12-24 12:00:00 | 84.339996 | <NA> | 0.25 | Q | 799.849976 | Q |
| 2021-12-24 12:15:00 | 84.489998 | <NA> | 0.0 | Q | 799.849976 | Q |
| 2021-12-24 12:30:00 | 84.660004 | <NA> | 0.25 | Q | 800.099976 | Q |
| 2021-12-24 12:45:00 | 84.839996 | <NA> | 0.25 | Q | 800.349976 | Q |
| 2021-12-24 13:00:00 | 85.07 | <NA> | 0.25 | Q | 800.609985 | Q |
| 2021-12-24 13:15:00 | 85.629997 | <NA> | 0.51 | Q | 801.119995 | Q |
| 2021-12-24 13:30:00 | 86.620003 | <NA> | 1.02 | Q | 802.130005 | Q |
| 2021-12-24 13:45:00 | 87.57 | <NA> | 0.76 | Q | 802.890015 | Q |
| 2021-12-24 14:00:00 | 87.970001 | <NA> | 0.51 | Q | 803.400024 | Q |
| 2021-12-24 14:15:00 | 87.980003 | <NA> | 0.0 | Q | 803.400024 | Q |
| 2021-12-24 14:30:00 | 88.110001 | <NA> | 0.25 | Q | 803.659973 | Q |
| 2021-12-24 14:45:00 | 88.269997 | <NA> | 0.25 | Q | 803.909973 | Q |
| 2021-12-24 15:00:00 | 88.440002 | <NA> | 0.0 | Q | 803.909973 | Q |
| 2021-12-24 15:15:00 | 88.699997 | <NA> | 0.25 | Q | 804.159973 | Q |
| 2021-12-24 15:30:00 | 88.849998 | <NA> | 0.25 | Q | 804.419983 | Q |
| 2021-12-24 15:45:00 | 89.330002 | <NA> | 0.51 | Q | 804.929993 | Q |
| 2021-12-24 16:00:00 | 89.449997 | <NA> | 0.0 | Q | 804.929993 | Q |
| 2021-12-24 16:15:00 | 89.839996 | <NA> | 0.51 | Q | 805.429993 | Q |
| 2021-12-24 16:30:00 | 90.440002 | <NA> | 0.51 | Q | 805.940002 | Q |
| 2021-12-24 16:45:00 | 90.949997 | <NA> | 0.51 | Q | 806.450012 | Q |
| 2021-12-24 17:00:00 | 91.449997 | <NA> | 0.51 | Q | 806.960022 | Q |
| 2021-12-24 17:15:00 | 91.809998 | <NA> | 0.51 | Q | 807.469971 | Q |
| ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 09:30:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 09:45:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 10:00:00 | 104.379997 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 10:15:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 10:30:00 | 104.379997 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 10:45:00 | 104.379997 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 11:00:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 11:15:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 11:30:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 11:45:00 | 104.379997 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 12:00:00 | 104.400002 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 12:15:00 | 104.400002 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 12:30:00 | 104.400002 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 12:45:00 | 104.400002 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 13:00:00 | 104.400002 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 13:15:00 | 104.389999 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 13:30:00 | 104.419998 | <NA> | 0.0 | Q | 820.929993 | Q |
| 2021-12-27 14:00:00 | 83.080002 | R | 0.25 | Q | 821.179993 | Q |
| 2021-12-27 14:15:00 | 74.970001 | <NA> | 24.889999 | Q | 846.070007 | Q |
| 2021-12-27 14:30:00 | 74.919998 | <NA> | 0.0 | Q | 846.070007 | Q |
| 2021-12-27 14:45:00 | 74.93 | <NA> | 0.0 | Q | 846.070007 | Q |
| 2021-12-27 15:00:00 | 74.940002 | <NA> | 0.0 | Q | 846.070007 | Q |
| 2021-12-27 15:15:00 | 74.93 | <NA> | 0.0 | <NA> | 846.070007 | <NA> |
| 2021-12-27 15:30:00 | 74.940002 | <NA> | 0.0 | <NA> | 846.070007 | <NA> |
| 2021-12-27 15:45:00 | 74.980003 | <NA> | 0.0 | <NA> | 846.070007 | <NA> |
306 rows × 6 columns
[392]:
rules.drain_recharge_flagging_wrap(event_window=3, drain_threshold=-25, max_recharge=2.67, runavg_nstd=2, runavg_wind=4)
pd.options.display.min_rows = 20
rules.qa_events[strt:end]
[392]:
| drain_event | neg_delta_tank | overflow | duplicate | tank_empty | clog | diurnal_flux | over_intensity | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2021-12-24 11:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:30:00 | False | False | False | False | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 13:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:45:00 | False | False | False | False | False | False | False | False |
306 rows × 8 columns
[393]:
rules.drain_recharge_flagging_wrap(event_window=3, drain_threshold=-15, max_recharge=2.67, runavg_nstd=2, runavg_wind=4)
rules.qa_events[strt:end]
[393]:
| drain_event | neg_delta_tank | overflow | duplicate | tank_empty | clog | diurnal_flux | over_intensity | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2021-12-24 11:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:30:00 | False | False | False | False | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 13:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:00:00 | True | True | False | False | False | False | False | False |
| 2021-12-27 14:15:00 | True | False | False | False | False | False | False | False |
| 2021-12-27 14:30:00 | True | False | False | False | False | False | False | False |
| 2021-12-27 14:45:00 | True | False | False | False | False | False | False | False |
| 2021-12-27 15:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:45:00 | False | False | False | False | False | False | False | False |
306 rows × 8 columns
[394]:
rules.drain_recharge_flagging_wrap(event_window=3, drain_threshold=-8, max_recharge=2.67, runavg_nstd=2, runavg_wind=4)
rules.qa_events[strt:end]
[394]:
| drain_event | neg_delta_tank | overflow | duplicate | tank_empty | clog | diurnal_flux | over_intensity | |
|---|---|---|---|---|---|---|---|---|
| Date | ||||||||
| 2021-12-24 11:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 11:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 12:45:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-24 13:30:00 | False | False | False | False | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 13:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 13:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 14:00:00 | True | True | False | False | False | False | False | False |
| 2021-12-27 14:15:00 | True | True | False | False | False | False | False | False |
| 2021-12-27 14:30:00 | True | False | False | False | False | False | False | False |
| 2021-12-27 14:45:00 | True | False | False | False | False | False | False | False |
| 2021-12-27 15:00:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:15:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:30:00 | False | False | False | False | False | False | False | False |
| 2021-12-27 15:45:00 | False | False | False | False | False | False | False | False |
306 rows × 8 columns
OK, the drain was too small to register, I need to drop the definition of a drain down to an 8mm drop. Now both timesteps are ID’d as a negative change in tank depth. And let’s check, but they should have an NA flag.
[395]:
rules.qa_flags[strt:end]
[395]:
| Q | U | C | SetNA | Set0 | E | M | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2021-12-24 11:15:00 | False | False | False | False | False | False | False |
| 2021-12-24 11:30:00 | False | False | False | False | False | False | False |
| 2021-12-24 11:45:00 | False | False | False | False | False | False | False |
| 2021-12-24 12:00:00 | False | False | False | False | False | False | False |
| 2021-12-24 12:15:00 | False | False | False | False | False | False | False |
| 2021-12-24 12:30:00 | False | False | False | False | False | False | False |
| 2021-12-24 12:45:00 | False | False | False | False | False | False | False |
| 2021-12-24 13:00:00 | False | False | False | False | False | False | False |
| 2021-12-24 13:15:00 | False | False | False | False | False | False | False |
| 2021-12-24 13:30:00 | False | False | False | False | False | False | False |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 13:15:00 | False | False | False | False | False | False | False |
| 2021-12-27 13:30:00 | False | False | False | False | False | False | False |
| 2021-12-27 14:00:00 | False | False | False | True | False | False | True |
| 2021-12-27 14:15:00 | False | False | False | True | False | True | True |
| 2021-12-27 14:30:00 | False | False | False | False | False | False | False |
| 2021-12-27 14:45:00 | False | False | False | False | False | False | False |
| 2021-12-27 15:00:00 | False | False | False | False | False | False | False |
| 2021-12-27 15:15:00 | False | False | False | False | False | False | False |
| 2021-12-27 15:30:00 | False | False | False | False | False | False | False |
| 2021-12-27 15:45:00 | False | False | False | False | False | False | False |
306 rows × 7 columns
Yahoo. Fixed. The qa parameters have been adjusted accordingly.
PRI_03¶
Let’s check the other NOAH IV and see how we did.
[396]:
site = 'PRI_03'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[396]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2018-10-05 21:30:00 | Q | Q | Q | ||||
| 2018-10-05 22:30:00 | Q | Q | Q | ||||
| 2018-10-06 01:00:00 | Q | Q | Q | ||||
| 2018-10-06 02:00:00 | Q | Q | Q | ||||
| 2018-10-06 02:15:00 | Q | Q | Q | ||||
| 2018-10-06 03:00:00 | Q | Q | Q | ||||
| 2018-10-06 03:15:00 | Q | Q | Q | ||||
| 2018-10-06 03:45:00 | Q | Q | Q | ||||
| 2018-10-06 04:00:00 | Q | R | Q | ||||
| 2018-10-06 04:15:00 | Q | Q | Q | ||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2024-09-11 10:15:00 | Q | R | Q | ||||
| 2024-09-11 10:30:00 | Q | Q | Q | ||||
| 2024-09-11 10:45:00 | Q | Q | Q | ||||
| 2024-09-11 17:15:00 | Q | Q | Q | ||||
| 2024-09-11 17:30:00 | Q | R | Q | ||||
| 2024-09-11 17:45:00 | Q | Q | Q | ||||
| 2024-09-11 18:00:00 | Q | Q | Q | ||||
| 2024-09-25 12:30:00 | Q | Q | Q | ||||
| 2024-09-25 13:30:00 | Q | R | Q | ||||
| 2024-09-25 13:45:00 | Q | Q | Q |
5818 rows × 7 columns
Well, upon closer inspection, the tank is 100% flagged. This seems to be driving tank flagging:
x>=15 & x<=403.86='Q'
And all precip values >=2 increments (0.25*2) are being flagged as well:
x>=0.511811 & x<=24.9987
This will need to be changed in provisional.
PRI_01¶
[397]:
site = 'PRI_01'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[397]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2019-02-27 10:50:00 | Q | Q | |||||
| 2019-02-27 10:55:00 | Q | Q | |||||
| 2019-02-27 11:00:00 | Q | Q | |||||
| 2019-02-27 11:05:00 | Q | Q | |||||
| 2019-02-27 11:10:00 | Q | Q | |||||
| 2019-02-27 11:15:00 | Q | Q | |||||
| 2019-02-27 11:20:00 | Q | Q | |||||
| 2019-02-27 11:25:00 | Q | Q | |||||
| 2019-02-27 11:30:00 | Q | Q | |||||
| 2019-02-27 11:35:00 | Q | Q | |||||
| ... | ... | ... | ... | ... | ... | ... | ... |
| 2021-12-27 15:00:00 | Q | Q | |||||
| 2021-12-27 15:05:00 | Q | Q | |||||
| 2021-12-27 15:10:00 | Q | Q | |||||
| 2021-12-27 15:15:00 | Q | Q | |||||
| 2021-12-27 15:20:00 | Q | Q | |||||
| 2021-12-27 15:25:00 | Q | Q | |||||
| 2021-12-27 15:30:00 | Q | Q | |||||
| 2022-08-09 17:55:00 | Q | Q | |||||
| 2023-05-16 14:40:00 | Q | Q | |||||
| 2024-08-17 15:05:00 | Q | Q |
2801 rows × 7 columns
[398]:
all_flags[site].event[q & ~qar & ~man].to_clipboard()
[399]:
all_flags[site].data[q & ~qar & ~man].to_clipboard()
OK, most of these are manual flags.
The first one was the snowdown. Power was out and the tipping bucket was unheated despite large amounts of snow. Hopefully the clog routine will replace most of these with U’s.
This one should be changed to M because the tips were caused by the removal of a wasp nest bumping the bucket, not rain. I’ll add it to manual flags.
4/29/19 |
18:50 |
Q |
Q |
4/29/19 |
18:50 |
6.1 |
6.1 |
|
4/29/19 |
18:55 |
Q |
Q |
4/29/19 |
18:55 |
0 |
0 |
|
4/29/19 |
19:00 |
Q |
Q |
4/29/19 |
19:00 |
0 |
12/12-12/13/21 was a malfunctioning heater during a snow storm. Hopefully that will mostly be replaced by U’s.
12/24-12/27/21 funnel was overtopped by snow that accumulated on the table mount. Hopefully this will also be replaced by U’s.
That leaves the following:
8/9/22 |
17:55 |
Q |
Q |
8/9/22 17:55 |
8.64 |
8.64 |
||
5/16/23 |
14:40 |
Q |
Q |
5/16/23 14:40 |
8.38 |
8.38 |
||
8/17/24 |
15:05 |
Q |
Q |
8/17/24 15:05 |
7.91 |
7.91 |
||
9/30/20 |
13:55 |
Q |
Q |
9/30/20 13:55 |
6.1 |
6.1 |
||
1/28/21 |
12:40 |
Q |
Q |
1/28/21 12:40 |
4.83 |
4.83 |
I think all of these are just large precipitation rates, where it was falling heavily. All but one could easily be spring squals or summer thunderstorms. I think this checks out.
H15¶
[401]:
site = 'H15_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[401]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
GSM_02¶
[402]:
site = 'GSM_02'
man = all_flags[site].event['manual_flag'] != ''
qar = all_flags[site].event['QaRule_flag'] != ''
q = all_flags[site].event['prov_flag'].str.contains('Q')
all_flags[site].event[q & ~qar & ~man]
[402]:
| prov_flag | tank_flag | QaRule_flag | manual_flag | final_flag | event_code | explanation | |
|---|---|---|---|---|---|---|---|
| Date |
OK, I think all of the Q values that will be passed on from provisional to final data make sense:
Q flags have been removed when event_code == CLOG, allowing only U, C, or no flag
Q flags on the tank will be passed on to precip, ensuring that manual flags from provisional are present in final when they aren’t superceded by a more specific flag such as U or C.
Q flags are retained when there is precip, but less than three tank values in a water year, since this is too few values for a trend in tank levels to be clear
A few issues have been addressed, and the current state seems great.
J and R Flags¶
J flags are for jumps in the tank value. Unfortunately the threshold for these is set quite low, so they show up all over the place. I wouldn’t automatically give them a Q, because that would overflag by a lot.
R is the flip side, it’s a tank drop, or the recharge following a tank drop. It’s not consistent enough to use on it’s own, and it should be superceded by the drain and recharge flagging routines already in place.
So neither of these flags will be placed in the final dataset.
W Flags¶
W flags are only used to signal that simple_pre.m is switching states between NOT_RAINING and RAINING (or Winter). There is no benefit from passing this flag on to final data and it would not be meaningful to the user.
[ ]: