GSM Parameters¶
Test parameters to use for GSM (Gauging Station Mack) QA
[1]:
import pandas as pd
import matplotlib.pyplot as plt
# Jupyter magic to make plots display interactive
# must install ipympl (Ipython-matplotlib) and nodejs
from ipywidgets.embed import embed_minimal_html
%matplotlib widget
import sys
sys.path.append("../")
from post_gce_qc import qaqc, data_transfer, cross_probe_qc, main
[2]:
prov = data_transfer.LoadProvisionalData(file_n='../config_new.yaml', strtyr=2019, endyr=2025, fname_base='MS00413_PPT_L1_5min_')
prov.load_ppt_data()
df = prov.pivot_on_probe(prov.df, 'GSM', '02')
[3]:
df.ACC.plot(grid=True)
[3]:
<Axes: xlabel='Date'>
[4]:
qa = qaqc.QaRules(df, {'precision':0.2})
Double Precip¶
[5]:
qa.flag_double_precip(min_ppt_nprecision=2, flat_tank_nprecision=1, flat_ppt_nprecision=1)
[6]:
qa.qa_events.duplicate[qa.qa_events.duplicate==True]
[6]:
Series([], Name: duplicate, dtype: bool[pyarrow])
Let’s check for empty tanks, since they are the most likely moments to have doubled precip amounts.
[7]:
plt.figure()
df.INST.plot(grid=True)
[7]:
<Axes: xlabel='Date'>
Never gets anywhere near 0. We can assume that this would work if there were any logger reboots such as a power loss or a new program install.
Empty Tank¶
[8]:
qa.flag_empty_tank(pause_nsteps=2)
[9]:
qa.df_orig[qa.qa_events.tank_empty==True]
[9]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date |
That makes sense, it looks like the tank was always above zero.
Drain Recharge¶
[10]:
# use defaults
qa.drain_recharge_flagging_wrap()
[11]:
pd.options.display.min_rows = 30
qa.df_orig[qa.qa_events.drain_event==True]
[11]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2018-10-23 10:10:00 | 28.16 | R | 0.0 | R | 51.200001 | R |
| 2018-10-23 10:15:00 | 28.16 | <NA> | 0.0 | <NA> | 51.200001 | <NA> |
| 2018-10-23 10:20:00 | 28.16 | <NA> | 0.0 | <NA> | 51.200001 | <NA> |
| 2018-11-14 13:05:00 | 39.32 | R | 0.0 | R | 204.440002 | R |
| 2018-11-14 13:10:00 | 39.32 | <NA> | 0.0 | <NA> | 204.440002 | <NA> |
| 2018-11-14 13:15:00 | 39.32 | <NA> | 0.0 | <NA> | 204.440002 | <NA> |
| 2018-12-05 08:50:00 | 35.669998 | R | 0.0 | R | 417.619995 | R |
| 2018-12-05 08:55:00 | 35.669998 | <NA> | 0.0 | <NA> | 417.619995 | <NA> |
| 2018-12-05 09:00:00 | 35.669998 | <NA> | 0.0 | <NA> | 417.619995 | <NA> |
| 2018-12-28 14:55:00 | 180.399994 | R | 0.0 | R | 756.450012 | R |
| 2018-12-28 15:00:00 | 50.110001 | R | 0.0 | R | 756.450012 | R |
| 2018-12-28 15:05:00 | 50.110001 | <NA> | 0.0 | R | 756.450012 | R |
| 2018-12-28 15:10:00 | 50.110001 | <NA> | 0.0 | <NA> | 756.450012 | <NA> |
| 2019-01-16 10:05:00 | 38.59 | R | 0.0 | R | 847.340027 | R |
| 2019-01-16 10:10:00 | 38.779999 | <NA> | 0.19 | <NA> | 847.530029 | <NA> |
| ... | ... | ... | ... | ... | ... | ... |
| 2025-01-07 11:55:00 | 48.830002 | <NA> | 0.0 | <NA> | 1371.97998 | <NA> |
| 2025-02-19 15:40:00 | 160.199997 | R | 0.0 | R | 1663.0 | R |
| 2025-02-19 15:45:00 | 30.360001 | R | 0.0 | R | 1663.0 | R |
| 2025-02-19 15:50:00 | 30.540001 | <NA> | 0.0 | R | 1663.0 | R |
| 2025-02-19 15:55:00 | 30.540001 | <NA> | 0.0 | <NA> | 1663.0 | <NA> |
| 2025-03-12 12:50:00 | 133.0 | R | 0.0 | R | 1889.560059 | R |
| 2025-03-12 12:55:00 | 29.08 | R | 0.0 | R | 1889.560059 | R |
| 2025-03-12 13:00:00 | 29.08 | <NA> | 0.0 | R | 1889.560059 | R |
| 2025-03-12 13:05:00 | 29.26 | <NA> | 0.18 | <NA> | 1889.73999 | <NA> |
| 2025-04-01 09:15:00 | 244.5 | R | 0.0 | R | 2346.780029 | R |
| 2025-04-01 09:20:00 | 28.530001 | R | 0.0 | R | 2346.780029 | R |
| 2025-04-01 09:25:00 | 28.530001 | <NA> | 0.0 | R | 2346.780029 | R |
| 2025-04-01 09:30:00 | 28.530001 | <NA> | 0.0 | <NA> | 2346.780029 | <NA> |
| 2025-06-17 13:40:00 | 178.399994 | <NA> | 0.0 | <NA> | 2497.25 | <NA> |
| 2025-06-17 13:45:00 | 178.399994 | <NA> | 0.0 | <NA> | 2497.25 | <NA> |
251 rows × 6 columns
[12]:
qa.df_orig[(qa.qa_events.drain_event==True)&(qa.df_orig.TOT > 0)]
[12]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-01-16 10:10:00 | 38.779999 | <NA> | 0.19 | <NA> | 847.530029 | <NA> |
| 2019-09-25 09:05:00 | 45.529999 | <NA> | 0.18 | <NA> | 2011.949951 | <NA> |
| 2020-03-31 11:55:00 | 56.900002 | <NA> | 0.19 | <NA> | 1434.060059 | <NA> |
| 2020-06-23 14:05:00 | 55.400002 | <NA> | 0.19 | <NA> | 1878.469971 | <NA> |
| 2020-11-18 11:55:00 | 55.77 | <NA> | 0.18 | <NA> | 428.73999 | <NA> |
| 2021-09-28 08:20:00 | 34.560001 | <NA> | 0.18 | <NA> | 2196.310059 | <NA> |
| 2021-12-19 16:45:00 | 41.880001 | <NA> | 0.18 | <NA> | 745.039978 | <NA> |
| 2021-12-19 16:50:00 | 42.060001 | <NA> | 0.18 | <NA> | 745.219971 | <NA> |
| 2022-01-13 14:55:00 | 48.830002 | <NA> | 0.18 | <NA> | 1176.140015 | <NA> |
| 2022-11-22 14:30:00 | 45.900002 | <NA> | 0.73 | <NA> | 368.619995 | <NA> |
| 2022-11-22 14:35:00 | 46.27 | <NA> | 0.37 | <NA> | 368.98999 | <NA> |
| 2023-03-29 17:40:00 | 34.189999 | <NA> | 0.18 | <NA> | 1404.469971 | <NA> |
| 2023-12-05 14:55:00 | 31.639999 | <NA> | 0.18 | <NA> | 741.25 | <NA> |
| 2024-01-18 15:55:00 | 27.620001 | <NA> | 0.19 | <NA> | 1358.599976 | <NA> |
| 2025-03-12 13:05:00 | 29.26 | <NA> | 0.18 | <NA> | 1889.73999 | <NA> |
Those are all marginal amounts of precip. Let’s take a closer look at the one big one.
[13]:
strt, end = pd.to_datetime('11/22/22 1330'), pd.to_datetime('11/22/22 1530')
qa.df_orig.loc[strt:end]
[13]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-11-22 13:30:00 | 249.600006 | <NA> | 0.5 | <NA> | 361.790009 | <NA> |
| 2022-11-22 13:35:00 | 249.800003 | <NA> | 0.2 | <NA> | 361.98999 | <NA> |
| 2022-11-22 13:40:00 | 250.199997 | <NA> | 0.4 | <NA> | 362.390015 | <NA> |
| 2022-11-22 13:45:00 | 250.699997 | <NA> | 0.5 | <NA> | 362.890015 | <NA> |
| 2022-11-22 13:50:00 | 251.5 | <NA> | 0.8 | <NA> | 363.690002 | <NA> |
| 2022-11-22 13:55:00 | 251.800003 | <NA> | 0.3 | <NA> | 363.98999 | <NA> |
| 2022-11-22 14:00:00 | 252.899994 | <NA> | 1.1 | <NA> | 365.089996 | <NA> |
| 2022-11-22 14:05:00 | 253.699997 | <NA> | 0.8 | <NA> | 365.890015 | <NA> |
| 2022-11-22 14:10:00 | 254.199997 | <NA> | 0.5 | <NA> | 366.390015 | <NA> |
| 2022-11-22 14:15:00 | 254.899994 | <NA> | 0.7 | <NA> | 367.089996 | <NA> |
| 2022-11-22 14:20:00 | 255.699997 | <NA> | 0.8 | <NA> | 367.890015 | <NA> |
| 2022-11-22 14:25:00 | 45.169998 | R | 0.0 | R | 367.890015 | R |
| 2022-11-22 14:30:00 | 45.900002 | <NA> | 0.73 | <NA> | 368.619995 | <NA> |
| 2022-11-22 14:35:00 | 46.27 | <NA> | 0.37 | <NA> | 368.98999 | <NA> |
| 2022-11-22 14:40:00 | 46.82 | <NA> | 0.55 | <NA> | 369.540009 | <NA> |
| 2022-11-22 14:45:00 | 47.369999 | <NA> | 0.55 | <NA> | 370.089996 | <NA> |
| 2022-11-22 14:50:00 | 47.73 | <NA> | 0.36 | <NA> | 370.450012 | <NA> |
| 2022-11-22 14:55:00 | 48.279999 | <NA> | 0.55 | <NA> | 371.0 | <NA> |
| 2022-11-22 15:00:00 | 48.650002 | <NA> | 0.37 | <NA> | 371.369995 | <NA> |
| 2022-11-22 15:05:00 | 48.830002 | <NA> | 0.18 | <NA> | 371.549988 | <NA> |
| 2022-11-22 15:10:00 | 49.380001 | <NA> | 0.55 | <NA> | 372.100006 | <NA> |
| 2022-11-22 15:15:00 | 49.380001 | <NA> | 0.0 | <NA> | 372.100006 | <NA> |
| 2022-11-22 15:20:00 | 49.560001 | <NA> | 0.18 | <NA> | 372.279999 | <NA> |
| 2022-11-22 15:25:00 | 49.560001 | <NA> | 0.0 | <NA> | 372.279999 | <NA> |
| 2022-11-22 15:30:00 | 49.560001 | <NA> | 0.0 | <NA> | 372.279999 | <NA> |
Ok, that doesn’t look like recharge. Let’s see how it was flagged, and focus on precip that was flagged E as missing or flagged Q.
[14]:
qa.qa_flags[(qa.qa_events.drain_event==True)&(qa.df_orig.TOT > 0)]
[14]:
| Q | U | C | SetNA | Set0 | E | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-01-16 10:10:00 | True | False | False | False | False | False |
| 2019-09-25 09:05:00 | True | False | False | False | False | False |
| 2020-03-31 11:55:00 | False | False | False | False | False | False |
| 2020-06-23 14:05:00 | True | False | False | False | False | False |
| 2020-11-18 11:55:00 | True | False | False | False | False | False |
| 2021-09-28 08:20:00 | False | False | False | False | False | False |
| 2021-12-19 16:45:00 | False | False | False | False | False | False |
| 2021-12-19 16:50:00 | False | False | False | False | False | False |
| 2022-01-13 14:55:00 | False | False | False | False | False | False |
| 2022-11-22 14:30:00 | False | False | False | False | False | False |
| 2022-11-22 14:35:00 | False | False | False | False | False | False |
| 2023-03-29 17:40:00 | True | False | False | False | False | False |
| 2023-12-05 14:55:00 | False | False | False | False | False | False |
| 2024-01-18 15:55:00 | False | False | False | False | False | False |
| 2025-03-12 13:05:00 | False | False | False | False | False | False |
Good, 11/22/22 wasn’t flagged. We’ll test the 5 that got a Q.
[15]:
strt, end = pd.to_datetime('1/16/19 0915'), pd.to_datetime('1/16/19 1115')
qa.df_orig.loc[strt:end]
[15]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-01-16 09:15:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:20:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:25:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:30:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:35:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:40:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:45:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:50:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 09:55:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 10:00:00 | 141.0 | <NA> | 0.0 | <NA> | 847.340027 | <NA> |
| 2019-01-16 10:05:00 | 38.59 | R | 0.0 | R | 847.340027 | R |
| 2019-01-16 10:10:00 | 38.779999 | <NA> | 0.19 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:15:00 | 38.779999 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:20:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:25:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:30:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:35:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:40:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:45:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:50:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 10:55:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 11:00:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 11:05:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 11:10:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
| 2019-01-16 11:15:00 | 38.77 | <NA> | 0.0 | <NA> | 847.530029 | <NA> |
OK, that random rain after the drain definitely seems suspicious. Possibly bounce back following the drain
[16]:
strt = pd.to_datetime('9/25/19 0805')
end = strt + pd.to_timedelta('2H')
qa.df_orig.loc[strt:end]
/var/folders/vs/y0_kk_gj2jxcb2z5xvlgv9g80000gq/T/ipykernel_28516/4291182737.py:2: FutureWarning: 'H' is deprecated and will be removed in a future version. Please use 'h' instead of 'H'.
end = strt + pd.to_timedelta('2H')
[16]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-09-25 08:05:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:10:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:15:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:20:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:25:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:30:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:35:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:40:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:45:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:50:00 | 288.5 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 08:55:00 | 45.349998 | R | 0.0 | R | 2011.77002 | R |
| 2019-09-25 09:00:00 | 45.349998 | <NA> | 0.0 | <NA> | 2011.77002 | <NA> |
| 2019-09-25 09:05:00 | 45.529999 | <NA> | 0.18 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:10:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:15:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:20:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:25:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:30:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:35:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:40:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:45:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:50:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 09:55:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 10:00:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
| 2019-09-25 10:05:00 | 45.529999 | <NA> | 0.0 | <NA> | 2011.949951 | <NA> |
Lone precip right after a drain. Definitely suspicious.
[17]:
strt = pd.to_datetime('6/23/20 1305')
end = strt + pd.to_timedelta('2H')
qa.df_orig.loc[strt:end]
/var/folders/vs/y0_kk_gj2jxcb2z5xvlgv9g80000gq/T/ipykernel_28516/1181996602.py:2: FutureWarning: 'H' is deprecated and will be removed in a future version. Please use 'h' instead of 'H'.
end = strt + pd.to_timedelta('2H')
[17]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-06-23 13:05:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:10:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:15:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:20:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:25:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:30:00 | 247.5 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:35:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:40:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:45:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:50:00 | 247.699997 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 13:55:00 | 55.209999 | R | 0.0 | R | 1878.280029 | R |
| 2020-06-23 14:00:00 | 55.209999 | <NA> | 0.0 | <NA> | 1878.280029 | <NA> |
| 2020-06-23 14:05:00 | 55.400002 | <NA> | 0.19 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:10:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:15:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:20:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:25:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:30:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:35:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:40:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:45:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:50:00 | 55.400002 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 14:55:00 | 55.389999 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 15:00:00 | 55.389999 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
| 2020-06-23 15:05:00 | 55.389999 | <NA> | 0.0 | <NA> | 1878.469971 | <NA> |
[18]:
strt = pd.to_datetime('11/18/20 1055')
end = strt + pd.to_timedelta('2H')
qa.df_orig.loc[strt:end]
/var/folders/vs/y0_kk_gj2jxcb2z5xvlgv9g80000gq/T/ipykernel_28516/2933260931.py:2: FutureWarning: 'H' is deprecated and will be removed in a future version. Please use 'h' instead of 'H'.
end = strt + pd.to_timedelta('2H')
[18]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2020-11-18 10:55:00 | 306.299988 | <NA> | 0.0 | <NA> | 427.459991 | <NA> |
| 2020-11-18 11:00:00 | 306.299988 | <NA> | 0.0 | <NA> | 427.459991 | <NA> |
| 2020-11-18 11:05:00 | 306.299988 | <NA> | 0.0 | <NA> | 427.459991 | <NA> |
| 2020-11-18 11:10:00 | 306.5 | <NA> | 0.2 | <NA> | 427.660004 | <NA> |
| 2020-11-18 11:15:00 | 306.600006 | <NA> | 0.1 | <NA> | 427.76001 | <NA> |
| 2020-11-18 11:20:00 | 306.600006 | <NA> | 0.0 | <NA> | 427.76001 | <NA> |
| 2020-11-18 11:25:00 | 306.799988 | <NA> | 0.2 | <NA> | 427.959991 | <NA> |
| 2020-11-18 11:30:00 | 307.0 | <NA> | 0.2 | <NA> | 428.160004 | <NA> |
| 2020-11-18 11:35:00 | 307.200012 | <NA> | 0.2 | <NA> | 428.359985 | <NA> |
| 2020-11-18 11:40:00 | 307.399994 | <NA> | 0.2 | <NA> | 428.559998 | <NA> |
| 2020-11-18 11:45:00 | 55.59 | R | 0.0 | R | 428.559998 | R |
| 2020-11-18 11:50:00 | 55.59 | <NA> | 0.0 | <NA> | 428.559998 | <NA> |
| 2020-11-18 11:55:00 | 55.77 | <NA> | 0.18 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:00:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:05:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:10:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:15:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:20:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:25:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:30:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:35:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:40:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:45:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:50:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
| 2020-11-18 12:55:00 | 55.77 | <NA> | 0.0 | <NA> | 428.73999 | <NA> |
OK, last check, the drain events should be pretty straight forward, and shouldn’t show up on weird jiggle in the tank data.
[29]:
f = qaqc.ApplyFlags(df.index, precision=0.2)
f.import_provisional_data(df)
f.apply_QaRules_flags(qa.qa_events, qa.qa_flags)
[26]:
for f in range(3,78):
plt.close(f)
[43]:
#for day in pd.to_datetime(pd.Series(qa.qa_events.loc[qa.qa_events.drain_event==True,'drain_event'].index)).dt.date.unique():
dy = pd.to_datetime(day)
f.plot_flagged_day(dy, 'GSM_02', auto_qa_event=qa.qa_events)
[43]:
(<Axes: xlabel='Date', ylabel='Precip (mm)'>,
<Axes: title={'center': 'GSM_02 - 2025-06-17 00:00:00'}, xlabel='Date', ylabel='Tank Height (mm)'>)
I checked all 74 graphs. This was the only one that looked mis-flagged as a drain. Everything else looked like it was adding drain events to actual drains and leaving most of the precip unflagged because there was none, or it was a small amount during a heavy rain.
This one problem graph above didn’t flag anything, just assigned a drain event. Given how steady the tank is before and after, I don’t think I can keep it from being flagged.
Repeating Value Precip¶
This seems mostly to be a problem for the LT420/LPRefineMe sensors, but it is an artifact of simple_pre.m whenever there is signal bounce, so let’s make sure we’re catching anything there is to catch.
[32]:
# try defaults
qa.flag_repeating_val_precip()
[33]:
qa.df_orig[qa.qa_events.duplicate==True]
[33]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date |
[34]:
# up the tank oscillations to 2x precision
qa.flag_repeating_val_precip(min_ppt_nprecision=0, flat_ppt_nprecision=0, flat_tank_nprecision=2, min_number_repeating=5)
[35]:
qa.df_orig[qa.qa_events.duplicate==True]
[35]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2019-04-07 12:10:00 | 278.100006 | <NA> | 0.2 | <NA> | 1478.880005 | <NA> |
| 2019-04-07 12:15:00 | 278.299988 | <NA> | 0.2 | <NA> | 1479.079956 | <NA> |
| 2019-04-07 12:20:00 | 278.5 | <NA> | 0.2 | <NA> | 1479.280029 | <NA> |
| 2019-04-07 12:25:00 | 278.700012 | <NA> | 0.2 | <NA> | 1479.47998 | <NA> |
| 2019-04-07 12:30:00 | 278.899994 | <NA> | 0.2 | <NA> | 1479.680054 | <NA> |
| 2019-04-07 23:25:00 | 316.5 | <NA> | 0.2 | <NA> | 1517.280029 | <NA> |
| 2019-04-07 23:30:00 | 316.700012 | <NA> | 0.2 | <NA> | 1517.47998 | <NA> |
| 2019-04-07 23:35:00 | 316.899994 | <NA> | 0.2 | <NA> | 1517.680054 | <NA> |
| 2019-04-07 23:40:00 | 317.100006 | <NA> | 0.2 | <NA> | 1517.880005 | <NA> |
| 2019-04-07 23:45:00 | 317.299988 | <NA> | 0.2 | <NA> | 1518.079956 | <NA> |
| 2020-01-29 16:00:00 | 478.200012 | <NA> | 0.2 | <NA> | 1006.75 | <NA> |
| 2020-01-29 16:05:00 | 478.399994 | <NA> | 0.2 | <NA> | 1006.950012 | <NA> |
| 2020-01-29 16:10:00 | 478.600006 | <NA> | 0.2 | <NA> | 1007.150024 | <NA> |
| 2020-01-29 16:15:00 | 478.799988 | <NA> | 0.2 | <NA> | 1007.349976 | <NA> |
| 2020-01-29 16:20:00 | 479.0 | <NA> | 0.2 | <NA> | 1007.549988 | <NA> |
| 2020-03-29 08:40:00 | 170.300003 | <NA> | 0.2 | <NA> | 1331.369995 | <NA> |
| 2020-03-29 08:45:00 | 170.5 | <NA> | 0.2 | <NA> | 1331.569946 | <NA> |
| 2020-03-29 08:50:00 | 170.699997 | <NA> | 0.2 | <NA> | 1331.77002 | <NA> |
| 2020-03-29 08:55:00 | 170.899994 | <NA> | 0.2 | <NA> | 1331.969971 | <NA> |
| 2020-03-29 09:00:00 | 171.100006 | <NA> | 0.2 | <NA> | 1332.170044 | <NA> |
| 2021-02-15 01:50:00 | 182.100006 | <NA> | 0.2 | <NA> | 1456.77002 | <NA> |
| 2021-02-15 01:55:00 | 182.300003 | <NA> | 0.2 | <NA> | 1456.969971 | <NA> |
| 2021-02-15 02:00:00 | 182.5 | <NA> | 0.2 | <NA> | 1457.170044 | <NA> |
| 2021-02-15 02:05:00 | 182.699997 | <NA> | 0.2 | <NA> | 1457.369995 | <NA> |
| 2021-02-15 02:10:00 | 182.899994 | <NA> | 0.2 | <NA> | 1457.569946 | <NA> |
| 2024-10-17 05:10:00 | 99.099998 | <NA> | 0.2 | <NA> | 19.559999 | <NA> |
| 2024-10-17 05:15:00 | 99.300003 | <NA> | 0.2 | <NA> | 19.76 | <NA> |
| 2024-10-17 05:20:00 | 99.5 | <NA> | 0.2 | <NA> | 19.959999 | <NA> |
| 2024-10-17 05:25:00 | 99.699997 | <NA> | 0.2 | <NA> | 20.16 | <NA> |
| 2024-10-17 05:30:00 | 99.900002 | <NA> | 0.2 | <NA> | 20.360001 | <NA> |
| 2024-11-20 07:30:00 | 326.600006 | <NA> | 0.2 | <NA> | 602.549988 | <NA> |
| 2024-11-20 07:35:00 | 326.799988 | <NA> | 0.2 | <NA> | 602.75 | <NA> |
| 2024-11-20 07:40:00 | 327.0 | <NA> | 0.2 | <NA> | 602.950012 | <NA> |
| 2024-11-20 07:45:00 | 327.200012 | <NA> | 0.2 | <NA> | 603.150024 | <NA> |
| 2024-11-20 07:50:00 | 327.399994 | <NA> | 0.2 | <NA> | 603.349976 | <NA> |
That all looks like real precip. Best to go back to the default parameters.
Tank Fluctuations¶
[36]:
qa.flag_precip_during_tank_flux(tank_col='INST', ppt_col='TOT', fluxprecision=5, accprecision=5,
zero_precision=2, fluxwindow=289, accwindow='1D', extend_ahead=3)
[37]:
qa.df_orig[qa.qa_events.diurnal_flux==True]
[37]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date |
This is what happened with the PAT sensor at H15 too. Let’s try using those parameters.
[38]:
qa.flag_precip_during_tank_flux(tank_col='INST', ppt_col='TOT', fluxprecision=4, accprecision=2,
zero_precision=3, fluxwindow=289, accwindow='1D', extend_ahead=3)
[39]:
qa.df_orig[qa.qa_events.diurnal_flux==True]
[39]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-04-27 00:00:00 | 40.240002 | <NA> | 0.18 | <NA> | 1789.969971 | <NA> |
| 2022-04-27 00:05:00 | 40.419998 | <NA> | 0.18 | <NA> | 1790.150024 | <NA> |
| 2022-04-27 00:15:00 | 40.610001 | <NA> | 0.19 | <NA> | 1790.339966 | <NA> |
[41]:
strt = pd.to_datetime('4/26/22 2300')
end = strt + pd.to_timedelta('2h')
qa.df_orig.loc[strt:end]
[41]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date | ||||||
| 2022-04-26 23:00:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:05:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:10:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:15:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:20:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:25:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:30:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:35:00 | 39.869999 | <NA> | 0.0 | <NA> | 1789.609985 | <NA> |
| 2022-04-26 23:40:00 | 40.060001 | <NA> | 0.18 | <NA> | 1789.790039 | <NA> |
| 2022-04-26 23:45:00 | 40.060001 | <NA> | 0.0 | <NA> | 1789.790039 | <NA> |
| 2022-04-26 23:50:00 | 40.060001 | <NA> | 0.0 | <NA> | 1789.790039 | <NA> |
| 2022-04-26 23:55:00 | 40.060001 | <NA> | 0.0 | <NA> | 1789.790039 | <NA> |
| 2022-04-27 00:00:00 | 40.240002 | <NA> | 0.18 | <NA> | 1789.969971 | <NA> |
| 2022-04-27 00:05:00 | 40.419998 | <NA> | 0.18 | <NA> | 1790.150024 | <NA> |
| 2022-04-27 00:10:00 | 40.419998 | <NA> | 0.0 | <NA> | 1790.150024 | <NA> |
| 2022-04-27 00:15:00 | 40.610001 | <NA> | 0.19 | <NA> | 1790.339966 | <NA> |
| 2022-04-27 00:20:00 | 40.610001 | <NA> | 0.0 | <NA> | 1790.339966 | <NA> |
| 2022-04-27 00:25:00 | 40.599998 | <NA> | 0.0 | <NA> | 1790.339966 | <NA> |
| 2022-04-27 00:30:00 | 40.610001 | <NA> | 0.0 | <NA> | 1790.339966 | <NA> |
| 2022-04-27 00:35:00 | 40.790001 | <NA> | 0.18 | <NA> | 1790.52002 | <NA> |
| 2022-04-27 00:40:00 | 40.790001 | <NA> | 0.0 | <NA> | 1790.52002 | <NA> |
| 2022-04-27 00:45:00 | 40.970001 | <NA> | 0.18 | <NA> | 1790.699951 | <NA> |
| 2022-04-27 00:50:00 | 40.970001 | <NA> | 0.0 | <NA> | 1790.699951 | <NA> |
| 2022-04-27 00:55:00 | 40.970001 | <NA> | 0.0 | <NA> | 1790.699951 | <NA> |
| 2022-04-27 01:00:00 | 41.16 | <NA> | 0.19 | <NA> | 1790.890015 | <NA> |
[44]:
plt.figure()
qa.df_orig.loc[strt:end, 'INST'].plot(grid=True)
[44]:
<Axes: xlabel='Date'>
OK, we definitely don’t want to flag that. Let’s use the H15 params, but tweak them to not show this.
[51]:
qa.qa_events.diurnal_flux = False
[52]:
qa.flag_precip_during_tank_flux(tank_col='INST', ppt_col='TOT', fluxprecision=4, accprecision=3,
zero_precision=2, fluxwindow=289, accwindow='1D', extend_ahead=3)
qa.df_orig[qa.qa_events.diurnal_flux==True]
[52]:
| INST | INST_Flag | TOT | TOT_Flag | ACC | ACC_Flag | |
|---|---|---|---|---|---|---|
| Date |
[ ]: