Temporal masking

deterministic probabilistic

The computation of many metrics are based on a time step by time step evaluation before a temporal summary statistic is calculated across study period. If metrics are required for several sub-periods in a given study period, and these sub-periods happen to overlap, computing these metrics on each sub-period separately could result in computing the same time step by time step computation several times. To avoid these redundant computations, evalhyd offers a temporal masking functionality where a mask is applied on the given study period to generate temporal subsets (i.e. sub-periods). This way, the time step by time step computations are performed once, before the temporal summary statistic is performed for each sub-period including the relevant time steps.

A temporal mask consists in a vector of boolean values of the same length as the study period where True (or 1) values indicate time steps to include in the temporal subset, and False (or 0) values indicate time steps to exclude in the temporal subset.

To illustrate, let’s look at the simple dataset provided below.

time index       0           1           2           3           4           5

observations     351         367         377         378         330         324
predictions      312         335         358         342         327         327

mask             True        True        False       True        False       True

In this example, the mask provided will result in only considering the time steps at indices 0, 1, 3, and 5 in the observations and in the predictions for the computation of the desired metrics.

As a result, the two following calls to evalhyd yield the same result:

>>> import evalhyd
>>> import numpy as np
>>> (
...     evalhyd.evald(
...         q_obs=np.array([351, 367, 377, 378, 330, 324]),
...         q_prd=np.array([312, 335, 358, 342, 327, 327]),
...         metrics=["NSE"],
...         t_msk=np.array([[[True, True, False, True, False, True]]])
...     )
...     == evalhyd.evald(
...         q_obs=np.array([351, 367, 378, 324]),
...         q_prd=np.array([312, 335, 342, 327]),
...         metrics=["NSE"]
...     )
... )
True
> all(
+     evalhyd::evald(
+         q_obs = c(351, 367, 377, 378, 330, 324),
+         q_prd = c(312, 335, 358, 342, 327, 327),
+         metrics = c("NSE"),
+         t_msk = array(c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE), dim = c(1, 1, 6))
+     )[[1]]
+     == evalhyd::evald(
+         q_obs = c(351, 367, 378, 324),
+         q_prd = c(312, 335, 342, 327),
+         metrics = c("NSE")
+     )[[1]]
+ )
[1] TRUE

Tip

If temporal masks cannot be easily generated by the user, conditional masking may be an easier alternative.