Temporal masking
deterministic probabilistic
The computation of many metrics are based on a time step by time step
evaluation before a temporal summary statistic is calculated across study
period. If metrics are required for several sub-periods in a given study
period, and these sub-periods happen to overlap, computing these metrics on
each sub-period separately could result in computing the same time step by
time step computation several times. To avoid these redundant computations,
evalhyd
offers a temporal masking functionality where a mask is applied
on the given study period to generate temporal subsets (i.e. sub-periods).
This way, the time step by time step computations are performed once,
before the temporal summary statistic is performed for each sub-period
including the relevant time steps.
A temporal mask consists in a vector of boolean values of the same length
as the study period where True
(or 1
) values indicate time steps to
include in the temporal subset, and False
(or 0
) values indicate time
steps to exclude in the temporal subset.
To illustrate, let’s look at the simple dataset provided below.
time index 0 1 2 3 4 5
observations 351 367 377 378 330 324
predictions 312 335 358 342 327 327
mask True True False True False True
In this example, the mask provided will result in only considering the time
steps at indices 0
, 1
, 3
, and 5
in the observations and in the
predictions for the computation of the desired metrics.
As a result, the two following calls to evalhyd
yield the same result:
>>> import evalhyd
>>> import numpy as np
>>> (
... evalhyd.evald(
... q_obs=np.array([351, 367, 377, 378, 330, 324]),
... q_prd=np.array([312, 335, 358, 342, 327, 327]),
... metrics=["NSE"],
... t_msk=np.array([[[True, True, False, True, False, True]]])
... )
... == evalhyd.evald(
... q_obs=np.array([351, 367, 378, 324]),
... q_prd=np.array([312, 335, 342, 327]),
... metrics=["NSE"]
... )
... )
True
> all(
+ evalhyd::evald(
+ q_obs = c(351, 367, 377, 378, 330, 324),
+ q_prd = c(312, 335, 358, 342, 327, 327),
+ metrics = c("NSE"),
+ t_msk = array(c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE), dim = c(1, 1, 6))
+ )[[1]]
+ == evalhyd::evald(
+ q_obs = c(351, 367, 378, 324),
+ q_prd = c(312, 335, 342, 327),
+ metrics = c("NSE")
+ )[[1]]
+ )
[1] TRUE
Tip
If temporal masks cannot be easily generated by the user, conditional masking may be an easier alternative.