Memoisation
Across metrics
deterministic probabilistic
Since certain evaluation metrics require the same intermediate computations, there is scope for some optimisation by storing these intermediate computations so that they can be used by all the evaluation metrics requiring them (i.e. the concept of memoisation in computing).
For example, the deterministic metrics NSE and KGE both require to compute the quadratic error between the observations and their arithmetic mean, so it is more efficient to compute this quadratic error only once, and reuse it in the computation of both NSE and KGE.
evalhyd
implements such approach to compute its evaluation metrics, this is
why it is recommended to ask for all evaluation metrics of interest at once in
a single call to evalhyd
rather than ask for them separately in several
calls.
That is to say, prefer:
>>> evalhyd.evald(obs, prd, ["NSE", "KGE"])
> evalhyd::evald(obs, prd, c("NSE", "KGE"))
$ ./evalhyd evald "obs.csv" "prd.csv" "NSE" "KGE"
over:
>>> evalhyd.evald(obs, prd, ["NSE"])
>>> evalhyd.evald(obs, prd, ["KGE"])
> evalhyd::evald(obs, prd, c("NSE"))
> evalhyd::evald(obs, prd, c("KGE"))
$ ./evalhyd evald "obs.csv" "prd.csv" "NSE"
$ ./evalhyd evald "obs.csv" "prd.csv" "KGE"
Across masks
deterministic probabilistic
In addition, most evaluation metrics first perform intermediate computations on each time step individually (e.g. errors between individual observations and their corresponding predictions), before performing some reduction across all time steps (e.g. arithmetic mean of these individual errors).
If different subset periods of the entire study period are needed (i.e.
using the temporal masking or the
conditional masking functionalities), and these
sub-periods happen to overlap, it is recommended to provide several masks
at once to evalhyd
rather than one mask at a time. Indeed, evalhyd
applies the masks only after the intermediate computations on individual
time steps are computed, thus optimising the computation time by avoiding
performing these intermediate computations on the same time steps several
times.
That is to say, prefer:
>>> res = evalhyd.evald(
... obs, prd, ["NSE"],
... t_msk=np.array([[[True, True, False, True, False, True],
... [False, True, True, True, False, True]]])
... )
> evalhyd::evald(
+ obs, prd, c("NSE")
+ t_msk = array(
+ data = rbind(c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE),
+ c(FALSE, TRUE, TRUE, TRUE, FALSE, TRUE)),
+ dim = c(1, 2, 6)
+ )
+ )
over:
>>> res = evalhyd.evald(
... obs, prd, ["NSE"],
... t_msk=np.array([[[True, True, False, True, False, True]]])
... )
>>> res = evalhyd.evald(
... obs, prd, ["NSE"],
... t_msk=np.array([[[False, True, True, True, False, True]]])
... )
> evalhyd::evald(
+ obs, prd, c("NSE")
+ t_msk = array(
+ data = c(TRUE, TRUE, FALSE, TRUE, FALSE, TRUE),
+ dim = c(1, 1, 6)
+ )
+ )
> evalhyd::evald(
+ obs, prd, c("NSE")
+ t_msk = array(
+ data = c(FALSE, TRUE, TRUE, TRUE, FALSE, TRUE),
+ dim = c(1, 1, 6)
+ )
+ )