Handling of missing data

deterministic probabilistic

Missing data must be flagged as “not a number” regardless of the utility used in the evalhyd stack. Missing data can be present both in the streamflow observations and/or the streamflow predictions. All time steps where either observations or predictions are missing are not taken into account in the computation of the metrics (a method sometimes referred to as pairwise deletion).

That is to say that these two calls to evalhyd yield the same result:

>>> import evalhyd
>>> import numpy as np
>>> (
...     evalhyd.evald(np.array([7, 3, 3, np.nan, 5]), np.array([5, 4, 3, 5, np.nan]), ["KGE"])
...     == evalhyd.evald(np.array([7, 3, 3]), np.array([5, 4, 3]), ["KGE"])
... )
True
> all(
+    evalhyd::evald(c(7, 3, 3, NA, 5), c(5, 4, 3, 5, NA), c("KGE"))[[1]]
+    == evalhyd::evald(c(7, 3, 3), c(5, 4, 3), c("KGE"))[[1]]
+ )
[1] TRUE

probabilistic-only

For a given site, evalhyd expects that the time indices in the streamflow predictions and in the streamflow observations correspond to the same datetime. This means that, when evaluating several lead times at once, the time series of streamflow observations will typically be longer than any of the time series of streamflow predictions. Indeed, missing predictions will exist for the longer lead times at the beginning of the time series, because the forecast datetime plus the lead time will not correspond to the first observed datetime (required for shorter lead times), but a later one. In turn, missing predictions will exist for the shorter lead times at the end of the time series, because the forecast datetime plus the lead time will not correspond to the last observed datetime (required for longer lead times), but an earlier one. For these datetimes, missing predictions must be filled with “not a number” value to make sure that the length of the observations and predictions are the same, so that a single time series of streamflow observations can be used across the prediction lead times evaluated.

To illustrate, let’s look at the simple example below where streamflow forecasts (only one ensemble member shown) are issued daily from the 1st until the 4th of January 2017 with a one-day, two-day, and three-day lead time.

time index       0           1           2           3           4           5
datetime         2017-01-02  2017-01-03  2017-01-04  2017-01-05  2017-01-06  2017-01-07

observations     351         367         377         378         330         324

predictions
1-day lead time  312         335         358         342         NaN         NaN
2-day lead time  NaN         341         364         351         332         NaN
3-day lead time  NaN         NaN         361         358         327         327

For a complete forecast evaluation, the one-day lead time will require observations from the 2nd until the 5th of January, the two-day lead time from the 3rd until the 6th, and the three-day lead time from the 4th until the 7th. Since only one time series of streamflow observations is expected by evalhyd, it must start on the 2nd and end on the 7th of January, while the streamflow predictions must “flag” those time steps for which they are not making predictions as “not a number” (identified as NaN in the above example). Indeed, each forecast can only feature 4 predicted values if forecasts were issued only on 4 consecutive days.

This example translates into the following call to evalhyd (where the ensemble member shown above is duplicated four times for the only purpose of illustration):

>>> import evalhyd
>>> import numpy as np
>>> res = evalhyd.evalp(
...     q_obs=np.array([[351, 367, 377, 378, 330, 324]]),
...     q_prd=np.array([[[[312, 335, 358, 342, np.nan, np.nan],
...                       [312, 335, 358, 342, np.nan, np.nan],
...                       [312, 335, 358, 342, np.nan, np.nan],
...                       [312, 335, 358, 342, np.nan, np.nan]],
...                      [[np.nan, 341, 364, 351, 332, np.nan],
...                       [np.nan, 341, 364, 351, 332, np.nan],
...                       [np.nan, 341, 364, 351, 332, np.nan],
...                       [np.nan, 341, 364, 351, 332, np.nan]],
...                      [[np.nan, np.nan, 361, 358, 327, 327],
...                       [np.nan, np.nan, 361, 358, 327, 327],
...                       [np.nan, np.nan, 361, 358, 327, 327],
...                       [np.nan, np.nan, 361, 358, 327, 327]]]]),
...     metrics=["CRPS_FROM_ECDF"]
... )
> res <- evalhyd::evalp(
+     q_obs = rbind(c(351, 367, 377, 378, 330, 324)),
+     q_prd = aperm(
+         array(
+             data = c(
+                 rbind(c(312, 335, 358, 342, NA, NA),
+                       c(312, 335, 358, 342, NA, NA),
+                       c(312, 335, 358, 342, NA, NA),
+                       c(312, 335, 358, 342, NA, NA)),
+                 rbind(c(NA, 341, 364, 351, 332, NA),
+                       c(NA, 341, 364, 351, 332, NA),
+                       c(NA, 341, 364, 351, 332, NA),
+                       c(NA, 341, 364, 351, 332, NA)),
+                 rbind(c(NA, NA, 361, 358, 327, 327),
+                       c(NA, NA, 361, 358, 327, 327),
+                       c(NA, NA, 361, 358, 327, 327),
+                       c(NA, NA, 361, 358, 327, 327))
+             ),
+             dim = c(4, 6, 3, 1)
+         ),
+         perm = c(4, 3, 1, 2)
+     ),
+     metrics = c("CRPS_FROM_ECDF")
+ )