Probabilistic metrics
Tip
All the metrics listed below are accessible via evalp
, the probabilistic
entry point of evalhyd
.
For example, the Brier score can be computed as follows:
#include <xtensor/xtensor.hpp>
#include <xtensor/xio.hpp>
#include <evalhyd/evalp.hpp>
xt::xtensor<double, 2> obs = {{4.7, 4.3, 5.5, 2.7, 4.1}};
xt::xtensor<double, 4> prd = {{{{5.3, 4.2, 5.7, 2.3, 3.1},
{4.3, 4.2, 4.7, 4.3, 3.3},
{5.3, 5.2, 5.7, 2.3, 3.9}}}};
xt::xtensor<double, 2> thr = {{4., 5.}};
std::cout << evalhyd::evalp(obs, prd, {"BS"}, thr, "high")[0] << std::endl;
// {{{{{ 0.222222, 0.133333}}}}}
>>> import numpy
... obs = numpy.array(
... [[4.7, 4.3, 5.5, 2.7, 4.1]]
... )
... prd = numpy.array(
... [[[[5.3, 4.2, 5.7, 2.3, 3.1],
... [4.3, 4.2, 4.7, 4.3, 3.3],
... [5.3, 5.2, 5.7, 2.3, 3.9]]]]
... )
... thr = numpy.array([[4., 5.]])
>>> import evalhyd
... evalhyd.evalp(obs, prd, ["BS"], thr, events="high")
[array([[[[[0.22222222, 0.13333333]]]]])]
> obs <- rbind(
+ c(4.7, 4.3, 5.5, 2.7, 4.1)
+ )
> prd <- array(
+ rbind(
+ c(5.3, 4.2, 5.7, 2.3, 3.1),
+ c(4.3, 4.2, 4.7, 4.3, 3.3),
+ c(5.3, 5.2, 5.7, 2.3, 3.9)
+ ),
+ dim = c(1, 1, 3, 5)
+ )
> thr <- rbind(
+ c(4., 5.)
+ )
> library(evalhyd)
> evalhyd::evalp(obs, prd, c("BS"), thr, events="high")
[[1]]
, , 1, 1, 1
[,1]
[1,] 0.2222222
, , 1, 1, 2
[,1]
[1,] 0.1333333
$ ./evalhyd evalp "./obs/" "./prd/" "BS" --q_thr "./thr/" --events "high"
{{{{{ 0.222222, 0.133333}}}}}
BS
Brier Score ("BS"
) originally derived by Brier (1950), but
computed as per Wilks (2011):
where, for a dichotomous event, \(y_k\) is the event forecast probability, \(o_k\) is the observed event outcome, and \(n\) is the number of time steps.
Required inputs |
Output shape |
---|---|
|
|
BSS
Brier Skill Score ("BSS"
), computed as per Wilks (2011):
where \(BS_{reference} = \frac{1}{n} \sum_{k=1}^{n} (o_k - \bar{o})^2\)2, \(o_k\) is the observed event outcome, \(n\) is the number of time steps, and \(\bar{o}\) is the mean observed event occurrence for the study period.
Required inputs |
Output shape |
---|---|
|
|
BS_CRD
Calibration-Refinement Decomposition of the Brier Score ("BS_CRD"
)
into the three components reliability, resolution, and uncertainty
[returned in this order].
Required inputs |
Output shape |
---|---|
|
|
BS_LBD
Likelihood-Base rate Decomposition of the Brier Score ("BS_LBD"
)
into the three components type 2 bias, discrimination, and sharpness
(a.k.a. refinement) [returned in this order].
Required inputs |
Output shape |
---|---|
|
|
REL_DIAG
X and Y axes of the reliability diagram ("REL_DIAG"
) and ordinates
of its associated sampling histogram: forecast probabilities (X),
observed frequencies (Y), and number of forecasts for each forecast
probability [returned in this order].
Required inputs |
Output shape |
---|---|
|
|
CRPS_FROM_BS
Continuous Ranked Probability Score computed from 101 Brier Scores
("CRPS_FROM_BS"
), i.e. using the observed minimum, the 99 observed
percentiles, and the observed maximum as streamflow thresholds.
Required inputs |
Output shape |
---|---|
|
|
CRPS_FROM_ECDF
Continuous Ranked Probability Score computed from the Empirical Cumulative
Density Function ("CRPS_FROM_ECDF"
), i.e. constructed from the ensemble
member predictions.
Required inputs |
Output shape |
---|---|
|
|
QS
Quantile Scores ("QS"
) where the ensemble member predictions are treated
as quantiles.
Required inputs |
Output shape |
---|---|
|
|
CRPS_FROM_QS
Continuous Ranked Probability Score computed from the Quantile Scores
("CRPS_FROM_QS"
).
Required inputs |
Output shape |
---|---|
|
|
CONT_TBL
Cells of the Contingency Table ("CONT_TBL"
), i.e. the hits \(a\),
the false alarms \(b\), the misses \(c\), and the correct
rejections \(d\), in this order.
Required inputs |
Output shape |
---|---|
|
|
POD
Probability Of Detection ("POD"
) also known as “hit rate”, derived
from the contingency table.
Required inputs |
Output shape |
---|---|
|
|
POFD
Probability Of False Detection ("POFD"
) also known as “false alarm rate”,
derived from the contingency table.
Required inputs |
Output shape |
---|---|
|
|
FAR
False Alarm Ratio ("FAR"
), derived from the contingency table.
Required inputs |
Output shape |
---|---|
|
|
CSI
Critical Success Index ("CSI"
), derived from the contingency table.
Required inputs |
Output shape |
---|---|
|
|
ROCSS
Relative Operating Characteristic Skill Score ("ROCSS"
), derived from
the contingency table, and based on computing the area under the ROC curve.
Required inputs |
Output shape |
---|---|
|
|
RANK_HIST
Frequencies of the Rank Histogram ("RANK_HIST"
), also known as the
Talagrand diagram.
Required inputs |
Output shape |
---|---|
|
|
DS
Delta score ("DS"
) as per Candille and Talagrand (2005).
Required inputs |
Output shape |
---|---|
|
|
AS
Alpha score ("AS"
) as per Renard et al. (2010).
Required inputs |
Output shape |
---|---|
|
|
CR
Coverage ratio ("CR"
), i.e. the portion of observations falling within the
predictive intervals. It is a measure of the reliability of the predictions.
Required inputs |
Output shape |
---|---|
|
|
AW
Average width ("AW"
) of the predictive interval(s). It is a measure of the
sharpness of the predictions.
Required inputs |
Output shape |
---|---|
|
|
AWN
Average width of the predictive interval(s) normalised by the mean
observation2 ("AWN"
), computed as per
Bourgin et al. (2015).
Required inputs |
Output shape |
---|---|
|
|
WS
Winkler score ("WS"
), also known as interval score, computed as per
Gneiting and Raftery (2007).
where, for a given confidence level, \(\alpha\) is the portion not included in the central predictive interval, \(u\) and \(l\) are the upper and lower bounds of the predictive interval, respectively, \(x\) are the observations, and \(n\) is the number of time steps.
Required inputs |
Output shape |
---|---|
|
|
ES
Energy score ("ES"
) is a multivariate (i.e. multisite) generalisation
of the continuous rank probability score.
Required inputs |
Output shape |
---|---|
|
|
Footnotes
- 1(1,2,3,4,5,6,7,8,9,10,11,12)
The threshold value is included in the definition of the events both for low flow and high flow events, i.e. where a streamflow observation/prediction value is equal to the threshold value, the event is considered to have occurred.
- 2(1,2)
The metric value returned is \(-\infty\) when the reference/climatology/normalisation value is zero.