Bootstrapping
deterministic probabilistic
A bootstrapping method is available in evalhyd
to assess the sampling
uncertainty in the evaluation metrics computed. It follows a non-overlapping
block bootstrapping approach (see e.g. Clark et al. (2021)) where blocks are taken to be
full years of data. For a given period, the bootstrap method randomly draws
with replacement from the years it contains.
Note
While providing full years is a requirement enforced by evalhyd
to
preserve seasonal patterns and intra-annual auto-correlation, the
start of the year is left to the appreciation of the user (e.g.
hydrological years, calendar years, etc.). And the first date and
time provided via the parameter dts is used to define the start of
the year.
This allows for the estimation of the sampling uncertainty of the evaluation metrics, i.e. the influence of the choice of the study period on the metric values.
The bootstrap method is configurable through three parameters:
Parameter |
Description |
Possible values |
|
---|---|---|---|
|
The number of random samples to generate. |
any integer |
|
|
The length of one sample in number of blocks (i.e. years). |
any integer |
|
|
The statistics to summarise the sampling distribution (i.e. across the samples). |
|
for no summary |
|
for mean & standard deviation |
||
|
for percentiles 5, 10, 25, 50, 75, 90, 95 |
Hint
The seed of the random generator is configurable through the seed parameter.
Note
Since the sampling is performed with replacement, the number of samples and the length of a sample have no upper limit.
Examples using the bootstrapping functionality are provided below.
>>> res = evalhyd.evald(
... obs, prd, ["NSE"],
... bootstrap={"n_samples": 100, "len_sample": 10, "summary": 0},
... dts=dts
... )
>>> res = evalhyd.evalp(
... obs, prd, ["CRPS_FROM_ECDF"],
... bootstrap={"n_samples": 100, "len_sample": 10, "summary": 0},
... dts=dts
... )
> res <- evalhyd::evald(
+ obs, prd, c("NSE"),
+ bootstrap = list(n_samples = 100, len_sample = 10, summary = 0),
+ dts=dts
+ )
> res <- evalhyd::evalp(
+ obs, prd, c("CRPS_FROM_ECDF"),
+ bootstrap = list(n_samples = 100, len_sample = 10, summary = 0),
+ dts = dts
+ )
$ ./evalhyd evald "obs.csv" "prd.csv" "NSE" --to_file \
> --bootstrap "n_samples" 100 "len_sample" 10 "summary" 0 --dts "dts.csv"
$ ./evalhyd evalp "./obs" "./prd" "CRPS_FROM_ECDF" --to_file \
> --bootstrap "n_samples" 100 "len_sample" 10 "summary" 0 --dts "dts.csv"