evalhyd evald

Evaluate deterministic predictions.

Usage

evalhyd evald [OPTIONS] q_obs q_prd metrics...

Positionals

q_obs <TEXT:FILE>: Path to streamflow observations CSV file. Time steps with missing observations must be assigned NAN values. Those time steps will be ignored both in the observations and in the predictions before the metrics are computed.

Important

The CSV file must feature one line and as many columns as there are time steps in the study period [shape: (1, time)].

q_prd <TEXT:FILE>: Path to streamflow predictions CSV file. Time steps with missing predictions must be assigned NAN values. Those time steps will be ignored both in the observations and in the predictions before the metrics are computed.

Important

The CSV file must feature one line or more and as many columns as there are time steps in the study period [shape: (series, time)].

metrics <TEXT ...>: List of evaluation metrics to compute.

Note

For each computed metric, the output shape is (series, subsets, samples). Since CSV files are intrinsically two-dimensional (i.e. lines and columns), the series are stacked on one another. For example, the output shape (2 series, 4 subsets, 3 samples) is stored into a CSV file containing eight lines and three columns (where the four first lines correspond to the four subsets and three samples of the first series, and the last four lines correspond to those of the second series).

See also

Deterministic metrics

Optionals

--q_thr <TEXT:DIR>: Path to streamflow thresholds CSV file. If the number of thresholds differs across series, NAN can be set as threshold for those series with fewer thresholds. Predictions and thresholds must feature the same number of series.

Important

The CSV file must feature as many lines as in q_prd and as many columns as there are thresholds to consider [shape: (series, thresholds)].

--events <TXT>: A string specifying the type of streamflow events to consider for threshold exceedance-based metrics. It can either be set as "high" when flooding conditions/high flow events are evaluated (i.e. event occurring when streamflow goes above threshold) or as "low" when drought conditions/low flow events are evaluated (i.e. event occurring when streamflow goes below threshold). It must be provided if q_thr is provided.

--transform <TEXT>: The transformation to apply to both streamflow observations and predictions prior to the calculation of the metrics.

See also

Transformation

--exponent <FLOAT>: The value of the exponent n to use when the transform is the power function. If not provided (or set to a value of 1), the streamflow observations and predictions remain untransformed.

--epsilon <FLOAT>: The value of the small constant ε to add to both the streamflow observations and predictions prior to the calculation of the metrics when the transform is the reciprocal function, the natural logarithm, or the power function with a negative exponent (since none are defined for 0). If not provided, one hundredth of the mean of the streamflow observations is used as value for epsilon, as recommended by Pushpalatha et al. (2012).

--t_msk <TEXT:FILE>: Path to CSV file containing the temporal subsets. Each subset consists in a series of 0/1 indicating which time steps to include/discard. If not provided and neither is m_cdt, no subset is performed. If provided, as many subsets as they are observed time series must be provided.

Important

The CSV file must feature as many lines as there are prediction series times temporal subsets, and as many columns as there are time steps in the study period [shape: (series, subsets, time)]. For example, for five predictions series and two temporal subsets, the first five lines must correspond to the five series for the first subset, and the last five lines to the five series for the second subset.

See also

Temporal masking

--m_cdt <TEXT:FILE>: Path to CSV file containing the masking conditions. Each condition consists in a string and can be specified on observed streamflow values/statistics (mean, median, quantile), or on time indices. If provided in combination with t_msk, the latter takes precedence. If not provided and neither is t_msk, no subset is performed. If provided, as many conditions as they are observed time series must be provided.

Important

The CSV file must feature as many lines as there are prediction series, and as many columns as there are masking conditions [shape: (series, subsets)].

See also

Conditional masking

--bootstrap <TEXT ...>

The values for the parameters of the bootstrapping method used to estimate the sampling uncertainty in the evaluation of the predictions. It takes three parameters: "n_samples" the number of random samples; "len_samples" the length of one sample in number of years; "summary" the statistics to return to characterise the sampling distribution. If not provided, no bootstrapping is performed. If provided, dts must also be provided.

Parameter example:

--bootstrap "n_samples" 100 "len_sample" 10 "summary" 0

Examples

$ evalhyd evald "q_obs.csv" "q_prd.csv" "NSE"
{{{ 0.625477}},
 {{ 0.043416}},
 {{ 0.663645}}}

$ evalhyd evald "q_obs.csv" "q_prd.csv" "NSE" --transform "sqrt"
{{{ 0.60338 }},
 {{-0.006811}},
 {{ 0.697281}}}

$ evalhyd evald "q_obs.csv" "q_prd.csv" "NSE" --transform "log" --epsilon .5
{{ 0.581342},
 {-0.045892},
 { 0.714327}}

$ evalhyd evald "q_obs.csv" "q_prd.csv" "NSE" --transform "pow" --exponent .8
{{{ 0.617575}},
 {{ 0.023426}},
 {{ 0.67871 }}}

$ evalhyd evald "q_obs.csv" "q_prd.csv" "NSE" \
> --bootstrap "n_samples" 5 "len_sample" 10 "summary" 0 --dts "dts.csv"
{{{ 0.625477,  0.625477,  0.625477,  0.625477,  0.625477}},
 {{ 0.043416,  0.043416,  0.043416,  0.043416,  0.043416}},
 {{ 0.663645,  0.663645,  0.663645,  0.663645,  0.663645}}}