Performance metrics

class pystran.Evaluation(observed, modelled)[source]

Class for deriving different evaluation criteria.

References

[E1](1, 2, 3) Gupta H.V., Sorooshian S., Yapo P.O.(1998), Toward improved calibration of hydrologic models: Multiple and noncommensurable measures of information, Water Resources Research,pp 751-763
[E2]H. Hauduc, M. B. Neumann, D. Muschalla, V. Gamerith, S. Gillot and P.A. Vanrolleghem (2011), Towards quantitative quality criteria to evaluate simulation results in wastewater treatment – A critical review. Proceedings 8th symposium on systems analysis and integrated assessment (Watermatex 2011)
AME()[source]

Absolute Maximum Error

Notes

The absolute maximum error indicates the maximum error of the model [E1]. This criterion is very sensitive to outliers

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria
APBIAS()[source]

Absolute Percent Bias

Notes

Useful in combi with PBIAS, eg if PBIAS small and APB very large one could conclude that volumes are ok, but timing is missing (continuous gap)

  • range: [0, inf]
  • optimum: 0
  • category: Total Relative error criteria
BIAS(optim=False)[source]

Bias E[obs-mod]

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

  • range: [-inf, inf]
  • optimum: 0
  • category: Total Relative error criteria
CrBal(optim=False)[source]

Balance Criterion

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm (1- CrBal)

Notes

[E9] use the balance criterion to measure the ability of the model to reproduce the same cumulative as observed. The difference between the inversed fractions penalises larger differences between observed and modelled cumulative values.

  • range: [-inf, 1]
  • optimum: 1
  • category: Total Relative error criteria
HighFDCE()[source]

Flow Duration Curve based high flow criterion

Notes

Uses the upper part (lowest percentiles) of the Flow Duration Curve to focus on high flow regimes. Always use in combination with a second criterion to make sure the timing of the model is also satifying. Used in [E15].

IA(optim=False)[source]

Index of agreement

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Index of agreement is te ratio of the sum of squared errors (SSE) and the largest potential error with respect to the mean of the observed values, [E4]. This is sensitive to the model mean and to the peak values, and is insensitive to low magnitude values.

  • range: [0, 1]
  • optimum: 1
  • category: comparison with reference model
LowFDCE()[source]

Flow Duration Curve based low flow criterion

Notes

Uses the lower part (highest percentiles) of the Flow Duration Curve to focus on low flow regimes. Always use in combination with a second criterion to make sure the timing of the model is also satifying. Used in [E15].

MAE()[source]

Mean Absolute Error

Notes

The mean absolute error indicates the average magnitude of the model error (accuracy) [E4]. Taking the absolute value avoids error compensation, but does not indicate the direction of the deviation.

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria

References

[E4](1, 2, 3) Willmott C.J., Ackleson S.G., Davis R.E., Feddema J.J., Klink K.M., Legates D.R., O’Donnell J. and Rowe C.M. (1985) Statistics for the evaluation and comparison of models. Journal of Geophysical Research, 90(C5), 8995-9005.
MAPE()[source]

Mean Absolute Percent Error

Notes

The mean absolute percent error used by [E3] is close to MARE. However, the errors are relative to the predicted values instead of the observed values. Consequently, the under-predicted values are penalised (for a similar error). This is an interesting criterion for situations in which one wants to determine a risk to reach concentration limits.

  • range: [0, inf]
  • optimum: 0
  • category: Relative criteria
MARE()[source]

Mean Absolute Relative Error

Notes

The mean absolute relative error is similar to the Mean Relative Error, but avoids the compensation of errors [E7].

  • range: [0, inf]
  • optimum: 0
  • category: Relative criteria

References

[E7]Petersen B., Gernaey K., Henze M. and Vanrolleghem P.A. (2002) Evaluation of an ASM1 model calibration procedure on a municipal-industrial wastewater treatment plant. Journal of Hydroinformatics, 4(1), 15-38.
ME(optim=False)[source]

Mean Error

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

The mean of residuals allows highlighting the existence of systematic bias, i.e. characteristic of a model leading to systematic over- or under-prediction [E1]. However, with this criterion errors can compensate each other, so no information on the magnitude of the errors is obtained.

  • range: [-inf, inf]
  • optimum: 0
  • category: Absolute criteria

References

[E3](1, 2, 3) Power M. (1993) The predictive validation of ecological and environmental models. Ecological Modelling, 68(1-2), 33-50.
MPE(optim=False)[source]

Mean Percent Error

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

See also

MRE

Notes

The mean percent error [E3] provide the average relative model error. However, negative and positive errors can compensate for each other.

  • range: [-inf, inf]
  • optimum: 0
  • category: Relative criteria
MRE(optim=False)[source]

Mean Relative Error

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

The mean relative error [E5] provide the average relative model error. However, negative and positive errors can compensate for each other.

  • range: [-inf, inf]
  • optimum: 0
  • category: Relative criteria
MSDE()[source]

Mean Square Derivative Error

Notes

The mean square derivative error is the square of the differences of predicted and observed variations between two time steps [E5]. This criterion penalizes noisy time series and series with timing error; it thus allows evaluating the peak’s timing.

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria
MSE()[source]

Mean Squared Error

Notes

The mean square error avoids error compensations and emphasises high errors [E4].

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria
MSLE()[source]

Mean Squared Logarithm Error

Notes

The mean square logarithm error is the sum of the squares of the differences of the natural logarithm of the predicted and observed value [E5]. It emphasises low magnitude errors.

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria

References

[E5](1, 2, 3, 4, 5, 6, 7) Dawson C.W., Abrahart R.J. and See L.M. (2010) HydroTest: Further development of a web resource for the standardised assessment of hydrological models. Environmental Modelling and Software, 25(11), 1481-1482.
MSRE()[source]

Mean Square Relative Error

Notes

The mean square relative error avoids compensation of errors and emphasises larger relative errors [E5].

  • range: [0, inf]
  • optimum: 0
  • category: Relative criteria
MSSoE()[source]

Mean Sqaured sorted Errors

Notes

The mean square error of sorted errors is calculated based on sorted observed and predicted data (van Griensven and Bauwens, 2003). Observations and predictions are sorted independently one from the other. The sorted series are then compared (comparison of their cumulative distributions) and it is evaluated whether the model reproduces the same distribution as the observed data.

The time of occurrence of a given value of the variable is not accounted for in the MSSoE method.

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria, timestep independent

References

[E6]van Griensven A. and Bauwens W. (2003) Multiobjective autocalibration for semidistributed water quality models. Water Resources Research, 39(12), SWC91-SWC99.
MeAPE()[source]

Median Absolute Percent Error

Notes

Median of the absolute relative error expressed in percentage [E5]. This criterion is less affected by outliers and the errors distribution form as the MARE criterion.

  • range: [0, inf]
  • optimum: 0
  • category: Relative criteria
NSC()[source]

Number of Sign Changes of the residuals

Notes

The number of sign changes,[E1]_, counts the number of times the residual (Oi-Pi) sign change. The minimum value is zero and the maximum n. A value close to zero indicates a systematic error (overestimating or under-estimating model) but a more consistent model. A value close to n indicates a random error.

  • range: [0, nsize]
  • optimum: /
  • category: Absolute criteria
NSE(optim=False)[source]

Nash-Sutcliffe Efficiency criterion

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Widely used criterion in hydrology, values ranging from -infty -> 1 A zero value means the model is not better than the ‘no knowledge’ model, which is characterised by the mean of the observations. Sensitive to extreme values.

  • range: [-inf, 1]
  • optimum: 1
  • category: comparison with reference model
NSE_BIAS()[source]

Combination of Nash Sutcliff and BIAS

Notes

The criterium is gaining importance by the combined effect and is proposed in [E16]. Here an adaptation is implemented by taking the absolute value of the bias, to make the function symmetrical around the optimal value.

References

[E16]Viney, N.R., J. Perraud, J. Vaze F.H.S. Chiew, D.A. Post and A. Yang (2009b). The usefulness of bias constraints in model calibration for regionalisation to ungauged catchments. Proceedings, MODSIM 200
NSE_FDChigh(w1=1.0, w2=1.0)[source]

Nash Sutcliff (mod) + high Flow; zelfde gewichtsfactor: als fout groter, ook beide groter!

Parameters:

w1 : float (0-1)

weighting factor 1, NSE

w2 : float (0-1)

weighting factor 2, FDC

NSE_FDClow(w1=1.0, w2=1.0)[source]

Nash Sutcliffe (mod) + low Flow; zelfde gewichtsfactor, als fout groter, ook beide groter!

Parameters:

w1 : float (0-1)

weighting factor 1, NSE

w2 : float (0-1)

weighting factor 2, FDC

NSE_boxcox(optim=False, llambda=0.25)[source]

Nash-Sutcliffe Efficiency criterion with boxcox transformed values

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Widely used criterion in hydrology, values ranging from -infty -> 1 A zero value means the model is not better than the ‘no knowledge’ model, which is characterised by the mean of the observations.

Model residuals typically increase with higher flowvalues. This means that themodel residual variance or standard deviation typically increases with increasing flow. It also means that the higher flow values receive more weight in the goodness-of-fit statistics, [E10].

  • range: [-inf, 1]
  • optimum: 1
  • category: comparison with reference model

References

[E10]Willems, P. A Time Series Tool to Support the Multi-criteria Performance Evaluation of Rainfall-runoff Models. Environmental Modelling & Software 24, no. 3 (March 2009): 311–321. http://linkinghub.elsevier.com/retrieve/pii/S1364815208001606.
NSE_log(optim=False)[source]

Nash-Sutcliffe Efficiency criterion with logarithmic values

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Widely used criterion in hydrology, values ranging from -infty -> 1 A zero value means the model is not better than the ‘no knowledge’ model, which is characterised by the mean of the observations. The log values of the observed and measured values are used to give more emphasis to the lower values

  • range: [-inf, 1]
  • optimum: 1
  • category: comparison with reference model
NSE_sqrt(optim=False)[source]

Nash-Sutcliffe Efficiency criterion with root values

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Widely used criterion in hydrology, values ranging from -infty -> 1 A zero value means the model is not better than the ‘no knowledge’ model, which is characterised by the mean of the observations. The root values of the observed and measured values are used to give more emphasis to the lower values

  • range: [-inf, 1]
  • optimum: 1
  • category: comparison with reference model
PBIAS(optim=False)[source]

Percent Bias

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

The percent bias [E5] and relative volume error are the sum of errors related to the sum of observed values, expressed as relative value or in percentage. This criterion measures an overall adequacy, but the errors can be compensated.

(Also known as DEVRV, the Deviation of runoff volumes, From Statistical evaluation of WATFLOOD, Angela MacLean, University of Waterloo)

  • range: [-inf, inf]
  • optimum: 0
  • category: Total Relative error criteria
PDIFF(optim=False)[source]

Peak Difference

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

This criterion evaluate how well the highest modelled value matches the highest observed value in percent. However, it does not take into account whether the max(Oi) and max(Pi) occur at the same time-step i.

Consequently, in case of multiple events on the same time-series, first the single events must be extracted from the whole time series to have less chance to mix up with peaks from another event.

  • range: [-inf, inf]
  • optimum: 0
  • category: single event
PEP(optim=False)[source]

Percent Error In Peak

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

This criterion evaluate how well the highest modelled value matches the highest observed value in percent. However, it does not take into account whether the max(Oi) and max(Pi) occur at the same time-step i.

Consequently, in case of multiple events on the same time-series, first the single events must be extracted from the whole time series to have less chance to mix up with peaks from another event.

  • range: [-inf, inf]
  • optimum: 0
  • category: single event
PI(optim=False)[source]

Coefficient of Persistance

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

The coefficient of persistence is close tot the NSE criterion, but the simplistic model used is th elast observed value instead of the mean of observed values, [E12].

  • range: [0, 1]
  • optimum: 1
  • category: comparison with reference model
R4MS4E()[source]

Root 4 Mean Square 4 Error

See also

RMSE

Notes

To put even more emphasis on the larger errors, the fourth root mean quadruples error is used [E5]

RAE()[source]

Relative Absolute Error

Notes

The RAE compares the sum of the absolute residuals to the residuals of the no knowledge model (mean of observed values, [E11]. This criterion does not allow error compensation.

  • range: [0, inf]
  • optimum: 0
  • category: comparison with reference model

References

[E11](1, 2) Legates D.R. and McCabe G.J. (1999) Evaluating the use of ‘goodness-of-fit’ measures in hydrologic and hydroclimatic model validation. Water Resources Research, 35(1), 233-241
RCOEF(optim=False)[source]

Correlation coefficient

Parameters:

optim : bool

if True, the objective is translated to be used in optimizations where a minimum value is seeked by the algorithm

Notes

Used to describe how well a regression line fits a set of data, compares variability in observed and modelled values. In general not the best criteria to check model performance, see more details in [E11].

  • range: [0, 1]
  • optimum: 1
  • category: comparison with reference model
RFLAUT(theta=1, method='biomath')[source]

First (or higher) lag autocorrelation, higher values of theta gives the higher value

Parameters:

theta : int

lag to calculate

method : biomath|gupta|anders

method to calculate the correlation

Notes

Calculates the first lag of the autocorrelation of the residuals, according to the version proposed by [E1] when method ‘gupta1998’ is chosen. Default is the biomath version, as proposed by Gujer, 2008 and more information is given in [E12]

  • range: [0, 1]
  • optimum: 0
  • category: others

References

[E13]Cierkens, Katrijn. Investigating Bioprocess Model Output

Uncertainty as Function of Input Data Quantity and Model Structure. Ghent University, 2010.

RMAE()[source]

Relative Mean Absolute Error

Notes

The relative mean absolute error is the sum of absolute errors related to the sum of observed data [E8]. The difference with the PBIAS and RVE is that errors are not compensated.

  • range: [0, inf]
  • optimum: 0
  • category: Total Relative error criteria

References

[E8](1, 2) Elliott J.A., Irish A.E., Reynolds C.S. and Tett P. (2000) Modelling freshwater phytoplankton communities: an exercise in validation. Ecological Modelling, 128(1), 19-26.
RMSE()[source]

Root Mean Square Error

Notes

The root mean square error is an absolute criterion that is often used [4]_. It indicates the overall agreement between predicted and observed data. The square allows avoiding error compensation and emphasises larger errors. The root provides a criterion in actual units. Consequently, this quality criterion can be compared to the MAE to provide information on the prominence of outliers in the dataset.

RMSE_boxcox(llambda=0.25)[source]

Root Mean Square Error with boxcox trfd values

Notes

The root mean square error is an absolute criterion that is often used [4]_. It indicates the overall agreement between predicted and observed data. The square allows avoiding error compensation and emphasises larger errors. The root provides a criterion in actual units. Consequently, this quality criterion can be compared to the MAE to provide information on the prominence of outliers in the dataset. Also applied in [E14].

RMSE_log(llambda=0.25)[source]

Root Mean Square Error with boxcox trfd values

Notes

The root mean square error is an absolute criterion that is often used [4]_. It indicates the overall agreement between predicted and observed data. The square allows avoiding error compensation and emphasises larger errors. The root provides a criterion in actual units. Consequently, this quality criterion can be compared to the MAE to provide information on the prominence of outliers in the dataset.

RRMSE()[source]

Relative Root Mean Square Error

See also

RMSE

Notes

The relative Root Mean Square Error is the Root Mean Square Error devided by the mean of the observations.

RSR()[source]

RMSE-observation standard deviation ratio

Notes

The RMSE-observation standard deviation ratio is the RMSE of the predicted data divided by the RMSE of the no knowledge model (mean of observed values), [E12]. It is a scaled criterion that emphasises larger errors and can be, as for MAE and RMSE, compared to the RAE to indicate the influence of larger errors.

  • range: [0, inf]
  • optimum: 0
  • category: comparison with reference model

References

[E12](1, 2, 3) Moriasi D.N., Arnold J.G., Van Liew M.W., Bingner R.L., Harmel R.D. and Veith T.L. (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Transactions of the ASABE, 50(3), 885-900
RVE()[source]

Relative Volume Error

Notes

The Relative volume error are the sum of errors related to the sum of observed values, expressed as relative value or in percentage. This criterion measures an overall adequacy, but the errors can be compensated

  • range: [-inf, inf]
  • optimum: 0
  • category: Total Relative error criteria
SARE()[source]

Sum of Absolute Relative Error

Notes

  • range: [0, inf]
  • optimum: 0
  • category: Relative criteria
SFDCE()[source]

Slope of the flow duration curve error

Notes

Based on the FDC, measures how well the model captures the distribution of mid-level flow. The slope of a watershed’s flow duration curve indicates the variability, or flashiness, of its flow magnitudes. The SFDCE metric is thus simply the absolute error in the slope of the flow duration curve between the 30 and 70 percentile flows.

References

[E14]van Werkhoven, Kathryn, Thorsten Wagener, Patrick Reed, and Yong Tang. Sensitivity-guided Reduction of Parametric Dimensionality for Multi-objective Calibration of Watershed Models. Advances in Water Resources 32, no. 8 (2009): 1154–1169. http://dx.doi.org/10.1016/j.advwatres.2009.03.002.
SSE()[source]

Sum of Squared Errors (of prediction)

Notes

  • range: [0, inf]
  • optimum: 0
  • category: Absolute criteria
TMC()[source]

Totel Mass Controller

Notes

[E6] use the Total Mass Controller criterion as an objective function. This criterion compares the cumulative predicted and observed values

  • range: [0, inf]
  • optimum: 0
  • category: Total Relative error criteria
ThInC()[source]

Theils Inequality Coefficient

Notes

Theil’s inequality coefficient used by [E3] and [E8] is the mean square error divided by the sum of observed data. This criterion avoids error compensation and emphasises larger errors.

  • range: [0, inf]
  • optimum: 0
  • category: Total Relative error criteria
check_boxcox(llambda)[source]

Function to evaluate the effect of the box cox transformation applied on the data, cfr. WETSPRO tool, hydromad application

infodict()[source]

Prepares information dictionary