# AUC: a misleading measure of the performance of predictive distribution models

@article{Lobo2008AUCAM, title={AUC: a misleading measure of the performance of predictive distribution models}, author={Jorge M. Lobo and Alberto Jim{\'e}nez‐Valverde and Raimundo Real}, journal={Global Ecology and Biogeography}, year={2008}, volume={17}, pages={145-151} }

The area under the receiver operating characteristic (ROC) curve, known as the AUC, is currently considered to be the standard method to assess the accuracy of predictive distribution models. It avoids the supposed subjectivity in the threshold selection process, when continuous probability derived scores are converted to a binary presence‐absence variable, by summarizing overall model performance over all possible thresholds. In this manuscript we review some of the features of this measure… Expand

#### Figures from this paper

#### 2,269 Citations

Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling

- Mathematics
- 2012

Aim The area under the receiver operating characteristic (ROC) curve (AUC) is a widely used statistic for assessing the discriminatory capacity of species distribution models. Here, I used simulated… Expand

Recommendations for using the relative operating characteristic (ROC)

- Mathematics
- Landscape Ecology
- 2013

The relative operating characteristic (ROC) is a widely-used method to measure diagnostic signals including predictions of land changes, species distributions, and ecological niches. The ROC measures… Expand

Novel Nonparametric Methods For ROC Curves

- Mathematics
- 2016

The receiver operating characteristic (ROC) curve is a widely used graphical method for evaluating the discriminating power of a diagnostic test or a statistical model in various areas such as… Expand

Revisiting the ROC curve for diagnostic applications with an unbalanced class distribution

- Mathematics
- 2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA)
- 2013

This communication investigates the impact on classifier evaluation of a high asymmetry between positive and negatives classes. It points out some necessary precautions when reporting classifier… Expand

A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms

- Medicine, Computer Science
- BMC Medical Informatics and Decision Making
- 2020

The concordant partial area under the ROC curve was proposed and unlike previous partial measure alternatives, it maintains the characteristics of the AUC. Expand

Plotting receiver operating characteristic and precision–recall curves from presence and background data

- Medicine
- Ecology and evolution
- 2021

The proposed PB‐based ROC/PR plots can provide valuable complements to the existing model assessment methods, and they also provide an additional way to estimate the constant c (or species prevalence) from presence and background data. Expand

Prevalence affects the evaluation of discrimination capacity in presence-absence species distribution models

- Mathematics
- 2021

The aim of this study is to understand how prevalence—the ratio of instances of presence to total sample size—affects the estimation of three discrimination indexes commonly used in distribution… Expand

Rethinking receiver operating characteristic analysis applications in ecological niche modeling

- Computer Science
- 2008

It is shown that, comparing two ROCs, using the AUC systematically undervalues models that do not provide predictions across the entire spectrum of proportional areas in the study area. Expand

Threshold-dependence as a desirable attribute for discrimination assessment: implications for the evaluation of species distribution models

- Mathematics
- Biodiversity and Conservation
- 2013

Species distribution modelling has become a common approach in ecology in the last decades. As in any modelling exercise, evaluation of the predicted suitability surfaces is a key process, and the… Expand

Sample size for the evaluation of presence-absence models

- Mathematics
- 2020

Abstract The effect of the training dataset sample size has been shown to have profound outcomes on the performance of species distribution models. However, the effects that the testing dataset… Expand

#### References

SHOWING 1-10 OF 70 REFERENCES

Modifying ROC Curves to Incorporate Predicted Probabilities

- Mathematics
- 2005

The area under the ROC curve (AUC) is becoming a popular measure for the evaluation of classifiers, even more than other more classical measures, such as error/accuracy, logloss/entropy or precision.… Expand

The meaning and use of the area under a receiver operating characteristic (ROC) curve.

- Medicine
- Radiology
- 1982

A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics,… Expand

The use of the area under the ROC curve in the evaluation of machine learning algorithms

- Computer Science
- Pattern Recognit.
- 1997

AUC exhibits a number of desirable properties when compared to overall accuracy: increased sensitivity in Analysis of Variance (ANOVA) tests; a standard error that decreased as both AUC and the number of test samples increased; decision threshold independent; and it is invariant to a priori class probabilities. Expand

Beware the Null Hypothesis: Critical Value Tables for Evaluating Classifiers

- Computer Science
- ECML
- 2005

This paper provides tables with critical values pre-computed for the normal distribution, the t-distribution, etc for the performance metrics of binary classification: accuracy, F-measure, area under the ROC curve (AUC), and true positives in the top ten. Expand

Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine.

- Chemistry, Medicine
- Clinical chemistry
- 1993

Receiver-operating characteristic (ROC) plots provide a pure index of accuracy by demonstrating the limits of a test's ability to discriminate between alternative states of health over the complete spectrum of operating conditions. Expand

A comparison of goodness-of-fit tests for the logistic regression model.

- Mathematics, Medicine
- Statistics in medicine
- 1997

An examination of the performance of the tests when the correct model has a quadratic term but a model containing only the linear term has been fit shows that the Pearson chi-square, the unweighted sum-of-squares, the Hosmer-Lemeshow decile of risk, the smoothed residual sum- of-Squares and Stukel's score test, have power exceeding 50 per cent to detect moderate departures from linearity. Expand

Selecting thresholds of occurrence in the prediction of species distributions

- Biology, Environmental Science
- 2005

Twelve approaches to determining thresholds were compared using two species in Europe and artificial neural networks, and the modelling results were assessed using four indices: sensitivity, specificity, overall prediction success and Cohen's kappa statistic. Expand

Evaluating predictive models of species’ distributions: criteria for selecting optimal models

- Mathematics
- 2003

Abstract The Genetic Algorithm for Rule-Set Prediction (GARP) is one of several current approaches to modeling species’ distributions using occurrence records and environmental data. Because of… Expand

Coefficient Kappa: Some Uses, Misuses, and Alternatives

- Mathematics
- 1981

This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these… Expand

Measuring the accuracy of diagnostic systems.

- Computer Science, Medicine
- Science
- 1988

For diagnostic systems used to distinguish between two classes of events, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy. Expand