Skip to main content
Log in

On the suitability of resampling techniques for the class imbalance problem in credit scoring

  • General Paper
  • Published:
Journal of the Operational Research Society

Abstract

In real-life credit scoring applications, the case in which the class of defaulters is under-represented in comparison with the class of non-defaulters is a very common situation, but it has still received little attention. The present paper investigates the suitability and performance of several resampling techniques when applied in conjunction with statistical and artificial intelligence prediction models over five real-world credit data sets, which have artificially been modified to derive different imbalance ratios (proportion of defaulters and non-defaulters examples). Experimental results demonstrate that the use of resampling methods consistently improves the performance given by the original imbalanced data. Besides, it is also important to note that in general, over-sampling techniques perform better than any under-sampling approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2

Similar content being viewed by others

References

  • Abdou HA and Pointon J (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature. Intelligent Systems in Accounting, Finance & Management: 18 (2–3): 59–88.

    Article  Google Scholar 

  • Abrahams CR and Zhang M (2008). Fair Lending Compliance: Intelligence and Implications for Credit Risk Management. Wiley: Hoboken, NJ.

    Google Scholar 

  • Alcalá-Fdez J et al (2009). KEEL: A software tool to assess evolutionary algorithms for data mining problems. Soft Computing 13 (3): 307–318.

    Article  Google Scholar 

  • Baesens B, van Gestel T, Viaene S, Stepanova M, Suykens J and Vanthienen J (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society 54 (6): 627–635.

    Article  Google Scholar 

  • Baesens B, Mues C, Martens D and Vanthienen J (2009). 50 years of data mining and OR: Upcoming trends and challenges. Journal of the Operational Research Society 60 (S1): 816–823.

    Article  Google Scholar 

  • Barandela R, Sánchez JS, García V and Rangel E (2003). Strategies for learning in class imbalance problems. Pattern Recognition 36 (3): 849–851.

    Article  Google Scholar 

  • Batista GEAPA, Prati RC and Monard MC (2004). A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations Newsletter 6 (1): 20–29.

    Article  Google Scholar 

  • Bhattacharyya S, Jha S, Tharakunnel K and Westland JC (2011). Data mining for credit card fraud: A comparative study. Decision Support Systems 50 (3): 602–613.

    Article  Google Scholar 

  • Brown I and Mues C (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications 39 (3): 3446–3453.

    Article  Google Scholar 

  • Bunkhumpornpat C, Sinapiromsaran K and Lursinsap C (2009). Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Proceedings of the 13th Pacific Asia Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand, pp 475–482.

  • Chawla NV, Bowyer KW, Hall LO and Kegelmeyer WP (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16: 321–357.

    Google Scholar 

  • Chawla NV, Japkowicz N and Kotcz A (2004). Editorial: Special issue on learning from imbalanced data sets. SIGKDD Explorations Newsletter 6 (1): 1–6.

    Article  Google Scholar 

  • Chawla NV, Cieslak DA, Hall LO and Joshi A (2008). Automatically countering imbalance and its empirical relationship to cost. Data Mining and Knowledge Discovery 17 (2): 225–252.

    Article  Google Scholar 

  • Demšar J (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7 (1): 1–30.

    Google Scholar 

  • Florez-Lopez R (2010). Credit risk management for low default portfolios. Forecasting defaults through cooperative models and boostrapping strategies. In: Proceedings of the 4th European Risk Conference—Perspectives in Risk Management: Accounting, Governance and Internal Control, Nottingham, UK, pp 1–27.

  • García S, Fernández A, Luengo J and Herrera F (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180 (10): 2044–2064.

    Article  Google Scholar 

  • Hand DJ (2005). Good practice in retail credit scorecard assessment. Journal of the Operational Research Society 56 (9): 1109–1117.

    Article  Google Scholar 

  • Hand DJ and Vinciotti V (2003). Choosing k for two-class nearest neighbour classifiers with unbalanced classes. Pattern Recognition Letters 24 (9–10): 1555–1562.

    Article  Google Scholar 

  • Hart PE (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory 14 (3): 505–516.

    Article  Google Scholar 

  • He H and Garcia EA (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21 (9): 1263–1284.

    Article  Google Scholar 

  • Henley WE and Hand DJ (1997). Construction of a k-nearest-neighbour credit-scoring system. IMA Journal of Management Mathematics 8 (4): 305–321.

    Article  Google Scholar 

  • Hochberg Y and Tamhane AC (1987). Multiple Comparison Procedures. John Wiley & Sons: New York, NY.

    Book  Google Scholar 

  • Huang Y-M, Hung C-M and Jiau HC (2006). Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Analysis: Real World Applications 7 (4): 720–747.

    Article  Google Scholar 

  • Huang Z, Chen H, Hsu C-J, Chen W-H and Wu S (2004). Credit rating analysis with support vector machines and neural networks: A market comparative study. Decision Support Systems 37 (4): 543–558.

    Article  Google Scholar 

  • Japkowicz N and Stephen S (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis 6 (5): 429–449.

    Google Scholar 

  • Kennedy K, Mac Namee B and Delany SJ (2010). Learning without default: A study of one-class classification and the low-default portfolio problem. In: Proceedings of the 20th Irish Conference on Artificial Intelligence and Cognitive Science, Dublin, Ireland, pp 174–187.

  • Kubat M and Matwin S (1997). Addressing the curse of imbalanced training sets: One-sided selection. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, pp 179–186.

  • Laurikkala J (2001). Improving identification of difficult small classes by balancing class distribution. In: Proceedings of the 8th Conference on Artificial Intelligence in Medicine in Europe, Cascais, Portugal, pp 63–66.

  • Lessmann S, Baesens B, Mues C and Pietsch S (2008). Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering 34 (4): 485–496.

    Article  Google Scholar 

  • Maciejewski T and Stefanowski J (2011). Local neighbourhood extension of SMOTE for mining imbalanced data. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, Paris, France, pp 104–111.

  • Pluto K and Tasche D (2006). Estimating probabilities of default for low default portfolios. In: Engelmann B and Rauhmeier R (eds). The Basel II Risk Parameters: Estimation, Validation, and Stress Testing. Springer: Berlin, pp 75–101.

    Google Scholar 

  • Provost F and Fawcett T (1997). Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, pp 43–48.

  • Sabzevari H, Soleymani M and Noorbakhsh E (2007). A comparison between statistical and data mining methods for credit scoring in case of limited available data. In: Proceedings of the 3rd CRC Credit Scoring Conference, Edinburgh, UK.

  • Sokolova M and Lapalme G (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management 45 (4): 427–437.

    Article  Google Scholar 

  • Thomas LC, Edelman DB and Crook JN (2002). Credit Scoring and Its Applications. SIAM: Philadelphia, PA.

    Book  Google Scholar 

  • Tian B, Nan L, Zheng Q and Yang L (2010). Customer credit scoring method based on the SVDD classification model with imbalanced dataset. In: Proccedings of the International Conference on E-business Technology and Strategy, Ottawa, Canada, pp 46–60.

  • Tomek I (1976). Two modifications of CNN. IEEE Transactions on Systems, Man and Cybernetics 6 (11): 769–772.

    Article  Google Scholar 

  • Vinciotti V and Hand DJ (2003). Scorecard construction with unbalanced class sizes. Journal of the Iranian Statistical Society 2 (2): 189–205.

    Google Scholar 

  • Wang G, Hao J, Ma J and Jiang H (2011). A comparative assessment of ensemble learning for credit scoring. Expert Systems with Applications 38 (1): 223–230.

    Article  Google Scholar 

  • Wilson DL (1972). Asymptotic properties of nearest neighbour rules using edited data. IEEE Transactions on Systems, Man and Cybernetics 2 (3): 408–421.

    Article  Google Scholar 

  • Xiao W, Zhao Q and Fei Q (2006). A comparative study of data mining methods in consumer loans credit scoring management. Journal of Systems Science and Systems Engineering 15 (4): 419–435.

    Article  Google Scholar 

  • Xie H, Han S, Shu X, Yang X, Qu X and Zheng S (2009). Solving credit scoring problem with ensemble learning: A case study. In: Proceedings of the 2nd International Symposium on Knowledge Acquisition and Modeling, Vol. 1, Wuhan, China, pp 51–54.

  • Yang Z, Wang Y, Bai Y and Zhang X (2004). Measuring scorecard performance. In: Proceedings of 4th International Conference on Computational Science, Krakow, Poland, pp 900–906.

  • Yao P (2009). Comparative study on class imbalance learning for credit scoring. In: Proceedings of the 9th International Conference on Hybrid Intelligent Systems, vol. 2, Shenyang, China, pp 105–107.

  • Yen S-J and Lee Y-S (2006). Under-sampling approaches for improving prediction of the minority class in an imbalanced dataset. In: Thoma M and Morari M (eds). Intelligent Control and Automation, Lecture Notes in Control and Information Sciences. Vol. 344 Springer: Berlin, pp 731–740.

    Google Scholar 

  • Zar JH (2009). Biostatistical Analysis. Pearson: Upper Saddle River, NJ.

    Google Scholar 

Download references

Acknowledgements

This work has partially been supported by the Spanish Ministry of Education and Science under grant TIN2009–14205 and the Generalitat Valenciana under grant PROMETEO/2010/028.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marqués, A., García, V. & Sánchez, J. On the suitability of resampling techniques for the class imbalance problem in credit scoring. J Oper Res Soc 64, 1060–1070 (2013). https://doi.org/10.1057/jors.2012.120

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1057/jors.2012.120

Keywords

Navigation