Abstract
In order to manage model risk, financial institutions need to set up validation processes so as to monitor the quality of the models on an ongoing basis. Validation can be considered from both a quantitative and qualitative point of view. Backtesting and benchmarking are key quantitative validation tools, and the focus of this paper. In backtesting, the predicted risk measurements (PD, LGD, EAD) will be contrasted with observed measurements using a workbench of available test statistics to evaluate the calibration, discrimination and stability of the model. A timely detection of reduced performance is crucial since it directly impacts profitability and risk management strategies. The aim of benchmarking is to compare internal risk measurements with external risk measurements so as to better gauge the quality of the internal rating system. This paper will focus on the quantitative PD validation process within a Basel II context. We will set forth a traffic light indicator approach that employs all relevant statistical tests to quantitatively validate the used PD model, and document this approach with a real-life case study. The set forth methodology and tests are the summary of the authors’ statistical expertise and experience of world-wide observed business practices.
Notes
Note that actually 1−ρ is given.
References
Balthazar, L. (2004) PD estimates for Basel II. Risk 17(4): 84–85.
Banking Policy Department (2006) The validation of internal rating systems for capital adequacy purposes. HKMA Quarterly Bulletin, September, Hong Kong Monetary Authority.
Basel Committee on Banking Supervision (2005a) Basel II: International convergence of capital measurement and capital standards: A revised framework. Technical Report, Bank for International Settlements.
Basel Committee on Banking Supervision (2005b) Studies on the validation of internal rating systems. Technical Report, Working Paper No. 14, Bank for International Settlements.
Basel Committee on Banking Supervision (2005c) Update on work of the accord implementation group related to validation under the Basel II framework. Technical Report Newsletter No. 4, Bank for International Settlements.
Blochwitz, S., Hohl, S., Dirk Tasche, D. and Wehn, C. (2004) Validating default probabilities on short time series. Technical Report, Capital & Market Risk Insights (Federal Reserve Bank of Chicago).
Cantor, R. and Falkenstein, E. (2001) Testing for rating consistency in annual default rates. J Fixed Income 11(2): 36–51.
Committee of European Banking Supervisors (2005) Guidelines on the implementation, validation and assessment of advanced measurement (AMA) and internal ratings based (IRB) approaches. Technical Report CEBS CP 10.
DeLong, E., DeLong, D. and Clarke-Pearson, D. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 44: 837–845.
Fawcett, T. (2006) An introduction to ROC analysis. Pattern Recogn Lett 27: 861–874.
Financial Services Authority (2005) Strengthening capital standards. Technical Report CP 05/3.
Fleiss, J. and Cohen, J. (1973) The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 33: 613–619.
Hong Kong Monetary Authority (2006) Validating risk rating systems under the IRB approaches. Technical Report.
Hosmer, D. and Lemeshow, S. (2000) Applied Logistic Regression . Wiley-Interscience: New York, NY.
Joseph, M. (2005) A PD validation framework for Basel II internal ratings-based systems. In: Thomas L, Crook JN and Edelman DB (eds). Proceedings of the nineth conference on Credit Scoring and Credit Control.
Kullback, S. and Leibler, R. (1951) On information and sufficiency. Ann Math Stat 22: 79–86.
Landis, J. R. and Koch, G. (1977) The measurement of observer agreement for categorical data. Biometrics 53: 159–177.
Lantz, C. and Nebenzahl, E. (1996) Behavior and interpretation of the κ statistic: Resolution of the two paradoxes. J Clin Epidemiol 49: 431–434.
Lloyd, S. N. (1984) Technical aids. J Qual Technol 16: 238–239.
Montgomery, C. D. (2005) Introduction to Statistical Quality Control . John Wiley and Sons: Hoboken, NJ.
Sheskin, D. (2000) Handbook of parametric and nonparametric statistical procedures . Chapman and Hall/CRC: Boca Raton, FL.
Sobehart, J., Keenan, S. and Stein, R. (2001) Benchmarking quantitative default risk models: A validation methodology. Algo Res Quart 4(1/2): 57–71.
Spitzer, R., Cohen, J., Fleiss, J. and Endicott, J. (1967) Quantification of agreement in psychiatric diagnosis. A new approach. Arch Gen Psychiat 17: 83–87.
Stein, R. (2003) Are the probabilities right? A first approximation to the lower bound on the number of observations required to test for default rate accuracy. Moody's KMV, Technical Report 030124.
Tasche, D. (2005) Rating and Probability of default validation. In: Studies on the Validation of Internal Rating Systems, BIS Working Paper No. 14, pp 28–59.
Thomas, L., Edelman, D. and Crook, J. (eds) (2002) Credit Scoring and its Applications. SIAM: Philadelphia.
Van Gestel, T., Martens, D., Baesens, B., Feremans, D., Huysmans, J. and Van-thienen, J. (2007) Forecasting and analyzing insurance companies' ratings. Int J Forecasting 23: 513–529.
Vasicek, O. (1997) The loan loss distribution. Working Paper, KMV Corporation.
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Castermans, G., Martens, D., Gestel, T. et al. An overview and framework for PD backtesting and benchmarking. J Oper Res Soc 61, 359–373 (2010). https://doi.org/10.1057/jors.2009.69
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1057/jors.2009.69