INTRODUCTION

Whether the Sharpe ratio is an appropriate performance index for ranking financial products remains a controversial question among both academics and practitioners (see Adcock et al1). The academic criticisms of the ratio are well known: although a trade-off ratio based on mean and variance is fully compatible with normally distributed returns (or, in general, with elliptical returns), it may lead to incorrect evaluations when returns exhibit heavy tails (see, for example, Leland,2 Bernardo and Ledoit,3 Campbell and Kräussl4). During the last two decades, numerous alternative ratios have been proposed in the literature (see, for example, Biglova et al,5 Menn et al6 among others); a recent review has classified more than 100 of them (see Cogneau and Hübner7). Nevertheless, more sophisticated tools require more information for proper implementation, and this is not without cost, both in time and effort. Furthermore, the Sharpe ratio is still a popular and widespread tool both in academia and practice (see Kouwenberg,8 Ammann and Moerth,9 Yu et al,10 Chincarini and Kim,11 Hemminki and Puttonen12). Thus, a crucial question faced by many practitioners is to quantify the difference in results between sophisticated decision aid systems and the classical Sharpe ratio.

A first attempt to answer this question has been carried out by Eling and Schuhmacher13 using hedge fund data. They show that the rank correlations between a set of different performance ratios based on downside risk measures and the Sharpe ratio are virtually equal to 1, leading to the apparent conclusion that the Sharpe ratio produces very comparable rankings than other performance measures. The above-mentioned research does not take into account the class of tailor-made ratios, as considered in Sortino and Satchell,14 Biglova et al,5 and Farinelli et al.15 These measures are especially relevant for investors in hedge funds (that is, sophisticated investors, such as pension funds and endowments; see Agarwal and Naik16) that might seek a different risk profile compared to, for example, investors in mutual funds. Fitting a performance measure to investor preferences is exactly what tailor-made performance ratios accomplish.

The aim of this article is to go a step further than previous research and study possible mismatch between the Sharpe ratio and tailor-made ratios. Specifically, we focus on the families of Sortino–Satchell (see Sortino and Satchell14), Farinelli–Tibiletti (see Farinelli and Tibiletti17, 18), and Rachev ratios (see Biglova et al5). Parameters in these ratios allow flexibility in the choice of which sector of the return distribution is focused on and create ratios tailored to the financial products under consideration and/or the investor risk profile. For the purpose of our analysis, we consider a large international database consisting of 4048 hedge funds.

Our empirical analysis confirms Eling and Schuhmacher's13 results. When the tailor-made ratios describe moderate investment styles and the quantitative analysis concerns the entire return distribution (as in the case of the ratios analyzed in Eling and Schuhmacher13), rankings are not too dissimilar to those established with the Sharpe ratio. However, as the parameters move to extreme values, making the ratios tailored to more aggressive investment styles, discrepancies with the Sharpe ratio ranking can be observed. As expected, the most discordant results are achieved for aggressive Rachev ratios, where only extreme tail events are taken into consideration.

The remainder of this article is organized as follows. The following section provides an overview of the tailor-made performance ratios. The data and methodology are then presented, and afterwards we analyze the results. In the final section we conclude.

TAILOR-MADE PERFORMANCE RATIOS

The challenge in ranking financial prospects is to choose a ratio that is not only able to discover the best return/risk trade-off, but also match the investor goals and/or the investment style of the financial products under consideration. The use of tailor-made performance ratios just seems to hit the target. In the following, we will deal with the one-size Sharpe ratio and tailor-made performance ratios belonging to the following three families: the Sortino–Satchell, the Farinelli–Tibiletti and the Rachev ratios. We first define these various ratios as they will be used throughout this article.

The classical Sharpe ratio can be calculated as (see Sharpe19):

where σ denotes the standard deviation and r f is the free-risk monthly interest rate. Using the standard deviation as a measure of risk means that upside and downside deviations to the benchmark are equally weighted. Therefore, this ratio is a good match for investors with a moderate investment style whose main concern is controlling the stability of returns around the benchmark. Its use is questionable, however, if the investment style is more aggressive and focused on the trade-off between large favorable/unfavorable deviations from the benchmark.

The Sortino–Satchell ratio is defined as:

with q>0. This ratio substitutes the standard deviation as a measure of risk with the left partial moment of order q; therefore, the only penalizing volatility is the ‘harmful’ one below the benchmark. The original Sortino–Satchell ratio (see Sortino and Satchell14) is defined for q=2, then the ratio has been extended to q⩾1 (see Biglova et al5; Rachev et al20), and more recently to q>0 (see Farinelli and Tibiletti,18 and Farinelli et al15; see also Famy,21 Gemmill et al,22 or Pedersen and Rudholm-Alfvin23 for applications of this ratio).

The Farinelli–Tibiletti ratio (see Farinelli and Tibiletti;17, 18 Menn et al,6 pp. 208–209) can be calculated as:

and p, q>0. If p=q=1, the index reduces to the so-called Omega index introduced in Keating and Shadwick.24

The parameters p and q can be balanced to match the agent's attitude toward the consequences of overperforming or underperforming. It is known (see Fishburn25) that the higher p and q, the higher the agent's preference for (in the case of expected gains, parameter p) or dislike of (in the case of expected losses, parameter q) extreme events. If the agent's main concern is that the investment fund might miss the target, without particular regard to by how much, then a small value (that is, 0<q<1) for the left order is appropriate. However, if small deviations below the benchmark are relatively harmless compared to large deviations (catastrophic events), then a large value (that is, q>1) for the left order is recommended. The right order p is chosen analogously and should capture the relative appreciation for outcomes above the benchmark.

Instead of measuring over- and under performance with respect to the benchmark, Rachev ratios (see Biglova et al5) draw attention to extreme events. The ratio is defined as follows:

with α, β∈(0, 1) and VaR c (x)≔ −inf{zP(xz)>c} interpreted as the smallest value to be added to the random profit and loss x to avoid negative results with probability at least 1−c. Formula (4) is related to the expected shortfall ES c (x)=−E[xx⩽−VaRc%(x)] also known as tail conditional expectation or Conditional VaR (CVaR) (see Acerbi and Tasche26): it measures the expected value of profit and loss, given that the VaR has been exceeded. By changing the sign in the ES, the Rachev ratio can be interpreted as the ratio of the expected tail return above a certain level, that is, the VaRα% divided by the expected tail loss below a certain level, that is, the VaRβ%. In other words, this ratio awards extreme returns adjusted for extreme losses. The STARR ratio (also called CVaR ratio, see Favre and Galeano,27 Martin et al28) is a special case of the Rachev ratio. For example, STARR (5 per cent)=Rachev ratio with (α, β)≔(1, 0.05). We analyze the Rachev ratio for different parameters α and β; the lower they are, the more the focus is concentrated on the extreme tails.

In conclusion, by properly balancing parameters p, q, α and β, we can tailor the ratios to investor style and/or capture different features of the financial products under consideration. As the parameters tend toward the extreme, the correspondent ratios shift to describe a more ‘extreme’ investment style. Specifically, if our goal is to focus on extreme events at the tails (high stakes/huge losses), thus needing an aggressive ratio, parameters p and q in the Farinelli–Tibiletti ratios are fixed at high values, whereas parameters α and β in the Rachev ratios are fixed at low values.

DATA AND METHODOLOGY

We consider hedge fund data provided by the Center for International Securities and Derivatives Markets (CISDM). We decided not to employ the hedge fund data that Eling and Schuhmacher13 used in their analysis because the CISDM database is larger and its use more widespread.29, 30

The database contains 4048 hedge funds reporting monthly returns, net of fees, for the time period of January 1996 – December 2005. Table 1 contains descriptive statistics on the return distributions of the hedge funds. On the basis of the Jarque–Bera test, the assumption of normally distributed hedge fund returns must be rejected for 37.67 per cent (43.60 per cent) of the funds at the 1 per cent (5 per cent) significance level.

Table 1 Descriptive statistics for 4048 hedge fund return distributions

The findings reported in the following section were generated by first using the measures presented in the section ‘Tailor-made performance ratio’ to determine hedge fund performance. To produce results comparable to those of Eling and Schuhmacher,13 we chose a minimal acceptable return equal to the risk-free monthly interest rate (r f ) of 0.35 per cent. Next, for each performance measure, the funds were ranked on the basis of the measured values. Finally, the rank correlations between the performance measures were calculated. This research design is of high relevance, as the performance of funds is regularly ranked on basis of risk-adjusted performance measures in order to benchmark the success of the fund compared with that of other funds, and to serve as the basis for investment decisions.

A large number of different parameter combinations were included in the analysis: For the Sortino–Satchell ratio, the parameter q is varied between 0.01 and 10. For the Farinelli–Tibiletti ratio, the parameters p and q are both varied between 0.01 and 10. For the Rachev ratio, the parameters α and β are varied between 0.1 per cent and 90 per cent.

FINDINGS

Figure 1 presents the rank correlation between the ranking resulting from the Sharpe ratio and that of the Sortino–Satchell ratio for different parameters q.

Figure 1
figure 1

Sortino–Satchell ratio.

The value assigned to parameter q appears to have little effect on the hedge fund ranking. In fact, the rank correlation is relatively close to 1 and a kind of lower bound with high values of q seems to exist with a rank correlation about 0.965 (this lower bound is confirmed by an analysis of higher values for q that is available upon request). For the original Sortino–Satchell ratio (q=2), the rank correlation is 0.98, which confirms the high rank correlation found by Eling and Schuhmacher13 for this measure. Note that it is not common to consider values for q much lower than 1. For example, Fishburn25 reports that, in practice, values for q range from slightly less than 1 to 4, while Farinelli et al15 use a value of q=0.8 to describe an aggressive investor, and a value of q=2.5 for a conservative investor. For all these values of q, the rank correlations are very close to 1. This is convincing evidence that the Sortino–Satchell and the Sharpe ratios lead to similar rankings.

Next, the Farinelli–Tibiletti ratio is analyzed. The upper part of Figure 2 presents the rank correlation between the Sharpe ratio and the Farinelli–Tibiletti ratio depending on the parameter p (with q=1); the lower part of the figure shows the rank correlation depending on the parameter q (with p=1).

Figure 2
figure 2

Farinelli–Tibiletti ratio (upper part: 0.01<p<10, with q=1; lower part: 0.01<q<10, with p=1).

Again, our results are in line with conjectures deriving from the study of the influence of the parameters (see Fishburn25). As expected, the highest rank correlations occur for values of p close to 1. For p=q=1, the Farinelli–Tibiletti ratio coincides with the Omega index, which Eling and Schuhmacher13 showed to produce rankings similar to those derived by the Sharpe ratio.

Figure 3 presents rank correlations between the Sharpe ratio and the Farinelli–Tibiletti ratio for different combinations of p and q (the kink at p, q=1 is due to the different scaling between 0.01<p, q<1 and 1<p, q<10).

Figure 3
figure 3

Farinelli–Tibiletti ratio (0.01<p, q<10).

The Farinelli–Tibiletti ratio is more sensitive to rank correlations than the Sortino–Satchell ratio, but still provides relatively high values, especially for reasonable values of p and q. For example, Farinelli et al15 use values of p=2.8 and q=0.8 to describe an aggressive investor, which here results in a rank correlation of 0.92 to the Sharpe ratio. A conservative investor is described by p=0.8 and q=2.5, which gives a rank correlation of 0.95.31 In both cases, the parameters are chosen according to Fishburn25 and expected utility theory, that is, conservative (p<1, q>1) and aggressive (p>1, q<1). If p (<1) tends toward 0, the ratio assumes a conservative investor most interested in gaining small returns rather than seeking high stakes. According to Fishburn,25 a conservative ratio should express aversion to high losses, and thus the parameter shaping an attitude toward negative returns should be q>1. Conversely, as p>1 increases, the ratio describes a more aggressive investor hoping to profit from a high-stakes strategy. Therefore, an aggressive ratio with p>1 should show indifference to high losses, thus q<1.

However, the Farinelli–Tibiletti ratio is a flexible tool that can be used in various ways. The parameters can be chosen so that the ratio can be read as the trade-off between moderate gain/moderate risk or between high stakes/huge losses. In such a case, p and q go hand in hand, that is, p<1 goes with q<1, and p>1 goes with q>1. The ratio can then be interpreted as the price of one unit of return for one unit of loss, where returns and losses are weighted by p and q. As the ratio moves to extreme investment styles, rank correlations with the moderate Sharpe ratio decrease. This is most evident for p and q close to 10, where the rank correlation falls to 0.59. It is worth noting that this occurs in correspondence with the case where the ratio detects the trade-off between high stakes/huge losses.

Finally, we consider the Rachev ratio. Figure 4 shows the rank correlation between the Sharpe ratio and the Rachev ratio for different combinations of the parameters α and β.

Figure 4
figure 4

Rachev ratio.

Among the tailor-made ratios, Rachev ratios are the most different from the Sharpe ratio. In fact, in all previous analyses, the entire return distribution is taken into account, although with a different emphasis given to the tails (according to parameters p and q, that is, both for return and risk). In contrast, Rachev ratios with α and β less than 0.5 ignore even a portion of upside variability in the evaluation of return and equally ignore even a portion of downside variability in the evaluation of risk. Remember that the Rachev ratio can be interpreted as the trade-off between the expected return above the VaRα%, that is, the CVaRα%, and the expected loss below the VaRβ%, that is, the CVaRβ%.

Again, our expectations are confirmed: the highest rank correlation is achieved for values of α and β close to 0.5, which is just the same as the case of a moderate ratio achieving the trade-off between the expected returns above and below the median. In this situation, the Rachev ratio acts similarly to the Omega index (note that it collapses into the Omega if the distribution is symmetrical). Vice versa, as α and β decrease, central data are removed from the analysis of return and risk. The ratio becomes more aggressive, providing only the trade-off between the expected high stakes and the expected huge losses. In such circumstances, the Rachev ratio is focused merely on the tails, whereas the Sharpe ratio is focused on the ‘stability’ around central values. Therefore, the two ratios show the biggest divergence in the way they capture information from the data, and as expected their rank correlations shrink to 0.51. Moreover, when β tends toward 1, the denominator tends toward the mean (given for β=1), clearly failing to be an accurate measure of risk and meaning that the ratio itself is no longer a valid return/risk trade-off; 0.5<β<1 is thus not relevant. In conclusion, when Rachev ratios are tailored to moderate investment styles (that is, for α and β close to 0.5), the rank correlation is about 0.90, whereas when they are fitted to more aggressive investment styles (that is, for α and β close to 0), the rank correlation falls to 0.51.

We can conclude that among the three families of tailor-made ratios analyzed here, the Sortino–Satchell is the one that behaves most like the Sharpe ratio. There may be two reasons for this: first, the Sortino–Satchell ratio captures the attitude toward gains with the mean, as does the Sharpe ratio; second, the choice of q varies in accordance with Fishburn's25 approach, that is, the greater the aversion to huge losses, the higher q>1 and the less the aversion to huge losses, the lower q<1, which is compatible with expected utility theory. The biggest discrepancies with the Sharpe ratio are found for Farinelli–Tibiletti and Rachev ratios fitting ‘extreme’ investment styles. Specifically, the worst mismatch is achieved when the ratio is built to act as a trade-off between moderate gains/moderate losses or between high stakes/huge losses, so that the parameter regulating aversion to huge losses no longer follows the Fishburn25 paradigm. As by definition, the Rachev ratio is the trade-off between gains and losses, its largest discrepancy from the Sharpe ratio occurs when it is set up for the most aggressive investor style, that is, small α and β. In this case, the rank correlation between the two measures falls as low as 0.51.

CONCLUSION

Whether using the Sharpe ratio to rank funds is advisable remains an open question in academia and among practitioners. The empirical analysis carried out here confirms the results of Eling and Schuhmacher;13 as long as tailor-made ratios describe moderate and conservative investment styles, the rank correlation with the Sharpe ratio ranking is close to 1. However, if ratios such as Farinelli–Tibiletti or Rachev are tailored to describe more aggressive investment styles, the rank correlation is drastically reduced and the use of the Sharpe ratio becomes questionable.