Paper

Journal of Derivatives & Hedge Funds (2008) 14, 9–30. doi:10.1057/jdhf.2008.3

On comparing hedge fund strategies using new Hausman-based estimators

Practical applications In order to correct specification errors in financial models of returns, the authors propose new procedures to generate strong instruments based on higher moments and cumulants of the explanatory variables. By doing so, we rehabilitate a well-known estimator, the Generalized Method of Moments (GMM), which uses these innovative instruments and we thus construct our new estimators, called the GMM-C and the GMM-hm. A new indicator is also developed for gauging measurements errors in financial models of hedge fund returns and it appears quite promising for the financial practitioner.

François-Éric Racicot1 and Raymond Théoret2

Correspondence: François-Éric Racicot, Department of Administrative Sciences, University of Quebec (Outaouais), (UQO), 101 St-Jean-Bosco Street, Gatineau (Hull), Quebec J8X 3X7, Canada. Tel: +1 819 595-3900 ext. 1727, Fax: +1 819 773 1797, E-mail: francoiseric.racicot@uqo.ca

1François-Éric Racicot PhD, is Associate Professor of Finance at the Department of Administrative Sciences, University of Quebec - Outaouais (LQO). His research interests focus on the problems of measurement errors and specification errors in financial models of returns. He is also interested in developing new methods used for forecasting financial time series.

2Raymond Théoret PhD, is Professor of Finance at the Department of Finance, University of Quebec - Montreal (UQAM). His research focuses on the volatility of bank income in relation with stock market performance, self-enforcing labour contracts and problems of measurement errors in financial models.

Received 4 March 2008; Revised 4 March 2008.

Top

Abstract

This paper proposes new Hausman-based estimators lying on higher moments and cumulants. Our study gives way to a new indicator signalling the presence of specification errors in financial models. We apply our battery of tests to a sample of 21 HFR hedge funds strategies over the period 1990–2005. Our tests reveal that it is much more preferable to account for specification errors when estimating financial models of returns. Actually, the performance ranking of hedge funds strategies may change significantly when accounting for specification errors.

Keywords:

hedge fund returns, alpha of Jensen, financial models, cumulants, higher moments, specification errors, aggregation bias, Hausman-C, GMM-C

Top

INTRODUCTION

The presence of specification errors is an important problem when estimating economic and financial models but the solutions are yet limited. The Generalised Method of Moments (GMM) is often used to remove specification errors but resorting to this procedure requires a judicious choice of instruments. The instruments used in the majority of financial studies, however, are quite poor.1 Even the Chen–Roll–Ross2 instruments are not very reliable.

In this paper, we revisit the Fama and French (F&F) model in the context of the estimation of hedge fund returns. We resort to two new sets of instruments based respectively on higher moments and cumulants of the explanatory variables. Higher moments and cumulants are quite promising as tools to analyse the distribution of returns or of other financial variables for which the asymmetry or kurtosis cannot be neglected. Higher moments and cumulants thus qualify as instrumental variables (IV) to estimate financial models like the F&F one by two-stage least squares (TSLS) or GMM.

Furthermore, the Hausman3 test is often invoked to detect specification errors in an estimated model. It is less known that a specific version of the Hausman test may be equivalent to a TSLS procedure.4, 5, 6 Resorting to this equivalence, we show how we can use our new set of instruments to generate innovative versions of financial models that give direct information on the severity of the specification errors. These developments allow us to build indicators of specification errors, one for each endogenous variable.

This paper is organised as follows. The next section proposes two new sets of instruments based respectively on higher moments and cumulants, and gives the econometric and financial foundations of these instruments. It is shown how these instruments may be integrated in a Hausman test to give way to indicators of the specification errors in the frame of the F&F model. These indicators are related to the spread between the coefficients estimated by ordinary least squares (OLS) and by an IV method. They give information on the degree of overstatement or understatement of the coefficients estimated by the OLS procedure. The subsequent section examines the empirical validity of these Hausman-based estimators for calibrating the F&F model. Our sample is a series of 22 monthly Hedge Fund Research (HFR) indices observed over the period 1990–2005, quite a long period for hedge fund returns. The final section concludes.

Top

HAUSMAN SPECIFICATION TESTS BASED ON HIGHER MOMENTS AND CUMULANTS7

Test based on higher moments

The augmented F&F8, 9, 10 model is a purely empirical model, which may be written as:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with Rpt- Rft being the excess return of a portfolio, Rft being the risk-free return; Rmt- Rft the market risk premium; SMB a portfolio that mimics the 'small firm anomaly', which is long in the returns of selected small firms and short in the returns of selected big firms; HML a portfolio that mimics the 'value stock anomaly', which is long in returns of stocks of selected firms having a high (book value/market value) ratio (value stocks) and short in selected stocks having a low (book value/market value) ratio (growth stocks); UMD a portfolio that mimics the 'momentum anomaly', which is long in returns of selected stocks having a persistent upper trend and short in stocks having a persistent downwards trend.

To explain the return of a stock or of a portfolio of stocks, the F&F model adds to the unique factor retained by the CAPM, the market risk premium, three other factors that are assumed to represent market anomalies: the small firm anomaly, the book value to market value anomaly and the momentum anomaly.11

We postulate that equation (1) contains specification errors. These errors might be due to many causes5 but the main plausible ones may be the omission of relevant variables, the aggregation level of the data or simply an incorrect functional form. Following these errors, the risk factors become endogenous and the condition of orthogonality between these factors of risk and the innovation term in equation (1) is violated: the estimators of the coefficients of this equation are no longer unbiased and consistent. To purge these coefficients from these biases, we must regress in a first pass the independent variables on IV. The estimated method used in this paper, which is based on the Hausman test, will be explained below. The problem lies in the judicious choice of instruments.

As we said before, it is difficult to find valuable instruments for the excess returns of the mimicking portfolios. Being long in some stocks and short in others, their cash flows are similar to those of hedge funds. Higher moments of returns, like asymmetry and kurtosis, might have a great influence on these returns. This suggests the use of higher moments of the variables on the RHS of equation (1) as IV. An econometric theory is indeed in construction on this subject. Following Durbin12 and Pal,13 Dagenais and Dagenais14 showed that higher moments15 of independent variables of a regression might be valid instruments to remove errors-in-variables or, more generally, specification errors. But instead of defining higher moments as in these papers, we will first adopt a method more akin to asset-pricing theory, which defines higher moments of returns by powers of these returns. We will subsequently present the instruments proposed by Dagenais and Dagenais16 in the section dedicated to the Hausman test based on cumulants.

The method of asset pricing based on higher moments is not new. Samuelson,17 Rubinstein18 and Kraus and Litzenberger19 put the foundations of the three-moment and four-moment CAPM.20 The three-moment CAPM integrated the asymmetry of returns in the analysis whereas the four-moment CAPM added their kurtosis.

The n-moment CAPM can be written as follows:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

A test on alpha2 is a test on skewness preferences in asset pricing and a test on alpha3, a test on kurtosis preferences, and so on. The higher moments are consequently powers of returns in this approach. We therefore use a financial theory, the n-moment CAPM, to give an object to the method of Dagenais and Dagenais for correcting specification errors. Let us return to the variable SMB, which we want to correct for the problem of specification errors. In the first pass of the regression, this variable will be regressed on:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where Fi are the variables on the RHS of the equation of F&F (equation (1)) including SMB. They stand for the higher moments of these variables. Fit2 stands for the skewness of factor Fi; Fit3, for its kurtosis, and so on. The variables appearing on the RHS of equation (3) will serve as IV in the first pass of the Hausman tests.

To detect specification errors in our sample of hedge funds, we could use the original Hausman h test.21 To explain this test, let us suppose the following classical model:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with Y an (n times 1) vector representing the dependent variable; X, an (n times k) matrix of the explanatory variables; beta, a (k times 1) vector of the estimates of the parameters and alt epsilonapproxiid(0, sigma2).

Hausman22 compares two sets of estimates of the parameters vector, say, betaOLS, the least-squares estimator (OLS), and betaA, and alternative estimator, which can take a variety of forms but which, for our purposes, is the IV estimator designated by betaIV. The hypotheses to test are H0, being in our case the absence of specification errors and H1, being the presence of specification errors. The vector of estimates betaIV is consistent under both H0 and H1 but betaOLS is consistent under H0 but inconsistent under H1. Under H0, betaIV is obviously less efficient than betaOLS.

Hausman wants to verify if 'endogeneity' of some variables,23 the variables measured with errors in our case, has any significant effect on the estimation of the vector of parameters. To do so, he defines the following vector of contrasts or distances: betaIV- betaOLS. The test statistic may be written as follows:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with Var(betacrcIV) and Var(betacrcOLS) being consistent estimates of the covariance matrices of betacrcIV and betacrcOLS. g is the number of potentially endogenous regressors. H0 will be rejected if the p-value of this test is less than alpha, with alpha being the critical threshold of the test, say 5 per cent.

According to MacKinnon,24 this test might run into difficulties if the matrix [ Var(betacrcIV)- Var(betacrcOLS)] , which weights the vector of contrasts, is not positive definite. Fortunately, there is an alternative way to do the Hausman test, which is much easier. This test goes as follows.

Assume a four-variable linear model, which, in this paper, is the general form of the F&F model with four factors:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with alt epsilonapproxN(0, sigma2).

The variables xit*25 are measured with errors, that is:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with xit, the corresponding observed variables, which are measured with errors. By substituting equation (7) into equation (6), we have:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with alt epsilont*=alt epsilont- beta1upsilon1t- beta2upsilon2t- beta3upsilon3t- beta4upsilon4t. As seen before, estimating coefficients of equation (8) by the OLS method gives way to biased and inconsistent coefficients because the explanatory variables are correlated with the innovation.

Consistent estimators can be found if we can identify an instrument vector zt, which is correlated with every explanatory variable but not with the innovation of equation (8). Then we regress these four explanatory variables on zt. We have:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with ital xcircit, the value of xit estimated with the vector of instruments and w circumflexit, the residuals of the regression of xit on ital xcircit. Substituting equation (9) into equation (8), we have:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

The explanatory variables of this equation are, on the one hand, the estimated values of xit, obtained by regressing these four variables on the vector of instruments zt, and on the other hand, the respective residuals of these regressions. Equation (10) is therefore an augmented version of equation (8), which might be viewed as an auxiliary or artificial regression.

Racicot26 applies this approach to the market model. He postulates that the t-test issued from the new variable w circumflex is distributed asymptotically as the normal distribution. According to Pindyck and Rubinfeld,4 this test is adequate. Racicot also postulates in this context that the new model resulting from the addition of the artificial variable may be considered as a new model by itself, so we have a new alpha for this model.

We can show that:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

When there is no specification error, sigmaupsiloni2=0 and OLS gives way to a consistent estimator for the parameter of w circumflexit in equation (10), that is betai. When there are specification errors, sigmaupsiloni2not equal0 and therefore this estimator is not consistent.

We can thus build the following test to detect the presence of specification errors. As we do not know a priori if there are such errors, we replace the coefficients of the w circumflexit in equation (10) with thetai. We have:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

But according to equation (9), ital xcircit=xit- w circumflexit. We can therefore rewrite equation (12) as follows:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

If there is no specification error for xit, then thetai=betai. If there are specification errors, thetainot equalbetai and the coefficients of the residuals terms wit will not be zero.

There is more information that we can draw from equation (13). Indeed, if the estimated coefficient (thetai- betai) is significantly positive, then the estimated coefficient of the corresponding explanatory variable xit is overstated in the OLS run. Therefore, the estimated coefficient for this variable will decrease in equation (13). On the other hand, if the estimated coefficient (thetai- betai) is significantly negative, then the estimated coefficient of the corresponding explanatory variable xit is understated in the OLS run. Therefore, the estimated coefficient for this variable will increase in equation (13). These effects of specification errors produced by equation (13) are thus very informative.

We must notice that the coefficients betai estimated by equation (13) are identical to those ones produced by a TSLS procedure using the same instruments. Equation (13) is therefore another way to set up a TSLS. But in view of the useful information produced by equation (13), this equation opens the door to new financial models. We should therefore prefer this formulation to the one represented by a TSLS to estimate the augmented F&F model. And we thus have a new empirical formulation for the F&F model.

We therefore proceed as follows to test for specification errors. First, we regress the observed explanatory variables xit on the instruments vector to obtain the residuals w circumflexit. Then, we regress yt on the observed explanatory variables xit and on the residuals w circumflexit. This is an auxiliary or artificial regression. If the coefficient of the residuals of an explanatory variable is significantly different from 0, we may conclude that there is a specification error related to this explanatory variable. We may resort to the Wald test (F-test) to see if the whole set of (thetai- betai) coefficients is significantly different from zero.

We can generalise the former procedure to the case of k explanatory. Let X be an (n times k) matrix of explanatory variables that is not orthogonal to the innovation, and let Z be an (n times s) matrix of instruments (s>k). To perform the Hausman test based on an artificial regression, we first regress X on Z to obtain X circ, that is:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where PZ is the 'predicted value maker'. Having performed this regression, we compute the matrix of residuals w circumflex:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Then we perform the following artificial regression:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

An F–test on the lambda coefficients will indicate if they are significant as a group. A t-test on individual coefficients will indicate if the corresponding beta is understated or overstated, as discussed previously.

The vector of beta estimated by equation (16) is identical to the TSLS estimates, that is:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

To detect specification errors in the augmented F&F model, we will run two sets of regressions. First, we will run the OLS one, that is:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Then, we will run the following artificial regression explained previously:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

The estimated coefficients phii will allow detecting specification errors, and their signs will indicate if the corresponding variable is overstated or understated in the OLS regression.

As said previously, the beta* estimated by equation (19) are equivalent to the TSLS estimates. But we prefer equation (19) because it gives more information on the problem of specification errors. Equation (19) is thus our new empirical version of the augmented F&F model. The phii are factors of correction of the risk exposure of a Fund to the ith factor of risk. If phii is positive, then the exposure to the ith risk factor is overstated in the OLS regression. The beta associated to this factor will thus decrease in the artificial regression and vice versa if phii is negative. Moreover, according to our previous developments, we expect a high positive correlation between (betacrci- betacrci*), that is the estimated error on the coefficient of factor i, and phi^i, the estimated coefficient of the corresponding artificial variable (w circumflexi).

We can sum up the former arguments by the following empirical equation:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where Spreadis=betacrcis- betacrcis*, s being a hedge fund strategy and sigma variantis being the innovation of the estimation. According to equation (20), phi may thus be viewed as an indicator of overstatement or understatement of the OLS estimation for the coefficient associated to the factor i for the strategy s. We will perform this equation for the most important risk factors in the empirical section of this paper. That constitutes our variant of the original Hausman test. The goodness of fit of equation (20) will give information about the severity of the specification errors for an explanatory variable as shown in the empirical section.

Test based on cumulants

It is possible to refine further these developments on instruments. Indeed, Durbin,12 Pal13 and more recently Dagenais and Dagenais14 have proposed new instruments based on cumulants of order higher than two. Racicot26 has generalised these methods to financial models of returns.

These new instruments are inspired from the works of Kendall and Stewart27 on moments and cumulants. Cumulants28 are more and more popular in the recent risk literature, and they therefore can be used as relevant instruments by following the logic of the preceding section.

Fundamentally, Dagenais and Dagenais proposed of combining the estimators of Durbin and Pal, which are based on cumulants and are very innovative from this point of view, to form a new estimator corrected for specification errors, and especially errors in variables. As the developments are complex, we will only give the principles in this section. The details may be found in Dagenais and Dagenais, in Racicot and Théoret29 and Coën et al.,30 Coën and Racicot.31

Let us assume once more the following general form for a model:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where Y is the vector (n times 1) representing the explained variables and X is the matrix (n times k) of the explanatory variables. We suspect specification errors in the explanatory variables, which might create inconsistency in the estimation of the vector beta. To correct for this problem, Durbin proposed to use as instruments the following product: x*x, where x is the matrix X expressed in deviation from the means of the explanatory variables and where the symbol * designates the Hadamard element-by-element matrix multiplication operator. Pal went in the same direction by proposing as instruments the cumulant based on the cubes of x instead of the squares as Durbin. Dagenais and Dagenais combined these instruments to obtain a matrix of instruments, say Z, based on the cumulants and co-cumulants of x and y, these being the matrix X and the vector Y expressed in deviation from the mean. Let us notice that the matrix Z may be decomposed in k vectors or series, so that Z=[ z1 z2 ... zk] . The vector z1, which is built with the first explanatory variable x1 of the equation (21), is thus the instrument of the first explanatory variable, and so on.

We regressed this vector Z on each of the explanatory variables to obtain ital xcirc. We have:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

This equation gives way to the residuals that are introduced in equation (10) to form the wic to distinguish them from the wi in equation (19). We thus obtain equation (23):

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

This equation is proposed as a new benchmark to eliminate specification errors. As we will see in the following section, the instruments defined by Z are quite good. These estimators have the advantage of requiring no extraneous information to the model.

Top

EMPIRICAL RESULTS

Description of the sample

Our sample of hedge funds comprises the monthly returns of 22 HFR indices classified by categories or groups of categories. The observation period runs from January 1990 to December 2005, for a total of 192 observations. The risk factors that appear in the F&F equation — that is the market risk premium and the three mimicking portfolios SMB, HML and UMD — are for their part drawn from the French's website.32 We used as instruments, among others, the Chen–Roll–Ross2 factors: the industrial production, the consumer price index, the spread between long- and short-term bonds, the spread between BBB and AAA corporate bonds and the dividend yield of the S&P500. These factors are drawn from the database of the Federal Reserve Bulletin and the Federal Reserve Bank of St-Louis.

A first glance at the database of hedge funds

We can get a first glance at our sample of hedge fund returns by looking at Table 1, which gives the descriptive statistics of the 22 HFR indices selected. The period of analysis runs from January 1990 to December 2005. The hedge fund indices are sorted by the R2 associated to the OLS estimation of the F&F model over the period 1990–2005. We see that the F&F model performs poorly for very specialised hedge fund strategies, like the fixed income arbitrage, convertibles and macro ones. But it seems relevant to explain the returns of the strategies that dominate the hedge fund industry, that is, the equity hedge, fund of funds and equity non-hedge strategies.


At an annualised value of 14.5 per cent over the 1990–2005 period, the mean return of the hedge fund composite index was higher than the 11.5 per cent realised by the S&P500. There is, however, a great diversity of returns over the strategies. The return of the short-selling index was a meagre 4 per cent, whereas the equity hedge index, associated with the most important strategy in the hedge fund industry, obtained a return as high as 17.5 per cent.

A stylised fact of the hedge fund returns is the high degree of kurtosis of their distribution. Actually, the kurtosis of the returns of the hedge fund composite index was 5.30 over the 1990–2005 period compared to the 3.73 observed for the S&P500. In Table 1, kurtosis ranges from a high of 14.71 for the merger arbitrage index to a low of 2.46 for the market timing one. Incidentally, the equity hedge strategy, the most important one in the hedge fund industry, had a kurtosis of 3.92 over the period 1990–2005, a level comparable to the S&P500.

The choice of instruments

As previously stated, this paper distinguishes three types of instruments: the classical instruments, the higher moments of the explanatory variables and the cumulants of the regressors, as proposed by Dagenais and Dagenais14 and transposed to financial models by Racicot.26 The classical instruments include the traditional predetermined variables and the Chen–Roll–Ross factors. The higher moment instruments add to the classical ones the higher moments of the regressors up to the fifth order (equation (3)). The derivation of cumulants, designated by zi, was explained in the preceding section.

Table 2 gives the adjusted R2 of the regressions of the risk factors of the F&F model (endogenous variables) on the classical instruments. As we can see, these instruments are quite poor, the adjusted R2 being under 0.10 for each risk factor of the F&F model. Resorting only to the classical instruments is inappropriate for explaining the returns of the market risk premium33 and of the mimicking portfolios that have a high degree of kurtosis. Non-linear instruments as higher moments and cumulants are required.


Table 3 confirms this allegation for the higher moment instruments. The adjusted R2 for the regressions of the endogenous variables on these instruments is in a range of 0.76–0.82. The higher moments thus seem to be very good instruments. But to qualify as such, instruments must also be uncorrelated with the innovation term of the OLS estimation of the F&F model. Figure 1 gives the distribution of the adjusted R2 of the residuals of the equations of 21 indices regressed on the higher moment IV. This adjusted R2 is not high enough to cause a problem.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Adjusted R2 distribution of the regression of the OLS residuals of the F&F model on the higher moment instruments

Full figure and legend (71K)


Finally, Table 4 shows the regressions of the four endogenous variables of the F&F models on the cumulants. The R2 of the regressions are in a range of 0.64–0.75 and are thus high. As we can see in Table 4, and as said previously when discussing equation (23), each risk factor has its own instrument. For instance, z1 is the instrument of the market risk premium. When regressing the market risk premium on the zi, z1 has a coefficient of 1 and the other zi has a coefficient equal to 0. z2 is the instrument of the SMB factor, the regression of SMB on the zi giving a coefficient of 1 to z2 and coefficients 0 for the other zi, and so on. We may thus consider the first column of Table 4 as the principal component of the market risk premium, the second column as the principal component of the SMB factor, and so on. The cumulant instruments are therefore very appealing in view of their characteristics. And a regression of the residuals of the F&F equation on the zi for each index reveals that the adjusted R2 is always 0. The cumulants are superior to the higher moments from this point of view. And being built with the cumulants of the explanatory variables of the sample, they do not require extraneous information in regards to this sample.


Comparison of estimation methods for the sample of 21 hedge funds indices

For the following evaluation of specification errors, the OLS method will serve as benchmark to estimate equation (1). Two IV methods using classical instruments are used: the TSLS and the GMM. For the GMM, we use the Newey–West matrix as weighting matrix. Besides these estimations, we then estimate the F&F model with TSLS and GMM methods using higher moments as instruments. We designate these estimations respectively by TSLS-hm and GMM-hm. We present only the GMM estimation for the IV method using cumulants, labelled GMM-C, because this estimation is strictly identified. We also estimate our two versions of the Hausman equation corresponding to equation (19), which we call HAUS-hm, and equation (23), which we call HAUS-C. We isolate these estimations at Table 5, which present our estimations, because they are respectively equivalent to a TSLS-hm and a GMM-C estimations.34


To estimate equation (1), we chose as the market portfolio not the S&P500 but the excess return of the HFR hedge fund weighted composite index because this benchmark is more akin to the style of the hedge funds strategies than the S&P500 index. Table 5 reports the mean results computed over strategies for the estimation of equation (1) according to the chosen estimation methods. We will first discuss the OLS estimation and then compare the OLS results with those of the IV methods.

At 0.51, the mean R2 associated to the OLS estimation is quite moderate but as we noticed in Table 1, it varies greatly from one strategy to the other. The constant of the regression of the F&F model is the alpha of Jensen, which is a measure of the performance of a portfolio manager. We notice in Table 5 that when we use the hedge fund composite index instead of the S&P500 as the benchmark, the mean alpha is no longer very high nor significant at the 5 per cent level. Studies that use the S&P500 as the market portfolio display a much higher alpha, which seems to contradict the market efficiency hypothesis. On the basis of this hypothesis, it thus seems more relevant to use the weighted composite index as the benchmark.

At 0.54, the mean beta of the 'average strategy' is quite moderate, but Table 6 reveals that its dispersion across strategies is important, the beta being as high as 1.86 for the equity non-hedge index and as low as - 1.98 for the short-selling index. Incidentally, the short-selling index is the only one for which the beta is negative. After the benchmark risk premium, the HML factor has the most important impact on hedge fund returns. But at 0.06, its mean impact is not substantial. Once again, Table 8 reveals that the dispersion of this coefficient over strategies is not negligible. For instance,the loading of the HML factor is as high as 0.63 for the short-selling strategy. Finally, the two other factors do not seem very important for explaining hedge fund returns over our period of analysis.35



Table 5 reveals that the results of the IV methods resorting to classical instruments are disappointing. The R2 related to these methods is very low, and the majority of its estimated coefficients are not significant at the 5 per cent level. Actually, the recourse to pure classical instruments is to proscribe when estimating hedge fund returns.

The IV methods using higher moments and cumulants as instruments perform much more. Their results are close to those of the OLS method, indicating that specification errors are low at the aggregation level of Table 5. The picture, however, will change when shifting to the Hausman tests by strategy.

Table 5 shows that the estimated beta of the three valuable IV methods used is about the same as the OLS. This coefficient thus seems to be correctly estimated by the OLS. The GMM-C method suggests that the mean alpha of Jensen is somewhat overstated by the OLS regression, a result not supported by the higher moment methods. There is no great variation in the estimated coefficient of the HML factor from one method to the other, although the three IV methods suggest that the UMD loading was understated by the OLS equation, the degree of understatement being much higher for the higher moment methods.

Hausman tests for the beta and HML loadings

The preceding section reveals that the two main factors that impact on the hedge fund returns are the weighted composite index and the HML factor. We explained previously how to do a modified version of the Hausman test by using an artificial regression incorporating the residuals of the regressions of the endogenous variables on the instruments. These regressions are useful because the estimated coefficients of the residuals are indicators of the degree of understatement or overstatement of the corresponding endogenous variable. They are all the more useful so as the estimated factor loadings of these regressions are the same as the corresponding TSLS. The spread between the OLS coefficient for a variable and its corresponding coefficient in the artificial regression is thus actually a measure of the specification error — overstatement or understatement — on this variable, as suggested by equation (20).

Tables 6 and 7reveal the results of the estimations of the artificial regressions 19 and 23 for the benchmark risk premium, here the excess return of the hedge fund composite index. Table 6 resorts to cumulants to compute the residuals included in the artificial regressions and Table 7 uses the higher moments to do so. The strategies appearing in bold are those for which the estimated phi is significant at the 10 per cent level.


At Table 6, where the beta is estimated by the HAUS-C method, we see that this coefficient is significantly overstated by the OLS method for five strategies and significantly understated for three strategies. We must thus disaggregate by strategy to observe specification errors because we did not suspect a specification error for the beta averaged over strategies (Table 5). As we notice in Table 6, the phi coefficient associated with the residuals of the benchmark risk premium is an indicator of the degree of overstatement or understatement of the estimated beta. When phi is positive, the spread between the OLS and HAUS-C coefficients is positive: the coefficient is overstated in this case. And when phi is negative, the spread between the OLS and HAUS-C coefficient is negative: the coefficient is understated in this case. If we regress the spread over the phi appearing in Table 6, we obtain the following regression36:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

The R2 of this regression is 0.99, which is almost a perfect fit. In Figure 2, we can observe the close linear relationship between phi and the spread between the OLS and HAUS-C betas. As we said previously, such a good fit suggests that the specification errors are far for being negligible for the betas of hedge funds sorted by strategy.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Relation between the spread and phi estimated by HAUS-C for the benchmark risk premium

Full figure and legend (30K)

Table 7 gives the same information as Table 6 for the beta coefficient except that the HAUS-hm method is used in this table. We notice that this regression identifies fewer strategies with significant specification errors, that is 3 instead of 8 for the HAUS-C method. And we must also notice that the direction of the specification error is related to the choice of instruments to remove specification errors.

According to the HAUS-hm regression, three strategies seem to suffer from significant specification errors: the convertibles, market timing and distressed securities ones. These funds were also chosen by the HAUS-C method as candidates for specification errors (Table 6). According to the HAUS-C method, the betas of the convertibles and distressed securities strategies are overstated and the one of the market timing strategy is understated. We obtain the inverse correction when using the HAUS-hm method for those funds. This situation is all the more problematic since we know that the beta estimated by the HAUS-C method is the same as the TSLS one using the cumulants as instruments and that the beta estimated by the HAUS-hm method is the same as the TSLS one using the higher moments of the explanatory variables as instruments. The betas estimated by the artificial regressions have thus a sound foundation, being related to a well-known IV method, and the correction of specification errors is thus related to the choice of instruments and not to the version of the Hausman test by itself.

We might suspect that the relation between the spread and phi is less tight in Table 7 than in Table 6 because the HAUS-hm method reports fewer strategies with specification errors. The regression between these two variables is, in this case:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

where the R2 is 0.81. As expected, the phi coefficient is less significant in the case of the HAUS-hm method than in the case of the HAUS-C one, and the R2 is much lower. The HAUS-hm method thus gives less evidence of specification errors for the estimated betas of the strategies than the HAUS-C procedure. Figure 3 gives the relation between the spread and the phi appearing at Table 7, and we see that this relationship is less tight than in Figure 2.

Figure 3.
Figure 3 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Relation between the spread and phi estimated by HAUS-hm for the benchmark risk premium

Full figure and legend (44K)

We do not report here the results of the artificial regressions for the SMB factor because this factor is not significant for the majority of the strategies. Figure 4 reports the association between the spread of the SMB coefficients estimated by the OLS and HAUS-C methods and the corresponding phi. There is no evidence of a linear relation between these two variables and we conclude that there are no specification errors for this variable.

Figure 4.
Figure 4 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Relation between the spread and phi estimated by HAUS-C for the SMB risk factor

Full figure and legend (50K)

Tables 8 and 9give the same information as Tables 6 and 7 for the HML factor. The HAUS-C method identifies five strategies with overstated HML coefficients and three strategies with understated HML coefficients. The presence of specification errors is thus again not negligible for this factor and the regression of the spread on phi in the case of the HML factor confirms this:


Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with an R2 of 0.94. Figure 5 shows the close relationship between these two variables.

Figure 5.
Figure 5 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Relation between the spread and phi estimated by HAUS-C for the HML risk factor

Full figure and legend (42K)

The HAUS-hm method (Table 9) identifies nine strategies with significant specification errors for the HML coefficient. Six of these strategies were also suspected of specification errors by the HAUS-C method. But once more, when a method concludes that a coefficient is overstated, the other method states that the same coefficient is understated. As said when analysing the beta coefficient, the analysis of specification errors depends greatly on the choice of instruments. The regression of the spread on the phi coefficient, viewed as an indicator of the severity of specification errors, gives the following result:

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

with an R2 of 0.95. This regression indicates that specification errors contaminate the estimation of the HML factor, which is confirmed by Figure 6.

Figure 6.
Figure 6 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

Relation between the spread and phi estimated by HAUS-hm for the HML risk factor

Full figure and legend (44K)

To sum up, our version of the Hausman test proves very useful for detecting specification errors for the estimated coefficients of the F&F model. Very often, the HAUS-C and HAUS-hm methods identify the same strategies as candidates to specification errors for a given risk factor. The problem lies in the choice of the instruments because the correction of these errors by the IV methods is very much conditioned by this choice. On the basis of our previous developments, the cumulants of the variables of the F&F model seem more relevant than the higher moments to remove specification errors from a financial model because these cumulants are strictly orthogonal to the innovation of the F&F model, which is not the case for the higher moments.

Top

CONCLUSIONS

In this paper, we proposed new versions of the Hausman test to detect and remove specification errors while estimating financial or economic models. These new tests constitute also new empirical versions of these models. These tests are equivalent to a TSLS procedure and give way to an indicator of the spread between the OLS and corrected coefficient of an explanatory variable, which is a measure of the specification error.

To set up these tests, we resorted to two new sets of instruments: higher moments and cumulants. These instruments reveal themselves as performing more than the instruments frequently used in the empirical financial literature, like the Chen–Roll–Ross2 instruments. These instruments take into account the high degree of kurtosis, which is present in the distributions of financial returns, unlike for the classical instruments.

As shown in this paper, the correction of specification errors in a financial model is much related to a judicious choice of instruments. On the basis of our empirical works, the choice of cumulants over higher moments seems preferable in view of the strict orthogonality between these instruments and the innovation term of the estimated financial models and to the parsimonious character of these instruments, their number being limited to the number of explanatory variables of a model.

In summary, our study reveals that we must account for specification errors when estimating a financial model. The method we suggest to do so looks promising because it integrates new developments in the theory of financial risk with the estimation methods of financial models.

Top

References

References and Notes

  1. Watson, C.T. (2003) 'GMM and the Fama and French Model: The Role of Instruments, Economics Department', Working Paper, UCLA.
  2. Chen, N.F., Roll, R. and Ross, S. (1986) 'Economic Forces and the Stock Market', Journal of Business, Vol. 59, No. 3, pp. 572–621.
  3. Hausman, J.A. (1978) 'Specification Tests in Econometrics', Econometrica, Vol. 46, pp. 1251–1271. | Article |
  4. Pindyck, R.S. and Rubinfeld, D.L. (1998) 'Econometric Models and Economic Forecasts', 4th edn, Irwin-McGraw-Hill, New York.
  5. Spencer, D.E. and Berk, K.N. (1981) 'A Limited Information Specification Test', Econometrica, Vol. 49, No. 4, pp. 1079–1085. | Article | ISI |
  6. Wu, D. (1973) 'Alternative Tests of Independence Between Stochastic Regressors and Disturbances', Econometrica, Vol. 41, pp. 733–750. | Article |
  7. For this section, see also Racicot, Théoret and Coën (2007), Racicot and Théoret (2007), Théoret and Racicot (2007), Racicot and Théoret (2006), Coën, Racicot and Théoret (2006a, b), Coën, Desfleurs, Hübner and Racicot (2005), Racicot and Théoret (2004) and Racicot (2003).
  8. Fama, E.F. and French, K.R. (1992) 'The Cross-Section of Expected Stock Returns', Journal of Finance, Vol. 47, No. 2, pp. 427–465. | Article | ISI |
  9. Fama, E.F. and French, K.R. (1993) 'Common risk Factors in the Returns on Stocks and Bonds', Journal of Financial Economics, Vol. 33, No. 1, pp. 3–56. | Article | ISI |
  10. Fama, E.F. and French, K.R. (1997) 'Industry Costs of Equity', Journal of Financial Economics, Vol. 43, No. 2, pp. 153–193. | Article | ISI |
  11. The original F&F model contained only the first two 'anomalies'. The momentum anomaly, which is due to Carhart (1997) and Jegadesh and Titman (1993), was introduced subsequently to form the augmented F&F model.
  12. Durbin, J. (1954) 'Errors in Variables', International Statistical Review, Vol. 22, No. 1/3, pp. 23–32.
  13. Pal, M. (1980) 'Consistent Moment Estimators of Regression Coefficients in the Presence of Errors in Variables', Journal of Econometrics, Vol. 14, No. 3, pp. 349–364. | Article | ISI |
  14. Dagenais, M.G. and Dagenais, D.L. (1997) 'Higher Moment Estimators for Linear Regression Models with Errors in the Variables', Journal of Econometrics, Vol. 76, No. 1–2, pp. 193–221. | Article | ISI |
  15. The reader will excuse ourselves for confounding at this stage higher moments and cumulants for the sake of the presentation. We will come back on the distinction between these two concepts later.
  16. See Ref. 13.
  17. Samuelson, P.A. (1970) 'The Fundamental Approximation Theorem of Portfolio Analysis in Terms of Means, Variances and Higher Moments', Review of Economic Studies, Vol. 37, No. 4, pp. 537–542. | Article | ISI |
  18. Rubinstein, M. (1973) 'The Fundamental Theorem of Parameter-Preference Security Valuation', Journal of Financial and Quantitative Analysis, Vol. 8, No. 1, pp. 61–69. | Article | ISI |
  19. Kraus, A. and Litzenberger, R. (1976) 'Skewness Preference and the Valuation of Risk Assets', Journal of Finance, Vol. 31, pp. 1085–1100. | Article | ISI |
  20. See also Jurczenko and Maillet (2006) for multi-moment asset pricing models.
  21. On the Hausman test, see: Hausman (1978), Wu (1973), MacKinnon (1992) and Pindyck and Rubinfeld (1998). A very good presentation of the version of the Hausman test using an artificial regression in the context of correction of errors in variables may be found in Pindyck and Rubinfeld (1998). This presentation is done for one explanatory variable.
  22. For a very good exposition on developments around Hausman specification test, see Racicot, F.-É., Théoret, R. and Coën, A. (2007). A new empirical version of the Fama and French model based on the Hausman specification test: An application to hedge funds. Best paper award, Proceedings of the Global Finance Conference, Melbourne, Australia.
  23. Therefore, the Hausman test is an orthogonality test, that is it aims to verify if p lim (1/T) X'alt epsilon=0 in large samples.
  24. MacKinnon, J.G. (1992) 'Model Specification Tests and Artificial Regressions', Journal of Economic Literature, Vol. 30, No. 1, pp. 102–146.
  25. As done usually in econometrics, we use the asterisks for the unobserved variables.
  26. Racicot, F.E. (2003) 'On Measurement Errors in Economic and Financial Variables', in Three Essays on the Analysis of Economic and Financial Data, Chapter 3, PhD thesis. (ESG-UQAM), 2003 (published in French).
  27. Kendall, M.G. and Stuart, A. (1963) 'The Advanced Theory of Statistics', Volume. 1, Charles Griffin, London.
  28. See, for example, Malevergne et Sornette (2005) on the use of cumulants as measures of risk.
  29. Racicot, F.É. and Théoret, R. (2004) 'Numerical calculus in quantitative and empirical finance (Le calcul numérique en finance empirique et quantitative)', 2ième édition Presses de l'Université du Québec (PUQ), Québec.
  30. Coën, A., Desfleurs, A., Hübner, G. and Racicot, F.E. (2005) 'A Reappraisal of the Performance of Hedge Funds in the Presence of Errors in Variables', in Gregoriou, G. N., Papageorgiou, N., Hübner, G. and Rouah, F. (eds.), 'Hedge Funds: Insights in Performance Measurement, Risk Analysis and Portfolio Allocation', John Wiley & Sons, New Jersey.
  31. Coën, A. and Racicot, F.E. (2007) 'Capital Asset Pricing Models Revisited: Evidence from Errors in Variables', Economics Letters, Vol. 95, No. 3, pp. 443–450. | Article |
  32. The address of the French's website is: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html.
  33. Here the excess return of the hedge fund composite index.
  34. Following the works of Chan and Faff (2005) and our own test, we resorted only to non-iterated GMM, iterated forms being problematic.
  35. Let us notice that the impact of the SMB factor is important when using the S&P500 as the market portfolio in the F&F model but that its influence vanishes when resorting to the hedge fund composite index as benchmark.
  36. Please note that the t-statistics of the estimated coefficients are in parentheses.