This study derives its motivation from those by Butterworth and Holmes1 and Dunis et al. 2 Butterworth and Holmes1 investigate the use of a theoretical no-arbitrage model to test for weak form efficiency in the spread between FTSE100 and FTSEMID250. They conclude that ‘While the inter-market spread is found to trade within its transactions costs limits on the majority of occasions, large and sustained deviations from fair value exist in both directions, resulting in the triggering of spread arbitrage transactions’. This would indicate that with some effective filtering a profitable arbitrage model could be built.

Dunis et al 2 investigate the trading of the West Texas Intermediate (WTI)-Brent spread with a similar fair value model; this time the model is based on the cointegration vector of the two underlying series. This fair value model produces an out-of-sample return of 17.46 per cent; however, this is not the only model investigated. Dunis et al 2 also investigate the Autoregressive Moving Average (ARMA), General Autoregressive Conditional Heteroskedastic (GARCH), Moving Average Convergence Divergence (MACD) and Neural Network Regression (NNR) models. Every model tested produced positive out-of-sample returns inclusive of transactions costs.

This study extends the work of Dunis et al 2 in two ways. First, we test the same trading rule models on a portfolio of three spreads, thus testing the efficiency of more markets. By looking at the drawdown and standard deviation of these portfolios, we will also be able to draw some conclusions as to the diversifying effect of spread portfolios.

The second way we have extended previous work is by the development of a new filtering technique. Dunis et al 2 look at threshold and correlation filters, and in this study this is developed further with the hybrid filter. This filter uses a combination of inputs from both a threshold and a correlation filter to further refine the trading rules that are used. The correlation filter is explained in the section entitled ‘The correlation filter’, and the hybrid filter is explained in the section entitled ‘The hybrid filter’.

The case for spread trading has been made by many academics, most of whom call on reduced margin requirements3, 4, 5, 6 or consistently tradable patterns7, 8, 9, 10 to encourage interest in spread trading. However, these same studies fail to explicitly explain that with reduced margin comes reduced potential reward. The reason for the reduced margin when trading a spread is the reduced chance of large moves. This is because the two legs of the spread are highly correlated over the long term and will therefore move in generally the same direction.

Figures A1 and A2 show the PDF of the WTI and the Brent series differences, respectively. Figure A3 shows the PDF of the WTI-Brent spread differences. It is evident 11 that the average change of the spread is smaller than the average change for either underlying by a significant amount. This validates the decision to offer a lower margin for spreads. In contrast, the kurtosis of the spread PDF is extremely high(11.48), indicating that large moves of the spread are consistent features. Further, the maximum move of the spread is similar to the maximum move of each of the underlying series. This seems to indicate that if spread trading rules were selective enough to pick out these large movements, the potential profitability of the rule would not be that much smaller than that of a rule on a single market, but could still retain the ability of the spread to hedge the position should a large, unfavourable, information-driven move occur on either leg.

BACKGROUND

Spread trading was first formally introduced into the finance literature by Working,12 who investigated the effects of the cost of storage on pricing relationships. It was demonstrated that futures traders could profit from the existence of abnormalities in the pricing relationships among futures contracts of different expiries.

Meland13 gives further justification for research into spread trading, stating that ‘although spread trading has been used to speculate on the cost of carry between different futures contracts, spread trading also serves the functions of arbitrage and hedging, together with providing a vital source of market liquidity’. It is therefore surprising that although there has been interest in cash-futures arbitrage,14, 15, 16, 17 inter- and intra-commodity spread trading has been largely ignored among the academic fraternity.1, 18, 19, 20, 21, 22

Studies such as those by Sweeney,23 Pruitt and White,24 and Dunis25 directly support the use of technical trading rules as a means of trading financial markets. Trading rules such as moving averages, filters and patterns seem to generate returns above the conventional buy and hold strategy. Nevertheless, Lukac and Brorsen26 carried out a comprehensive test of futures market trading. It was found that all but one of the trading rules tested generated significantly abnormal returns. Sullivan et al 27 investigated the performance of technical trading rules over a 100-year period of the Dow Jones Industrial Average, they conclude that ‘there is no evidence that any trading rule outperforms [the benchmark buy and hold strategy] over the sample period’.

With the increasing processing power of computers, rule-induced trading has become far easier to implement and test. Kaastra and Boyd28 investigated the use of Neural Networks for forecasting financial and economic time series. They concluded that the large amount of data needed to develop working forecasting models involved too much trial and error. In contrast, Chen et al 29 study the 30-year US Treasury bond using a neural network approach. The results prove to be good, with an average buy prediction accuracy of 67 per cent and an average annualised return on investment of 17.3 per cent.

In recent years, there has been an expansion in the use of computer trading techniques, which has once again called into doubt the efficiency of even very liquid financial markets. Kaastra and Boyd28 suggest that it is possible to achieve abnormal returns on the Morgan Stanley High Technology 35 index using a Gaussian mixture neural network trading model. Lindemann et al 30 justified the use of the same model to successfully trade the EUR/USD exchange rate, an exchange rate noted for its liquidity.

Butterworth and Holmes21 state that ‘an analysis of spread trading is important since it contributes to the economics of arbitrage and serves as an alternative to cash-futures arbitrage for testing for futures market efficiency’. They test a fair value model on the FTSE250 – FTSEMID100 spread, and conclude that ‘while there are many deviations from fair value, these are generally quite small in actual magnitude, indicating that both contracts tend to be efficiently priced’. This statement does not indicate whether these deviations from fair value can be captured by either a more sophisticated trading rule or an appropriate filter.

Recently, Dunis et al 2 looked at the trading of the WTI-Brent spread using a variety of techniques, and concluded that it is possible, using only past prices, to generate abnormal out-of-sample profits. They also investigate the use of filters in improving the Sharpe ratios of the trading rules; the two filters used are a threshold filter based on the size of the predicted market move and a correlation filter based on a rolling correlation. This study further extends this work in two ways: first by testing the profitability of this trading rule on spread portfolios, and second by introducing a further logical development of a correlation filter, namely, a hybrid filter, the exact description of which is given in the section entitled ‘The hybrid filter’.

DATA AND METHODOLOGY

The spread returns series is calculated in the following way:

where Leg 1,t is price of Leg 1 at time t; Leg 1,t−1 is price of Leg 1 at time t−1; Leg 2,t is price of Leg 2 at time t; and Leg 2,t−1 is price of Leg 2 at time t−1.

This convention allows for the calculation of annualised returns and annualised standard deviation to be carried out in the usual manner.

Data set

For all trading rules, except the NNR rule, the data have been split into two subsections; these are as shown in Table 1.

Table 1 In-Sample and Out-of-Sample trading periods

The first subset (the in-sample subset) is used to test and optimise the models. The second subset (the out-of-sample subset) is used as an unseen data set, to test our optimised models.

In the case of the NNR model, and in order to avoid overfitting, the data will be split into three subsets, as is standard in the literature (for example Lindemann et al 30 and Kaastra and Boyd28); these are as shown in Table 2.

Table 2 In-Sample and Out-of-Sample trading periods for NNR model

The NNR model is trained slightly differently from the other models. The training data set is used to train the network, the minimisation of the error function being the criterion optimised. The training of the network is stopped when the profit on the test data set is at a maximum. This model is then traded on the validation subset, which for comparison purposes is identical to the out-of-sample data set used for the other models. This technique restricts the amount of noise that the model will fit, while also ensuring that the structure inherent in the training and test subsets is modelled. Further explanation of this is presented in the section entitled ‘Neural Network Regression’.

Table 3 shows details of the time series used to form the portfolio.

Table 3 Time Series

The series shown in Table 3 are then combined to form the following spreads:

  1. 1

    WTI crude versus Brent Crude;

  2. 2

    WTI crude versus Unleaded Gasoline; and

  3. 3

    WTI Crude versus Heating Oil.

It is evident from Table 3 that these time series can be traded as spreads, as they are denoted in the same currency and each leg of the spread has an identical fixing time to the opposing leg. The portfolio that has been traded is an equally weighted portfolio of the three spreads shown above, and therefore each spread return is given a one-third weighting for its effect on the portfolio.

Constructing a continuous spread series

Trading on futures markets is slightly more complex than trading on cash markets, because futures contracts have limited lifetimes. If a trader takes a position on a futures contract that subsequently expires, he can take the same position on the next available contract. This is called ‘rolling forward’. The problem of rolling forward is that two contracts of different expiry may not (and invariably do not) have the same price. When the roll-forward technique is applied to the futures time series, it will cause the time series to exhibit periodic blips in the price of the futures contract. Although the cost of carry (which actually causes the pricing differential) can be mathematically taken out of each contract, this does not leave us with exactly tradable futures series.

In this study, as we are dealing with futures spreads, we have rolled forward both contracts on the first trading day of the maturity month of the earliest maturing contract. As both contracts are on largely similar underlying, the short leg roll forward will cancel out the long leg roll forward.31 We are therefore left with a tradable time series with no periodic roll-forward price blips.

Transactions costs

The bid-ask spread is an average of four intra-day bid-ask spreads, and is presented here as a percentage of the underlying price. The bid-ask spread of each contract, as a percentage of the contract price, can be seen in Table 3. The cost of trading the spread is the total cost of trading one leg in addition to the total cost of trading the other leg. For example, the cost of trading the WTI-Brent spread is (0.0289 per cent +0.0940 per cent)=0.1229 per cent.32

TRADING DECISION MODELS

The trading decision models have been arranged such that each generic set of trading rules is used to form a portfolio of trading models. There are therefore four trading model portfolios; these are described below.

Fair value cointegration trading rule

The fair value model that has been used in this study is the cointegration fair value model as used by Dunis et al. 2 The model itself is based on the cointegration test,33 which shows the long-run relationship between multiple assets. This can be extended to a trading model as shown below.

Taking the in-sample cointegrating vectors as

α 12+β 12=Cointegrating Vector of Leg 1 with respect to Leg 2.

Using this vector we can find the residuals of the cointegrating equation. This is achieved in the following way:

where Leg 1 is the long leg of the spread and Leg 2 is the short leg of the spread.

Any deviation of Leg 2 from the theoretical price imposed on it by Leg 1 could be seen as a deviation from fair value. We can therefore present a trading rule as follows:

If μ t<0 then go long the spread, until μ t=0 is regained.

If μ t0 then go short the spread, until μ t=0 is regained.

MACD trading rule

The main problem with the fair value cointegration approach is that the fair value is stationary. This is a problem, because any fundamental change in the underlying relationship could cause a massive drawdown, resulting in the trader being priced out of the market. A logical extension of this model would be to regularly re-estimate the fair value based on the most recent data. However, although this would be a computationally intensive undertaking for the fair value cointegration model, a faster method is to use an n-day moving average as a proxy for the fair value price.

The ‘reverse moving average’ in which the traditional rule positions are reversed therefore provides the trader with a dynamic model for exploring the situation wherein markets are not trending but mean-reverting; this rule should also help limit the potential problem of large drawdowns affecting the results.

The formalism for the traditional moving average is as follows:

The trader should go long if p t >MA t , and the trader should go short if p t <MA t , where p t−n is price at time t−n; n=(1, 2, …, N); p t is price at time t; and N is number of days of moving average.

The reverse moving average rule will therefore be that the trader should go long if p t <MA t and should go short if p t >MA t , with MA t calculated in the same way.

Traditional regression analysis

The third trading decision models to be investigated are traditional regression analysis models, that is, ARMA and GARCH models. An ARMA(10, 10) model was used to estimate the percentage change in the spread, and a restricted model was estimated using the Akaike information criterion as the optimising parameter (over the in-sample period). Autocorrelation was then tested for and removed with the addition of lags of the percentage change in the spread. If heteroskedasticity was present, an alternative GARCH(1, 1) model was similarly estimated. The final models are those free from heteroskedasticity and autocorrelation, with an optimised Akaike information criterion. These models were then used to estimate the out-of-sample using a ‘one size fits all’ estimation.

Neural Network Regression

The most basic type of NNR model, which is used in this study, is the MultiLayer Perceptron (MLP). As explained in the study by Lindemann et al,34 the network has three layers: the input layer (explanatory variables), the output layer (the model estimation of the time series) and the hidden layer. The number of nodes in the hidden layer defines the amount of complexity that the model can fit. The input and hidden layers also include a bias node,35 which has a fixed value of 1.

The network processes information as follows:

  1. 1

    The input nodes contain the values of the explanatory variables (in this case 10 lagged values of the spread).

  2. 2

    These values are transmitted to the hidden layer as the weighted sum of its inputs.

  3. 3

    The hidden layer passes the information through a non-linear activation function and, if the calculated value is above a threshold value, onto the output layer.

The connections between neurons for a single output neuron in the net are shown in Figure 1 :

x t [n] (n=1, 2, …, k+1):

are the model inputs (including the input bias node) at time t;

h t [m] (m=1, 2, …, m+1):

are the hidden nodes outputs (including the hidden bias node);

t :

is the MLP model output (the estimate of the change in the spread);

u jk and w j :

are the network weights;

:

is the transfer sigmoid function: S(x)=1/(1+e x); and

⊘:

is a linear function: F(x) = ∑i x i .

Figure 1
figure 1

A single-output, fully connected MLP model.

The error function to be minimised is with y t (the change in the spread) being the target value.

The training of the neural network is of utmost importance, as it is possible for the network to learn the training data subset exactly (commonly referred to as overfitting). For this reason the network training must be stopped early. This is achieved by dividing the data set into three different components (as described at the start of this section). First, a training subset is used to optimise the model; the ‘back propagation of errors’ algorithm is used to establish optimal weights from the initial random weights.

Second, a test subset is used to stop the training subset from being overfitted. Optimisation of the training subset is stopped when the test subset is at maximum positive return. These two subsets are the equivalent of the in-sample subset for all other models. This technique will prevent the model from overfitting the data, while also ensuring that any structure inherent in the spread is captured.

Finally, a validation subset is used to simulate future values of the time series, which for comparison is the same as the out-of-sample subset of the other models.

As the starting point for each network is a set of random weights, we have used a committee of 10 networks to arrive at a trading decision (the average change estimate decides on the trading position taken). This helps to overcome the problem of local minima affecting the training procedure. The trading model predicts the change in the spread from one closing price to the next, and therefore the average result of all trading models was used as the forecast of the change in the spread.

FILTER METHODOLOGIES

The filter methodologies employed in this study are as follows.

The threshold filter

The threshold filter is constructed as follows:

If (Δy t )2<x t then stay out of the market; otherwise take the decision of the trading rule, where Δy t is the estimate of the percentage change in the spread, and x t is the size of the threshold filter, which has been optimised in-sample.

In the case of the MACD and fair value models, the position is held until the moving average or fair value is regained.

The correlation filter

In addition to the application of a threshold filter, the trading rules have been filtered in terms of their correlation. This methodology is explained in the study by Dunis et al 2 and is reproduced below.

A rolling 30-day correlation is produced from the two legs of the spread. The change of this series is then calculated. From this there is a binary output of either 0 if the change in the correlation is above X c or 1 if the change in the correlation is below X c. X c being the correlation filter level, which is optimised in-sample. This is then multiplied by the returns series of the trading model.

By using this filter, it should also be possible to filter out initial moves away from fair value, which are generally harder to predict than moves back to fair value. Figure 2 presents an example of the entry and exit points of the filter when X c=0.

Figure 2
figure 2

Operation of the correlation filter.

Figure 2 shows that a market entry is triggered the day after the drop in correlation. The market exit is triggered the day after the correlation starts to rise.

The hybrid filter

The hybrid filter has been included here because the threshold and correlation filters seem to filter trades under different circumstances. The hybrid filter is simply a combination of signals from both filters. The hybrid filter has not been optimised itself, but uses the parameters of the threshold and correlation filters. This filter can be represented as follows.

If either filter shows out of the market signals then stay out of the market; otherwise take the decision of the trading rule.

RESULTS

The results for all four trading techniques can be seen below. Filters are optimised in terms of the in-sample leverage factor.36 This has been used because when trading futures markets, a trader will only need to invest a margin (around 6 per cent of the contract price for the period under review); therefore the underlying investment should be based on a measure of drawdown.

Table 4 shows the trading statistics for the fair value cointegration portfolio. The table covers the percentage of annualised return, the percentage of annualised standard deviation, the percentage of maximum drawdown, the Sharpe ratio and a leverage factor.37

Table 4 Fair-value cointegration portfolio results

Cointegration fair value portfolio

It is evident from Table 4 that the out-of-sample portfolio does produce returns sufficient to cover transactions costs even without the use of a filter. It is also evident that the use of filtering methodologies is vindicated. From the in-sample results the threshold filter would have been chosen (decision based on highest leverage factor): the out-of-sample results prove this to be the best performer.

MACD portfolio

It is evident from Table 5 that the out-of-sample portfolio does produce returns sufficient to cover transactions costs even without the use of a filter. It is also evident that the use of filtering methodologies is not vindicated. From the in-sample results the unfiltered model would have been chosen (decision based on highest leverage factor); the out-of-sample results also show the threshold filter to subsequently be the best-performing model.

Table 5 MACD portfolio results

Traditional regression analysis portfolio

It is evident from Table 6 that the out-of-sample portfolio does not produce returns sufficient to cover transactions costs without the use of a filter. It is also evident that the use of filtering methodologies is vindicated. From the in-sample results the correlation filter would have been chosen (decision based on highest leverage factor); the out-of-sample results also show the correlation filter to be the best performer.

Table 6 Traditional regression analysis portfolio results

NNR portfolio

It is evident from Table 7 that the out-of-sample portfolio does produce returns sufficient to cover transactions costs even without the use of a filter. It is also evident that the use of filtering methodologies is vindicated. From the in-sample results the hybrid filter would have been chosen (decision based on highest leverage factor); the out-of-sample results also show the hybrid filter to be the best-performing model type.

Table 7 NNR portfolio results

CONCLUSIONS

It can be concluded from these results that the use of the various filtering techniques; the hybrid filter in particular is vindicated. This is evident as the selection decision we make is proven to be incorrect only once. This is the case of the MACD model for which the unfiltered model is chosen based on the in-sample leverage factor; however, the out-of-sample statistics show that a better filter model would have been the threshold filter.

In all other models, the selection of the best in-sample filtering technique leads to the best out-of-sample performance. The hybrid filter was selected on one of these occasions, and proved to be the best out-of-sample performer.

It can also be concluded that the use of the fair value cointegration model is vindicated returning an out-of-sample leverage factor of 1.38. The best model type for trading spreads, based on this evidence, seems to be the NNR model, which returns an out-of-sample leverage factor of above 7 and an out-of-sample Sharpe ratio of above 1.5. It can therefore be concluded that the various filters shown here, and in particular the hybrid filter, can provide the spread trader with an advantage over plain trading decision models.

Furthermore, drawing on the results of the study by Salcedo,10 it can be concluded that spread portfolios offer diversification benefits. Salcedo10 shows a maximum out-of-sample leverage factor of 4.6, for the MACD model; here the out-of-sample leverage factor has been increased to 7.3 – this time for the NNR model.