Monday, 22 February 2016

Backtesting in Algorithmic Trading

Backtesting is what we have been doing in our previous posts - using historical market data, we see how our strategy would have worked over different timeframes. The higher the frequency of the algo, the more challenging it is to discover whether the backtest provides realistic results given that execution becomes the key advantage of the strategy and a millisecond change in any of the parameters can affect performance greatly.

Michael Halls Moore attributes backtesting to three key reasons:
1. Filtration
We can filter strategies to only those that meet our needs, such as by taking advantage of performance metrics including Sharpe Ratio and Max Relative Drawdown/Drawdown duration as provided in Quantopian.

2. Modelling
We can relate theory to practice by testing (with no risk) under realistic market microstructure conditions, and realise impacts from illiquidity.

3. Strategy Optimisation
We can change different parameters in the strategy and see the impact of that change in the overall performance, as we did in our previous posts.

Interestingly, most biases in backtesting idealises the performance and hence one should always consider the overall PnL to be the most optimal result. This can be due to optimisation/curve fitting/data snooping bias, where parameters are all changed to generate the highest profit. Although more data and less parameters can help reduce bias, timeframes are also subject to this bias as older data can be related to a different market regulatory structure and hence not applicable anymore.

As I learnt from Michael Halls Moore, a sensitivity analysis can be used to eliminate optimisation bias. This involves adjusting parameters incrementally and plotting the relevant layer of performance. A volatile parameter surface suggests the parameter is an artefact of the past data. Going to check how this works out on Quantopian, and will update later!

Look-ahead bias, when future data is used in the backtest when that data shouldn’t have been known at that point in time, can be introduced in various ways. One I found particularly interesting was trading strategies that use maximum and minimum values. These values should be lagged by at least one period because they can only be calculated at the end of a period. Another risk is shorting constraints - awareness of the lack of liquidity in certain equities and market regulations e.g the US 2008 shorting ban can explain inflated backtest results.

Backtesting should be performed on complete data, though this is typically expensive and used by institutions. Some such as Yahoo Finance would only include assets that did not end up delisting. This means the strategy is only being tested on those assets that are already strong. Equities are prone to this survivorship bias. A shorter time frame can reduce the chance of delisters. They can also have OHLC prices (opens, highs, lows and closing) that are biased from outliers due to the small orders from different exchanges. Trades involving FX markets are also challenging in this aspect as the bid ask prices on one exchange is different to another and therefore consolidated price (and transaction cost) data from multiple ECNs is not ideal.

Transaction costs can eat into the profits of algorithms particularly those that rely on high volume small margin trades and therefore should be modelled. The fixed costs are easier to implement in backtests.

  • broker commission, fees to clear and settle trades (fixed)
  • taxes (fixed)
  • slippage: affected by latency (time difference between signal and execution, can be high for assets with high volatility. Momentum and contrarian strategies are sensitive to high slippage as you are trying to buy assets that are already moving in the expected direction/opposite direction respectively).
  • market impact (breaking down into smaller volume) and spread (wider for illiquid assets)

No comments:

Post a Comment