Sunday 21 February 2016

Strategy Formulation and Evaluation in Algo Trading

Overview

One of the many benefits of automated trading includes the ability to verify and subsequently optimise a strategy through testing it on historical data. This is a process known as backtesting, a topic we will focus on in future blog posts. Strategy formulation and evaluation is the key focus of this blog post. 

Automation also means traders do not need to constantly monitor prices. Risk management is also important to traders, and position sizes, leverage, etc. can likewise be dynamically adjusted depending on market movements. 

With various performance metrics being tracked, capital allocation is more efficient as comparison is made easier relative to the traditional profit and loss tracking which masks drawdown. Drawdown refers to the decline between the peak and trough of an investment. 

Algo trading also eliminates psychological bias, which sometimes act as motivators in trading and erodes the performance of a trading strategy. However, human judgement is important when determining whether it is sensible and logical to modify parameters in a strategy depending on external influences. 

Strategy Formulation

There are two main approaches to coming up with a trading strategy. Firstly, the data mining approach- we apply numerous parameters onto a time series and utilise such parameters in a strategy. Whilst this may appear to be effective, it is difficult and time consuming to identify the reason behind performance erosion. This then results in arbitrary, non-logical optimisation where further irrelevant parameters are applied that could lead to temporary profits only in the short term, hence resulting in a cycle of poor performance. In our opinion, a better approach is the hypothesis testing method. Firstly, traders observe trends and correlations and come up with a hypothesis. The null is the random walk hypothesis. We then apply statistical tests such as Variance-Ratio tests to prove the hypothesis. This test and further statistical models will be explained in future blog posts. Always remember even if the hypothesis is disproved at the level of significance it could still be true- try using a larger dataset, and think of additional relevant parameters to refine the hypothesis. Even if profitability performance erodes in the future, we are able to refine the hypothesis. 

We use Python language to code up our algorithms. Python is the fastest growing language in the finance industry due to its development speed. We also found it the easiest to learn as it comes with its own libraries relevant to statistical modelling. The most useful are: NumPy, SciPy, matplotlib, statsmodel, scikit-learn, IPython, and pandas. We will expand on coding in future blog posts. For those with an extensive programming background, C++ is most useful in terms of speed, making it effective for High Frequency Trading. Execution can be optimised in Python to be as fast as C with more complex code. 

Ideas can be sourced from academic journals, trading forums, finance magazines and textbooks. The aim is to establish a consistent approach to sourcing, testing and executing strategies so ideas are of consistent quantity and frameworks are more easily developed to accept and reject ideas without psychological and emotional attachment. It is recommended by numerous quant finance specialists to avoid cognitive biases in strategy, which could even come down to personal preference of asset classes. Such preferences must always be justified logically using metrics such as leverage and capital constraints. 

From our experience, many websites and forums refer to technical analysis for trade ideas. Technical analysis is essentially the use of signals to enter trades and behavioural finance to predict trends and mean reversion in prices. However, when reading up on quant finance, it seems to be less emphasised upon. However programming and modelling have capabilities to statistically evaluate the profitability of technical analysis based strategy. 

Academic journals can provide strong fundamental trading ideas however it is clear that extensive testing using more updated data and liquid asset classes that take into consideration transaction costs such as fees, spread and slippage is important to ensure a more realistic replica of the strategy. 

Strategy Evaluation

Below are important criteria we have compiled inspired by Quantopian and Michael Halls-Moore to assess strategy performance: 


  • Approach - The main types of trading strategies we have come across fall under momentum/directional, mean reversion or hedged (market neutral). Strategies have different characteristics in profit and loss. For instance, momentum strategies tend to exhibit this pattern as they base themselves on a small number of winning trades to generate profits, even if the majority are losing trades. Mean reversion on the other hand tends to be the opposite, where most trades are winning trades though mis-timing the reversion leads to severe losses. We should always be able to identify the reason behind the market movement we are trying to exploit to ensure the strategy still holds if there is an external event e.g. regulatory change. This also enables greater understanding of whether the strategy only applies to a certain asset class or over a specific time series, to avoid wasting time and resources backtesting and refining ineffective strategies. It is important that none of the strategies come with numerous parameters as it would lead to bias in optimisation. 
  • Parameters - If you've learnt regression then you should be familiar with optimisation bias/curve fitting. In any regression, the more parameters there are, the better the fit of the model, even if one parameter is almost unrelated and only adds to the fit by less than a percent. The R squared does not consider the relevance of each individual parameter. Therefore it is important to stick with strategies with minimal parameters and ensure datasets are large enough to test such parameters. 
  • Benchmark - Depending on your strategy, there is usually an appropriate benchmark to measure the strategy against average market performance. After all, there is no point in designing and optimising a strategy that returns 5% when the benchmark is returning over 10%. It is recommended to use a performance benchmark that is composed of the underlying asset class the strategy is based on. For instance, using the most liquid equities in the ASX means the ASX200 is an effective benchmark. This leads to terminology such as "alpha" and "beta". Alpha is a risk adjusted rate of return. That is, it considers the risk involved and the coefficient will indicate whether the return was worth the risk. Beta refers to whether the investment strategy was more or less volatile relative to the market. For fixed income funds, you should compare performance relative to a basket of bonds, fixed income products or risk free rate. 


  • Sharpe Ratio - This metric refers to risk reward ratio, that is, how much return achieved for the level of volatility the investor encounters. If we set the strategy to be a higher frequency strategy, the sampling rate of volatility (standard deviation) should be greater to match this. 
  • Volatility- This important metric is embodied in the Sharpe Ratio and is useful to identify whether hedging should be undertaken. For instance, high volatility in the underlying assets leads to higher volatility in the equity curve, reducing the Sharpe ratio. The equity curve is a graphical representation of the changes in value of a trading account, and positive slopes implies profitable trading strategies.  

  • Maximum Drawdown - Maximum drawdown refers to the peak-trough ratio drop on the equity curve. This means incrementally losing trades will lead to extended drawdowns, which is common in momentum strategies. It is wise to avoid giving up during such drawdowns, as historical backtesting usually implies this is as expected. Though it is up to the trader to determine what drawdown percentage is acceptable prior to exiting the strategy, similar to setting stop losses on a trade. 
  • Frequency - Frequency is related to technological skills, Sharpe ratio and transaction costs. This concept therefore introduces the concept of technology stack. The technology stack refers to the operating system and related support programs and runtime environments to support applications such as Python. Higher frequency strategies are naturally more complex in terms of programming expertise, cost and storage requirements as intraday ticks and order book data is often required. 
  • Liquidity - We only have experience with highly liquid instruments when it comes to algorithmic trading. We must always ensure the strategy is scalable in case of increased capital allocation in the future. 
  • Leverage - Futures, options, swaps are leveraged derivatives. These instruments have high volatility and can therefore easily result in margin calls, which requires capital (and patience). 
  • Technology - In order to generate the above metrics, a database engine is needed for data storage. These databases can include MySQL or Oracle and accessed via an application code (R, Matlab, Excel) that queries the database then provides users with tools to analyse such data. Code can be Python as previously mentioned, or C++ or Java. 
  • Data - The type of data stored is important as this is what the strategy will use for entry/exit points and profitability calculations. Fundamental data, that is, macro metrics including inflation, interest rates, corporate actions, earnings reports, etc. are helpful in valuation of a company or asset based on their fundamentals. Storage should be minimal unless multiple companies are being analysed simultaneously, it does not involve time series of asset prices. Asset price data is considered the norm for finance quants. Such data can apply to equities, fixed income, commodities and forex prices. If intraday data is also included, storage requirements increase significantly, as we must also consider cleaning the data for increased accuracy. Qualitative data can also be included. Subscription to media feeds falls under this category. Currently the newer trend in data analysis involves machine learning using classifiers to collect investor sentiment, and due to the qualitative nature of such data, it is stored using NoSQL document databases. 

The returns of the strategy do not provide direct information to what is working and what isn't working in a trading strategy, as there is no insight into capital requirements, leverage, volatility, benchmarks, etc. It is therefore advisable to consider the risk characteristics in a strategy prior to evaluating returns. The way we like to think of it is that returns represent the y variable, the dependent variable. Sites such as Quantopian have backtesting platforms embedded, so these metrics are generated automatically, allowing us to focus solely on strategy formulation and optimisation of strategy. 

No comments:

Post a Comment