Using Returns in Pairs Trading

This blog article is taken from our book [1].

In most entry-level materials on pairs trading such as in [2],  a mean reverting basket is usually constructed by this relationship:

\(P_t – \gamma Q_t = Z_t, \textrm{(eq. 1)}\)

, where \(P_t\) is the price of asset \(P\) at time t, \(Q_t\) the price of asset \(Q\) at time t, and \(Z_t\) the price of the mean reverting asset to trade. One way to find \(\gamma\) is to use cointegration. There are numerous problems in this approach as detailed in [1]. To mention a few: the identified portfolios are dense; executions involve considerable transaction costs; the resultant portfolios behave like insignificant and non-tradable noise; cointegration is too stringent and often unnecessary a requirement to satisfy.

This article highlights one important problem: it is much better to work in the space of (log) returns than in the space of prices. Therefore, we would like to build a mean reverting portfolio using a similar relationship to (eq. 1) but in returns rather than in prices.

The Benefits of Using Log Returns

When we compare the prices of two assets, [… TODO …]

 

A Model for a Mean Reverting Synthetic Asset

Let’s assume prices are log-normally distributed, which is a popular assumption in quantitative finance, esp. in options pricing. Then, prices are always positive, satisfying the condition of “limited liability” of stocks. The upside is unlimited and may go to infinity. [5] We have:

\(P_t = P_0\exp(r_{P,t}) \\ Q_t = Q_0\exp(r_{Q,t}), \textrm{eq. 2}\)

\(r_{P,t}\) is the return for asset \(P\) between times 0 and t; likewise for asset \(Q\).

Instead of applying a relationship, e.g., cointegration (possible but not a very good way), to the pair on prices, we can do it on returns. This is possible because, just like prices, the returns at time t are simply random walks, hence \(I(1)\) series. We have (dropping the time subscript):

\(r_P – \gamma r_Q = Z, \textrm{(eq. 3)}\)

Of course, the \(\gamma\) is a different coefficient; the \(Z\) a different white noise.

Remove the Common Risk Factors

Let’s consider this scenario. Suppose the oil price suddenly drops by half (as is developing in the current market). Exxon Mobile (XOM), being an oil company, follows suit. American Airlines (AAL), on the other hand, can save cost on fuel and may rise. The naive (eq. 3) will show a big disequilibrium and signal a trade on the pair. However, this disequilibrium is spurious. Both XOM and AAL are simply reacting to the new market/oil regime and adjust their “fair” prices accordingly. (Eq. 3) fails to account for the common oil factor to both companies. Mean reversion trading should trade only on idiosyncratic risk that are not affected by systematic risks.

To improve upon (eq. 3), we need to remove systematic risks or common risk factors from the equation. Let’s consider CAPM. It says:

\(r = r_f + \beta (r_M – r_f) + \epsilon, \textrm{(eq. 4)}\)

The asset return, \(r\), and \(\epsilon\), are normally distributed random variables. The average market return, \(r_M\), and the risk free rate, \(r_f\), are constants.

Substituting (eq. 4) into the L.H.S. of (eq. 3) and grouping some constants, we have:

\((r_P – \beta_P (r_M-r_f)) – \gamma (r_Q – \beta_Q (r_M-r_f)) = \epsilon + \mathrm{constant}\)

To simply things:

\((r_P – \beta_P r_M) – \gamma (r_Q – \beta_Q r_M) = \epsilon + \gamma_0, \textrm{(eq. 5)}\)

where \(\gamma_0\) is a constant.
(Eq. 5) removes the market/oil effect from the pair. When the market simply reaches a new regime, our pair should not change its value. In general, for multiple n asset, we have:

\(\gamma_0 + \sum_{i=1}^{n}\gamma_i (r_i – \beta_ir_M) = \epsilon, \textrm{(eq. 6)}\)

For multiple n asset, multiple m common risk factors, we have:

\(\gamma_0 + \sum_{i=1}^{n}\gamma_i (r_i – \sum_{j=i}^{m}\beta_jF_j) = \epsilon, \textrm{(eq. 7)}\)

Trade on Dollar Values

It is easy to see that if we use (eq. 1) to trade the pair, to long (short) \(Z\), we buy (sell) 1 share of \(P\) and sell (long) \(\gamma\) share of \(Q\). How do we trade using (eqs. 3, 5, 6, 7)? When we work in the log-return space, we trade for each stock, \(i\), the number of shares worth of \(\gamma_i\). That is, we trade for each stock \(\gamma_i/P_i\) number of shares, where \(P_i\) is the current price of stock \(i\).

Let’s rewrite (eq. 3) in the price space.

\(\log(P/P_0) – \gamma \log(Q/Q_0) = Z\)

The R.H.S. is

\(\log(P/P_0) – \gamma \log(Q/Q_0) = \log(1 + \frac{P-P_0}{P_0}) – \gamma \log(1 + \frac{Q-Q_0}{Q_0})\)

Using the relationship \(\log(1+r) \approx r, r \ll 1\), we have

\(\log(1 + \frac{P-P_0}{P_0}) – \gamma \log(1 + \frac{Q-Q_0}{Q_0}) \approx \frac{P-P_0}{P_0} – \gamma \frac{Q-Q_0}{Q_0} \\ = (\frac{P}{P_0} -1) – \gamma (\frac{Q}{Q_0} -1) \\ = \frac{1}{P_0}P – \gamma \frac{1}{Q_0}Q + \mathrm{constant} \\= Z\)

Dropping the constant, we have:

\(\frac{1}{P_0}P – \gamma \frac{1}{Q_0}Q = Z, \textrm{(eq. 8)}\)

That is, we buy \(\frac{1}{P_0}\) shares of \(P\) at price \(P_0\) and \(\frac{1}{Q_0}\) shares of \(Q\) at price \(Q_0\). We can easily extend (eq. 8) to account for the general cases: we trade for each stock \(i\) \(\gamma_i/P_i\) number of shares.

References:

  1. Numerical Methods in Quantitative Trading, Dr. Haksun Li, Dr. Ken W. Yiu, Dr. Kevin H. Sun
  2. Pairs Trading: Quantitative Methods and Analysis, by Ganapathy Vidyamurthy
  3. Identifying Small Mean Reverting Portfolios, Alexandre d’Aspremont
  4. Developing high-frequency equities trading models, Infantino
  5. The Econometrics of Financial Markets, John Y. Campbell, Andrew W. Lo, & A. Craig MacKinlay

On Some Practical Issues when Using AlgoQuant to Compute the Markowitz Efficient Frontier

Markowitz suggests in his Nobel Prize-winning paper Markowitz(1952) that when one selects a portfolio, he/she should consider both the return and the risk of the portfolio. Most of us, if not all, are risk-averse. Risk-averse means that if there are two  portfolios with the same return, but different risks (in this article by risk we mean the standard deviation of the portfolio), we would choose the one with the smaller risk without any hesitation. Therefore given a set of risky assets and an expected return, we are interested in finding their best combination, i.e. the weights which will minimize the risk of the portfolio. And if we find the minimum risk of the portfolio for any return, we can draw a curve on risk-return plane. This curve is the famous efficient frontier.

Assuming there are \(n\) risky assets, and their return vector and covariance matrix are \(R\) and \(\Sigma\) respectively, then the points on the efficient frontier are computed by solving the following problem:

\(\min_{w\in\mathbb{R}^{n}} w^{\top}\Sigma w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

\(R^{\top}w=\mu\)

where \(\mu\) is the pre-defined expected return, and \(w\) is the weight vector. The above problem can be solved using Lagrange multipliers. And we denote this problem as “Problem 1”.

In AlgoQuant, we use another approach to compute the efficient frontier. The problem we solve is based on the utility function:

\(\min_{w\in\mathbb{R}^{n}} q\times w^{\top}\Sigma w-R^{\top}w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

\(R^{\top}w=\mu\)

\(R\), \(\Sigma\), \(w\) and \(\mu\) are the same parameters in Problem 1. The newly added parameter \(q\), is risk-averse coefficient. And this problem is denoted as “Problem 2”.

The larger the \(q\), the less risk the investor is willing to take. Although most of us are risk-averse, the degrees of risk-averse are different among individuals. As a result, a coefficient that describes the risk-averse degree is introduced. Note that in some papers, risk tolerance coefficient is used, for example Steinbach (2001). Risk tolerance is the reciprocal of risk-averse and it is applied on the return term in the objective function rather than the risk term. For usages of risk-averse coefficient in portfolio optimization, please see page 75 of Lee and Lee (2010), and page 159 of Bodie et al. (2008).

It can be seen that Problem 2 and Problem 1 are equivalent. Because with the constraint \(R^{\top}w=\mu\), the second term in Problem 2’s objective function is a constant, and it does not affect the optimization result. If problem 2 and problem 1 are equivalent, why even bother including the risk-averse coefficient?

When computing the efficient frontier, the two problems are equivalent. However the solutions of the two problems will be different when the constraint \(R^{\top}w=\mu\) is removed.

 

Problem 1′:

\(\min_{w\in\mathbb{R}^{n}} w^{\top}\Sigma w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

Problem 2′:

\(\min_{w\in\mathbb{R}^{n}} q\times w^{\top}\Sigma w-R^{\top}w \)

\(\text{s.t.} \sum_{i=1}^{n}w_{i}=1\)

The solution of Problem 1′ is the minimum variance portfolio. And minimum variance portfolio is the portfolio on the efficient frontier which has the minimum variance (i.e., the leftmost point on the curve). On the other hand, the solution of Problem 2′ is the optimal portfolio given utility function \(R^{\top}w-q\times w^{\top}\Sigma w\). This optimal portfolio is also on the efficient frontier.  Thus different \(q\) can lead to different optimal portfolios on the efficient frontier.  When \(q\) is infinity, the return term in the objective function would be 0 comparing to the risk. It means that the only consideration in portfolio optimization is the risk. Thus the portfolio corresponding to \(q=\infty\) is the minimum variance portfolio. When \(q\) is decreasing to 0, the significance of the risk term in the objective function is also decreasing.  And the corresponding portfolio is moving in the upper right direction along the efficient frontier. Finally when \(q\) is 0, we are just maximizing expected return, without any constraints on the risk.

From the above discussion, it can be seen that changing \(q\) in Problem 2′ can also find the efficient frontier. However this is not the approach used in AlgoQuant. Because the point movements on the curve are very slow when \(q\) increases. For example, portfolios corresponding to \(q = 0.1\) and \(q = 10\) are very close on the efficient frontier. As a result, changing expected return in Problem 2 is more convenient to draw the efficient frontier.

When \(q\) is unknown, AlgoQuant provides a method to find the optimal \(q\). In AlgoQuant, MarkowitzPortfolio class, there is a method called getOptimalRiskAversionCoefficient. This method will try different risk-averse coefficients and select the one whose corresponding optimal portfolio has the largest Sharpe ratio, given a risk free rate. And the corresponding portfolio is the tangency portfolio on the efficient frontier.

So far three different portfolios have been discussed: the minimum variance portfolio, the tangency portfolio and the optimal portfolio. And they have the following connections. The optimal portfolio is found by solving Problem 2′, given a risk-averse coefficient \(q\). And every portfolio on the efficient frontier is an optimal portfolio. The optimal portfolio with the largest Sharpe ratio (given a risk free rate) is the tangency portfolio.  And finally the optimal portfolio with the smallest risk is the minimum variance portfolio.  Moreover the weight corresponding to any one of the optimal portfolios is a linear combination of weights corresponding to the minimum variance portfolio and the tangency portfolio.

By providing an extra input parameter \(q\), AlgoQuant is more flexible to give a client a specific optimal portfolio. Because if a client knows his risk-averse coefficient \(q\), the optimal portfolio corresponding to \(q\) can be computed by AlgoQuant. And if he doesn’t, AlgoQuant can always recommend a portfolio on the efficient frontier according to his expected return.

Reference:

Appendix:

The following code is an example on computing the efficient frontier in AlgoQuant. In this example, Problem 2 is solved with different expected returns and fixed \(q\).

public void generateFrontierUsingAlgoQuant() throws Exception {
    System.out.println("generating efficient frontier");
    Matrix sigma = new DenseMatrix(new double[][]{
        {0.1, 0.03, -0.08, 0.05},
        {0.03, 0.2, 0.02, 0.03},
        {-0.08, 0.02, 0.3, 0.2},
        {0.05, 0.03, 0.2, 0.9}
    });
    final Vector mu 
            = new DenseVector(new double[]{0.08, 0.09, 0.1, 0.11});

    double q = 1;
    double[] expRs = DoubleUtils.seq(0.07, 0.14, 0.005);
    double[] returns = new double[expRs.length];
    double[] stdevs = new double[expRs.length];
    int i = 0;
    final int n = mu.size();
    for (final double expR : expRs) {
        // we constraint the expected return to compute the point
        // on the frontier
        WeightConstraints expectedReturnConstraint 
                = getExpReturnConstraint(mu, expR);

        MarkowitzPortfolio mp = new MarkowitzPortfolio(mu,
                sigma, expectedReturnConstraint);
        mp.setRiskAversionCoefficient(q);
        double ret = mp.getPortfolioReturn();
        double stdev = Math.sqrt(mp.getPortfolioVariance());
        Vector w = mp.getWeights();

        System.out.println("exp return = " + expR);
        System.out.println("q = " + q);
        System.out.println("w = " + w);
        System.out.println("return = " + ret + " = "
                + w.innerProduct(mu));
        System.out.println("stdev = " + stdev + " = "
                + Math.sqrt(w.innerProduct(sigma.multiply(w))));

        returns[i] = ret;
        stdevs[i] = stdev;
        i++;
    }

    // print out for plotting in Excel
    System.out.println("Summary:");
    System.out.println("  returns = ");
    for (double ret : returns) {
        System.out.println(ret);
    }
    System.out.println("  stdevs = ");
    for (double std : stdevs) {
        System.out.println(std);
    }
}

private WeightConstraints getExpReturnConstraint(
        final Vector mu, 
        final double expR) {
    return new WeightConstraints() {
        @Override
        public LinearGreaterThanConstraints getLinearGreaterThanConstraints() {
            return null;
        }

        @Override
        public LinearLessThanConstraints getLinearLessThanConstraints() {
            return null;
        }

        @Override
        public LinearEqualityConstraints getLinearEqualityConstraints() {
            return new LinearEqualityConstraints(
                    new DenseMatrix(
                    mu.minus(expR).toArray(), 1, mu.size()),
                    new DenseVector(0.));
        }
    };
}

The Role of Technology in Quantitative Trading Research

I posted my presentation titled “The Role of Technology in Quantitative Trading Research” presented in

You can find the powerpoint here.

Abstract:

There needs a technology to streamline the quantitative trading research process. Typically, quants/traders, from idea generation to strategy deployment, may take weeks if not months. This means not only loss of trading opportunity, but also a lengthy, tedious, erroneous process marred with ad-hoc decisions and primitive tools. From the organization’s perspective, comparing the paper performances of different traders is like comparing apples to oranges. The success of the firm relies on hiring the right geniuses. Our solution is a technological process that standardizes and automates most of the mechanical steps in quantitative trading research. Creating a new trading strategy should be as easy and fun as playing Legos by assembling together simpler ideas. Consequently, traders can focus their attention on what they are supposed to be best at – imagining new trading ideas/strategies.

Excerpts:

  • In reality, the research process for a quantitative trading strategy, from conceptual design to actual execution, is very time consuming, e.g., months. The backtesting step, in the broadest sense, takes the longest time. There are too many details that we can include in the backtesting code. To just name a few, data cleaning and preparation, mathematics algorithms, mock market simulation, execution and slippage assumptions, parameter calibration, sensitivity analysis, and worst of all, debugging. In practice, most people will ignore many details and make unfortunate “approximation”. This is one major reason why real and paper p&l’s are different.
  • Before AlgoQuant, there is no publicly available quantitative trading research platform that alleviates quants/traders from coding up those “infrastructural” components. Most of the existing tools are either lacking extensive built-in math libraries, or lacking modern programming language support, or lacking plug-and-play “trading toolboxes”.
  • Technology can change the game by enhancing productivity. Imagine there is a system that automates and runs in a parallel grid of 100s of CPUs for you all those tedious and mundane tasks, data cleaning, mock market, calibration, and mathematics. You can save 80% of coding time and can focus your attention to trading ideas and analysis. Jim, using Matlab, may find a successful trading strategy in 3 months. You, equipped with the proper technology, may find 3 strategies in a month! The success of a hedge fund shall no longer rely on hiring genius.
  • After we code up a strategy and choose a parameter set, there is a whole suite of analysis that we can go through and many measures that we can compute to evaluate the strategy. For instance, we can see how the strategy performs for historical data, simulated data generated from Monte Carlo simulation (parametric) or bootstrapping (non-parametric), as well as scenario data (hand crafted). We can construct the p&l distribution (it is unfortunate that historical p&l seems to be the popular performance measure; we traders do not really care about what we could make in the past but care only about our bonuses in the future; so what we really want to see is the future p&l distribution for uncertainty not historical p&l); we can do sensitivity analysis of parameters; we can compute the many performance statistics. All these are very CPU-intensive tasks. Using AlgoQuant, you simply feed your strategy into the system. AlgoQuant runs all these tasks on a parallel grid and generates a nice report for you.
  • The academic community publishes very good papers on quantitative trading strategies. Unfortunately they are by-and-large unexplored. First, they are very difficult to understand because they are written for peer reviewers not laymen. Second, they are very difficult to reproduce because most authors do not publish source code. Third, they are very difficult to apply in real trading because the source code is not meant for public use, even if available.

Mean Reversion vs. Trend Following

AlgoQuant 0.0.5 just got released!

This release is particularly exciting because you no longer need a license to use AlgoQuant. AlgoQuantCore now disappears forever. The source of the entire AlgoQuant project is now available:
http://www.numericalmethod.com/trac/numericalmethod/browser/algoquant

Maybe even more exciting is that we ship this release with two quantitative trading strategies: one mean reversion strategy and one trend following strategy. More information can be found here:
http://www.numericalmethod.com/trac/numericalmethod/wiki/AlgoQuant#TradingStrategies

The question remains: when do you do mean reversion? when do you do trend following?

I will leave this to the reader to figure it out. 😀

 

Strategy Optimization

Trading strategy optimization is an important aspect and a challenging problem in algorithmic trading. It requires determining a set of optimal solutions with respect to multiple objectives, where the objective functions are often multimodal, non-convex, and non-smooth. Moreover, the objective functions are subject to various constraints of which many are typically non-linear and discontinuous. Conventional methods, such as Nelder-Mead, BFGS, cannot cope with these realistic problem settings. A solution is the stochastic search heuristic called differential evolution. In this blog piece, I will look at the performance issue of applying differential evolution in optimizing a quantitative trading strategy.

Briefly, differential evolution (DE) is a genetic algorithm that searches for a function minimum by maintaining a pool of local searches, and then combining the local searches to try to escape the local minima. When applying differential evolution to optimize a trading strategy, it will need to maintain and simulate a pool of parameter sets. A typical procedure looks like this:

Decide the parameters, x, of an algorithmic trading strategy S(x);
Choose an objective function, e.g., Sharpe-ratio, or Omega;
Choose a calibration window, L, e.g., most recent 12 months;
Choose a calibration frequency, f, e.g., every 3 months;
For each f month {
	Prepare the data set for the last L months;
	Initialize a pool of different parameter sets;
	For each DE iteration {
		Apply the DE operators to generate the next generation
                parameter sets;
		For each parameter set {
			Simulate the strategy;
			Measure the performance;
		}

		Select the best performing parameter sets;
	}

	Simulate trading the strategy using the optimized parameters
        from the DE optimization algorithm;
}

There are two, among many, difficulties in running this script in Matlab/R. The first one is coding. My personal opinion is that it requires Herculean effort to simulate non-trivial trading strategies (or any complex systems) in Matlab/R and make sure that the simulation is correct. There is simply no easy process or tool available to systematically and automatically check and test for the correctness of the scripts. Worse, most quants/traders cannot code. They know nothing about software testing. In practice, most quants/traders simply think/assume/fantasy that their code is correct. Actually, many programmers in the financial industry cannot code neither. (The good ones go to GooG, M$FT, AAPL… or startups.)

More importantly, the second difficulty, which this discussion focuses on, is performance. I believe that it is more than obvious that in general, executing Matlab/R scripts is very slow. Matlab/R executes and interprets a script line-by-line. For our algorithmic trading strategy optimization script, the bottlenecks are:

  1. Each strategy simulation is slow because of looping over times (or dates, or ticks).
  2. We need to simulate the strategy many times in each DE iteration.
  3. We run many these DE iterations as in genetic algorithm.

In other words, if you try to code up the above pseudo-code in Matlab/R, it may take days, if at all, to come up with results. If your trading strategy works with tick-by-tick data, then, it is hopeless.

To speed up the process, we can run the differential evolution algorithm in parallel on a grid. Matlab/R does allow you to parallelize your code. However, realistically, I have never seen any quants/traders writing parallel Matlab/R code. Even if they wanted to do it, few, if any, understood multi-threaded concurrent programming to get the code right. In addition, we can simply move away from Matlab/R to some real programming languages, e.g., C++/C#/Java. Complied code always runs faster than scripts. (Please!!! No VBA. Only managers use it…)

Algo Quant addresses the trading strategy optimization problem by

  1. Enable easy coding of a complex trading strategy in Java. To code a strategy, a quant/trader simply puts together components, e.g., signals, mathematics, event handlers, from the library.
  2. Optimize a strategy or a portfolio by running the differential evolution optimization algorithm in parallel.

I demonstrate the performance of Algo Quant using a simple moving average crossover strategy. This strategy maintains two moving averages, a fast moving average and a slow moving average. When the fast moving average crosses the slow moving average from below, we enter a long position; when the fast moving average crosses the slow moving average from above, we enter a short position. There are theoretical and empirical evidence to show that both the fast and slow moving window sizes should not be too big. See Haksun’s course note lecture 6.

Most traders choose the fast and slow moving window sizes by 1) guessing, 2) hearsay, 3) backtesting (with a few more or less randomly chosen parameters) in Bloomberg. Some traders recalibrate periodically by updating the parameters.

Let’s first try the hearsay approach. A popular choice is (50, 200). I simulate this strategy trading the S&P 500 futures, VFINX, from 2000/1/1 to 2011/5/31 with the Yahoo data. The source code is here.

Here is the P&L generated using Algo Quant.

P&L for a simple moving average crossover strategy

We have: pnl = 50.4; sharpe = 0.080829; omega = 1.285375

There is no reason that (50, 200) is the optimal parameter set or that it even works at all. Intuitively, if we periodically update the pair (more flexibility), we could potentially generate a better P&L. For example, I trade this Simple Moving Average Crossover strategy by using the optimal parameters Dynamically calibrated every 3 months using the data in the last 12 months. For the objective function to determine optimality, I use omega. The source code is here.

As expected, this dynamically calibrated strategy does generate (much) better P&L than the statically strategy using randomly guessed parameters.

P&L for a dynamically calibrated simple moving average crossover strategy

We have: pnl = 106.65; sharpe = 0.309824; omega = 2.816514

In terms of performance (computing speed), it took 32.5 minutes (1954877 ms) to finish the simulation for 10 and a half years on my workstation (dual E5520 @ 2.27GHz) with 12GB memory. (As a side note, using the other parallel Brute Force algorithm took only 6.6 minutes or 395334 ms.) Our parallel differential evolution algorithm maintains a pool of 16 parameter sets. In each of the 80 iterations, I run 16 simulations in parallel, one in each core. The picture shows that my computer is working hard, utilizing all its resources to search for the historically optimal parameter sets. Redoing this simulation-test-and-pick procedure in Matlab/R might probably take days, if not forever.

CPU usage

As a disclaimer, I do not claim that this dynamically calibrated SMA/2/Crossover strategy generates alpha on the S&P500 future. In fact, I just happened to pick (f = 3, L =12) by chance. My point is to compare Algo Quant’s performance of the parallel differential evolution on strategy optimization to that of Matlab/R.

The Right (and Wrong) Way to Run an Algorithmic Trading Group

I would like to share with you the unique vision that Numerical Method Inc. has about running an algorithmic trading group. To get an edge over competing funds, we emphasize on 1) the research process and 2) technology rather than on hiring more intelligent people.

Currently, the majority of the quant funds run like cheap arcade booths: the traders are given workstations and data. They do whatever to crank out strategies. The only contributing factor to profit is luck – luck in finding the right people and/or luck in finding the right strategies. I had a conversation with an executive from a large financial organization two years ago when they started to build an algorithmic trading group. He said, “Haksun, we need to hire some very smart people to be better than Renaissance.” I repeatedly hear something similar from various portfolio managers and hedge fund owners.

Staffing “very smart” people in this ad-hoc, unmethodical, non-scientific process in search of profitable trading strategy is merely a lottery in disguise.

  1. The main reason is that there is no necessary condition between “very smart” people and “very profitable” trading strategies. If there is any relationship, it can only be a sufficient condition.
  2. It is very difficult to hire the “very smart” people because
    1. They are difficult to be identified among numerous pretentiously smart people.
    2. The competition for them is a very fierce battle. The best examples are the fight between Microsoft and Google over Dr. Lee Kai-Fu, the dispute between Renaissance Technologies and Millennium Partners over two former employees.
  3. The very best people are driven by passion rather than money, e.g., Gates, Jobs. They cannot be hired.

The key to success in running an algorithmic trading group, as in warfare, lies in process (tactics) and technology (weapon), not on star traders. Analogously, knights, despite being elite warriors, were superseded by cheap infantry; German tanks, despite being best engineered during WWII, were outnumbered by cheap USSR tanks. The better traders do not get their profitable strategies from higher beings. Speaking as an AI scientist, they are “better” only because their search involves a bigger space (more knowledge) and is faster (more efficient). We have created a process and a technology that enables a good trader to be just as profitable as the “very smart” traders.

Process

In terms of the algorithmic trading research process, there is usually very little standardization even within one firm, let alone in the industry. Most quant fund houses do not invest in building research technology (with the famous exception of Blackrock).

This is best illustrated by the languages they use to do backtesting. When there are 6 traders, they could be using: Matlab, R, VBA, C++, Java, C#. The first consequence of this using-their-best-language is that there is absolutely no sharing of code. This firm would write the same volatility calculation function 6 times. Suppose trader A comes up with a new way of measuring volatility, trader B could not leverage on this. Trader C could not quickly prototype a new trading idea by combining the mean-reverting signal from trader D and the trend following signal from trader E. The productivity is very low.

The management is not able to compare strategies from two traders. The 6 traders all make different “simplifications” in backtesting. For instance, they would clean the same data set in different way; they would use their “proprietary” bid-ask, execution, (market vs. limit) order models for computing historical P&L; they would make assumptions about liquidity and market impacts. Mainly due to simplifying coding, they would make all sorts of guesses about the details in executing their strategies. The management does not have the time to question every single detail in their backtesting, hence the lack of understanding and confidence. They would simply resort to “trusting” the reports.

Worst, while algorithmic traders maybe excellent mathematicians, they are usually bad programmers. I am yet to see an algorithmic trader who understands modern programming concepts such as interface vs. inheritance, memory model, thread safety, design pattern, testing, and software engineering. They usually produce code that is very long, unstructured and poorly documented. The code must have bugs. Therefore, the management cannot trust their backtesting result and the performance report.

Our solution to make systematic the trading research process has three steps. Firstly, we mandate that all traders do their backtesting in the same language, e.g., Java. Secondly, we mandate that all traders contribute their code to a research library. Thirdly, the firm invests in a common backtesting infrastructure by expanding this research library. The advantages are the following.

  1. The traders can focus on what they are supposedly good at – generating innovative trading strategies. They no longer bother with the IT grunt work.
  2. They can quickly prototype a trading idea by putting together components, e.g., signals, filters, modules, from the research library.
  3. They can share code with colleagues and be understood because all conform to the same standard.
  4. They can compare strategies because the performance measures are computed from simulations making the same assumptions.

Over time, as the algorithmic trading firm invests, creates as well as expands the research library and backtesting system, this IT infrastructure will become the most valuable asset. Suppose a star trader takes 3 month to test a good idea and make it profitable. With this standardized process and technology, a good trader is able to rapidly prototype 30 strategies in 3 days in parallel on a cluster of computers. The profitability of the firm depends not on hiring Einstein but on good and hardworking people leveraging on using the infrastructure.

Technology

There are many vendors that sell backtesting tools. However, IMHO, these backtesters are no more than augmented versions of for-loops – looping over historical data. Some better ones come with various features, e.g., importing data, cleaning data, statistics and signal computation, optimization, real-time paper trading. Some even go one step further to provide brokerage services for real trading. The major problem with all these backtesters is that you cannot code, hence simulate, true quantitative trading strategies with them. Suppose you want to replicate Elliott and Malcolm’s pair trading strategy, you will need to do SDE, EM, MLE, and KF. Most commercial backtesters are built for amateurs and simply cannot do it.

There are a few more professional backtesters that provide link to Matlab or R for data analytics. I have a lot of complains about doing data analysis in Matlab/R. Many traders use them only because they cannot code. (It is extremely rare to find someone who is mathematically talented and can code.) Matlab/R is for amateurs; VBA is a joke. The problems are:

  1. They are very slow. These scripting languages are interpreted line-by-line. They are not built for parallel computing. (OK. I know Matlab/R can do parallel computing but who use them in practice? Which traders understand immutability to write good parallel code?)
  2. They do not handle a lot of data well. How do you handle two year worth of EUR/USD tick by tick data in Matlab/R?
  3. There is no modern software engineering tools built for Matlab/R. How do you know your code is correct? Most traders just think that their code is correct because they are not trained programmers.
  4. The code cannot be debugged easily. Ok. Matlab comes with a toy debugger somewhat better than gdb. It does not compare to NetBeans, Eclipse or IntelliJ IDEA.
  5. How do you debug a large Matlab/R application with 50 pages of scripts anyway? I usually just give up.

You can tell that Matlab/R is not fit for financial modelling simply by observing that no serious person/bank/fund does option pricing in Matlab/R. My former employers do not price our portfolios using Matlab/R on my workstation. They deploy a few thousands computers in a grid to run the pricing code for the whole night! Likewise, we need a cluster of computers to run our trading models for many different scenarios with many different parameters before we are confident to bet a few 100 million on them.

Algo Quant is a first attempt by Numerical Method Inc. to solve the technology problem. Algo Quant is an embodiment of the systematic algorithmic trading research process that we discussed. Algo Quant is a suite of algorithmic trading research tools (with source code) for trading idea development, quick prototyping, data preparation, in-samples calibration, out-samples backtesting, even automatic trading strategy generation. Algo Quant is backed by SuanShu, a powerful modern library of mathematics, statistics and data mining. For more information, please check out this.

If you are a passionate quant/trader/programmer who shares our vision to revolutionize the algorithmic trading industry, please join us! If you are a hedge fund who wants to hire us to implement this scientific trading research process in house, please contact us.

Gain Capital FX rate data

For most non-professional FX traders, one hurdle to start your own algorithmic trading research is tick-by-tick data. Gain Capital Group has been very kind and does the community a very good service by providing historical rate data on their website: http://ratedata.gaincapital.com/. The data look OK.
Unfortunately, one serious problem that catches eyes are the very ad-hoc formats of the data files. For example,
  • The newer csv files store data by weeks (e.g., in year 2010) while some older csv files store data by quarters (e.g., in year 2000).
  • Some zipped files contains folders while other contains csv files.
  • Some zipped files contains other zipped files.
  • Some csv files have headers while others don’t.
  • The orderings of the columns are not all the same for all csv files. Some csv files have even missing columns.
  • The timestamp formats are not all the same.
  • Worst, the csv files have different encoding!
In order to process these “raw” data zipped files into useful and more importantly usable format that we can use for research, I recommend these following steps.
  1. We first unzip the raw zipped files recursively until we store store all plain csv files in one single folder. This is to handle the situation where zipped files in zipped files and folders in folders.
  2. Read each csv file using an appropriate parser (depending on the encoding, format, headers, etc.), row by row, line by line.
  3. Group the rows of the same date together and write them out as one csv file, e.g., AUDCAD.2007-10-7.csv.zip.
  4. Zip the csv file for easy archive.
  5. Write an adapter to read the processed zipped csv files and convert the data into whatever format your backtesting tool reads.
I have written some code to automate these steps, and they are now available in Algo Quant for free. Specifically, the important Java classes are:
  1. GainCapitalFxCsvFilesExtractor.java: unzip all csv files into one folder
  2. GainCapitalFXRawFile.java: read the Gain Capital csv files of various formats, save and zip the data by dates.
  3. GainCapitalFXRawFile.java: read the zipped csv data files.
  4. GainCapitalFXCacheFactory.java: the data feed/adapter to read the zipped csv data files into the Algo Quant backtesting system.
Here is a sample usage:
To unzip the raw zipped files from Gain Capital, we do
    public void test_0010() {
        GainCapitalFxCsvFilesExtracter extracter = new GainCapitalFxCsvFilesExtracter();
        String outputDirName = "./log/tmpGCcsv";
        String tmpDirName = "./log/tmpDir";
        int nCSV = extracter.extract("./test-data/gaincapital/2000", outputDirName,
            tmpDirName);
        assertEquals(18, nCSV);

        // clean up
        NMIO.deleteDir(new File(outputDirName));
        NMIO.deleteDir(new File(tmpDirName));
    }
To extract from the raw files the rows, group, save, and zip them by dates, we do
    public void test_0030() throws IOException {
        String input = "C:/Temp/GCcsv";
        String output = "C:/Temp/byDates";

        File outputDir = new File(output);
        if (!outputDir.canWrite()) {
            outputDir.mkdir();
        }

        File inputDir = new File(input);

        //reading the file names
        File[] csvs = inputDir.listFiles(new FilenameFilter() {

            public boolean accept(File dir, String name) {
                return name.matches(".*[.]csv");
            }
        });

        //these pairs are already processed before, e.g., the program crashed
        Set done = new HashSet();
//        done.add("AUD_USD");
//        done.add("CHF_JPY");

        Set pairs = new HashSet();
        for (File csv : csvs) {
            String pair = csv.getName().substring(0, 7);
            if (pair.matches("..._...")) {
                if (!done.contains(pair)) {
                    pairs.add(pair);
                }
            }
        }

        GainCapitalFxDailyCsvZipFileConverter converter = new
              GainCapitalFxDailyCsvZipFileConverter();
        for (String pair : pairs) {
            String ccy1 = pair.substring(0, 3);
            String ccy2 = pair.substring(4, 7);
            System.out.println(String.format("working on %s/%s", ccy1, ccy2));
            converter.process(
                    ccy1, ccy2,
                    input,
                    String.format("%s/%s%s", output, ccy1, ccy2));
        }
    }
Have fun using the Gain Capital FX data for your own trading research!