Using Returns in Pairs Trading

This blog article is taken from our book [1].

In most entry-level materials on pairs trading such as in [2],  a mean reverting basket is usually constructed by this relationship:

\(P_t – \gamma Q_t = Z_t, \textrm{(eq. 1)}\)

, where \(P_t\) is the price of asset \(P\) at time t, \(Q_t\) the price of asset \(Q\) at time t, and \(Z_t\) the price of the mean reverting asset to trade. One way to find \(\gamma\) is to use cointegration. There are numerous problems in this approach as detailed in [1]. To mention a few: the identified portfolios are dense; executions involve considerable transaction costs; the resultant portfolios behave like insignificant and non-tradable noise; cointegration is too stringent and often unnecessary a requirement to satisfy.

This article highlights one important problem: it is much better to work in the space of (log) returns than in the space of prices. Therefore, we would like to build a mean reverting portfolio using a similar relationship to (eq. 1) but in returns rather than in prices.

The Benefits of Using Log Returns

When we compare the prices of two assets, [… TODO …]

 

A Model for a Mean Reverting Synthetic Asset

Let’s assume prices are log-normally distributed, which is a popular assumption in quantitative finance, esp. in options pricing. Then, prices are always positive, satisfying the condition of “limited liability” of stocks. The upside is unlimited and may go to infinity. [5] We have:

\(P_t = P_0\exp(r_{P,t}) \\ Q_t = Q_0\exp(r_{Q,t}), \textrm{eq. 2}\)

\(r_{P,t}\) is the return for asset \(P\) between times 0 and t; likewise for asset \(Q\).

Instead of applying a relationship, e.g., cointegration (possible but not a very good way), to the pair on prices, we can do it on returns. This is possible because, just like prices, the returns at time t are simply random walks, hence \(I(1)\) series. We have (dropping the time subscript):

\(r_P – \gamma r_Q = Z, \textrm{(eq. 3)}\)

Of course, the \(\gamma\) is a different coefficient; the \(Z\) a different white noise.

Remove the Common Risk Factors

Let’s consider this scenario. Suppose the oil price suddenly drops by half (as is developing in the current market). Exxon Mobile (XOM), being an oil company, follows suit. American Airlines (AAL), on the other hand, can save cost on fuel and may rise. The naive (eq. 3) will show a big disequilibrium and signal a trade on the pair. However, this disequilibrium is spurious. Both XOM and AAL are simply reacting to the new market/oil regime and adjust their “fair” prices accordingly. (Eq. 3) fails to account for the common oil factor to both companies. Mean reversion trading should trade only on idiosyncratic risk that are not affected by systematic risks.

To improve upon (eq. 3), we need to remove systematic risks or common risk factors from the equation. Let’s consider CAPM. It says:

\(r = r_f + \beta (r_M – r_f) + \epsilon, \textrm{(eq. 4)}\)

The asset return, \(r\), and \(\epsilon\), are normally distributed random variables. The average market return, \(r_M\), and the risk free rate, \(r_f\), are constants.

Substituting (eq. 4) into the L.H.S. of (eq. 3) and grouping some constants, we have:

\((r_P – \beta_P (r_M-r_f)) – \gamma (r_Q – \beta_Q (r_M-r_f)) = \epsilon + \mathrm{constant}\)

To simply things:

\((r_P – \beta_P r_M) – \gamma (r_Q – \beta_Q r_M) = \epsilon + \gamma_0, \textrm{(eq. 5)}\)

where \(\gamma_0\) is a constant.
(Eq. 5) removes the market/oil effect from the pair. When the market simply reaches a new regime, our pair should not change its value. In general, for multiple n asset, we have:

\(\gamma_0 + \sum_{i=1}^{n}\gamma_i (r_i – \beta_ir_M) = \epsilon, \textrm{(eq. 6)}\)

For multiple n asset, multiple m common risk factors, we have:

\(\gamma_0 + \sum_{i=1}^{n}\gamma_i (r_i – \sum_{j=i}^{m}\beta_jF_j) = \epsilon, \textrm{(eq. 7)}\)

Trade on Dollar Values

It is easy to see that if we use (eq. 1) to trade the pair, to long (short) \(Z\), we buy (sell) 1 share of \(P\) and sell (long) \(\gamma\) share of \(Q\). How do we trade using (eqs. 3, 5, 6, 7)? When we work in the log-return space, we trade for each stock, \(i\), the number of shares worth of \(\gamma_i\). That is, we trade for each stock \(\gamma_i/P_i\) number of shares, where \(P_i\) is the current price of stock \(i\).

Let’s rewrite (eq. 3) in the price space.

\(\log(P/P_0) – \gamma \log(Q/Q_0) = Z\)

The R.H.S. is

\(\log(P/P_0) – \gamma \log(Q/Q_0) = \log(1 + \frac{P-P_0}{P_0}) – \gamma \log(1 + \frac{Q-Q_0}{Q_0})\)

Using the relationship \(\log(1+r) \approx r, r \ll 1\), we have

\(\log(1 + \frac{P-P_0}{P_0}) – \gamma \log(1 + \frac{Q-Q_0}{Q_0}) \approx \frac{P-P_0}{P_0} – \gamma \frac{Q-Q_0}{Q_0} \\ = (\frac{P}{P_0} -1) – \gamma (\frac{Q}{Q_0} -1) \\ = \frac{1}{P_0}P – \gamma \frac{1}{Q_0}Q + \mathrm{constant} \\= Z\)

Dropping the constant, we have:

\(\frac{1}{P_0}P – \gamma \frac{1}{Q_0}Q = Z, \textrm{(eq. 8)}\)

That is, we buy \(\frac{1}{P_0}\) shares of \(P\) at price \(P_0\) and \(\frac{1}{Q_0}\) shares of \(Q\) at price \(Q_0\). We can easily extend (eq. 8) to account for the general cases: we trade for each stock \(i\) \(\gamma_i/P_i\) number of shares.

References:

  1. Numerical Methods in Quantitative Trading, Dr. Haksun Li, Dr. Ken W. Yiu, Dr. Kevin H. Sun
  2. Pairs Trading: Quantitative Methods and Analysis, by Ganapathy Vidyamurthy
  3. Identifying Small Mean Reverting Portfolios, Alexandre d’Aspremont
  4. Developing high-frequency equities trading models, Infantino
  5. The Econometrics of Financial Markets, John Y. Campbell, Andrew W. Lo, & A. Craig MacKinlay

Change of Measure/Girsanov’s Theorem Explained

Change of Measure or Girsanov’s Theorem is such an important theorem in Real Analysis or Quantitative Finance. Unfortunately, I never really understood it until much later after having left school. I blamed it to the professors and the textbook authors, of course.  The textbook version usually goes like this.

Given a probability space \({\Omega,\mathcal{F},P}\), and a non-negative random variable Z satisfying \(\mathbb{E}(Z) = 1\) (why 1?). We then defined a new probability measure Q by the formula, for all \(A in \mathcal{F}\).

\(Q(A) = \int _AZ(\omega)dP(w)\)

Any random variable X, a measurable process adapted to the natural filtration of the \(\mathcal{F}\), now has two expectations, one under the original probability measure P, which denoted as \(\mathbb{E}_P(X)\), and the other under the new probability measure Q, denoted as \(\mathbb{E}_Q(X)\). They are related to each other by the formula

\(\mathbb{E}_Q(X) = \mathbb{E}_P(XZ)\)

If \(P(Z > 0) = 1\), then P and Q agree on the null sets. We say Z is the Radon-Nikodym derivatives of Q with respect to P, and we write \(Z = \frac{dQ}{dP}\). To remove the mean, μ, of a Brownian motion, we define

\(Z=\exp \left ( -\mu X – \frac{1}{2} \mu^2 \right )\)

Then under the probability measure Q, the random variable Y = X + μ is standard normal. In particular, \(\mathbb{E}_Q(X) = 0\) (so what?).

This text made no sense to me when I first read it in school. It was very frustrated that the text was filled with unfamiliar terms like probability space and adaptation, and scary symbols like integration and \(\frac{dQ}{dP}\). (I knew what \(\frac{dy}{dx}\) meant when y was a function and x a variable. But what on earth were dQ over dP?)

Now after I have become a professor to teach students in finance or financial math, I would get rid of all the jargon and rigorousness. I would focus on the intuition rather than the math details (traders are not mathematicians). Here is my laymen version.

Given a probability measure P. A probability measure is just a function that assigns numbers to a random variable, e.g., 0.5 to head and 0.5 to tail for a fair coin. There could be another measure Q that assigns different numbers to the head and tail, say, 0.6 and 0.4 (an unfair coin)! Assume P and Q are equivalent, meaning that they agree on what events are possible (positive probabilities) and what events have 0 probability. Is there a relation between P and Q? It turns out to be a resounding yes!

Let’s define \(Z=\frac{Q}{P}\). Z here is a function as P and Q are just functions. Z is evaluated to be 0.6/0.5 and 0.4/0.5. Then we have

\(\mathbb{E}_Q(X) = \mathbb{E}_P(XZ)\)

This is intuitively true when doing some symbol cancellation. Forget about the proof even though it is quite easy like 2 lines. We traders don’t care about proof. Therefore, the distribution of X under Q is (by plugging in the indicator function in the last equation):

\(\mathbb{E}_Q(X \in A) = \mathbb{E}_P(I(X \in A)Z)\)

Moreover, setting X = 1, we have (Z here is a random variable):

\(\mathbb{E}_Q(X) = 1 = \mathbb{E}_P(Z)\)

These results hold in general, especially for the Gaussian random variable and hence Brownian motion. Suppose we have a random (i.e., stochastic) process generated by (adapted to) a Brownian motion and it has a drift μ under a probability measure P. We can find an equivalent measure Q so that under Q, this random process has a 0 drift. Wiki has a picture that shows the same random process under the two different measures: each of the 30 paths in the picture has a different probability under P and Q.

The change of measure, Z, is a function of the original drift (as would be guessed) and is given by:

\(Z=\exp \left ( -\mu X – \frac{1}{2} \mu^2 \right )\)

For a 0 drift process, hence no increment, the expectation of the future value of the process is the same as the current value (a laymen way of saying that the process is a martingale.) Therefore, with the ability to remove the drift of any random process (by finding a suitable Q using the Z formula), we are ready to do options pricing.

Now, if you understand my presentation and go back to the textbook version, you should have a much better understanding and easier read, I hope.

References:

Trading and Investment as a Science

Here is the synopsis of my presentation at HKSFA, September 2012. The presentation can be downloaded from here.

1.

Many people lose money playing the stock market. The strategies they use are nothing but superstitions. There is no scientific reason why, for example, buying on a breakout of the 250-day-moving average, would make money. Trading profits do not come from wishful thinking, ad-hoc decision, gambling, and hearsay, but diligent systematic study.
• Moving average as a superstitious trading strategy.

2.

Many professionals make money playing the stock market. One approach to investment decision or trading strategy is to treat it as a science. Before we make the first trade, we want to know how much money we expect to make. We want to know in what situations the strategy will make (or lose) money and how much.
• Moving average as a scientific trading strategy

3.

There are many mathematical tools and theories that we can use to quantify, analyse, and verify a trading strategy. We will show case some popular ones.
• Markov chain (a trend-following strategy)
• Cointegration (a mean-revision strategy)
• Stochastic differential equations (the best trading strategy, ever!)
• Extreme value theory (risk management, stop-loss)
• Monte Carlo simulation (what are the success factors in a trading strategy?)

Data Mining

The good quant trading models reveal the nature of the market; the bad ones are merely statistical artifacts.

One most popular way to create spurious trading model is data snooping or data mining. Suppose we want to create a model to trade AAPL daily. We download some data of, e.g., 100 days of AAPL, from Yahoo. If we work hard enough with the data, we will find a curve (model) that explains the data very well. For example, the following curve perfectly fits the data.

Suppose the prices are \({ x_1, x_2, \dots x_n }\)

\(\frac{(t-2)\dots(t-n)}{(1-2)\dots(1-n)}(x_1) + \frac{(t-1)\dots(t-n)}{(2-1)\dots(2-n)}(x_2) + \dots + \frac{(t-1)\dots(t-n+1)}{(n-1)\dots(n-n+1)}(x_n)\)

Of course, most of us are judicious enough to avoid this obvious over-fitting formula. Unfortunately, some may fall into the trap of it in disguise. Let’s say we want to understand what factors contribute to the AAPL price movements or returns. (We now have 99 returns.) We come up with a list of 99 possible factors, such as PE, capitalization, dividends, etc. One very popular method to find significant factors is linear regression. So, we have

\(r_t = \alpha + \beta_1f_{1t} + \dots + \beta_{99}f_{99t} + \epsilon_t\)

Guess how well this fits? The goodness-of-fit (R-squared) turns out be 100% – a perfect fit! It can be proved that this regression is a complete nonsense. Even if we throw in random values for those 99 factors, we will also end up with a perfect fit regression. Consequently, the coefficients and t-stats mean nothing.
Could we do a “smaller” regression on a small subset of factors, e.g., one factor at a time, and hope to identify the most significant factor? This step-wise regression turns out to be spurious as well. For a pool of large enough factors, there is big probability of finding (the most) significant factors even when the factors values are randomly generated.

Suppose we happen to regress returns on only capitalization and finds that this factor is significant. Even so, we may in fact be doing some form of data snooping. This is because there are thousands other people testing the same or different factors using the same data set, i.e., AAPL prices from Yahoo. This community, taken as a whole, is doing exactly the same step-wise regression described in the last paragraph. In summary, empirical evidence alone is not sufficient to justify a trading model.

To avoid data snooping in designing a trading strategy, Numerical Method Inc. recommends our clients a four-step procedure.

  1. Hypothesis: we start with an insight, a theory, or a common sense about how the market works.
  2. Modeling: translate the insight in English into mathematics (in Greek).
  3. Application: in-sample calibration and out-sample backtesting.
  4. Analysis: understand and explain the winning vs. losing trades.

In steps 1 and 2, we explicitly write down the model assumptions, deduce the model properties, and compute the p&l distribution. We prove that under those assumptions, the strategy will always make money (on average). Whether these assumptions are true can be verified against data using techniques such as hypothesis testing. Given the model parameters, we know exactly how much money we expect to make. This is all done before we even look at a particular data set. In other words, we avoid data snooping by using the data set only until the calibration step and after we have created a trading model.

An example of creating a trend following strategy using this procedure can be found in lecture 1 of the course “Introduction to Algorithmic Trading Strategies”.

 

The Role of Technology in Quantitative Trading Research

I posted my presentation titled “The Role of Technology in Quantitative Trading Research” presented in

You can find the powerpoint here.

Abstract:

There needs a technology to streamline the quantitative trading research process. Typically, quants/traders, from idea generation to strategy deployment, may take weeks if not months. This means not only loss of trading opportunity, but also a lengthy, tedious, erroneous process marred with ad-hoc decisions and primitive tools. From the organization’s perspective, comparing the paper performances of different traders is like comparing apples to oranges. The success of the firm relies on hiring the right geniuses. Our solution is a technological process that standardizes and automates most of the mechanical steps in quantitative trading research. Creating a new trading strategy should be as easy and fun as playing Legos by assembling together simpler ideas. Consequently, traders can focus their attention on what they are supposed to be best at – imagining new trading ideas/strategies.

Excerpts:

  • In reality, the research process for a quantitative trading strategy, from conceptual design to actual execution, is very time consuming, e.g., months. The backtesting step, in the broadest sense, takes the longest time. There are too many details that we can include in the backtesting code. To just name a few, data cleaning and preparation, mathematics algorithms, mock market simulation, execution and slippage assumptions, parameter calibration, sensitivity analysis, and worst of all, debugging. In practice, most people will ignore many details and make unfortunate “approximation”. This is one major reason why real and paper p&l’s are different.
  • Before AlgoQuant, there is no publicly available quantitative trading research platform that alleviates quants/traders from coding up those “infrastructural” components. Most of the existing tools are either lacking extensive built-in math libraries, or lacking modern programming language support, or lacking plug-and-play “trading toolboxes”.
  • Technology can change the game by enhancing productivity. Imagine there is a system that automates and runs in a parallel grid of 100s of CPUs for you all those tedious and mundane tasks, data cleaning, mock market, calibration, and mathematics. You can save 80% of coding time and can focus your attention to trading ideas and analysis. Jim, using Matlab, may find a successful trading strategy in 3 months. You, equipped with the proper technology, may find 3 strategies in a month! The success of a hedge fund shall no longer rely on hiring genius.
  • After we code up a strategy and choose a parameter set, there is a whole suite of analysis that we can go through and many measures that we can compute to evaluate the strategy. For instance, we can see how the strategy performs for historical data, simulated data generated from Monte Carlo simulation (parametric) or bootstrapping (non-parametric), as well as scenario data (hand crafted). We can construct the p&l distribution (it is unfortunate that historical p&l seems to be the popular performance measure; we traders do not really care about what we could make in the past but care only about our bonuses in the future; so what we really want to see is the future p&l distribution for uncertainty not historical p&l); we can do sensitivity analysis of parameters; we can compute the many performance statistics. All these are very CPU-intensive tasks. Using AlgoQuant, you simply feed your strategy into the system. AlgoQuant runs all these tasks on a parallel grid and generates a nice report for you.
  • The academic community publishes very good papers on quantitative trading strategies. Unfortunately they are by-and-large unexplored. First, they are very difficult to understand because they are written for peer reviewers not laymen. Second, they are very difficult to reproduce because most authors do not publish source code. Third, they are very difficult to apply in real trading because the source code is not meant for public use, even if available.

Mean Reversion vs. Trend Following

AlgoQuant 0.0.5 just got released!

This release is particularly exciting because you no longer need a license to use AlgoQuant. AlgoQuantCore now disappears forever. The source of the entire AlgoQuant project is now available:
http://www.numericalmethod.com/trac/numericalmethod/browser/algoquant

Maybe even more exciting is that we ship this release with two quantitative trading strategies: one mean reversion strategy and one trend following strategy. More information can be found here:
http://www.numericalmethod.com/trac/numericalmethod/wiki/AlgoQuant#TradingStrategies

The question remains: when do you do mean reversion? when do you do trend following?

I will leave this to the reader to figure it out. 😀

 

Quantitative Trading: Economist Approach vs. Mathematician Approach

Thank you Lewis for introducing me to the field of “Quantitative Equity Portfolio Management”. It opens my eyes to the other spectrum of “Quantitative Trading.” Apparently what Lewis considers quantitative trading is very different from what I consider quantitative trading. I call the former an economist approach and the latter a mathematician approach. This blog piece does a very brief comparison and points out some new research directions by taking the advantages of both.

Briefly, the economist approach is a two-step approach. The first step tries to predict the exceptional excess returns alpha by examining its relationships with macroeconomic factors, such as momentum, dividends, growth, fundamentals and etc. The second step is capitals allocation. The focus in the economist approach is on identifying the “right” economic factors. The mathematics  employed is relatively simple: linear regression, (constrained) quadratic programming. The trading horizon is month-on-month, quarter-on-quarter, or even years. An example of such is factor model in QEPM.

In contrast, the mathematician approach tries to predict the short-term price movement by building sophisticated mathematical models for the, e.g., price time series. The focus is on finding the right mathematics to better describe the statistical properties of price process, e.g., stochastic calculus, Markov chain. Macroeconomic and fundamental factors are not often used. The trading horizon is intra-day or seconds. An example of such is volatility arbitrage in different intraday time scales.

One way to appreciate the differences is by looking at their trading horizons. When trading high frequency, the company fundamentals certainly have little relevance because, e.g., the quarterly earnings do not change second-by-second. The statistical properties of the price process dominate in these time scales.  As we increase the trading horizon to days, months, quarters, and even years, the macroeconomic information become more relevant and important.

Recent research has combined the advantages from both: the utilization of macroeconomic information from the economist approach and the sophistication of mathematical modeling from the mathematician approach. One example of such a hybrid approach is a Markov switching model on dual-beta modeling. The trading horizon is daily. This trading strategy imposes an advanced modeling on the time series property (namely serialization) of beta which itself is computed using macroeconomic information using the economic theory such as CAPM.

 

Java vs C++ performance

It is very unfortunate that some people are still not aware of the fact that Java performance is comparable to that of C++. This blog piece collects the evidence to support this claim.

The wrong perception about Java slowness is by-and-large because Java 1 in 1995 was indeed slower than C++. Java has improved a lot since then, e.g., hotspot. It is now version 6 and soon will be version 7. Java is now a competitive technology comparing to C/C++. In fact, in order to realistically optimize for C/C++, you need to find the “right” programmer to code it. This programmer needs to be aware of all the performance issues of C/C++, profiling, code optimization such as loop unfolding, and may even need to write code snippets in assembly. An average Joe coding in C/C++ is probably not any faster than coding in Java.

(I am in general against code optimization techniques because they make the code unreadable to humans, hence unmaintainable, such as a lot of the FORTRAN/C/C++ code found in Netlib and Statlib.)

More importantly, most modern software runs on multiple cores. Code optimization techniques are dwarfed by parallel computing technologies. It is significantly easier and more efficient (and more enjoyable) to write concurrent programming code in Java than in C++. Therefore, to code high performance software, I personally prefer to code for multi-core, multi-CPU, and cloud in java rather than doing code optimization in C/C++.

(I AM NOT SURE WHY FORTRAN SURVIVES IN 2011. HOW ARE YOU SUPPOSED TO READ THOUSDANDS LINES OF CODE ALL IN UPPER/LOWER CASES WITH A BUNCH OF C’S AND GOTO’S EVERYWHERE?)

Briefly, my conclusion is that, among the general purpose programming languages (hence excluding Scala and etc.), we should use Java instead of C/C++, FORTRAN, Assembly and etc. whenever possible because Java is the easiest programming language to learn and work with without a sacrifice in performance.

(For me, Java is an easier language than C# because the Java IDE technology is far much better than the C# counterparts.)

The evidence I collect are listed below. Please feel free to expand the list.

  1. Java has recently won some major benchmark competitions.
    1. http://developer.yahoo.com/blogs/hadoop/posts/2008/07/apache_hadoop_wins_terabyte_sort_benchmark/
    2. http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/
    3. http://news.cnet.com/8301-13846_3-10242392-62.html
  2. Recent independent studies seem to show that Java performance for high performance computing (HPC) is similar to FORTRAN on computation intensive benchmarks.
  3. http://blog.cfelde.com/2010/06/c-vs-java-performance/
  4. http://www.amazon.com/Fixed-Income-Analytics-Developer-Circa/lm/R3FV39FJRU3FE9

 

Open Source Trading Software Or Not?

I have built a few trading systems from scratch during my years with investment banks. So, I have learnt from the many mistakes made. I am recently reviewing some open source system, and would like to share some thoughts.

In general, I am against building software in house. Funds and banks are not software firms. They should focus on making money instead of IT development. Producing software is best left to Microsoft, Google, Apple and IT vendors. These good folks spend their life time doing nothing other than building software. There is no chance that a (small) financial organization can build better systems than they do.

Now, there is an exception to this rule of thumb. When it comes to building trading systems, I suggest that we should do this in house. Alternatively, you can hire a vendor and retain them on contract basis to co-develop the trading system, if you do not have proper talents in house. (It is notoriously difficult to hire very talented programmers in finance because the top quality guys go to GOOG, MSFT, AAPL or startups.)

The worst is to download an open source trading system and simply hope that it will work. Let’s examine why. Firstly, for a piece of very complicated software such as trading system, it is unlikely that the downloaded code works out of the box. There are often too many configurations to do. Even with proper documentations (which is a big assumption), there are still too many plumbing to set up before you can make the first trade, e.g., connectivity to exchanges, data sources, news and announcements, writing and testing strategies in their framework, book-keeping, middle office, clearance. Chances are that you will end up paying the vendor’s service to speed up the set up process.

Secondly, you may think that you have the option to edit their (open) source code if you like. The reality is that you probably cannot do so. There are two problems. The first problem is that, without the authors walking thru the code with you, it is very difficult to understand the source code, the architecture, the assumptions, the hacks and quick-fixes, and even the comments. For the open source trading system that I am playing with, I don’t think I can reasonably edit it without spending a month or two studying it without assistance.

The second and the biggest problem is that even though you can edit the code to your taste, how do you merge your changes with the vendor’s future releases? The vendor knows absolutely nothing about your changes. Whenever they make an update, you will have to manually merge all the changes, assuming that this is even possible. You will then find yourself either 1) rewriting the same code over and over again or 2) maintaining a separate branch from the vendor’s altogether. Neither is satisfactory. Chances are that you will end up paying the vendor to write the code for you.

Thirdly, assuming that you cannot reasonably edit their code even if it is open source, you and your trading strategies must then work under their framework and work with all their assumptions and limitations. There is no one standard way to build a trading system. There are too many possibilities: order routing, order placement, order cancelation, order filling, error handling, alerting.

Take order cancel-replace as an example. There are at least two ways to do it. Number one: you cancel an order, wait for a confirmation, and then place a new order. The problem is that this method is very slow so you may miss the trading opportunity. You may even miss the confirmation that you never send out the replacement order. Number two: you place the new order before sending in the cancelation message. The problem in this case is that you may over-fill your order when both the old and new orders are executed. There is no “correct” way to do it. It all depends on the traders and the trading strategies being executed. Another detail is whether you still want to cancel the order if it is already partially filled. The open source trading code I am reading now does not address this altogether. Chances are that you will end up paying the vendor to customize the code for your applications.

Fourth, if even the trading system is open source, you cannot reasonably understand the code enough to modify it. You at best can only be a user of the system. In this case, what do you do in the future when you want to add additional components to enhance your trading process? For instance, you may want independent and hence external modules such as order aggregators, flow controls, safety guards, etc. You will have to build these components around their framework that you don’t understand. Chances are that you will end up paying the vendor to build these add-on modules for you.

Fifth, some trading systems, for the sake of generality, take a script as a trading strategy. The script may be in proprietary languages such as TradeStation EasyLanguage or in popular language like Ruby. Either way, the problem with scripting is that it is very slow, relatively speaking. The competition in the algorithmic trading space is very fierce. Many firms, e.g., GETCO, spend a lot of resources to tweak their systems or colocation just to get a few milliseconds edge over competitors. Executing scripts line-by-line is very slow. Not being able to tweak, hence optimize, your system is making it even slower. You are already losing the trading game to others in speed. Chances are that you will end up paying the vendor to directly access the underlying API.

Sixth, some solid trading systems come with many exchanges/ECNs gateway access. Many don’t. For those smaller ones, such as the open source ones, they might at best have implemented the FIX protocol. While FIX is a popular protocol in finance, many exchanges and ECNs have their own protocols. Some ECNs may even have different APIs for liquidity takers and liquidity providers. I cannot image the vendor would make open source many these gateway API’s. (Some major businesses live on providing these API for fees not for free after all.) Chances are that you will end up paying the vendor to develop certain gateway access to the markets that you want to trade.

In other words, whether it is open source or not, it is not very relevant. You will end up paying big money either for the software with support or for the services. In either case, you will need a copy of the source code so that you can modify as the last resort. Therefore, from the IT perspective, it does not make sense to buy a piece of trading software from a vendor.

From also the business perspective, you should build the trading system in-house. Or, at least, you need an in-house IT team to work with the 3rd party vendor to co-develop the trading system. The main reason is that a trading system is an embodiment of many business secrets, such as trading strategies, capital allocation, execution optimization, stop loss handling, etc. For instance, when designing a trading system, one question is how to optimize execution to reduce the transaction costs, such as bid-ask. You may want to split the order. You may have some tick prediction mechanism. Etc. You do not want to outsource these business logic or proprietary knowledge to a 3rd party vendor. They are best kept in-house.

Going back to my general principle that a fund is not a software house so it should do as minimal IT as possible, so, where is the line drawn? My suggestion is that we do not build a trading system all from scratch. You want to buy as many “standard” components as possible and only the “standard” components. For instance, it makes no sense to reinvent another FIX engine because some vendors have done it so well. The same go true for historical data database, communication bus, analytic tools, back testing system, market gateways and APIs, GUI controls, etc. What you should build in-house are only the parts that incorporate your business knowledge, e.g., execution optimization, as discussed above.

In conclusion, a trading system is an embodiment of the business secrets of a fund or a bank. You should not out-source this work to a 3rd party vendor. You should develop this in-house. Over time, you grow your internal team of trading system experts. More importantly, your trading system, albeit starting small, will eventually grow to a powerful weapon that gives you an edge over competitors. After all, algorithmic trading is a game of technology competition.

Strategy Optimization

Trading strategy optimization is an important aspect and a challenging problem in algorithmic trading. It requires determining a set of optimal solutions with respect to multiple objectives, where the objective functions are often multimodal, non-convex, and non-smooth. Moreover, the objective functions are subject to various constraints of which many are typically non-linear and discontinuous. Conventional methods, such as Nelder-Mead, BFGS, cannot cope with these realistic problem settings. A solution is the stochastic search heuristic called differential evolution. In this blog piece, I will look at the performance issue of applying differential evolution in optimizing a quantitative trading strategy.

Briefly, differential evolution (DE) is a genetic algorithm that searches for a function minimum by maintaining a pool of local searches, and then combining the local searches to try to escape the local minima. When applying differential evolution to optimize a trading strategy, it will need to maintain and simulate a pool of parameter sets. A typical procedure looks like this:

Decide the parameters, x, of an algorithmic trading strategy S(x);
Choose an objective function, e.g., Sharpe-ratio, or Omega;
Choose a calibration window, L, e.g., most recent 12 months;
Choose a calibration frequency, f, e.g., every 3 months;
For each f month {
	Prepare the data set for the last L months;
	Initialize a pool of different parameter sets;
	For each DE iteration {
		Apply the DE operators to generate the next generation
                parameter sets;
		For each parameter set {
			Simulate the strategy;
			Measure the performance;
		}

		Select the best performing parameter sets;
	}

	Simulate trading the strategy using the optimized parameters
        from the DE optimization algorithm;
}

There are two, among many, difficulties in running this script in Matlab/R. The first one is coding. My personal opinion is that it requires Herculean effort to simulate non-trivial trading strategies (or any complex systems) in Matlab/R and make sure that the simulation is correct. There is simply no easy process or tool available to systematically and automatically check and test for the correctness of the scripts. Worse, most quants/traders cannot code. They know nothing about software testing. In practice, most quants/traders simply think/assume/fantasy that their code is correct. Actually, many programmers in the financial industry cannot code neither. (The good ones go to GooG, M$FT, AAPL… or startups.)

More importantly, the second difficulty, which this discussion focuses on, is performance. I believe that it is more than obvious that in general, executing Matlab/R scripts is very slow. Matlab/R executes and interprets a script line-by-line. For our algorithmic trading strategy optimization script, the bottlenecks are:

  1. Each strategy simulation is slow because of looping over times (or dates, or ticks).
  2. We need to simulate the strategy many times in each DE iteration.
  3. We run many these DE iterations as in genetic algorithm.

In other words, if you try to code up the above pseudo-code in Matlab/R, it may take days, if at all, to come up with results. If your trading strategy works with tick-by-tick data, then, it is hopeless.

To speed up the process, we can run the differential evolution algorithm in parallel on a grid. Matlab/R does allow you to parallelize your code. However, realistically, I have never seen any quants/traders writing parallel Matlab/R code. Even if they wanted to do it, few, if any, understood multi-threaded concurrent programming to get the code right. In addition, we can simply move away from Matlab/R to some real programming languages, e.g., C++/C#/Java. Complied code always runs faster than scripts. (Please!!! No VBA. Only managers use it…)

Algo Quant addresses the trading strategy optimization problem by

  1. Enable easy coding of a complex trading strategy in Java. To code a strategy, a quant/trader simply puts together components, e.g., signals, mathematics, event handlers, from the library.
  2. Optimize a strategy or a portfolio by running the differential evolution optimization algorithm in parallel.

I demonstrate the performance of Algo Quant using a simple moving average crossover strategy. This strategy maintains two moving averages, a fast moving average and a slow moving average. When the fast moving average crosses the slow moving average from below, we enter a long position; when the fast moving average crosses the slow moving average from above, we enter a short position. There are theoretical and empirical evidence to show that both the fast and slow moving window sizes should not be too big. See Haksun’s course note lecture 6.

Most traders choose the fast and slow moving window sizes by 1) guessing, 2) hearsay, 3) backtesting (with a few more or less randomly chosen parameters) in Bloomberg. Some traders recalibrate periodically by updating the parameters.

Let’s first try the hearsay approach. A popular choice is (50, 200). I simulate this strategy trading the S&P 500 futures, VFINX, from 2000/1/1 to 2011/5/31 with the Yahoo data. The source code is here.

Here is the P&L generated using Algo Quant.

P&L for a simple moving average crossover strategy

We have: pnl = 50.4; sharpe = 0.080829; omega = 1.285375

There is no reason that (50, 200) is the optimal parameter set or that it even works at all. Intuitively, if we periodically update the pair (more flexibility), we could potentially generate a better P&L. For example, I trade this Simple Moving Average Crossover strategy by using the optimal parameters Dynamically calibrated every 3 months using the data in the last 12 months. For the objective function to determine optimality, I use omega. The source code is here.

As expected, this dynamically calibrated strategy does generate (much) better P&L than the statically strategy using randomly guessed parameters.

P&L for a dynamically calibrated simple moving average crossover strategy

We have: pnl = 106.65; sharpe = 0.309824; omega = 2.816514

In terms of performance (computing speed), it took 32.5 minutes (1954877 ms) to finish the simulation for 10 and a half years on my workstation (dual E5520 @ 2.27GHz) with 12GB memory. (As a side note, using the other parallel Brute Force algorithm took only 6.6 minutes or 395334 ms.) Our parallel differential evolution algorithm maintains a pool of 16 parameter sets. In each of the 80 iterations, I run 16 simulations in parallel, one in each core. The picture shows that my computer is working hard, utilizing all its resources to search for the historically optimal parameter sets. Redoing this simulation-test-and-pick procedure in Matlab/R might probably take days, if not forever.

CPU usage

As a disclaimer, I do not claim that this dynamically calibrated SMA/2/Crossover strategy generates alpha on the S&P500 future. In fact, I just happened to pick (f = 3, L =12) by chance. My point is to compare Algo Quant’s performance of the parallel differential evolution on strategy optimization to that of Matlab/R.