The Right (and Wrong) Way to Run an Algorithmic Trading Group

I would like to share with you the unique vision that Numerical Method Inc. has about running an algorithmic trading group. To get an edge over competing funds, we emphasize on 1) the research process and 2) technology rather than on hiring more intelligent people.

Currently, the majority of the quant funds run like cheap arcade booths: the traders are given workstations and data. They do whatever to crank out strategies. The only contributing factor to profit is luck – luck in finding the right people and/or luck in finding the right strategies. I had a conversation with an executive from a large financial organization two years ago when they started to build an algorithmic trading group. He said, “Haksun, we need to hire some very smart people to be better than Renaissance.” I repeatedly hear something similar from various portfolio managers and hedge fund owners.

Staffing “very smart” people in this ad-hoc, unmethodical, non-scientific process in search of profitable trading strategy is merely a lottery in disguise.

The main reason is that there is no necessary condition between “very smart” people and “very profitable” trading strategies. If there is any relationship, it can only be a sufficient condition.
It is very difficult to hire the “very smart” people because
1. They are difficult to be identified among numerous pretentiously smart people.
2. The competition for them is a very fierce battle. The best examples are the fight between Microsoft and Google over Dr. Lee Kai-Fu, the dispute between Renaissance Technologies and Millennium Partners over two former employees.
The very best people are driven by passion rather than money, e.g., Gates, Jobs. They cannot be hired.

The key to success in running an algorithmic trading group, as in warfare, lies in process (tactics) and technology (weapon), not on star traders. Analogously, knights, despite being elite warriors, were superseded by cheap infantry; German tanks, despite being best engineered during WWII, were outnumbered by cheap USSR tanks. The better traders do not get their profitable strategies from higher beings. Speaking as an AI scientist, they are “better” only because their search involves a bigger space (more knowledge) and is faster (more efficient). We have created a process and a technology that enables a good trader to be just as profitable as the “very smart” traders.

Process

In terms of the algorithmic trading research process, there is usually very little standardization even within one firm, let alone in the industry. Most quant fund houses do not invest in building research technology (with the famous exception of Blackrock).

This is best illustrated by the languages they use to do backtesting. When there are 6 traders, they could be using: Matlab, R, VBA, C++, Java, C#. The first consequence of this using-their-best-language is that there is absolutely no sharing of code. This firm would write the same volatility calculation function 6 times. Suppose trader A comes up with a new way of measuring volatility, trader B could not leverage on this. Trader C could not quickly prototype a new trading idea by combining the mean-reverting signal from trader D and the trend following signal from trader E. The productivity is very low.

The management is not able to compare strategies from two traders. The 6 traders all make different “simplifications” in backtesting. For instance, they would clean the same data set in different way; they would use their “proprietary” bid-ask, execution, (market vs. limit) order models for computing historical P&L; they would make assumptions about liquidity and market impacts. Mainly due to simplifying coding, they would make all sorts of guesses about the details in executing their strategies. The management does not have the time to question every single detail in their backtesting, hence the lack of understanding and confidence. They would simply resort to “trusting” the reports.

Worst, while algorithmic traders maybe excellent mathematicians, they are usually bad programmers. I am yet to see an algorithmic trader who understands modern programming concepts such as interface vs. inheritance, memory model, thread safety, design pattern, testing, and software engineering. They usually produce code that is very long, unstructured and poorly documented. The code must have bugs. Therefore, the management cannot trust their backtesting result and the performance report.

Our solution to make systematic the trading research process has three steps. Firstly, we mandate that all traders do their backtesting in the same language, e.g., Java. Secondly, we mandate that all traders contribute their code to a research library. Thirdly, the firm invests in a common backtesting infrastructure by expanding this research library. The advantages are the following.

The traders can focus on what they are supposedly good at – generating innovative trading strategies. They no longer bother with the IT grunt work.
They can quickly prototype a trading idea by putting together components, e.g., signals, filters, modules, from the research library.
They can share code with colleagues and be understood because all conform to the same standard.
They can compare strategies because the performance measures are computed from simulations making the same assumptions.

Over time, as the algorithmic trading firm invests, creates as well as expands the research library and backtesting system, this IT infrastructure will become the most valuable asset. Suppose a star trader takes 3 month to test a good idea and make it profitable. With this standardized process and technology, a good trader is able to rapidly prototype 30 strategies in 3 days in parallel on a cluster of computers. The profitability of the firm depends not on hiring Einstein but on good and hardworking people leveraging on using the infrastructure.

Technology

There are many vendors that sell backtesting tools. However, IMHO, these backtesters are no more than augmented versions of for-loops – looping over historical data. Some better ones come with various features, e.g., importing data, cleaning data, statistics and signal computation, optimization, real-time paper trading. Some even go one step further to provide brokerage services for real trading. The major problem with all these backtesters is that you cannot code, hence simulate, true quantitative trading strategies with them. Suppose you want to replicate Elliott and Malcolm’s pair trading strategy, you will need to do SDE, EM, MLE, and KF. Most commercial backtesters are built for amateurs and simply cannot do it.

There are a few more professional backtesters that provide link to Matlab or R for data analytics. I have a lot of complains about doing data analysis in Matlab/R. Many traders use them only because they cannot code. (It is extremely rare to find someone who is mathematically talented and can code.) Matlab/R is for amateurs; VBA is a joke. The problems are:

They are very slow. These scripting languages are interpreted line-by-line. They are not built for parallel computing. (OK. I know Matlab/R can do parallel computing but who use them in practice? Which traders understand immutability to write good parallel code?)
They do not handle a lot of data well. How do you handle two year worth of EUR/USD tick by tick data in Matlab/R?
There is no modern software engineering tools built for Matlab/R. How do you know your code is correct? Most traders just think that their code is correct because they are not trained programmers.
The code cannot be debugged easily. Ok. Matlab comes with a toy debugger somewhat better than gdb. It does not compare to NetBeans, Eclipse or IntelliJ IDEA.
How do you debug a large Matlab/R application with 50 pages of scripts anyway? I usually just give up.

You can tell that Matlab/R is not fit for financial modelling simply by observing that no serious person/bank/fund does option pricing in Matlab/R. My former employers do not price our portfolios using Matlab/R on my workstation. They deploy a few thousands computers in a grid to run the pricing code for the whole night! Likewise, we need a cluster of computers to run our trading models for many different scenarios with many different parameters before we are confident to bet a few 100 million on them.

Algo Quant is a first attempt by Numerical Method Inc. to solve the technology problem. Algo Quant is an embodiment of the systematic algorithmic trading research process that we discussed. Algo Quant is a suite of algorithmic trading research tools (with source code) for trading idea development, quick prototyping, data preparation, in-samples calibration, out-samples backtesting, even automatic trading strategy generation. Algo Quant is backed by SuanShu, a powerful modern library of mathematics, statistics and data mining. For more information, please check out this.

If you are a passionate quant/trader/programmer who shares our vision to revolutionize the algorithmic trading industry, please join us! If you are a hedge fund who wants to hire us to implement this scientific trading research process in house, please contact us.

4 Comments

Jia
October 18, 2011 at 12:08 am
How do u find in real world like getting clients? It looks that it’s extremely difficult toget traders write java, and they need to hire coders to do their job
- Log in to Reply
- Link
Haksun LiModerator
October 19, 2011 at 12:09 am
Most capable *quant* traders, esp. in the younger generation, that I know can already program in C/C++/C#/Java. I don’t know why some people call themselves *quant* traders if all they can do is Excel…… 🙂

There aren’t really that much *quant* stuff you can do in Excel.

Matlab/R is better but they are *extremely* slow in backtesting, optimization, and sensitivity analysis. Matlab/R is esp. hopeless in high frequency trading research IMO. You can barely load a day worth of EURUSD tick data in a 2G memory space (assuming 32bit machine), let alone simulating it.
- Log in to Reply
- Link
Markus
March 31, 2012 at 6:25 pm
2. With R, see the indexing/mmap packages from Jeff Ryan. It is extremely fast to load data from disk into memory. IMO, the real world of algo trading dvpt involves trade-offs. If you think you need to load 10 GB of data for analysis, think again. If you think you’ll need the fastest language out there to do your job, think again. At the end of the day, you want performance. Speed and advanced technology is just a way to get to the goal.

While I agree on the need to “uniformize” the R&D process, I don’t see the point of having the same infrastructure for R&D and production (though in an ideal world maybe!).
- Log in to Reply
- Link
Haksun LiModerator
April 1, 2012 at 9:42 pm
Hello Markus, my suggestion about having infrastructure for R&D is to speed up and automate those time-consuming and boring tasks in backtesting a strategy. Backtesting a strategy involves more than running the strategy with historical data. It involves calibration, sensitivity analysis, bootstrapping, Monte Carlo simulation, etc. All these means running many variants and parameterizations of the strategy. A state-of-the-art infrastructure helps speeding up all these, freeing the quants/traders from programming. Quants/traders can thus focus on what they are supposedly best at – generating trading ideas.
- Log in to Reply
- Link