How To Combine A Billion Alphas

How To Combine A Billion Alphas

How To Combine A Billion Alphas

Zura Kakushadze

Quantigic Solutions LLC; Free University of Tbilisi

Willie Yu

Centre for Computational Biology, Duke-NUS Medical School

February 27, 2016


We give an explicit algorithm and source code for computing optimal weights for combining a large number N of alphas. This algorithm does not cost O(N^3) or even O(N^2) operations but is much cheaper, in fact, the number of required operations scales linearly with N. We discuss how in the absence of binary or quasi-binary “clustering” of alphas, which is not observed in practice, the optimization problem simplifies when N is large. Our algorithm does not require computing principal components or inverting large matrices, nor does it require iterations. The number of risk factors it employs, which typically is limited by the number of historical observations, can be sizably enlarged via using position data for the underlying tradables.

How To Combine A Billion Alphas – Introduction

Now that machines have taken over alpha mining, the number of available alphas is growing exponentially. On the flip side, these “modern” alphas are ever fainter and more ephemeral. To mitigate this effect, among other things, one combines a large number of alphas and trades the so-combined “mega-alpha”. And this is nontrivial.

Why? It is important to pick the alpha weights optimally, i.e., to optimize the return, Sharpe ratio and/or other performance characteristics of this alpha portfolio. The commonly used techniques in optimizing alphas are conceptually similar to the mean-variance portfolio optimization [Markowitz, 1952] or Sharpe ratio maximization [Sharpe, 1994] for stock portfolios. However, there are some evident differences. The most prosaic difference is that the number of alphas can be huge, in hundreds of thousands, millions or even billions. The available history (lookback), however, naturally is much shorter. This has implications for determining the alpha weights.

Let us look at vanilla Sharpe ratio maximization of the alpha portfolio with weights Screenshot_1, where N is the number of alphas. The optimal weights are given by


where Screenshot_4 are the expected returns for our alphas, Screenshot_5 is the inverse of the alpha return covariance matrix Screenshot_6, and Screenshot_7 is the normalization coefficient such that


If we compute Screenshot_6 as a sample covariance matrix based on a time series of realized returns (see (3)), it is badly singular as the number of observations is much smaller than N. This also happens in the case of stock portfolios. In that case one either builds a proprietary risk model to replace Screenshot_6 or opts for a commercially available (multifactor) risk model. In the case of alphas the latter option is simply not there.

So, what is one to do? We can try to build a risk model for alphas following a rich experience with risk models for stocks. In the case of stocks a more popular approach is to combine style risk factors (i.e., those based on measured or estimated properties of stocks, such as size, volatility, value, etc.) and industry risk factors (i.e., those based on stocks’ membership in sectors, industries, sub-industries, etc., depending on the nomenclature used by a particular industry classification employed). The number of style factors is limited, of order 10 for longer-horizon models, and about 4 for shorter-horizon models. In the case of stocks, at least for shorter-horizon models, it is the ubiquitous industry risk factors (numbering in a few hundred for a typical liquid trading universe) that add most value. However, there is no analog of the (binary or quasi-binary) industry classification for alphas. In practice, for many alphas it is not even known how they are constructed, only the (historical and desired) positions are known. Even formulaic alphas [Kakushadze, Lauprete and Tulchinsky, 2015] are mostly so convoluted that realistically it is impossible to classify them in any meaningful way, at least not such that the number of the resulting (binary or quasi-binary) “clusters” would be numerous enough to compete with principal components (see below). And there are only a few a priori relevant style factors for alphas [Kakushadze, 2014] to compete with the principal components.

How To Combine A Billion Alphas

How To Combine A Billion Alphas

See full PDF below.


Saved Articles

The Life and Career of Charlie Munger

Charlie is more than just Warren Buffett’s friend and Berkshire Hathaway’s Vice Chairman – Buffett has actually credited him with redefining how he looks at investing. Now you can learn from Charlie firsthand via this incredible ebook and over a dozen other famous investor studies by signing up below:

  • Learn from the best and forever change your investing perspective
  • One incredible tidbit of knowledge after another in the page-turning masterpiece of a book
  • Discover the secrets to Charlie’s success and how to apply it to your investing
Never Miss A Story!
Subscribe to ValueWalk Newsletter. We respect your privacy.

Are you an intelligent investor?

ValueWalkPremium is a website and newsletter for smart investors like yourself. We focus on the latest hedge fund industry news much of which is not in the public domain and obtained via our sources.

We also have 10 years of resources on how to use this information to better your investment process.

Sign up for  today for only a few dollars a day and get a 3 day no obligation trial with a targeted 20% discount coupon code.

Cancel anytime during trial and you are never charged.

Limited time offer: For first 50 subscribers