Euclidean 1Q16 Letter – Deep Learning & Value InvestingVW Staff
Euclidean letter for the first quarter ended March 31, 2016; titled, “Deep Learning & Value Investing.”
In recent letters, we showed the merits of adhering to simple forms of value investing over time. In particular, we highlighted the potential that value strategies have demonstrated following periods resembling the past few years — when value investing has underperformed the broad market and more speculative forms of investing.
Now we bring our focus back to Euclidean’s approach to systematic value investing, which was formed using machine learning. Machine learning is, perhaps most simply, the discipline of teaching machines (i.e., computers) how to do things through experience, in a manner that resembles how people learn. We have written about machine learning and how it relates to Euclidean in previous letters.
Over the past several years there have been many advances in machine learning, particularly within the field of “deep learning.” Now, more than ever, computers can uncover complex and fruitful patterns that lay hidden in highly complex environments. Deep learning is behind the much talked about achievements of self-driving cars, image recognition technology that performs better than humans, impressive language translation, and voice recognition.
It is also behind the recent achievement of a computer named AlphaGo beating the world’s best players at the ancient game of Go. This development is interesting because, unlike chess, where each move affords about 40 options, Go has up to 200. The permutation of outcomes quickly compounds to a bewildering range of choices — more than the total number of atoms in the entire observable universe.  Thus, mastering Go cannot be achieved by a computer through brute force as it can for games like checkers and chess, but rather requires the pattern recognition and intuitive skills many have felt to be an exclusively human capability. Here’s a comment from a January article in Nature that describes deep learning’s step forward. 
“The IBM chess computer Deep Blue, which famously beat grandmaster Garry Kasparov in 1997, was explicitly programmed to win at the game. But AlphaGo was not preprogrammed to play Go: rather, it learned using a general-purpose algorithm that allowed it to interpret the game’s patterns…”
The very same technologies that have made this advance possible have opened potentially fruitful avenues of exploration for us to further develop Euclidean’s investment process.
Euclidean: Flashback To 1994 - From John’s Perspective
Before we started our first company and began to ponder questions regarding business quality and valuation, I (John) got immersed in the world of machine learning. My first job after college was with a crack team of fifteen engineers at the consulting firm Booz, Allen, and Hamilton doing machine learning projects for several government agencies. One project was an early attempt at applying machine learning to computer vision: we sought to train models to classify the types of vehicles found in satellite images. The goal was to distinguish hostile vehicles like tanks from benign vehicles like school buses.
One quality of how we approached machine learning then, that has resembled our work at Euclidean since, is that we needed to provide a great deal context about the information from which we wanted our computers to learn. For example, our computer vision project required us to extract features of images so that they could be fed into the machine learning environment. There were, however, many different features of these images that we could use. Choosing the right ones required a lot of effort, and also made the process inherently subject to our biases.
Though our efforts were successful, there’s a good chance our assessments could have been better. It was ultimately only summary detail and not the image itself that our algorithms had access to during the learning process. Maybe this summary detail missed important qualities about the images or the context in which they resided. Perhaps a more powerful approach – if it had been possible – would have been simply to feed raw satellite images to our machines. Then, through trial and error and recurrent learning, perhaps they could teach themselves to find what was important.
This ability to work with raw information, instead of information that is heavily pre-processed, is in fact one of the important advantages of deep learning. It is part of the reason deep-learning-powered image recognition programs are now performing better than humans in many contexts.
Euclidean - Machine Learning through Today
Euclidean’s ambition has always been to identify the best methods for distinguishing between companies that are likely to be good investments and those that are likely to disappoint. In our initial efforts to do this using machine learning, we separated companies from the past into two groups. Those that outperformed the market made up group 1 and those that underperformed populated group 2. This step was pretty easy, with the caveat that we needed access to good data that mitigates survivorship bias and lets us “see” companies and market prices from the past as they actually existed.
The hard work began when we strove to find the qualities that provide the most information regarding whether a company’s shares are likely to be a winning investment. A good metaphor for describing this step would be to imagine the way that an exceptional investor might evaluate a potential opportunity. After becoming familiar with how a company serves customers, manages expenses, and deploys capital, this investor would be equipped to compare the company with his experiences involving similar investments from the past. To the extent that those analogs, or “comparables,” in the past had done well, his confidence in the new opportunity would be high, and the converse would also be true.
To execute against this metaphor in the context of machine learning, our challenge was to determine the lens that would be most useful in comparing current opportunities with ones from the past. There are, after all, many different qualities that could form the basis for that comparison.
Would earnings yield prove better than price to earnings when evaluating whether a company is inexpensive? What about price-to-book or price-to-sales? How has a company’s historical rate of growth tended to relate to its intrinsic value? What would prove to be the best way to consider one-time charges when evaluating earnings? What do measures such as debt-to-equity, return-on-capital, and gross profitability tell us about companies’ relative quality? Should we look at just the last twelve months of data on a given company, or should we look at how it has evolved over the last several years, or even since the company’s inception? Which of these measures are best, which are redundant? And so on, and so on.
Traditional machine learning techniques require us to put a great deal of energy into these types of questions. If we don’t get the input factors right, machine learning doesn’t work. Thus, we devoted much of our energy to this area of factor selection, and we believe Euclidean operates today focused on qualities that provide a strong signal regarding when a good company is offered at an attractive price.
Perhaps, though, there are other ways of looking at companies that could be more fruitful? Maybe the tools of deep-learning could open up new areas of analysis that have been previously outside of our grasp?
New opportunities for Euclidean
Over the next few years we plan to invest a good deal of our R&D effort into seeing whether the new developments of deep learning can add to our investment process. While we are excited about what we might achieve, our expectations are tempered by the caveat that deep learning generally requires vastly more data than traditional approaches to machine learning. It may be the case that the amount of relevant financial data available to us is inadequate to harness deep learning’s real value. We won’t know until we try.
The deep learning innovations that are relevant to Euclidean fall into three categories: learning sequences, working with unstructured financial data, and learning textual data.
Traditionally, machine learning has been structured in such a way that the goal is to predict something from a fixed number of inputs. For example, we might want to predict the likelihood that a company’s stock will outperform over the next few years based on a fixed number of financial ratios (like the stock’s return-on-equity, earnings yield, and debt-to-equity). However, in the real world, the input data typically comes in sequences and predictions should be conditioned on the evolution of those sequences.
With respect to stocks, the future value of a company depends on the evolving state of a company’s cash flows as they report them from quarter to quarter. It is in this area of modeling sequences of data through time that machine learning has recently made huge steps forward. Now, to be fair to Euclidean, we have embedded a sense of time in our models by evaluating many potential input factors that range over different periods of time. For example, our models are very interested in how a company’s results have evolved over the prior ten years. But we had to specifically test many factors over different time periods to zero in on the ones that looked the most promising. With new techniques in deep learning, our models might be able to learn the time dynamics of value investing more directly and find relationships in the data that we haven’t explored.
Working with Unstructured (or “Raw”) Numerical Data
Related to the above point, deep learning has created the opportunity to do less of the “factor engineering” work we talked about earlier and instead rely on the raw financial data to tell the story of a stock (e.g., analogous to the summary detail vs. the raw images in our satellite example). That is, the “deepness” in deep learning means that successive layers of a model are able to untangle important relationships in a hierarchical way from data as it is found “in the wild,” with much less pre-processing than has been done in the past. So there is potential to find measures that are more meaningful than what Euclidean relies on today, and also to limit further the potential biases that can impact the learning process when data is processed into forms usable by traditional machine learning tools.
While we probably can’t hope to just feed our models raw income statements and balance sheets, it may be that we can use somewhat normalized versions of these statements and let the machine learning process find what is important on its own.
Lastly, some of the greatest progress in machine learning has been in the area of text processing. Much of this has been motivated by the desire to achieve better web search results, language translation, sentiment analysis, and photo captioning (where the output is text). Why is text processing of interest to Euclidean?
As we approach value investing in public markets with a whole owner’s perspective (meaning that we evaluate potential investments as we would if we were going to buy the entire company), we need to believe that the board and management team is making financial decisions in the interest of every shareholder. If they are in fact making decisions that way, than the economic results from buying a fraction of the company should be little different than buying the whole company, and an analysis based on a whole owner perspective is justified.
But how do you quantitatively assess management’s integrity with respect to its fiduciary duty? On the one hand, we believe we do this reasonably well today with our models. If a company consistently increases earnings per share and shows good use of capital over the long-term, then management is doing something right for individual shareholders. On the other hand, surely there is more to be examined here.
As an example, even though our investment process relies on quantifiable information, we enjoy digging into the more qualitative aspects of the companies in our portfolio. As we have done so, we have made the observation that earnings conferences calls and transcripts sometimes open a window into management’s character. Some management teams talk very coherently about how they deploy capital and clearly operate with the long-term in mind. Others seem to be overly focused on near-term developments and provide thin excuses for poor results. Our bias would be toward betting on the former and passing on that latter! And yet, we haven’t had a way to systematically analyze these transcripts, evaluate our inclinations, and augment our investment process with this information. Perhaps deep learning can help us search for patterns in how management teams discuss their business results and future plans that can provide an additional, useful input to how Euclidean invests.
We have evolved our investment process over time to incorporate new learnings. Even as we stand today, confident in our process and optimistic that the current environment will be supportive of our approach, Euclidean remains devoted to learning. We continue to seek the best methods for uncovering history’s lessons and overseeing a systematic process that reflects what we have found.
John & Mike