Does Online Financial Commentary Have A Predictive Quality?VW Staff
Cristian Bissattini and Kostis Christodoulou study attempts to discover and analyze the predictive power of stock messages, posting on financial commentary boards, on future stock price directional movements.
Several events in the world of finance, such as the Great Crash of 1929, the Black Monday crash of October 1987, the internet bubble of the 1990s, and the recent financial turmoil in October 2008, not only caused a dramatic change in stock prices but also challenged the explanation offered by neoclassical finance models. The standard finance model, in which “unemotional investors always force capital market prices to equal the rational present value of expected future cash flows”, does not seem to offer perfect insight into asset pricing anomalies (Baker and Wurgler, 2007).
In this context, it seems timely to define a human sentiment function in a Stochastic Discount Factor (SDF) model. Consequently, researchers have proposed some key
behavioral theories to supplement the existing finance model and better predict asset returns in the market. These studies are based on experimental psychology literature
to explain investor behavior.
The notion of investor sentiment goes back to at least Keynes (1936) arguing that markets can fluctuate wildly under the influence of investors «animal spirits » which move prices in a way unrelated to fundamentals, and is further developed in Black (1986), Delong, Shleifer, Summers and Waldmann (1990), Shleifer and Summers (1990), and Baker and Wurgler (2006 and 2007). Delong et al. (1990) show that sentiment combined with arbitrage risk can create persistent mispricing. Baker and Wurgler (2006 and 2007) and Baker, Wurgler, and Yuan (2009) provide empirical evidence suggesting that sentiment creates mispricing.
One of the most popular and interesting forms of financial information is certainly represented by message boards. Message boards (such as Yahoo!Finance or Ranging Bull) provide financial internet forum where people can hold conversations related to stock trading and investing in equities. Recent studies show that many traders take decisions in the financial market solely based on what other people think and what they recommend.
However, extracting sentiment from message boards is very difficult because meaningful information is hidden within large amount of data and it is impossible for an investor to elaborate manually all this information for implementing good trading strategies. In addition not all message posts are equal (few informed experts’ predictions are mixed with those of thousands of uninformed investors) and finally opinions on message boards can be rumor, bullish, bearish, spam, or often simply unrelated to the stock. As results most of online investors who posts on stock message boards are not informed and their recommendations have little informational value.
Our work has three important distinctive features in comparison with other related researches:
1) We propose a novel way to generate sentiment based on author’s credibility calculated on the accuracy of his past posts. Every day our algorithms compare each author’s prediction that day with the subsequent daily stock return over market. If the poster’s prediction is correct then the algorithm will increase the author’s weight, on the contrary his associated weight will decrease. This method give us two important advantages: (1) cut off rumor (weights of uninformed authors remain close to zero, or in any
case very low, and they are irrelevant in forming weighted average recommendation), (2) occasional “strokes of luck” in prediction of stock performance don’t increase significantly the author’s weight (exponential smoothing rewards posters with a large number of successful predictive recommendations).
2) The sentiment tag, optionally included in Yahoo! message posts, reflects the authors’ recommendation of the stock. In this work, we explore this self-reported sentiment for our analysis and we ignore posts without sentiment tag. The main advantage of this approach is primarily the accuracy of the extracted information, avoiding misinterpretations of semantic analysis and learning-?machine results.