Most Positive and Most Negative Company Filings Going into 2020

Company financial filings are typically thought of as boring, dry documents filled with boilerplate language. However, there have been numerous academic studies that disagree and have found that the language used is financial documents matters and has predictive powers. Indeed, language used can help predict things as diverse as future performance or even the presence of accounting fraud.

Many studies use robust multifactor models (SVM models are the most popular). But even a simplistic examination of feature sets yields interesting results.

We looked at which S&P 500 companies saw the biggest annual percentage increase in the use of positive words from their CY2018 to their CY2019 form 10-K. Positive words were ones that were labeled a positive emotion by the WordNet Affect dictionary. The count of positive words was scaled to the overall number of words in the companies 10-K. We then looked at the companies subsequent performance in 2020 up until the COVID related market drop (2/20/20). Performance post COVID is likely much more correlated to the stocks exposure to the pandemic than it is to language specific factors pre-pandemic.

Note that some companies reports were excluded due to formatting and parsing issues (we hope to have this fixed in the future) so the dataset is more like the S&P 468 than S&P 500.

The top 20 stocks with the biggest change in positive language are below.

TickerPos. Emo. Chg.TickerPos. Emo. Change

The average return for this group of stocks excluding dividends (however NLOK was adjusted for the special dividend) was 3.53%.

The top bottom 20 stocks with the smallest, or in this case negative, change in the use of positive language are below.

TickerPos. Emo. Chg.TickerPos. Emo. Change

The average return of this group was -.83%.

For comparison purposes the return of the S&P 500 during this period was approximately 4.8%. Neither group beat the return of the S&P 500 but the top 20 positive group did beat the return of the less tech heavy Dow Jones (2.4%).

In summary, there appears to be at least some signal value in the language used in 10-Ks. The increase in the use of positive language words doesn’t appear to be a screaming short term buy signal however seeing the proportion of positive language decrease might be a sign that investors should take a closer look. There’s also a high degree of variance in the returns of each group. The standard deviation of the top 20 group was 11.8% and the bottom 20 group was 7.6%. So, we can see why many higher performing models use multiple features for building comprehensive models. However, for smaller, less sophisticated investors simply reading a 10-K and comparing it to last year to see if it sounds more positive or less positive can also be useful as a starting point for further analysis.

Leave a Reply