Abstract
We use three supervised machine learning methods, namely linear discriminant analysis, quadratic discriminant analysis, and random forest, to build models that predict financial performance of sixty-three listed banks from eight emerging markets for 10 years from 2008 to 2017. We use the design science research (DSR) framework to examine whether the textual contents of annual reports in previous years contain value-relevant information for anticipating future performance; thus, these contents can improve the accuracy and quality of predictive models. We combine two groups of variables in the proposed models. The first group is the sentiment analysis of disclosure tone in annual report narratives using the Loughran and McDonald (2011) dictionary, while the second group is the quantitative properties of banks which consist of five variables: firm size, financial leverage, age, market-to-book ratio, and risk. We find that the random forest method provides the best predictive model. We also find that the accuracy and performance of predictive models can be increased by incorporating disclosure tone variables with financial variables. Interestingly, we find that uncertainty is the most important disclosure tone variable. Finally, we find that firm size is the most important variable related to banks’ quantitative characteristics. Our study suggests that the analysis of tone through corporate narrative disclosures can be used as a complementary or diagnostic approach rather than an alternative in making decisions by different stakeholders.
Original language | English |
---|---|
Journal | International Journal of Disclosure and Governance |
Early online date | 5 Sept 2021 |
DOIs | |
Publication status | Early online - 5 Sept 2021 |
Keywords
- Predictive Models
- Financial Performance
- Disclosure Tone
- Machine Learning
- Discriminant Analysis
- Random Forest