SciELO - Scientific Electronic Library Online

vol.21 issue4Learning to Answer Questions by Understanding Using Entity-Based Memory NetworkPost-Processing for the Mask of Computational Auditory Scene Analysis in Monaural Speech Segregation author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO


Computación y Sistemas

Print version ISSN 1405-5546


SMAILOVIć, Jasmina et al. Automatic Analysis of Annual Financial Reports: A Case Study. Comp. y Sist. [online]. 2017, vol.21, n.4, pp.809-818. ISSN 1405-5546.

The main goal of reporting in the financial system is to ensure high quality and useful information about the financial position of firms, and to make it available to a wide range of users, including existing and potential investors, financial institutions, employees, the government, etc. Formal reports contain both strictly regulated, financial sections, and unregulated, narrative parts. Our research starts from the hypothesis that there is a relation between business performance and not only content, but also the linguistic properties of unregulated parts of annual reports. In the paper we first present our dataset of financial reports and the techniques we used to extract the unregulated textual parts. Next, we introduce our approaches of differential content analysis and analysis of correlation with financial aspects. The differential content analysis is based on TF-IDF weighting and is aimed at finding the characteristic terms for each year (i.e. the terms which were not prevailing in the previous reports by the same firm). For correlation of linguistic characteristics of reports with financial aspects, an array of linguistic features was considered and selected financial indicators were used. Linguistic features range from measurements, such as personal/impersonal pronouns ratio, to assessments of characteristics like financial sentiment, trust, doubt, and discursive features expressing certainty, modality, etc. While some features show strong correlation with industry (e.g., shorter and more personal reports by IT industry compared to automotive industry), doubt, communication – as well as necessity and cognition words to some extent – are positively correlated with failure.

Keywords : Differential content analysis; linguistic characteristics; 10-K annual reports; financial indicators.

        · text in English     · English ( pdf )