SciELO - Scientific Electronic Library Online

vol.18 número3Spotting Fake Reviews using Positive-Unlabeled LearningSoft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados




Links relacionados

  • No hay artículos similaresSimilares en SciELO


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.18 no.3 Ciudad de México jul./sep. 2014 

Artículos regulares


Using Multi-View Learning to Improve Detection of Investor Sentiments on Twitter


Zvi Ben-Ami1, Ronen Feldman2, and Binyamin Rosenfeld2


1 The Hebrew University, School of Business Administration, Jerusalem, Israel.

2 Digital Trowel, New York, USA.,


Article received on 07/01/2014.
Accepted on 01/02/2014.



Stock-related messages on social media have several interesting properties regarding the sentiment analysis (SA) task. On the one hand, the analysis is particularly challenging, because of frequent typos, bad grammar, and idiosyncratic expressions specific to the domain and media. On the other hand, stock-related messages primarily refer to the state of specific entities – companies and their stocks, at specific times (of sending). This state is an objective property and even has a measurable numeric characteristic, namely, the stock price. Given a large dataset of twitter messages, we can create two separate "views" on the dataset by analyzing text of messages and external properties separately. With this, we can expand the coverage of generic SA tools and learn new sentiment expressions. In this paper, we experiment with this learning method, comparing several types of general SA tools and sets of external properties. The method is shown to produce significant improvement in accuracy.

Keywords: Sentiment analysis, sentiment expression mining, unsupervised learning, multi-view learning, investors' sentiments, social media.





We thank Bing Liu for sharing his Opinion Observer System's output with us.

This work is supported by the Israel Ministry of Science and Technology Center of Knowledge in Machine Learning and Artificial Intelligence and the Israel Ministry of Defense.



1. Abarbanell, J.S. & Bushee, B.J. (1997). Fundamental Analysis , Future Earnings, and Stock Prices. Journal of Accounting Research, 35(1), 1–24.         [ Links ]

2. Antweiler, W. & Frank, M.Z. (2004). Is All That Talk Just Noise ? The Information Content of Internet Stock Message Boards. The Journal of Finance, 59(3), 1259–1294.         [ Links ]

3. Bar-haim, R., Dinur, E., Feldman, R., Fresko, M., & Goldstein, G. (2011). Identifying and Following Expert Investors in Stock Microblogs. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'11), Edinburgh, Scotland, UK, 1310–1319.         [ Links ]

4. Bollen, J., Mao, H., & Zeng, X.J. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.         [ Links ]

5. De Choudhury, M., Sundaram, H., John, A., & Seligmann, D.D. (2008). Can blog communication dynamics be correlated with stock market activity?. Proceedings of the nineteenth ACM conference on Hypertext and hypermedia (HT'08), Pittsburgh, PA, USA, 55–60.         [ Links ]

6. Connor, B.O., Balasubramanyan, R., Routledge, B.R., & Smith, N.A. (2010). From Tweets to Polls : Linking Text Sentiment to Public Opinion Time Series. Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC, 122–129.         [ Links ]

7. Ding, X., Liu, B., & Yu, P.S. (2008). A holistic lexicon-based approach to opinion mining. 2008 International Conference on Web Search and Web Data Mining, Palo Alto, California, USA, 231–240.         [ Links ]

8. Feldman, R., Rosenfeld, B., Bar-haim, R., & Fresko, M. (2011). The Stock Sonar - Sentiment Analysis of Stocks Based on a Hybrid Approach. Proceedings of the Twenty-Third Innovative Applications of Artificial Intelligence Conference, San Francisco, California, USA.         [ Links ]

9. Giannini, R., Irvine, P.J., & Shu, T. (2012). The Impact of Divergence of Opinions about Earnings using a Social Network.         [ Links ]

10. Gilbert, E. & Karahalios, K. (2010). Widespread Worry and the Stock Market. Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC, USA, 58–65.         [ Links ]

11. Lev, B. & Thiagarajan, S.R. (1993). Fundamental Information Analysis. Journal of Accounting Research, 31(2), 190–215.         [ Links ]

12. Liu, B. (2012). Sentiment Analysis and Opinion Mining. San Rafael, Calif: Morgan & Claypool Publishers.         [ Links ]

13. Loughran, T. & Mcdonald, B. (2010). When is a Liability not a Liability ? Textual Analysis, Dictionaries, and 10-Ks. Journal of Finance, 66(1), 35–65.         [ Links ]

14. Sprenger, T.O. & Welpe, I.M. (2010). Tweets and Trades : The Information Content of Stock Microblogs (Early View. Online Version of Record published before inclusion in an issue).         [ Links ]

15. Oh, C. & Sheng, O.R.L. (2011). Investigating Predictive Power of Stock Micro Blog Sentiment in Forecasting Future Stock Price Directional Movement. International Conference on Information Systems (ICIS 2011), Shanghai, China, 1–18.         [ Links ]

16. Pang, B. & Lee, L. (2004). A Sentimental Education : Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. 42nd Annual Meeting on Association for Computational Linguistics (ACL '04), Barcelona, Spain, 271–278.         [ Links ]

17. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment Classification using Machine Learning Techniques. Conference on Empirical Methods in Natural Language Processing (ACL-02), Stroudsburg, PA, USA, 10, 79–86.         [ Links ]

18. Rozenfeld, B. & Feldman, R. (2011). Unsupervised Lexicon Acquisition for HPSG-based Relation Extraction. Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI'11), Catalonia, Spain, 1890–1895.         [ Links ]

19. Ruiz, E.J., Hristidis, V., Castillo, C., Gionis, A., & Jaimes, A. (2012). Correlating financial time series with micro-blogging activity. Fifth ACM international conference on Web search and data mining (WSDM '12), Seattle, Washington, 513–522.         [ Links ]

20. Salton, G. & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management: an International Journal, 24(5), 513–523.         [ Links ]

21. Sprenger, T.O. & Welpe, I.M. (2010). Tweets and Trades: The Information Content of Stock Microblogs.         [ Links ]

22. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2), 267–307.         [ Links ]

23. Turney, P.D. (2002). Thumbs Up or Thumbs Down ? Semantic Orientation Applied to Unsupervised Classification of Reviews. 40th Annual Meeting on Association for Computational Linguistics (ACL'02), Stroudsburg, PA, USA, 417–424.         [ Links ]

24. Turney, P. & Littman, M.L. (2003). Measuring Praise and Criticism : Inference of Semantic Orientation from Association. ACM Transactions on Information Systems, 21(4), 315–346.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons