International Journal of

ADVANCED AND APPLIED SCIENCES

EISSN: 2313-3724, Print ISSN: 2313-626X

Frequency: 12

line decor
  
line decor

 Volume 7, Issue 12 (December 2020), Pages: 113-126

----------------------------------------------

 Original Research Paper

 Title: Socio monitoring framework (SMF): Efficient sentiment analysis through informal and native terms

 Author(s): Muhammad Javed 1, *, Ziauddin 1, Shahid Kamal 1, Jamal Abdul Nasir 1, Arslan Ali Raza 2, Asad Habib 2

 Affiliation(s):

 1Institute of Computing and Information Technology, Gomal University, Dera Ismail Khan, Pakistan
 2Institute of Computing, Kohat University of Science and Technology, Kohat, Pakistan

  Full Text - PDF          XML

 * Corresponding Author. 

  Corresponding author's ORCID profile: https://orcid.org/0000-0001-6884-6641

 Digital Object Identifier: 

 https://doi.org/10.21833/ijaas.2020.12.013

 Abstract:

Prediction and analysis of public expression is the trending topic of the current research arena. Opinion mining (a.k.a. Sentiment Analysis) is the automated orientation of public sentiments, views, suggestions, and opinions. It assists in estimating the popularity of products, events, services, and even political policies via user-generated content. Machine learning based supervised, semi-supervised, and unsupervised lexicon oriented techniques are applicable in the semantic orientation of public opinions about numerous real world entities. It is observed that socio channels contain real-time contents, which sometimes face the intricacy of informality, Slangs, Vernacular (Native terms), and sarcasm; however, these indicators provide high visibility of sentiments and opinions in terms of orientation.  Unfortunately, the unclear perceptiveness of such contents lack in optimized orientation, and supervised machine learning systems are inappropriate where the Lexicon based opinion mining methods are preferred over learning based ones when training data is not adequate. In this paper, we seek to improve the performance of lexicon-based sentiment analysis by incorporating novel linguistic features such as vernaculars, slangs, and sarcasm for monitoring the social media contents up to a more realistic level. The core contributions are sarcasm detection and identification of vernacular terms. The performance of the proposed unsupervised lexicon-based framework over native, informal, and sarcastic opinion bearing terms is assessed via numerous experiments. For this, we utilized tweets relevant to two key domains, including Product and Politics. Experimental outcomes revealed that the proposed system outperformed the existing supervised and semi-supervised systems as 84.24%, and 82.35% of accuracies are achieved over informal and sarcastic contents for product and politics domains, respectively. The average accuracy for both domains is 83.29%. 

 © 2020 The Authors. Published by IASE.

 This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

 Keywords: Informal slangs, Vernacular, Microblogging contents, Semantic orientation, Sarcasm detection

 Article History: Received 2 February 2020, Received in revised form 29 July 2020, Accepted 8 August 2020

 Acknowledgment:

No Acknowledgment.

 Compliance with ethical standards

 Conflict of interest: The authors declare that they have no conflict of interest.

 Citation:

  Javed M, Ziauddin, Kamal S et al. (2020). Socio monitoring framework (SMF): Efficient sentiment analysis through informal and native terms. International Journal of Advanced and Applied Sciences, 7(12): 113-126

 Permanent Link to this page

 Figures

 Fig. 1 Fig. 2 Fig. 3 Fig. 4

 Tables

 Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 Table 8 

----------------------------------------------

 References (55)

  1. Amiri H and Chua TS (2012). Mining slang and urban opinion words and phrases from cQA services: An optimization approach. In the 5th ACM International Conference on Web Search and Data Mining, Association for Computing Machinery, Seattle, USA: 193-202. https://doi.org/10.1145/2124295.2124319   [Google Scholar]
  2. Arif MH, Li J, Iqbal M, and Liu K (2018). Sentiment analysis and spam detection in short informal text using learning classifier systems. Soft Computing, 22(21): 7281-7291. https://doi.org/10.1007/s00500-017-2729-x   [Google Scholar]
  3. Balahur A (2013). Sentiment analysis in social media texts. In the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Association for Computational Linguistics, Atlanta, Georgia: 120-128.   [Google Scholar]
  4. Balahur A and Jacquet G (2015). Sentiment analysis meets social media–Challenges and solutions of the field in view of the current information sharing context. Information Processing and Management 51(4): 428-432. https://doi.org/10.1016/j.ipm.2015.05.005   [Google Scholar]
  5. Balahur A, Mihalcea R, and Montoyo A (2014). Computational approaches to subjectivity and sentiment analysis: Present and envisaged methods and applications. Computer Speech and Language, 28(1): 1-6. https://doi.org/10.3115/v1/W14-26   [Google Scholar]
  6. Bamman D and Smith NA (2015). Contextualized sarcasm detection on Twitter. In the 9th International AAAI Conference on Web and Social Media, Association for the Advancement of Artificial Intelligence, Menlo Park, USA: 574-577.   [Google Scholar]
  7. Basiri ME, Naghsh-Nilchi AR, and Ghassem-Aghaee N (2014). A framework for sentiment analysis in Persian. Open Transactions on Information Processing, 1(3): 1-14. https://doi.org/10.15764/OTIP.2014.03001   [Google Scholar]
  8. Bilal M, Israr H, Shahid M, and Khan A (2016). Sentiment classification of Roman-Urdu opinions using Naïve Bayesian, decision tree and KNN classification techniques. Journal of King Saud University-Computer and Information Sciences, 28(3): 330-344. https://doi.org/10.1016/j.jksuci.2015.11.003   [Google Scholar]
  9. Bird S and Loper E (2004). NLTK: The natural language toolkit. In the ACL Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, Barcelona, Spain: 214-217. https://doi.org/10.3115/1219044.1219075   [Google Scholar]
  10. Bouazizi M and Ohtsuki TO (2016). A pattern-based approach for sarcasm detection on Twitter. IEEE Access, 4: 5477-5488. https://doi.org/10.1109/ACCESS.2016.2594194   [Google Scholar]
  11. Chaudhuri A (2019). Visual and text sentiment analysis. In: Chaudhuri A (Ed.), Visual and text sentiment analysis through hierarchical deep learning networks: 23-24. Springer, Singapore, Singapore. https://doi.org/10.1007/978-981-13-7474-6_5   [Google Scholar]
  12. Choudhury M, Saraf R, Jain V, Mukherjee A, Sarkar S, and Basu A (2007). Investigation and modeling of the structure of texting language. International Journal of Document Analysis and Recognition, 10(3-4): 157-174. https://doi.org/10.1007/s10032-007-0054-0   [Google Scholar]
  13. Chumwatana T (2018). Comment analysis for product and service satisfaction from Thai customers review in social network. Journal of Information and Communication Technology, 17(2): 271-289. https://doi.org/10.32890/jict2018.17.2.5   [Google Scholar]
  14. Cook P and Stevenson S (2009). An unsupervised model for text message normalization. In the workshop on Computational Approaches to Linguistic Creativity, Association for Computational Linguistics, Boulder, USA: 71-78. https://doi.org/10.3115/1642011.1642021   [Google Scholar] PMid:19101489
  15. Dalmia A, Gupta M, and Varma V (2015). IIIT-H at SemEval 2015: Twitter sentiment analysis–The good, the bad and the neutral! In the 9th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Denver, USA: 520-526. https://doi.org/10.18653/v1/S15-2087   [Google Scholar]
  16. Dave K, Lawrence S, and Pennock DM (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In the 12th International Conference on World Wide Web, Association for Computing Machinery, Budapest, Hungary: 519-528. https://doi.org/10.1145/775152.775226   [Google Scholar]
  17. Dias CPM and Roy A (2016). Language identification for social media: Short messages and transliteration. In the 25th International Conference Companion on World Wide Web, Montréal, Canada: 611-614.   [Google Scholar]
  18. Dubey A, Kumar L, Somani A, Joshi A, and Bhattacharyya P (2019). “When numbers matter!”: Detecting sarcasm in numerical portions of text. In the 10th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Association for Computational Linguistics, Minneapolis, USA: 72-80. https://doi.org/10.18653/v1/W19-1309   [Google Scholar]
  19. Fersini E, Pozzi FA, and Messina E (2015). Detecting irony and sarcasm in microblogs: The role of expressive signals and ensemble classifiers. In the International Conference on Data Science and Advanced Analytics, IEEE, Paris, France: 1-8. https://doi.org/10.1109/DSAA.2015.7344888   [Google Scholar]
  20. González-Ibánez R, Muresan S, and Wacholder N (2011). Identifying sarcasm in Twitter: A closer look. In the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Portland, USA: 581-586.   [Google Scholar]
  21. Hamdan H (2016). SentiSys at SemEval-2016 Task 4: Feature-based system for sentiment analysis in Twitter. In the 10th International Workshop on Semantic Evaluation, Association for Computational Linguistics, San Diego, USA: 190-197. https://doi.org/10.18653/v1/S16-1028   [Google Scholar]
  22. Hasan A, Moin S, Karim A, and Shamshirband S (2018). Machine learning-based sentiment analysis for Twitter accounts. Mathematical and Computational Applications, 23(1): 11. https://doi.org/10.3390/mca23010011   [Google Scholar]
  23. Javed M and Kamal S (2018). Normalization of unstructured and informal text in sentiment analysis. International Journal of Advanced Computer Science and Applications, 9(10): 78-85. https://doi.org/10.14569/IJACSA.2018.091011   [Google Scholar]
  24. Joshi A, Bhattacharyya P, and Carman MJ (2017). Automatic sarcasm detection: A survey. ACM Computing Surveys, 50(5): 1-22. https://doi.org/10.1145/3124420   [Google Scholar]
  25. Khan FH, Qamar U, and Bashir S (2016). eSAP: A decision support framework for enhanced sentiment analysis and polarity classification. Information Sciences, 367: 862-873. https://doi.org/10.1016/j.ins.2016.07.028   [Google Scholar]
  26. Kiritchenko S, Zhu X, and Mohammad SM (2014). Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research, 50: 723-762. https://doi.org/10.1613/jair.4272   [Google Scholar]
  27. Kumar A, Sangwan SR, Arora A, Nayyar A, and Abdel-Basset M (2019). Sarcasm detection using soft attention-based bidirectional long short-term memory model with convolution network. IEEE Access, 7: 23319-23328. https://doi.org/10.1109/ACCESS.2019.2899260   [Google Scholar]
  28. Kundi FM, Ahmad S, Khan A, and Asghar MZ (2014). Detection and scoring of internet slangs for sentiment analysis using SentiWordNet. Life Science Journal, 11(9): 66-72.   [Google Scholar]
  29. Lai P (2010). Extracting strong sentiment trends from Twitter. Computer Science Department Stanford University, Stanford, USA.   [Google Scholar]
  30. Liebrecht CC, Kunneman FA, and van Den Bosch APJ (2013). The perfect solution for detecting sarcasm in tweets# not. In: Balahur A, Goot E, and Montoyo A (Ed.), Proceedings of the 4th workshop on computational approaches to subjectivity, sentiment and social media analysis: 29-37. ACL, New Brunswick, USA.   [Google Scholar]
  31. Liu B (2015). Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge University Press, Cambridge, UK. https://doi.org/10.1017/CBO9781139084789   [Google Scholar]
  32. Liu B and Zhang L (2012). A survey of opinion mining and sentiment analysis. In: Aggarwal C and Zhai C (Eds.), Mining text data: 415-463. Springer, Boston, USA. https://doi.org/10.1007/978-1-4614-3223-4_13   [Google Scholar]
  33. Lo SL, Cambria E, Chiong R, and Cornforth D (2017). Multilingual sentiment analysis: From formal to informal and scarce resource languages. Artificial Intelligence Review, 48(4): 499-527. https://doi.org/10.1007/s10462-016-9508-4   [Google Scholar]
  34. Mataoui MH, Zelmati O, and Boumechache M (2016). A proposed lexicon-based sentiment analysis approach for the vernacular Algerian Arabic. Research in Computing Science, 110: 55-70. https://doi.org/10.13053/rcs-110-1-5   [Google Scholar]
  35. Mehmood K, Essam D, Shafi K, and Malik MK (2019). Sentiment analysis for a resource poor language-Roman Urdu. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 19(1): 1-15. https://doi.org/10.1145/3329709   [Google Scholar]
  36. Mullen T and Malouf R (2006). A preliminary investigation into sentiment analysis of informal political discourse. In the AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, Association for the Advancement of Artificial Intelligence, Menlo Park, USA: 159-162.   [Google Scholar]
  37. Nasukawa T and Yi J (2003). Sentiment analysis: Capturing favorability using natural language processing. In the 2nd International Conference on Knowledge Capture, Association for Computing Machinery, Sanibel Island, USA: 70-77. https://doi.org/10.1145/945645.945658   [Google Scholar]
  38. Neethu MS and Rajasree R (2013). Sentiment analysis in Twitter using machine learning techniques. In the 4th International Conference on Computing, Communications and Networking Technologies, IEEE, Tiruchengode, India: 1-5. https://doi.org/10.1109/ICCCNT.2013.6726818   [Google Scholar]
  39. Osimo D and Mureddu F (2012). Research challenge on opinion mining and sentiment analysis. Universite de Paris-Sud, Laboratoire LIMSI-CNRS, Orsay, France.   [Google Scholar]
  40. Pang B and Lee L (2008). Foundations and Trends® in information retrieval. Foundations and Trends® in Information Retrieval, 2(1-2): 1-135. https://doi.org/10.1561/1500000011   [Google Scholar]
  41. Pennell DL and Liu Y (2014). Normalization of informal text. Computer Speech and Language, 28(1): 256-277. https://doi.org/10.1016/j.csl.2013.07.001   [Google Scholar]
  42. Pontes A, Henn M, and Griffiths MD (2018). Towards a conceptualization of young people’s political engagement: A qualitative focus group study. Societies, 8(1): 17. https://doi.org/10.3390/soc8010017   [Google Scholar]
  43. Raza AA, Habib A, Ashraf J, and Javed M (2017). A review on Urdu language parsing. International Journal of Advanced Computer Science and Applications, 8(4): 93-97. https://doi.org/10.14569/IJACSA.2017.080413   [Google Scholar]
  44. Reyes A, Rosso P, and Veale T (2013). A multidimensional approach for detecting irony in Twitter. Language Resources and Evaluation, 47(1): 239-268. https://doi.org/10.1007/s10579-012-9196-x   [Google Scholar]
  45. Riloff E, Qadir A, Surve P, De Silva L, Gilbert N, and Huang R (2013). Sarcasm as contrast between a positive sentiment and negative situation. In the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Seattle, USA: 704-714.   [Google Scholar]
  46. Rout JK, Choo KKR, Dash AK, Bakshi S, Jena SK, and Williams KL (2018). A model for sentiment and emotion analysis of unstructured social media text. Electronic Commerce Research, 18(1): 181-199. https://doi.org/10.1007/s10660-017-9257-8   [Google Scholar]
  47. Stephen JA (2010). Business impact of Web 2.0 technologies. Communications of the ACM, 53(12): 67-79. https://doi.org/10.1145/1859204.1859225   [Google Scholar]
  48. Thelwall M (2017). TensiStrength: Stress and relaxation magnitude detection for social media texts. Information Processing and Management, 53(1): 106-121. https://doi.org/10.1016/j.ipm.2016.06.009   [Google Scholar]
  49. Thelwall M and Wilkinson D (2010). Public dialogs in social network sites: What is their purpose? Journal of the American Society for Information Science and Technology, 61(2): 392-404.   [Google Scholar]
  50. Van Hee C, Lefever E, and Hoste V (2018). Semeval-2018 task 3: Irony detection in English tweets. In The 12th International Workshop on Semantic Evaluation, Association for Computational Linguistics, New Orleans, USA: 39-50. https://doi.org/10.18653/v1/S18-1005   [Google Scholar]
  51. Vilares D, Gómez-Rodríguez C, and Alonso MA (2017). Universal, unsupervised (rule-based), uncovered sentiment analysis. Knowledge-Based Systems, 118: 45-55. https://doi.org/10.1016/j.knosys.2016.11.014   [Google Scholar]
  52. Yang L, Li C, Ding Q, and Li L (2013). Combining lexical and semantic features for short text classification. Procedia Computer Science, 22: 78-86. https://doi.org/10.1016/j.procs.2013.09.083   [Google Scholar]
  53. Yue L, Chen W, Li X, Zuo W, and Yin M (2019). A survey of sentiment analysis in social media. Knowledge and Information Systems, 60: 617–663. https://doi.org/10.1007/s10115-018-1236-4   [Google Scholar]
  54. Zhang L and Liu B (2016). Sentiment analysis and opinion mining. In: Sammut C and Webb GI (Eds.), Encyclopedia of machine learning and data mining. Springer, Boston, USA. https://doi.org/10.1007/978-1-4899-7502-7_907-1   [Google Scholar]
  55. Zhao J, Liu K, and Xu L (2016). Sentiment analysis: Mining opinions, sentiments, and emotions. Association for Computational Linguistics, 42(3): 595-598. https://doi.org/10.1162/COLI_r_00259   [Google Scholar]