Volume 7, Issue 5 (May 2020), Pages: 20-26
----------------------------------------------
Original Research Paper
Title: Semi-supervised method for sensitivity based documents’ classification for online service providers
Author(s): Sharaf J. Malebary *, Shakeel Ahmad
Affiliation(s):
Faculty of Computing and Information Technology in Rabigh (FCITR), King Abdulaziz University, Jeddah, Saudi Arabia
Full Text - PDF XML
* Corresponding Author.
Corresponding author's ORCID profile: https://orcid.org/0000-0003-4339-3791
Digital Object Identifier:
https://doi.org/10.21833/ijaas.2020.05.004
Abstract:
In today’s digital era, many services providing companies exist on the web whereas service is the logical product of a company, which can be utilized through the Internet. Different service providers provide these services i.e., Online counselling service, online doctor consultation, cloud service provider, web hosting service, etc. to their customers. When customers face some problems, they may text to their providers. One solution is that providers can solve these issues based on the First-Come-First-Serve formula. But there should be an option to detect sensitive issue which may need to be solved first. How can this sensitivity be determined? Already there is a lot of researched work based on text to determine the polarity as positive and negative. Besides this classification, there are also some other classification methods investigated, such as aspect, not aspect, subjective, objective, spam, not spam, etc. regarding text sensitivity, whether it is sensitive or not? This classification is not yet considered for service providers. This paper presents a strategy for sensitivity based classification using Latent Semantic Indexing (LSI). The purpose of LSI is to rank documents concerning a given query. However, in this study, a mechanism was provided to generate query automatically based on sensitive general words with the words from all documents. This is a semi-supervised approach because 4782 sensitive words have been labeled from various sources and used based on an unsupervised approach to detect the sensitivity of the document. The sorted lists of documents based on the LSI scores generated by the sensitive-query were checked manually and were proved to be highly satisfactory. The topmost document in this list was the most sensitive, and the last document in the list was least sensitive.
© 2020 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Service, Sentiment analysis, Supervised learning method, Unsupervised learning method, Latent semantic indexing
Article History: Received 4 November 2019, Received in revised form 5 February 2020, Accepted 7 February 2020
Acknowledgment:
No Acknowledgment.
Compliance with ethical standards
Conflict of interest: The authors declare that they have no conflict of interest.
Citation:
Malebary SJ and Ahmad S (2020). Semi-supervised method for sensitivity based documents’ classification for online service providers. International Journal of Advanced and Applied Sciences, 7(5): 20-26
Permanent Link to this page
Figures
Fig. 1 Fig. 2
Tables
Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7
----------------------------------------------
References (39)
- Ahmad S, Saqib SM, Almagrabi AO, and Alotaibi FM (2017). LSI based search technique: Using extracted keywords and key-sentences. VAWKUM Transactions on Computer Sciences, 14(2): 1-8. https://doi.org/10.21015/vtcs.v14i2.471 [Google Scholar]
- Altaher A (2017). Hybrid approach for sentiment analysis of Arabic tweets based on deep learning model and features weighting. International Journal of Advanced and Applied Sciences, 4(8): 43-49. https://doi.org/10.21833/ijaas.2017.08.007 [Google Scholar]
- Asfoura E, Abdel-Haq MS, Chatti H, and Radouche T (2018). Classification of business models with focusing on characterizing "as a service" offers. International Journal of Advanced and Applied Sciences, 5(11): 16-23. https://doi.org/10.21833/ijaas.2018.11.002 [Google Scholar]
- Asghar MZ, Khan A, Ahmad S, and Kundi FM (2014). A review of feature extraction in sentiment analysis. Journal of Basic and Applied Scientific Research, 4(3): 181-186. [Google Scholar]
- Bazsova B (2019). How can the company choose the best web designer? Decision-making application within a company. International Journal of Advanced and Applied Sciences, 6(2): 6–11. https://doi.org/10.21833/ijaas.2019.02.002 [Google Scholar]
- Chen LS, Liu CH, and Chiu HJ (2011). A neural network based approach for sentiment classification in the blogosphere. Journal of Informetrics, 5(2): 313-322. https://doi.org/10.1016/j.joi.2011.01.003 [Google Scholar]
- Chen T, Xu R, He Y, and Wang X (2017). Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Systems with Applications, 72: 221-230. https://doi.org/10.1016/j.eswa.2016.10.065 [Google Scholar]
- data.world (2018). Hotel-reviews. Available online at: https://bit.ly/38AeLAW
- Ding X and Liu B (2010). Resolving object and attribute coreference in opinion mining. In the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, Beijing, China: 268-276. [Google Scholar]
- Glover-Thomas N and Fanning J (2010). Medicalisation: The role of e-pharmacies in iatrogenic harm. Medical Law Review, 18(1): 28-55. https://doi.org/10.1093/medlaw/fwp026 [Google Scholar] PMid:20133321
- Gojali S and Khodra ML (2016). Aspect based sentiment analysis for review rating prediction. In the International Conference on Advanced Informatics: Concepts, Theory and Application, IEEE, George Town, Malaysia. https://doi.org/10.1109/ICAICTA.2016.7803110 [Google Scholar]
- Gupta DK and Ekbal A (2014). IITP: Supervised machine learning for aspect based sentiment analysis. In the 8th International Workshop on Semantic Evaluation, Association for Computational Linguistics, Dublin, Ireland: 319-323. https://doi.org/10.3115/v1/S14-2053 [Google Scholar]
- Hameed M, Tahir F, and Shahzad MA (2018). Empirical comparison of sentiment analysis techniques for social media. International Journal of Advanced and Applied Sciences, 5(4): 115-123. https://doi.org/10.21833/ijaas.2018.04.015 [Google Scholar]
- Htay SS and Lynn KT (2013). Extracting product features and opinion words using pattern knowledge in customer reviews. The Scientific World Journal, 2013: 394758. https://doi.org/10.1155/2013/394758 [Google Scholar] PMid:24459430 PMCid:PMC3888732
- Huang A, Milne D, Frank E, and Witten IH (2009). Clustering documents using a wikipedia-based concept representation. In the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Bangkok, Thailand: 628-636. https://doi.org/10.1007/978-3-642-01307-2_62 [Google Scholar]
- Jin J, Ji P, and Gu R (2016). Identifying comparative customer requirements from product online reviews for competitor analysis. Engineering Applications of Artificial Intelligence, 49: 61-73. https://doi.org/10.1016/j.engappai.2015.12.005 [Google Scholar]
- Khan K, Baharudin BB, and Khan A (2009). Mining opinion from text documents: A survey. In 3rd IEEE International Conference on Digital Ecosystems and Technologies, IEEE, Istanbul, Turkey: 217-222. https://doi.org/10.1109/DEST.2009.5276756 [Google Scholar] PMCid:PMC2694658
- Kundi FM, Ahmad S, Khan A, and Asghar MZ (2014a). Detection and scoring of internet slangs for sentiment analysis using SentiWordNet. Life Science Journal, 11(9): 66-72. [Google Scholar]
- Kundi FM, Khan A, Ahmad S, and Asghar MZ (2014b). Lexicon-based sentiment analysis in the social web. Journal of Basic and Applied Scientific Research, 4(6): 238-48. [Google Scholar]
- Li FH, Huang M, Yang Y, and Zhu X (2011). Learning to identify review spam. In the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain: 2488-2493. [Google Scholar]
- Liu B (2012). Sentiment analysis and opinion mining: Synthesis lectures on human language technologies. Morgan and Claypool Publishers, San Rafael, USA. https://doi.org/10.2200/S00416ED1V01Y201204HLT016 [Google Scholar]
- Liu B, Hu M, and Cheng J (2005). Opinion observer: Analyzing and comparing opinions on the web. In the 14th International Conference on World Wide Web, Association for Computing Machinery, Chiba, Japan: 342–351. https://doi.org/10.1145/1060745.1060797 [Google Scholar]
- Mallen MJ and Vogel DL (2005). Introduction to the major contribution: Counseling psychology and online counseling. The Counseling Psychologist, 33(6): 761-775. https://doi.org/10.1177/0011000005278623 [Google Scholar]
- Phadnis N and Gadge J (2014). Framework for document retrieval using latent semantic indexing. International Journal of Computer Applications, 94(14): 37-41. https://doi.org/10.5120/16414-6065 [Google Scholar]
- Raganato A, Camacho-Collados J, and Navigli R (2017). Word sense disambiguation: A unified evaluation framework and empirical comparison. In the 15th Conference of the European Chapter of the Association for Computational Linguistics, 1: 99-110. https://doi.org/10.18653/v1/E17-1010 [Google Scholar]
- Rios A, Mascarell L, and Sennrich R (2017). Improving word sense disambiguation in neural machine translation with sense embeddings. In the 2nd Conference on Machine Translation, Association for Computational Linguistics, Copenhagen, Denmark: 11-19. [Google Scholar]
- Rosenthal S, Farra N, and Nakov P (2017). SemEval-2017 task 4: Sentiment analysis in Twitter. In the 11th International Workshop on Semantic Evaluations, Vancouver, Canada: 502–518. https://doi.org/10.18653/v1/S17-2088 [Google Scholar]
- Saqib SM and Kundi FM (2016). MMO: Multiply-Minus-One rule for detecting and ranking positive and negative opinion. International Journal of Advanced Computer Science and Applications, 7(5): 122-127. https://doi.org/10.14569/IJACSA.2016.070519 [Google Scholar]
- Saqib SM, Ahmad S, Syed AH, Naeem T, and Alotaibi FM (2019). Grouping of aspects into relevant category based on wordnet definitions. International Journal of Computer Science and Network Security, 19(2): 113–119. [Google Scholar]
- Saqib SM, Jan MA, Ahmad B, Ahmad S, and Asghar MZ (2011). Custom software under the shade of cloud computing. International Journal of Computer Science and Information Security, 9(5): 219-223. [Google Scholar]
- Saqib SM, Kundi FM, Syed AH, and Ahmad S (2018). Semi supervised method for detection of ambiguous word and creation of sense: Using WordNet. International Journal of Advanced Computer Science and Applications, 9(11): 353-359. https://doi.org/10.14569/IJACSA.2018.091149 [Google Scholar]
- Saqib SM, Mahmood K, and Naeem T (2016). Comparison of LSI algorithms without and with pre-processing: Using text document based search. Transactions on Information Security, 1(4): 44-51. [Google Scholar]
- Shu L, Xu H, and Liu B (2017). Lifelong learning CRF for supervised aspect extraction. In the 55th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Vancouver, Canada, 2: 148-154. https://doi.org/10.18653/v1/P17-2023 [Google Scholar] PMCid:PMC5576273
- Swathy R (2017). A survey on word sense disambiguation used in NLP. International Journal of Innovative Research in Computer and Communication Engineering, 5(3): 5116–5117. [Google Scholar]
- Teli S and Biradar S (2014). Effective spam detection method for email. In the International Conference on Advances in Engineering and Technology, Singapore, Singapore: 68-72. [Google Scholar]
- Wang S, Li D, Song X, Wei Y, and Li H (2011). A feature selection method based on improved fisher’s discriminant ratio for text sentiment classification. Expert Systems with Applications, 38(7): 8696-8702. https://doi.org/10.1016/j.eswa.2011.01.077 [Google Scholar]
- Wang S, Li D, Wei Y, and Li H (2009). A feature selection method based on fisher’s discriminant ratio for text sentiment classification. In the International Conference on Web Information Systems and Mining, Springer, Shanghai, China: 88-97. https://doi.org/10.1007/978-3-642-05250-7_10 [Google Scholar]
- Wang T, Li W, Liu F, and Hua J (2017). Sprinkled semantic diffusion kernel for word sense disambiguation. Engineering Applications of Artificial Intelligence, 64: 43-51. https://doi.org/10.1016/j.engappai.2017.05.010 [Google Scholar]
- Yang Q and Li FM (2005). Support vector machine for customized email filtering based on improving latent semantic indexing. In the International Conference on Machine Learning and Cybernetics, IEEE, Guangzhou, China, 6: 3787-3791. https://doi.org/10.1109/ICMLC.2005.1527599 [Google Scholar]
|