Volume 8, Issue 2 (February 2021), Pages: 1-5
----------------------------------------------
Original Research Paper
Title: Data mining classification algorithms: An overview
Author(s): Saeed Ngmaldin Bardab 1, *, Tarig Mohamed Ahmed 2, 3, Tarig Abdalkarim Abdalfadil Mohammed 1
Affiliation(s):
1Department of Computer Sciences, ALNeelain University, Khartoum, Sudan
2Department of IT, King Abdul-Aziz University, Jeddah, Saudi Arabia
3Department of Computer Sciences, University of Khartoum, Khartoum, Sudan
Full Text - PDF XML
* Corresponding Author.
Corresponding author's ORCID profile: https://orcid.org/0000-0003-4263-168X
Digital Object Identifier:
https://doi.org/10.21833/ijaas.2021.02.001
Abstract:
Data mining is also defined as the process of analyzing a quantity of data (usually a large amount) to find a logical relationship that summarizes the data in a new way that is understandable and useful to the owner of the data. This paper examines the various types of classification algorithms in Data Mining, their applications and categorically states the strengths and limitations of each type. The weaknesses found in each algorithm demonstrate how tasks cannot be performed well when only one type of algorithm is applied. For this reason, it is the view of the writer that further research needs to be carried out to explore the potential of combining several of these algorithms to solve machine learning problems.
© 2020 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Classification, Machine learning, Supervised learning, Classifiers
Article History: Received 5 May 2020, Received in revised form 20 August 2020, Accepted 10 September 2020
Acknowledgment:
No Acknowledgment.
Compliance with ethical standards
Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Citation:
Bardab SN, Ahmed TM, and Mohammed TAA (2021). Data mining classification algorithms: An overview. International Journal of Advanced and Applied Sciences, 8(2): 1-5
Permanent Link to this page
Figures
Fig. 1 Fig. 2
Tables
No Table
----------------------------------------------
References (24)
- Adhatrao K, Gaykar A, Dhawan A, Jha R, and Honrao V (2013). Predicting students' performance using ID3 and C4. 5 classification algorithms. https://doi.org/10.5121/ijdkp.2013.3504 [Google Scholar]
- Awad M and Khanna R (2015). Efficient learning machines: Theories, concepts, and applications for engineers and system designers. Springer Nature, Berlin, Germany. https://doi.org/10.1007/978-1-4302-5990-9 [Google Scholar]
- Bhatia N (2010). Survey of nearest neighbor techniques. International Journal of Computer Science and Information Security, 8(2): 302-305. [Google Scholar]
- Bhukya DP and Ramachandram S (2010). Decision tree induction: An approach for data classification using AVL-tree. International Journal of Computer and Electrical Engineering, 2(4): 660-665. https://doi.org/10.7763/IJCEE.2010.V2.208 [Google Scholar]
- Cover T and Hart P (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1): 21-27. https://doi.org/10.1109/TIT.1967.1053964 [Google Scholar]
- Deng J, Berg AC, Li K, and Fei-Fei L (2010). What does classifying more than 10,000 image categories tell us? In: Daniilidis K, Maragos P, and Paragios N (Eds.), European conference on computer vision: 71-84. Springer, Berlin, Germany. https://doi.org/10.1007/978-3-642-15555-0_6 [Google Scholar]
- Friedman N and Goldszmidt M (1996). Discretizing continuous attributes while learning Bayesian networks. In the International Conference on Machine Learning: 157-165. [Google Scholar]
- Han J, Kamber M, and Pei J (2011). Data mining concepts and techniques third edition. In: Han J, Kamber M, and Pei J (Eds.), The Morgan Kaufmann series in data management systems: 83-124. Morgan Kaufmann/Elsevier, Burlington, USA. https://doi.org/10.1016/B978-0-12-381479-1.00003-4 [Google Scholar]
- Heckerman D, Geiger D, and Chickering DM (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3): 197-243. https://doi.org/10.1007/BF00994016 [Google Scholar]
- Kesavaraj G and Sukumaran S (2013). A study on classification techniques in data mining. In the 4th International Conference on Computing, Communications and Networking Technologies, IEEE, Tiruchengode, India: 1-7. https://doi.org/10.1109/ICCCNT.2013.6726842 [Google Scholar]
- Kotsiantis SB, Zaharakis I, and Pintelas P (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160(1): 3-24. [Google Scholar]
- Krishna D, Albinson N, and Chu Y (2017). Managing algorithmic risks. Deloitte, New York, USA. [Google Scholar]
- Neelamegam S and Ramaraj E (2013). Classification algorithm in data mining: An overview. International Journal of P2P Network Trends and Technology, 4(8): 369-374. [Google Scholar]
- Nevala K (2017). The machine learning primer. SAS Institute, North Carolina, USA. [Google Scholar]
- Nizar AH, Dong ZY, and Wang Y (2008). Power utility nontechnical loss analysis with extreme learning machine method. IEEE Transactions on Power Systems, 23(3): 946-955. https://doi.org/10.1109/TPWRS.2008.926431 [Google Scholar]
- Patil DD, Wadhai VM, and Gokhale JA (2010). Evaluation of decision tree pruning algorithms for complexity and classification accuracy. International Journal of Computer Applications, 11(2): 23-30. https://doi.org/10.5120/1554-2074 [Google Scholar]
- Phyu TN (2009). Survey of classification techniques in data mining. In the International Multi Conference of Engineers and Computer Scientists, Hong Kong, China, 1: 1-5. [Google Scholar]
- Quinlan JR (1986). Induction of decision trees. Machine Learning, 1(1): 81-106. https://doi.org/10.1007/BF00116251 [Google Scholar]
- Quinlan JR (1993). Combining instance-based and model-based learning. In the 10th International Conference on Machine Learning, Morgan Kaufmann Publishers Inc., Amherst, USA: 236-243. https://doi.org/10.1016/B978-1-55860-307-3.50037-X [Google Scholar]
- Samuel AL (2000). Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 44(1.2): 206-226. https://doi.org/10.1147/rd.441.0206 [Google Scholar]
- Smola A and Vishwanathan SVN (2008). Introduction to machine learning. Cambridge University Press, London, UK. [Google Scholar]
- Viswanath P and Sarma TH (2011). An improvement to k-nearest neighbor classifier. In the IEEE Recent Advances in Intelligent Computational Systems, IEEE, Trivandrum, India: 227-231. https://doi.org/10.1109/RAICS.2011.6069307 [Google Scholar]
- Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, and Zhou ZH (2008). Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1): 1-37. https://doi.org/10.1007/s10115-007-0114-2 [Google Scholar]
- Yang Y and Webb GI (2009). Discretization for Naive-Bayes learning: Managing discretization bias and variance. Machine Learning, 74(1): 39-74. https://doi.org/10.1007/s10994-008-5083-5 [Google Scholar]
|