International Journal of

ADVANCED AND APPLIED SCIENCES

EISSN: 2313-3724, Print ISSN: 2313-626X

Frequency: 12

line decor
  
line decor

 Volume 8, Issue 9 (September 2021), Pages: 29-38

----------------------------------------------

 Original Research Paper

 Title: A framework for predicting employee health risks using ensemble model

 Author(s): Nicholas Khin-Whai Chan 1, Angela Siew-Hoong Lee 1, *, Zuraini Zainol 2

 Affiliation(s):

 1Department of Computing and Information Systems, Sunway University, Sunway, Malaysia
 2Department of Computer Science, Universiti Pertahanan Nasional Malaysia, Kuala Lumpur, Malaysia

  Full Text - PDF          XML

 * Corresponding Author. 

  Corresponding author's ORCID profile: https://orcid.org/0000-0003-3388-2372

 Digital Object Identifier: 

 https://doi.org/10.21833/ijaas.2021.09.004

 Abstract:

Through the phenomenon of data, big data and data analytics have provided an opportunity to collect, store, process, analyze and visualize an immense amount of information. Healthcare is recognized as one of the most information-intensive sectors. An urge to explore analytics has been sparked by the rapid growth of data within the healthcare sector. Most employers in Malaysia provide medical benefits that are included in the medical insurance plan for their employees. Data collected such as the history of medical claims are stored with the HR (Human Resource) which contributes to the potential of analyzing and recognizing trends within medical claims to better understand the use and overall health of the employee population. Patients with higher risk will generally convert into patients with high costs. Hence, early intervention of these patients will allow employers to potentially minimize costs and plan preventative steps. In predictive analysis, Decision Trees and Regression are typical techniques applied. The proposed framework combines an ensemble technique known as Stacking. As opposed to a single predictive model, an ensemble predictive model would yield better performance and accuracy. The objective of this paper is therefore to review current practices and past research within the healthcare sector while suggesting a practical framework for classification ensemble modeling. Preliminary findings indicated that an ensemble model can produce higher predictive accuracy and performance than a single model. 

 © 2021 The Authors. Published by IASE.

 This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

 Keywords: Data analytics, Predictive analysis, Ensemble modeling, Stacking, Framework

 Article History: Received 16 December 2020, Received in revised form 20 April 2021, Accepted 10 June 2021

 Acknowledgment 

No Acknowledgment.

 Compliance with ethical standards

 Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

 Citation:

 Chan NKW, Lee ASH, and Zainol Z (2021). A framework for predicting employee health risks using ensemble model. International Journal of Advanced and Applied Sciences, 8(9): 29-38

 Permanent Link to this page

 Figures

 Fig. 1 Fig. 2 Fig. 3 Fig. 4 Fig. 5 Fig. 6

 Tables

 No Table  

----------------------------------------------

 References (29)

  1. Abdunabi TA (2016). A framework for ensemble predictive modeling. Ph.D. Dissertation, University of Waterloo, Waterloo, Canada.   [Google Scholar]
  2. Agarwal R (2019). The 5 feature selection algorithms every data scientist should know. Available online at: https://towardsdatascience.com/the-5-feature-selection-algorithms-every-data-scientist-need-to-know-3a6b566efd2
  3. Alharthi H (2018). Healthcare predictive analytics: An overview with a focus on Saudi Arabia. Journal of Infection and Public Health, 11(6): 749-756. https://doi.org/10.1016/j.jiph.2018.02.005   [Google Scholar] PMid:29526444
  4. Alonso SG, de la Torre Diez I, Rodrigues JJ, Hamrioui S, and Lopez-Coronado M (2017). A systematic review of techniques and sources of big data in the healthcare sector. Journal of Medical Systems, 41(11): 1-9. https://doi.org/10.1007/s10916-017-0832-2   [Google Scholar] PMid:29032458
  5. Annamalai N, Kamaruddin S, Abdul Azid I, and Yeoh TS (2013). Importance of problem statement in solving industry problems. Applied Mechanics and Materials, 421: 857-863. https://doi.org/10.4028/www.scientific.net/AMM.421.857   [Google Scholar]
  6. Bates DW, Saria S, Ohno-Machado L, Shah A, and Escobar G (2014). Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Affairs, 33(7): 1123-1131. https://doi.org/10.1377/hlthaff.2014.0041   [Google Scholar] PMid:25006137
  7. Bruno G, Cerquitelli T, Chiusano S, and Xiao X (2014). A clustering-based approach to analyze examinations for diabetic patients. In the IEEE International Conference on Healthcare Informatics, IEEE, Verona, Italy: 45-50. https://doi.org/10.1109/ICHI.2014.14   [Google Scholar] PMCid:PMC6353491
  8. Chandrasekar P, Qian K, Shahriar H, and Bhattacharya P (2017). Improving the prediction accuracy of decision tree mining with data preprocessing. In the IEEE 41st Annual Computer Software and Applications Conference, IEEE, Turin, Italy, 2: 481-484. https://doi.org/10.1109/COMPSAC.2017.146   [Google Scholar]
  9. Eapen AG (2004). Application of data mining in medical applications. M.Sc. Thesis, University of Waterloo, Waterloo, Canada.   [Google Scholar]
  10. Gore A (2012). The digital earth: understanding our planet in the 21st century. The Australian Surveyor, 43(2): 89-91. https://doi.org/10.1080/00050348.1998.10558728   [Google Scholar]
  11. Hu H, Li JY, Wang H, Daggard G, and Wang LZ (2008). Robustness analysis of diversified ensemble decision tree algorithms for microarray data classification. In the International Conference on Machine Learning and Cybernetics, IEEE, Kunming, China, 1: 115-120. https://doi.org/10.1109/ICMLC.2008.4620389   [Google Scholar]
  12. Jain R (2015). Predictive modeling for chronic conditions. M.Sc. Thesis, Florida Atlantic University, Boca Raton, USA.   [Google Scholar]
  13. Jović A, Brkić K, and Bogunović N (2015). A review of feature selection methods with applications. In the 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, IEEE, Opatija, Croatia: 1200-1205. https://doi.org/10.1109/MIPRO.2015.7160458   [Google Scholar]
  14. Kincade K (1998). Data mining: Digging for healthcare gold. Insurance and Technology, 23(2): 2-7.   [Google Scholar]
  15. Koh HC and Tan G (2011). Data mining applications in healthcare. Journal of Healthcare Information Management, 19(2): 65-72.   [Google Scholar]
  16. Koller D, Schön G, Schäfer I, Glaeske G, van den Bussche H, and Hansen H (2014). Multimorbidity and long-term care dependency-A five-year follow-up. BioMed Central Geriatrics, 14(1): 1-9. https://doi.org/10.1186/1471-2318-14-70   [Google Scholar] PMid:24884813 PMCid:PMC4046081
  17. Lin YK, Chen H, Brown R, Li SH, and Yang HJ (2014). Healthcare analytics and clinical intelligence: A risk prediction framework for chronic care. https://doi.org/10.2139/ssrn.2444025   [Google Scholar]
  18. Menahem E, Rokach L, and Elovici Y (2009). Troika–An improved stacking schema for classification tasks. Information Sciences, 179(24): 4097-4122. https://doi.org/10.1016/j.ins.2009.08.025   [Google Scholar]
  19. Moturu ST, Johnson WG, and Liu H (2007). Predicting future high-cost patients: A real-world risk modeling application. In the IEEE International Conference on Bioinformatics and Biomedicine, IEEE, Fremont, USA: 202-208. https://doi.org/10.1109/BIBM.2007.54   [Google Scholar]
  20. Raghupathi W and Raghupathi V (2014). Big data analytics in healthcare: Promise and potential. Health Information Science and Systems, 2(1): 1-10. https://doi.org/10.1186/2047-2501-2-3   [Google Scholar] PMid:25825667 PMCid:PMC4341817
  21. Rahm E (2016). Big data analytics. IT-Information Technology, 58(4): 155-156. https://doi.org/10.1515/itit-2016-0024   [Google Scholar]
  22. Ramzai J (2019). Simple guide for ensemble learning methods. Available online at: https://towardsdatascience.com/simple-guide-for-ensemble-learning-methods-d87cc68705a2
  23. Raul A, Patil A, Raheja P, and Sawant R (2016). Knowledge discovery, analysis and prediction in healthcare using data mining and analytics. In the 2nd International Conference on Next Generation Computing Technologies, IEEE, Dehradun, India: 475-478. https://doi.org/10.1109/NGCT.2016.7877462   [Google Scholar]
  24. Solutions V (2016). Improving predictions with ensemble model. Available online at: https://www.datasciencecentral.com/profiles/blogs/improving-predictions-with-ensemble-model
  25. Tekieh MH (2012). Analysis of healthcare coverage using data mining techniques. Ph.D. Dissertation, University of Ottawa, Ottawa, Canada.   [Google Scholar]
  26. Tuysuzoglu G, Birant D, and Pala A (2017). Ensemble methods in environmental data mining. IntechOpen, Rijeka, Croatia. https://doi.org/10.5772/intechopen.74393   [Google Scholar]
  27. Wang Y, Kung L, and Byrd TA (2018). Big data analytics: Understanding its capabilities and potential benefits for healthcare organizations. Technological Forecasting and Social Change, 126: 3-13. https://doi.org/10.1016/j.techfore.2015.12.019   [Google Scholar]
  28. Wolpert DH (1992). Stacked generalization. Neural Networks, 5(2): 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1   [Google Scholar]
  29. Yuvaraj N and SriPreethaa KR (2019). Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster. Cluster Computing, 22(1): 1-9. https://doi.org/10.1007/s10586-017-1532-x   [Google Scholar]