Detection of arrhythmia from the analysis of ECG signal using artificial neural networks

Arrhythmia is a heart rhythm problem that could indicate a symptom of heart disease that often contributes to the increase in hospitalization in many developed countries. The patient of heart disease requires continuous monitoring and close attention to their vital sign such as the heart rate. There are many attempts to automate the detection of Arrhythmia from the Electrocardiogram (ECG) readings of patient. Nevertheless, the accuracy of some of these methods is not satisfactory and prone to biased result due to inter-patient variations of ECG dataset. The purpose of this research addresses the arrhythmia classification problem from the ECG signal using Artificial Neural Network (ANN). First, we perform feature extraction on the ECG data which are the four features from RR intervals. The features are then transformed into a feature vector. Then we modelled sixteen different models of ANN where four different algorithms were used such as Bayesian Regularization (BR), Levenberg-Marquardt (LM), Scaled Conjugate Gradient (SCG), and Resilient Backpropagation (RP). The sixteen models are built with a different number of neurons in the hidden layer. We used the dataset from Massachusetts Institutes of Technology- Beth Israel Hospital (MIT-BIH) Arrhythmia Database for evaluating our models which are simulated in MATLAB. The results of the simulation were analyzed and the best model was compared with the previous work. The analysis of our research indicates that the ANN using Bayesian regularization with twenty number of neurons in the hidden layer is the optimal model compared to other models with an overall accuracy of 83.1%. The Normal class Sensitivity was 97.4%, Specificity of 66.7% and Positive Predictive Value of 77.1%. The SVEB Sensitivity was 60% with Specificity of 86.9% and Positive Predictive Value of 42.9%. The VEB Sensitivity was 66.7% with Specificity of 88.7% and Positive Predictive Value of 66.7%. The comparison with other works indicates that our model outperforms the previous work in terms of sensitivity and overall accuracy .


Introduction
* Heart disease is one of the major causes of death and the leading chronic disease globally. Since the case of hospitalized heart failure and mortality is increasing, it has become a global issue that had imposed a huge economic burden to the health care and the government. In 1000 populations, the heart failure varies between 3 to 20 cases. However, the prevalence of heart failure could be as high as 100 per 1000 population in persons with age 65 and above (Davis et al., 2000). The rate of mortality is also significant depending on the severity which varies from 5% to 52%. Heart failure is also important cause of hospitalization and re-admission. For example, in Malaysia heart disease account for 6% to 10% of all acute admission (Chong et al., 2003). Regarding re-admission, 25% of patients are re-admitted within 30 days. According to Krumholz et al. (2009), Joynt and Jha (2011), and Chun et al. (2012), within 6 months after discharge, there are about 50% patients re-admitted. The estimated global economic cost due to heart disease in 2012 is about $108 billion per annum (Cook et al., 2014). Therefore, it is clear that heart disease contributes significantly to the health care and economic burden.
Based on the aforementioned economic burden imposed by heart disease, an effective and efficient means for managing the disease is indispensable. Remote patient monitoring can help to manage heart disease patient in some way. Particularly in diagnosing heart disease, an Electrocardiogram (ECG) is used by the health care to understand the structure and activity of the patient heart.
Typically, the diagnosis using the ECG is only performed in the environment of a hospital by an expert. For cardiovascular-related disease which often require continuous monitoring and close attention to the patient vital sign, probably many important information about the condition of the patient could be missed if only rely on the diagnosis during the hospital visit. Therefore, a solution should be introduced to overcome these limitations.
A remote health monitoring system is able to monitor in real-time the information of a patient. There are several remote health monitoring systems have been proposed which can monitor the vital sign as well as automatic interpretation of the ECG data of the patient. Such technology not only allows realtime remote monitoring but also allows the patient to know the condition of their health immediately from their smartphone. Apart from that, the patient or the expert can also set a notification if an emergency or fast response is needed.
The challenge of the ECG signal classification in remote health monitoring is to address the variations of ECG waveforms among different patients. Without addressing this challenge, the classification can be unreliable and tend to have higher variations in accuracy for larger databases. The performance can also become inconsistent when classifying ECG signal for a new patient (Kiranyaz et al., 2015). In this regards, there has been a number of research works in addressing the issue and improving the performance of the classifier.
Machine learning techniques have been used widely in the classification of the ECG waveforms (Jambukia et al., 2015). For example, a decision tree was used to classify between 'Normal' and 'Abnormal' waveform (Leutheuser et al., 2014), and Montaño et a. (2014) uses ANN for classifying five types of arrhythmia such as Supraventricular Ectopic Beats (S), Ventricular Ectopic Beats (V), Fusion Beats (F), Normal (N) and Unclassified (Q) beats. The problem in most of the classification of ECG is the variations of ECG waveform between different patients. This makes the classifier unreliable to be used clinically because of the high variation in their accuracy and efficiency for larger databases (Kiranyaz et al., 2015).
There are several researches on modelling a classifier for solving the inter-patient variations of the ECG waveform. Hu et al. (1997) proposed an approach for patient-adaptable ECG classification based on a mixture of expert. An automatic classification of heartbeats using ECG morphology and heartbeat interval feature was proposed by De Chazal et al. (2004). This kind of well-known challenge in ECG classification still requires more research (Luz et al., 2016). Furthermore, there was a number of system prototypes that have been proposed and developed for generic and automatic ECG classification (Hermawan et al., 2011;Montaño et al., 2014;Xue et al., 2015). However, they have not specifically addressed the inter-patient variations of ECG waveform in their model which would cause the performance of the algorithm to be inconsistent when classifying ECG waveform on a new patient (De Chazal and Reilly, 2006;Kiranyaz et al., 2015). Although many researches have been done in the area of ECG classification, there were challenges to find the best and accurate method which have been identified in classifying heartbeat type which are, among others, the individuality of the ECG patterns and the inter-patient variations of the ECG waveform (Jambukia et al., 2015).
This project will investigate the model of ECG classification for detection of arrhythmia for patient monitoring system. After thoroughly analyzing the algorithm in ECG classification and the process of how it works, we build and analyzed several models that suited for our problem. We will investigate and analyze the prospect of different ANN algorithm and models for ECG waveform classification, thus present a model for ECG classification for prediction of arrhythmia according to the class specified by Association for the Advancement of Medical Instrumentation (AAMI).

Methodology
The analysis and the implementation of the algorithm were performed using the datasets which was recommended by the AAMI for designing medical devices as specified in ANSI/AAMI EC57: 1998/(R) 2008. In this experiment, the MIT-BIH arrhythmia database was used for the performance evaluation of the proposed ANN models since it was the most representative database for life-threatening arrhythmia. The database contains 48 recordings of 30 minutes ECG sampled from 24 hours recordings of 47 individual. The signal was sampled at 360 Hz and band pass filtered at 0.1-100 Hz. The database includes the class information of heartbeats and the timing annotations that was made and verified by independent experts. The operational design of this project consists of several steps. Fig. 1 illustrates the operational design processes.
The first step in the system design is the data processing and feature extraction. Since the difference in the timing sequence between the ECG signals are useful in the detection of arrhythmias such as Bradycardia, Tachycardia, and premature ventricular contraction, RR interval is very important in the analysis of heart rate from an ECG signal (De Lannoy et al., 2010). The RR intervals can be defined as the interval between two successive heartbeats which is marked by the fiducial point. The duration of heartbeat cycles can be measured from the RR interval by calculating the time span between adjacent R peaks. In this project, there are four features extracted from the RR intervals namely, the pre-RR interval, post-RR interval, local average RR interval, and the average RR interval. The three beats interval from the subject 123's ECG record are shown in Fig. 2.  The RR interval between a given heartbeat and the previous heartbeat is known as the pre-RR interval, while the post-RR interval is the RR interval between a given heartbeat and the following heartbeat. The average RR interval is the mean of the RR-intervals for a recording and this value should be the same for all heartbeats in a recording. The local average RR interval is the average of ten RRintervals surrounding a given heartbeat. After the calculation of the four features of RR intervals, four feature vectors were extracted that represent each of the features as shown in Table 1.
The concatenation of the vectors x 1 … x 4 gives only one vector x that is used as the input to the classifier such as: The classes that serves as the dependent variables in this paper are based on the five classes of arrhythmia recommended by the Association for the Advancement of Medical Instrumentation (AAMI). For designing of medical device, the AAMI has specified a standard, recommended that all of the 15 arrhythmia classes are classified into 5 main groups or super classes. These includes Normal (N), Ventricular Ectopic Beat (VEB), Supraventricular Ectopic Beat (SVEB), Fusion beat (F), and Unclassified beat (Q). The type of heartbeat that exist in the MIT-BIH database that was used in this study are illustrated in Table 2 along with the corresponding five groups of arrhythmia.
Then, the next steps is the classification of the ECG pattern into several classes. In this experiment, we compare different models of the ANN classifier. The two layers architecture of the ANN algorithm were used in the experiment with one hidden layer, one input layer and one output layer as shown in Fig.  3. According to Wang (2003), one hidden layer of the neural network is sufficient for a vast majority of problems. In order to find the best model of ANN, several models were compared with a different number of neurons in the hidden layer using different training function. Four training functions available in MATLAB were used to train the network namely trainlm, trainbr, trainscg and trainrp.
The trainlm is the training function that updates the weight and bias value based on Levenberg-Marquardt algorithm. The trainbr is a Bayesian Regularization backpropagation that updates the weight and bias value according to Levenberg-Marquardt algorithm but the Bayesian Regularization algorithm minimizes and determine the correct combination of weights and squared errors so that the generalization performance is increased. The RR intervals between a given heartbeat and the previous heartbeat The RR interval between a given heartbeat and the following heartbeat The average of ten RR-intervals surrounding a given heartbeat The mean of the RR-intervals for a recording x 4 Preprocessing Feature Extraction Classification and Evaluation The third training function which is the trainscg updates the weights and bias value according to Scaled Conjugate Gradient algorithm. Finally, the trainrp is a training function that is based on resilient backpropagation algorithm. There were about sixteen models of ANN that we trained based on the architecture shown in Fig. 3. For the purpose of brevity and for ease of identification, we created a different name for our models as illustrated in Table  3. For example, a model that uses Levernberg-Marquardt (LM) training function, the name of the model is written as ANNLM followed by a number which indicates the number of the hidden neurons of the model.  After the classification, the performance of the classification systems is evaluated to get better intuition of the model. In this project, the performance measures are used to evaluate the classification model against accuracy, sensitivity, specificity, and positive predictive value.
Since the datasets contain an unbalanced number of beats from different classes, accuracy alone is not adequate in measuring the performance of the algorithm. In this regard, a classifier has the tendency of always predicting that the signal is a normal beat because the number of the normal beats in the dataset has a much larger number of representative beats than the other class of beats. For this reason, the algorithm used in this experiment is also evaluated in terms of Sensitivity, Specificity, and Positive Predictive Value. The calculation of these performance measures are derived based on the number of positives conditions, negative conditions, true positives, true negatives, false negatives, and false positive such as described in Table 4. Positive predictive value is the proportion of positives result that are true positive results. It measures the probability that the patient with positive result truly has the disease. The positive predictive value is calculated as follow: Positive Predictive Value = TP (TP+FP)

Result and discussion
Regression analysis on the result of the testing and validation phase has been performed as illustrated in Table 5. The regression analysis compares the actual outputs of the algorithm with the desired output using the correlation coefficient. The R value that is closer to 1 indicates that there is a linear relationship between the outputs and the targets. If the R value is closer to zero, there is a poor relationship between the outputs and the target.
Based on this the result, the correlation coefficient for ANNBR20 is the highest among the other models. Although the result is lower when the number of neurons are reduced, the Bayesian Regularization based models achieve higher correlation when the size of neurons is increased. The correlation coefficient of both ANNBR20 and ANNBR30 are higher compared to other models.
In order to assess the performance of the classification, the accuracy of the algorithm is measured. Accuracy is the number of correctly classified pattern, either true negatives or true positives, divided by the total number of patterns classified. Accuracy measures the degree of correctness (veracity) of a test on a condition. The equation below calculates the Accuracy:  The graph of the analysis during training as shown in Fig. 4, shows that the line correspond to ANNBR is increasing as the number of neurons are added. This indicates that the number of neurons plays as an important parameter that influence the performance of the ANNBR model.

Fig. 4: Regression analysis on training result vs number of Neurons
For a more precise analysis, the graph of analysis during validation is also provided in Fig. 5. Based on the analysis on validation result, the correlation coefficient of ANNBR is at the highest when the number of neuron is twenty. Adding more neurons such as thirty neurons, does not result in a more satisfactory result for ANNBR. Comparing the two graph reveals that when the number of neurons are twenty, the difference of correlation coefficient result between the training and validation is much less compared to when the number of neurons are thirty in the ANNBR models. On the other hand, the other models that uses different algorithm such as Lavenberg-Marquardt (LM), Resili-ence Backpropagation (RP), and Scaled Conjugate Gradient (SCG) showing that increasing the number of neuron have no influence on the correlation coefficient of the models. Based on this, we decided that the ANNBR models are better than other models.

Fig. 5: Regression analysis on validation vs number of neurons
The accuracy of the different model of ANN are listed in Table 6 that illustrated and analyzed in the following graph. Based on the graph in Fig. 6, the result of ANNBR indicate better accuracy than the other model and shows a smooth line. This indicates Again, the number of neurons influence the performance of the models of ANNBR. The accuracy is starting to increase when the number of neurons are increased to twenty. The other models are not very consistent even when the number of neurons are added. For example, the ANNRP have a fluctuating result such as the accuracy is high when the number of neurons is twenty but drastically dropped when the number of neurons are increased to thirty. Since the result of the ANNRP is fluctuating and the result of the correlation coefficient is low as discussed previously, it is considered not a good model and probably tend to be overstrained when the number of neurons are increased. The ANNBR on the other hand is showing a consistent accuracy. Based on the correlation coefficient result discussed previously, the ANNBR20 achieve a better result than ANNBR30. For this reason, we decided that the ANNBR20 is a better model of the classification compared to ANNBR30. With twenty neurons, the problem of overtraining the model is minimized if compared to when the number of neurons are thirty. A model that have too many neurons will become too complex and causes the model to be overstrained that lead to a worst result (Alman and Ningfang, 2002).

Fig. 6: Accuracy on testing vs number of neurons
The bar chart of accuracy for all the models of the ANN are illustrated in Fig. 7. The accuracy rate of the model is based on the testing phase result. Accuracy measures the number of the correctly classified pattern. The most accurate model is the ANNBR20 which is based on Bayesian Regularization algorithm. Another comparison made by Kayri (2016) found that Bayesian Regularization has achieved higher accuracy when compared to the LM algorithm. This further indicate that the ANNBR model is the best model among the others.

Fig. 7: Comparison of the accuracy rate for all the ANN models
The measure of sensitivity is specific to each class rather than the overall measures of the classification. Table 7 illustrate the sensitivity of the classifier to detect and SVEB, VEB and Normal class. High sensitivity indicates the ability of the classifier to detect a class correctly. This is further analysed in the next graph. The results of the sensitivity are illustrated in the graph in Fig. 8. Most of the models achieve high sensitivity in detecting normal beats. This is probably because, in ECG data, normal beats are dominating the other classes. As a result, the normal class are classified better than the other type of beats. On the other hand, the class of Ventricular Ectopic Beats (VEB) are the less dominant class. In this case, the Resilience Backpropagation such as the ANNRP10 showing a low sensitivity on VEB class. In this result, the Bayesian Regularization methods especially the ANNBR20 and ANNBR30 are still showing a relatively consistent result if compared to other models. Specificity is the proportion of true negatives that are correctly classified by a diagnostic test. A diagnostic test that have high specificity indicates that the diagnostic is good at classifying other beat condition or negative condition. In the result, as shown in Fig. 9 and Table 8, most of the model result shows high specificity on SVEB and VEB.  As shown in Fig. 9, the specificity result is showing a consistent line without any extreme pattern. Here, the result indicate that they can distinguish the SVEB and VEB class from other beats type. In other words, they can detect that the other class is not from the SVEB and VEB group efficiently. However, the normal class have low specificity, which probably related to the same issue with the proportion of the other class (e.g. the SVEB, VEB) in the datasets that are less dominant. The result of ANNBR20 is still consistent in this result, indicating that the Bayesian Regularization is always perform better than the other models in this classification problem.
The precision is also called the Positive predictive value is the proportion of positives result that are true positive results. It measures the probability that the patient with positive result is truly have the disease. In the result as shown in Table 9 and Fig. 10, the Normal beats gain high positive predictive value than the SVEB and VEB class. Since the Normal beats are dominating the population of other class, it gains more true positives value.
The graph plot on Fig. 10 further illustrate the comparison of the models. The graph indicates the comparison of the positive predictive value among the classification model on classifying Normal, SVEB and the VEB beats. Similar with sensitivity, the graph result of positive predictive value indicates that Resilient Backpropagation (e.g. the ANNRP10) does not perform well when classifying VEB class. This is because of the VEB class is not a dominant class in the datasets. The best model when classifying this kind of class is the Bayesian Regularization (e.g. the ANNBR20) methods.

Comparison with the previous work
The finding of the result is compared with the previous work. Since the ANNBR20 have shown a consistent and the highest performance among other models, we consider this model for comparing the performance with similar work in Luz et al. (2013) shown in Table 10. As shown in Fig. 11, both methods achieved high sensitivity on Normal class as compared to other class. This is could be attributed to the normal class that has a higher proportion compare to the other class that are less dominant. In terms of performance, the proposed model achieved a better result than the previous work. The previous work unable to classify the SVEB type. The SVEB and VEB class can be hard to classify due to the number of beats with these classes are less dominant (Huang et al., 2014). However, in this case, our proposed models still produce a significant result on the SVEB class.  In terms of specificity, both method achieved high result on the SVEB and VEB class as shown in Fig. 12. As mentioned previously, specificity is the correct classification of true negative, that is measures the capability of the model to distinguish other beat from the SVEB and VEB. The classifier probably able to easily detect the normal class as true negatives when evaluating the specificity of the SVEB and VEB. The reason that the low specificity of normal beats is probably that when attempting to measure specificity on the normal class, the models were having a hard process to detect the other less dominant class such as the SVEB and VEB as true negatives.
We also compare the accuracy of the proposed models and the previous work as illustrated in Fig.  13. The current work has a better result in terms of accuracy compared to the previous one. It is important to note that there are significant challenges in finding the accurate model of ECG classification systems for interpatient or patient specific classification scheme because the models not only have to comply with the requirement of patient specific which is already a challenging task, the ECG recording also often contain highly unbalanced class of heartbeat, making the classification more difficult to distinguish certain heartbeat that are less dominant. In the literature, there are many attempt towards developing a method to solve this problem that has come to an unsatisfactory result (Huang et al., 2014).

Conclusion
The main focus of this project was the analysis of ECG signal for classification and detection of Arrhythmia disease. All relevant information has been successfully documented in this paper as planned. The significance of this study was that we performed a comparative study of different ANN model for ECG signal classification to detect arrhythmia. Optimum parameter combinations of the models has been reached and we also compared our model with the previous work. The result was that our model outperforms the previous work in terms of sensitivity and overall accuracy. In future works, this project can be further tested and implemented in a real environment or mobile application for remote patient monitoring. Hopefully, this research can contribute to ECG signal classification area and especially a solution for remote patient monitoring for the well-being of patients and the community.