An integrated approach on verification of signatures using multiple classifiers (SVM and Decision Tree): A multi-classification approach

Article history: Received 17 November 2020 Received in revised form 11 February 2021 Accepted 15 November 2021 A signature is a handwritten representation that is commonly used to validate and recognize the writer individually. An automated verification system is mandatory to verify the identity. The signature essentially displays a variety of dynamics and the static characteristics differ with time and place. Many scientists have already found different algorithms to boost the signature verification system function extraction point. The paper is aimed at multiplying two different ways to solve the problem in digital, manual, or some other means of verifying signatures. The various characteristics of the signature were found through the most adequately implemented methods of machine learning (support vector and decision tree). In addition, the characteristics were listed after measuring the effects. An experiment was performed in various language databases. More precision was obtained from the feature.


Introduction
*A signature is a graphical depiction of the writer's name, giving the one way to identify the person's authentication. Human signature is based on behavioral and physical characteristics which are further fall into two categories: Online signature and offline signatures. Both the signatures are different from each other in terms of the way to signing and sequence of features contains.

Digital signature vs handwritten signature
Online, signatures are taken using a digitizer having a stylus, and dynamic features are captured while writing in space provider to the person (Kiani et al., 2009). In comparison, offline signatures are quite different, collected using pen paper, open space given to the signer, and features collected are static in nature. Considering previous research on the system, online signatures are more accurate than offline (Swain et al., 2020;Saeed et al., 2020). One major factor which affects the total performance is noise. Higher the noise lowers down performance. In other terms, the performance of the verification system depends on the factor of noise available in the input signatures. The key objective of the signature verification system (SVS) is to discriminate the human signatures into the defined classes either genuine/original or forged. Every time when the person signs there are certain variations that come in that, due to some stress, environmental conditions ad any other physical trait. This term can be called Inter/Intra personal variations.

 Forgery classification
Further forgery is also having some classes such as Random, Skilled, and Unskilled depending upon the presentation (Kiani et al., 2009):  Random: Where signer knows the presentation of signature.  Skilled: Where signer knows name and representation of the signature.  Unskilled: Where signer does not know name and signature.  Signature verification system (Guru and Prakash, 2008;Vargas et al., 2009;Jarad et al., 2014) In this system, the signature image is normalized and checks whether the image matches the original image or not. For security purposes, SVS can be used, such as verification for assessing entry applications and password substitutions. Signature verification has four stages namely acquisition and preprocessing, feature extraction, classification, and verification. Signature verification system block diagram given in Fig. 1.

Acquisition
Pre-processing Feature Extraction Verification Classifier

Fig. 1: Signature verification system
 Phases and their description in the signature verification system Step 1: Signature images collected via a penpaper or any digital device. Pre-processing phase makes the signature image prepared for the next phase i.e., feature extraction. This stage also includes binarization, rotation, scaling, thinning, cropping, and many others. Fig. 2 explains the steps involved in the pre-processing. How signature image becomes clean and clear for next step.
Step 2: Feature Extraction: This section is the main part of the signature verification. It is called as respiratory of the system, where features are classifieds into two local and global, both the classes have their own set of features, which needs to extract while calculating the forgery factor from the signature image. Local and global features are their sub-division is described in the below Fig. 3.
Step 3: Classification: Classification plays a vital role in the verification process, where the signature image is trained using single or multiple classifiers. Most of the researchers have given different learning approaches to train the input. Some of the classifiers which are used for the process are listed below in Fig. 4  Step 4: Verification: In this step, a certain decision comes in the form of scores, which are further matched with the original score of the signature. It is accomplished by some distance measure and decision rules e.g., Euclidean distance, Mahalanobis measure, etc. Previously authors have calculated the False Acceptance Rate (FAR), False rejection rate (FRR), Equal error rate (EER), and other terms as result. The formula for calculating the distance is mentioned below: 1. Euclidean Distance: Defines the average distance between the two points of signature.
Consider two images I, J where points in the images are: 2. Mahabalonis Distance: To calculate the covariance between two feature vectors and maintain the matrix (Qiao et al., 2011).
M is the mean value, i is the input image, n be the set of signature images. M is the set of signatures; Ref S is the reference signature trained using classifiers. The above three measures support the process to find out the forgery factory from signatures.

Research motivation
Our main aim is to develop a secure system that helps us to find the forged signature out of a set of signatures, whether offline or online, and the main challenge is a skilled forgery, where chances of forgery are very high as compared to others. Signatures are broadly classified into global, local, and transitional features. Global features describe the entire signature including length, width, height, etc. whereas local features consider small of signature and extract detailed information from it. Verification of the system depends on two steps: Extraction and classification. More extraction, better classification will produce a high rate of accuracy and increase performance.

Classifiers used to calculate forgery factor using different datasets of offline and online signature
Based on the previous approaches, Table 1 shows that the features extracted, their classifiers used to find out forgery and accuracy obtained. Most of the researchers have used already available data of signatures and created their own set. An experiment was done on various datasets of signatures. Some of the authors used their own data set.

Proposed work
The objective of our research is to design a robust integrated signature verification system. We have already studied different types of forgeries present in online and offline signatures. For this, a database containing skill level forgeries has been used. However, the proposed system will try to identify unique features from the signatures of a person. Thus deliberate or fake inputs may cause lower down verification rates. By using a support vector machine we can produce a better classification. The one major disadvantage of the SVM classifier is that more the input slows down the results.
To overcome the complexity of SVM, an addition of decision tree function produced better results previously, the hybrid SVM model was proposed embed C4.5 algorithm of a decision tree into the SVM and resulting in a more accurate and efficient hybrid classifier. Then we introduced the modified DT-SVM (Maergner et al., 2019;Blankers et al., 2009) algorithm addition of a new method, Probability-based Distance as Spitting Criterion, in which we use the distances in the frequency distribution of the instances. Thus, the modified DT-SVM provides better performance over the previous decision tree, and SVM (Ferrer et al., 2005) in comparison with the Computational Complexity and overall Accuracy. We have emphasized reducing input space using new splitting criteria. The proposed algorithm will show better performance from previous research.
i. Input signature: The input signatures are collected from persons where each signature has its own x-y coordinates, which are calculated from the pen up and pen down points. Every single point has is stored in the database. Representation of the signature can be done in the form of conservative coordinates i.e., x-y coordinates. Coordinates will be saved in the system in the .txt file Database for online and offline signatures are separate.
The output of the signature image is produced in terms of the forgery factory. The input signature is initially scaled and calculate length L s, and, find its velocity vector Ѵ new directly comparable with each sample signature λ(i), rotate in a clockwise direction to produce new velocity vector Ѵ new λ(i) having initial direction Ѵ new λ(i+j). An indication that the new signature is forged (Bharadwaja, 2015) is then provided by large values of the 'forgery index': Forgery Index then normalized (λ new ) to range lies either 0 or 1, and we assume it, to be small for original input signatures. In terms of λ new theory testing, is a test statistic, for the null hypothesis that the new signature is original.
iii. Pre-processing: Pre-processing (Sheng et al., 2005) is another important phase of the verification system, to improve the overall accuracy and reduce the computational needs of the feature extraction phase. The main purpose of this phase is to make the standardized form of signature and prepare for the next phase i.e., feature extraction. It primarily includes the following steps Noise, conversion, resizing, thinning, normalization, transformation, and smoothing. All these are to avoid the developed system falsifying the original signature.
Size normalization is done by scaling each character, each node, and every point of direction both horizontally and vertically.
where W and H are the width and height of the normalized signature respectively. The re-sampling step SΔ is a fraction of the total arc length L. Below equations shows data points in the signature.
Noise Removal: Given Fig. 6 is the example of a signature taken from the database for the preprocessing. For noise removal, a modified canny edge algorithm is defined, where the Sobel detection operator is used to detect the edges of the image (Mathur et al., 2016;Gao et al., 2010;Sheng et al., 2005;Vincent and Folorunso, 2009). As in the canny edge used Gaussian filter, which loses informational edges only isolated edges appeared. In our system, both the static and dynamic are available in the signature. To improve the edge detection modified canny edge algorithm was used.
Step 1: An input signatures I, where I>0.
Step 2: Convert image I (grayscale image) into Ib i.e binary image, Step 3: Morphological operation on Ib, (If Ib <0,) then go to step 1 Else Step 4: After a normalized operation, Ib converts to In , Image is normalized, Step 5: Apply noise removal Canny edge using Sobel filter (Shokhan, 2014), Step 6: Calculate the Error from Ib.
Step 7: Conversion of image Ib to image Inew. Fig. 6: Example showing the signature generated from database for the preprocessing iv. Feature extraction: Feature extraction technique (Sharif et al., 2018) is respiratory of the verification process, and different extract attributes and characteristics from the given image and create a matrix for further purpose. Feature Extraction is broadly divided into three main categories: Global, Local and Geometrical. Further, our system has two phases: Training and Testing. In the training phase, we applied a support vector machine and then classified the image. Output is displayed in the form of a feature matrix. In the testing phase, we applied a modified SVM-DT algorithm which is the proposed algorithm. For Decision function mapping with support vector machine (Shao et al., 2013;Zuo and Jia, 2016;Boonchuay et al., 2017;Nazari and Kang, 2015) where F(I) represents the decision function of the new signature image. K belongs to the I, Ix coordinates i.e., the kernel function to define the feature matrix.
The split function is used to calculate the feature space where cur_value denotes the current value of the features extracted from the original and forged signature.
if cur_value>Split Criteria then return √SplitCriteria Distances in features of the signature need to be calculated using the Frequency distributions function which is used to map the balance between unbalanced and balanced datasets of signatures.
While integrating two classifiers, problems raise in their mapping because we have both balanced and unbalanced data. For mapping these data, we add The Bhattacharyya coefficient in the proposed algorithm using Probability-based Distance as Spitting Criterion.

Implementation and result analysis
In our research, two different experiments were carried out. One for the testing phase and the other training phase. The first experiment was conducted on the training phase of the system where signature features are extracted. The second experiment is applied in the testing phase using the proposed algorithm SVM-DT features of captured signature data. A performance evaluation is done for each phase. The proposed integrated signature verification system includes a database of original signatures which contains all the information of the features. The set of features are captured at the time of feature extraction further it will compare with the set of features of the forged signatures to verify the status of the signature. For comparison of such feature set, we propose an algorithm that has multiple classifiers Support Vector Machine and Decision Tree (SVM-DT) are applied. Given Fig. 7 shows some sample signatures collected from the dataset. Fig. 7: Sample signature set for the verification purpose In this system, set of total 1036 signatures we have taken. Out of these, 192 were trained and 47 were tested using the proposed algorithm DT-SVM. Using the proposed algorithm, the system displayed the false acceptance rate and false rejection rate for both global and local features. The results are very promising and decrease the rate of forgery. The verification results of the SVM and DT-SVM methods are given in Table 2. Fig. 8 shows a graphical representation of the implementation of the verification system and Fig. 9 shows the confusion matrix of the integrated signature verification system.

Characteristics of performance analysis
In this section, we have calculated certain True positive, False Negative, and precision values of the signature dataset which is implemented on two different datasets of the training and testing phase (Tables 3 and 4).

ROC curve
In the integrated signature verification, we have plotted the receiver operating characteristics curve to illustrate the performance of our proposed system. It is the graphical representation of the results where x and y axis will represent the true positive rate and false positive rate respectively. In the given curve, there are three different signature sets represented in different classes 1, class 1, and class 2. Figs. 10, 11, and 12 of the ROC curve are generated at different threshold values.   In the end, it is clear that combining two classifiers gives better and more accurate results and reliability than the previously used classifiers. The main of our research is to show the reduction in the error rate from the signature image and find out the forgery rate from the signature. From the above section 5, certain scores of the signature have been generated. Experiment analysis is done using the machine learning tool.

Results
As expected, the Accuracy results of the integrated systems are much better than previous systems. The accuracy achieved 96.6%. Use of single classifiers, the systems are not able to compute within the variability of a person. Generally expected to yield better results when presented with more classifiers for the reference signatures.

Conclusion and future scope
This paper presents a brief survey of various features and methods for the classification of the set features from the signature image. These approaches are studied according to their different stages, and the performance evaluation based on FRR, and FAR is given. In addition, they can be analyzed for efficiency to get a better result. There is a need to develop one general system in future work to classify every style of signature and to enhance performance.

Conflict of interest
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.