Recognition of Handwritten English Numerals Based on Combining Structural and Statistical Features

Generally, pattern recognition considered a strong challenge in many information processing research fields. The aim of this paper is to propose a highly accurate model for recognizing a handwritten English numeral through efficiently extracting the most valuable features of a certain handwritten numeral or digit. The handwritten English Numerals Recognition Model (HENRM) is proposed in this paper. The features extraction of the proposal based on combining both statistical and structural features of the certain numeral sample image. Mainly, the proposed HENCM has four phases which are image acquisition, image preprocessing, features extraction, and classification. In fact, four feature extraction approaches are utilized in this paper, which are the number of intersection points, the number of open-end points, calculation of density feature, and determining the chain code for each of the English numerals. The latter phase gives a features vector of 26-element size to be fed into the classifier that uses the Multi-class Support Vector Machine (MSVM) for the classification process. The experimental results showed that the proposed HENCM exhibits an average recognition rate equals to 97%.


I. INTRODUCTION
Recently, the field of pattern recognition with its numerous applications became the most valuable and attractive research field [1], [2]. At the same time and importance, the image processing field was the main player in the game of successfully recognizing the required data pattern.
Basically, handwritten character recognition is one of the topics related to the pattern recognition research field. According to its nature, the handwriting style varies from one person to another; this variation increases the difficulty of automatically recognizing the certain handwritten character using a machine [3].
The main target for the basic character recognition model is how meaningful information could be extracted from a captured or scanned image for a certain object to be exploited in the process of successful recognition for a certain character that may be a symbol, letter, or digit. So, the useful information obtained from the raw captured image is called features in usual, while the process of how to obtain these features is called the feature extraction process [4]. Definitely, the raw image to be ready for the feature extraction required to be preprocessed and enhanced through different steps that its number and types depend on the quality of the raw image is on. Another essential process that is normally used in handwritten character recognition is the classification process [5]. DOI: https://doi.org/10.33103/uot.ijccce.21. 1.7 The role of this process is the analyzing the extracted features information of the character image and classifying it to its corresponding character.
The numeral handwritten recognition is one of the applications that fall into the area of pattern recognition. In its global idea, the numeral handwritten recognition model follows the same concepts mentioned above. The preprocessing phase is used to clear and enhance the image by eliminating the noise that results from the scanning process which may later cause a poor recognition rate. The feature extraction phase gives the most relevant information from the preprocessed image and this information describes the character in numerical representation as a feature vector in order to prepare it for the classification phase to identify each character and assign it to the correct character class.
Character handwriting has been achieved by researchers who have used different techniques and methods in processing, feature extraction and classification.
A recognition system had been presented by Binod Kumar Prasad et al to recognize the same size rotated English numerals with two different feature extraction techniques, delta distance coding considered as first and pixel moment of inertia as second, then the classification is made by SVM with 99.17% as a rating of accuracy [6]. Another model presented by H.A. Jeiad to recognize handwritten Indian numbers while features are extracted using four parameters. The 16 features for each handwritten number were used in the training and testing in Multi-class SVM as a classifier with an accuracy of around 97% [7]. Parshuram M. Kamble et al proposed geometrical-based feature extraction on Marathi Handwritten character recognition. Feature extracted by using connected pixel-based features like area, perimeter, eccentricity, orientation and Euler number. Recognition modified by KNN and SVM [8]. Ben-Zheng Li et al presented a spiking neural network (SNN) to emulate a 2D representation of odor information through using the dataset of MNIST images for handwritten digits. The training of SNN was done basing on spike timing dependent plasticity (STDP) in an unsupervised mode. The results showed that considering SNN improves the encoding of the representation of 2D neural with temporal codes and obtains reasonable accuracy very near to the behavior of animals [9]. Aline A. Peres et al proposed a handwritten character identification algorithm. Histogram of Oriented Gradients (HOG) descriptor is used for feature extraction. In classification, three different approaches are used: SVM, CCN and hybrid CNN+SVM and the latest classifier achieved high performance with an accuracy of 96.5% [10]. Mohamed Elleuch et al proposed a recognition approach to investigate the features of handwritten characters in the Arabic language using Deep Belief Neural Network (DBNN) which distinguished by the ability to manage inputs with large dimensions and make it possible to use inputs as raw data instead of the process of extraction the feature vector and transferring between classes smoothly [11]. Cursive English handwriting recognition presented by Pritam Dhande et al horizontal and vertical projection methods are used for segmentation. Convex hull algorithm is used for feature extraction and SVM is used for recognition [12].
In our model, first, make some preprocess operation to the image such as converting into grayscale, contrast enhancement is used by histogram equalization to equalize the brightness level, Binarization applied using adaptive thresholding by Otsu's method, then resizing to get the area of interest, Canny method is used to detect the edge of characters by filtering noise and region growing for edge and finally thinning to get skeleton image. Secondly, to obtain the feature vector, a feature extraction technique applied by combining four kinds of feature extraction, finding a number of Intersection points, locating the number of open-end points, nine density features obtained by dividing the Skelton image into 9 zones and calculate the average of foreground pixels in the zone and finally the 16-DOI: https://doi.org/10.33103/uot.ijccce.21.1.7 chain code was obtained by the tracking process. The total size of the feature vector is 27 is used as an input to trained a Multi-class Support Vector Machine (MSVM) classifier. The recognition process is performed by making a comparison between the features of the input image and the sored one then make the decision by choosing the best match among the classes.
The remainder of the paper is ordered as: in depth overview of the proposed model is presented section II. Section III offers the experimental results. Conclusion and feature work is given in Section IV.

II. THE PROPOSAL MODEL
This section introduces the description of the handwritten English numerals recognition model which is abbreviated by HENRM. Essentially, the model consists of four main phases which are image acquisition, pre-processing, feature extraction and classification. The abstracted block diagram of the proposed model is shown in Fig.1 that will be detailed in the next subsections.

A. Image Acquisition
The proposed HENRM is suggested to works in offline mode that means the recognition process will base on the acquired image. The image to be acquired is considered to contain the objective handwritten English numeral that is required to be recognized. In fact, the usual way for capturing the image is via scanner or digital camera and converted into a certain and proper type of image data file. The objective handwritten numeral is manually cropped in order to separate the handwritten English numerals individually and store them in a file.

B. Image pre-processing
The preprocessing phase is essential and crucial for the next phases of the proposed HENRM. Initially, the scanned image to be inputted must be prepared to make it suitable for the next phase which is the feature extraction phase. The current phase is used to enhance and prepare the image quality to make the extraction of features more easy and accurate. The preprocessing phase comprises from a number of consecutive stages that are shown in the block diagram in Fig.2 At first handwritten samples of English digits are collected and these samples are scanned and stored in digital format. All the scanned images are loaded from a specific file and converted into grayscale images, and then contrast enhancement is used by the histogram equalization (HE) technique. It tends towards changing the image mean brightness into the middle level of the gray level range.
Converting the image from grayscale to binary image had done by applying adaptive thresholding depending on Otsu's method. This method is applied to separate the existing pixels to foreground class and background class.
After converting the image to binary, resizing operation is applied to the image in order to maintain the size uniformity of all the character images, they are resized into a standard dimension to get the area that contains the handwritten English digit only which called the region of interest (ROI). Resizing is followed by an edge detection stage that uses a canny method to detect the edge of the interesting handwritten digit through noise reduction and region growing for the edge of that digit. This is followed by the last stage of preprocessing through converting the image to one-pixel image. Thus, after applying the whole stages of preprocessing we will get preprocessed images of different handwritten English digits in the form of skeleton images

C. Features Extraction
The feature extraction is an important phase of any recognition model and the accuracy of the recognition depends on the accuracy of the approaches used in the feature extraction phase. Fig.3 illustrates the block diagram regarding feature extraction phase that considered in our proposed HENRM.
The purpose of the feature extraction phase of HENRM is to obtain a feature vector combined by four kinds of feature sets that were utilized in the proposed HENRM and will be detailed in the next subsections.

Finding the Intersection and Open-End Points:
The description of the intersection point is any pixel that has more than two pixels as neighbors on the other hand the open end  Fig.4 shows an example of digit 3 which has one intersection point and three open end points.
Calculating the Density Feature: A statistical feature extraction method is considered in calculating the density feature which is named zoning. In this method, the resized skeleton image of size 30*30 for the certain handwritten digit is divided into 9 equal zones each of size 10*10 pixels as shown in Fig.5(A). Fig.5(B),(C) illustrates how the skeleton image for the handwritten English digit 5 and digit 4 divided into 9 equal zones to get a 9element density feature vector.
The density features are extracted by counting the average number of foreground pixels inside each zone by dividing the foreground pixels by the total pixels for each of the nine zones to obtain the nine density features due to (1).

= (1)
For example, the foreground pixel of zone Z1 of digit 5 as shown in Fig.5(D) might be computed. The number of foreground pixels is 6 while the total number of pixels is 100, so, applying Equ. 1 yields density feature D1 for Z1 equal to 0.06. In the same way, the rest eight density features can be obtained and arranged in 9-element vectors as shown in the below formula.

Density Feature Vector =[D1+D2+…+D9]
Determining the Chain Code: The chain code is used to track a shape constructed by a sequence of connected pixels. The tracking process based on the 8-connectivity property shown in Fig.6 gives a 8-directional sequence of numbers. The chain code based on 8-directional numbers is called Freeman chain code. .

Intersection Point
Open End Points DOI: https://doi.org/10.33103/uot.ijccce.21.1.7 The proposed HENRM utilizes the mentioned chain code approach to obtain a certain set of features strongly specified to each of the handwritten ten English digits. The sequence of connected pixels that represent certain digits in the skeleton image of that digit will be tracked according to the clockwise 8-connectivity property starting from one of the previously determined open end points. The tracking process will be continuing until the last open-end point reached. If the under-processing digit has an intersection point, then the tracking will be accomplished by taking the other direction determined by that intersection point until reaches the terminated open-end point. This process will be repeated until the whole shape of the handwritten English digit is completely tracked.
Chain code for different handwritten English digits has different 8-direction numbers and different lengths. However, to obtain a fixed size of chain code a normalization for the values of chain code must be applied. Assuming that the following chain code C_t is produced from digit 5 by traversing it in a clockwise direction. Computing the frequency of occurrence of the 8-direction numbers for the total chain the below frequency chain will be obtained. The chain of Cnf will be scaled up by multiplying each of its values by 10 to obtain more reasonable values for classification purposes and the final form of the scaled frequency chain vector, Csf, will be as follows:

D. Multi-class SVM Based Classifier
SVM (Support Vector Machine) is a machine learning method-based classification pattern. This method was developed and introduced by (Vapnik,1998). This algorithm provides good performance results in learning and classifying so it becomes a hotspot to the researchers in the machine learning field. SVM classified data into two categories which perform the classification by separates the data, with a hyperplane, into two classes.
Later, the researchers studied the ability of extended the SVM classifier from two classes classification into multi-classes classification by two popular approaches (Hsu and Lin, 2002). One of them based on combining a number of two class SVMs in a particular way to procedure a multi-class classifier, while the other is used the training and testing procedure to straightly perform that multiclass classification. Although, this procedure considered being time consumption.
Classification is the last Phase of HENRM, it is the step when the employed images are given exclusive labels depending on their features that been extracted from the previous phase in the proposed model. Fig.7 shows classification phase-based multi-classes SVM. This phase consists of two stages, training and testing. The classification process has 10 classes of handwritten English digits from 0 to 9. The total sample set is 800 samples each class has 80 samples, the training process using 80% of the samples and the remaining 20% used for the testing process. The multi-classes SVM classifier is trained with training feature vectors set in the training stage and the results are done by using testing feature vectors set in the testing stage.

III. EXPERIMENT RESULTS
Experiments have been carried out to evaluate the performance and applicability of the proposed model. The proposed HENRM has been implemented using Matlab 2015b on a laptop with ci7, 2 GHz CPU.

A. Data Set Description
The experiments were performed on a captured image for the handwritten English digits of 0 to 9 that collected from 80 students using pencils and different pens` colors to cover the most possible variations of the digit pattern samples to cover the most possible variations of the digit pattern samples.

B. Evaluation of Data Preprocessing
The preprocessing phase of HENRM which contains six stages as shown in Fig.2 is carried out to get a skeleton image. The image of each handwritten English digit sample has been scanned at 300 dpi to obtain a reasonable image quality while a window of 70*70 pixels was applied to get a uniform image set that has the same size to be preprocessed by the proposed HENRM.
The input image has been converted into grayscale and then contrast enhancement has been done by equalizing the brightness level of that image. After that, Otsu's method is applied by using an adaptive threshold to convert images to binary. After binarization, resizing operation is applied to get ROI of 30*30 pixels, then the canny detector is used for edge detection, then a skeleton image of each digit sample was obtained by applying thinning operation. The output of the whole mentioned preprocessing stages for digits 5 and 3 is shown in Fig.8.

C. Evaluation of feature extraction
As previously mentioned in section II, the feature extraction method used in HENRM combines the structural and statistical features of the image to obtain a feature vector of 27 elements. The structural features give 18 of the total 27 elements while the statistical features which are the density features occupy 9 elements. The former represents the number of intersection points, a number of open-end points that list two features as shown in Table 1 and the combined chain code which is represented by the frequency chain, Cf, glued together with scaled frequency chain code, Csf which list sixteen features as shown in Table 2.   The plot of pixels obtained from the tracking process for the ten handwritten English digit samples is illustrated in Fig. 9.
The nine density features were determined by dividing each of the skeleton image samples into equal 3*3 zones with dimensions of 10*10 pixels for each zone. After that, the density for each of the nine zones was computed by applying (1). Table 3 lists the nine density features obtained for one sample from each of the ten handwritten English digits.

D. Evaluation of classification phase
The multi-class SVM is used to evaluate the classification performance. The 800 feature vectors were divided into two sets, training and testing. 600 vectors are taken to train the classifier, and the remaining 200 are used for testing. The 27-element feature vector is computed for all the samples in both training and testing samples. The confusion matrix with the recognition rate of each of the ten digits is shown in Table IV. Eq.(2) is applied to calculate the recognition rate for ten digits as shown in Table IV  The generated confusion matrix in Table 4 demonstrates that the most misclassified digits are 3, 4, 7, and 9 because of the high similarity with the other digits due to the skills of handwriting for the different students.  Table 5 illustrates a comparison for the percentage of recognition rate achieved by other researchers that were used different approaches for feature extraction and different classification algorithms. It is clear that the proposed model in this paper contributes very efficient recognition performance with a smaller number of features. In fact, the latter plays a serious role in avoiding redundancy in the generated data for classification purposes.

IV. CONCLUSION AND FEATURE WORK
This paper suggests an efficient feature extraction process by using a combination of structural and statistical approaches as a core of the proposed handwritten English numerals recognition model. The obtained features vector is consisting of 27 elements and constructed from determining or calculating the intersection points that gives 1 element, open end points that give 1 element, frequency chain code that gives 8 elements, scaled frequency chain code that gives 8 elements, and density features that give 9 elements. This features vector fed to Multi-Class SVM as a classifier which has been trained and tested with 80 sets of ten numeral. Experimental results show that the achieved recognition rate was equal to 97%. As future work, improvement may be suggested in the feature extraction phase by adding other approaches to solve the misclassification of some digits. Also, the proposed recognition model can be extended to be applied to recognized letters and symbols.