Predicting Severity of Diabetic Retinopathy using Deep Learning Models

This paper presents deep learning models for the classification of Diabetic Retinopathy (DR) grades. The goal of this research is to find and create a deep learning model that will help us identify the image with high accuracy into one of the five phases of the DR as no DR, mild, moderate, severe, and proliferative DR.The whole work is developed using four steps. The first, using Ben Graham's pre-possessing form, the fundus images were pre-processed. Secondly, in order to train the models, the preprocessed images are contributed to the deep learning algorithm. The third,deep learning models such as Deep CNN, Dense Net, and Group 19 Visual Geometry (VGG19) are developed to predict the severity of the DR. The APTOS Blindness Detection dataset is used to train the proposed deep learning models. Since the data set is imbalanced in nature, the issue of training bias contributes to it. Therefore, at the time of training the models, class weight technique is used to eliminate the training bias problem. In the case of DR grading structures, the proposed deep learning models work well. The Dense Net has been found to work better than the other two models.


Introduction
In 2019, the global prevalence of diabetes is estimated to be 9.3% (463 million individuals) and to increase to 10.2% (578 million) by 2030 and 10.9% (700 million) by 2045 [1]. DR is an eyerelated condition that occurs in persons with prolonged (i.e. more than 20 years) diabetes.And it is becoming one of the major vision related problem in these days. The large amount of population is suffering from various stages of DR like no DR mild, moderate, severe and proliferative DR. It is therefore now important that the frequent prescreening Computer Aided Diagnosis (CAD) method for the DR is to be adopted [2].It has long been known that there is a need for a robust and automated DR screening process, and previous attempts have made good progress using image detection, pattern recognition, and machine learning with eye photos as input. The purpose of this work is to build a new paradigm that will hopefully lead to practical clinical potential. And to carry out DR screening process, these models can be used. But we need qualified and educated experts for the screening process. One survey was conducted between 2015 and 2019 to find out the number of ophthalmologists around the world.And the survey revealed that in 194 countries, there were just 25 thousand ophthalmologists [3]. Thus, now a day the CAD based approaches are becoming the most widely used approach. In the automatic grading of DR using CAD at the time of screening, three major problems remain. The first is that most of the CAD programs currently available support the diagnosis of DR in only two grades, i.e. normal and abnormal. In fact, however, the DR is progressing through five different phases [4], such as no DR as first, mild as second, third stage as moderate DR and severe and proliferative DR are at fourth and fifth stage respectively. The second is the optimal overall accuracy of multi-class classification, and the third is the imbalanced dataset. The imbalanced data set is a dataset that does not have uniform features, such as number of samples belonging to each class.This means that changes in our network had to be made to ensure that the functionality of these photos could still be taught. There are also very few articles addressing the five classifications of DR using a CNN technique, as far as we are aware.We present deep learning models in this paper, such as DenseNet', basic CNN, and VGG19. Figure 1 displays the pictorial view of the model commonly proposed. To predict the severity of the DR, this model takes input as fundus images and applies a trained model on the input fundus image. The proposed deep learning modelsadopts the following steps:  Applying image preprocessing techniques is the first step in designing and training the model.  We have applied image resize and image cropping at the second stage in the model creation process.,  The construction of deep learning models is the third and most important step..  In the fourth level, class weight approach is adopted to deal with training bias problem due to imbalanced dataset.  Finally, we assessed the performance of the built models. The paper is organized into different sections. The related work done by researchers in this field is clarified in Section 2.A description of the various methods of detection and classification of DR is given in this section.Detailed information on the data set used to train and test the deep learning models is provided in Section 3. In the Section 4 various techniques adopted for the pre-processing of the fundus images are defined. Also Section 4 includes description about the construction of deep learning models, Results obtained were established in Section 5 and Conclusion is in the last Section 6. Multiple classifiers such as Support Vector Machine (SVM), Random Forest, Random Tree, and J48 classifiers are tested. And, with an overall accuracy of 99.7 percent, Random Forest has been found to outperform all the other classifiers.In 2020Gayatr S., et al. [6] for the classification of the DR from the fundus images suggested a light weight CNN. And the assessed results show that, along with the J48, the proposed feature extraction technique is betterAn updated color autocorrelogram function (AutoCC) with low dimensionality was proposed by RaghavVenkatesan et al. [7] in 2012. Jaakko Sahlsten et al. [8] suggested deep learning fundus image processing for the classification of DR and Macular Edema in 2029. They introduced a deep learning framework in this research that recognizes referable DRs.
An automatic detection of mild and multi-class DR using deep learning was proposed by RubinaSarki et al. [9] in 2020.
In 2019 Karthikeyan S et al. [10] proposed a model for detection of Multi-Class retinal diseases using artificial intelligence. The proposed model uses minimal data to train the CNN.Yung-Hui Li et al. [11] suggested a CAD framework for DR based on fundus photos using deep CNN imagery in 2020.In 2012 Man Li et.al. [12] Proposed a common approach to handle imbalance dataset training bias issue in which technique is adopted to weighting samples in rare classes with high cost and then apply cost-sensitive learning strategies to fix the class imbalance issue.
The new state-of-the-art DR color fundus image detection and classification methods using deep learning techniques were reviewed and evaluated in article [13] and [14][15][16][17][18][19].Different machine learning techniques such as Random Forest, SVM, etc. have been applied to enhance the performance of DR detection, as seen from the literature. However, there is still scope to explore more relevant features that can contribute to the identification of a DR image.

Dataset
The "Asia Pacific Tele-Ophthalmology Society (APTOS) Blindness Detection" dataset [15] is used to train and test deep learning models. The dataset includes 3662 total number of color fundus images. And these images are graded ranging from 0 to 4 for the DR by clinician. For classification problems, imbalanced datasets are a special case where the class distribution between classes is not uniform. Fig.2 [16] shows the distribution of the dataset per DR grade.

Fig.2. Distribution of number of samples per DR grades 4. Methods
The numerous methods proposed and implemented to implement the deep learning models to classify the fundus images into different grades of the DR are listed in this section. This paper suggested different deep learning models such as Base Model (CNN only), VGG19 Model, and DenseNet Model.
Fundus images, preprocessed using the Ben Graham system, are the input to train these models. The dataset is created using fundus photography under a range of imaging conditions with a large collection of retina images. We will have noise in images like any other real-world dataset. In addition, photographs were obtained over an extended period of time from several clinics using a range of cameras, which would introduce more variations. Therefore, it is important to adopt and apply pre-processing techniques on fundus images.

Fundus image pre-processing
This sub section of the paper explains in detail the different techniques used in the image preprocessing process.
1) Gaussian Blur/Smooth: To blur or smooth the input fundus images, the Gaussian filter is used. This is similar to how the average filter works, except for the Gaussian filter, a different kernel is used. 2) Ben Graham approach: The Ben Graham approach [17] for the pre-processing of the representations of the input fundus is adopted. Graham performed both scaling the image and adding a circular crop with the input fundus image. The Graham fundus image preprocessing approach are as follows:  Rescale the images so that they have the same radius, i.e. (300 pixels or 500 pixels),  The local average color is subtracted; 50 percent grey is mapped to the local average.  Clipping the images to a size of 90% to eliminate the boundary effects. 3) Image Cropping: Images have a black section around the actual image of the eye in the dataset. The black portion impacts the model's output because it contains no data. So we need to cut this black portion out of the picture.. 4) Resizing: The images in the dataset vary in size.
So we used the radius of an image equal to 500 pixels to render images of the same size. The pre-processed image is shown in Fig. 3. (b) after applying all image pre-processing techniques such as Gaussian blurring and Graham's proposed methods.

Building deep learning models
This section presents various deep learning models like simple basic CNN, VGG19, and DensNet. Pretrained models are modified and constructed with by applying class weight method training techniques to overcome issue of training bias. A) Simple Basic Deep CNN:-In deep learning, a CNN may be a category of deep neural networks, most typically applied to analysing visual imaging. The constructed model is trained for 15 epochs with batch normalization equal to 128.

Fig.3. Preprocessing (a) Input Fundus image (b)
Preprocessed fundus image. B) VGG19:-A CNN with 19 layers deep is called VGG19. We can load a pre-trained version of the network that is trained on more than a million ImageNet database images. It's the first contender in ImageNet Challenge in 2014. [18]. This network is distinguished by its simplicity, using an increasing depth of just 3*3 convolution layers stacked on top of each other.
C) DenseNet:-A DenseNet may be a kind of CNN that, via Dense Blocks, utilises dense connections between layers wherever we appear to directly connect all layers with each other. Each layer obtains additional inputs from all previous layers to maintain the feed-forward nature and passes on its own feature-maps to any or all subsequent layers [19]. DenseNet is more efficient on some image classification benchmarks.  Table. 1.shows that the DenseNet is performing better than the basic deep CNN and VGG19. So, the DenseNet is suggested for the purpose of developing DR severity prediction system.