Drug Analyser Using Neural Networks with the Use of Transfer Learning Techniques

Resnet architecture was used to create a user-friendly drug analyser web application. This architecture is a transfer learning method that was used as a convolutional neural network in this case, and it will be trained on a collection of images that contain labels for each drug individually. The activation functions used within these neural networks are ReLU (Rectified Linear Unit) and softmax activation functions, as well as categorical cross-entropy as a loss function. Stochastic gradient descent (adam optimizer) was used to change the weights for each input on each epoch. Finally, after receiving a traditional model, it was merged with a web application API such as flask in Python. After that, the web application was deployed to cloud platforms such as Heroku.


Introduction
When a person is afflicted by a disease or an infection, he or she should seek the advice of a doctor in order to recover from the illness; this is the best way to be properly cured. However, in some cases, most people will self-medicate without understanding how the treatment works, which may result in any negative side effects. So, to solve this problem, we'll propose a solution based on a convolutional neural network, a deep learning technique that was used to search and analyse the drug in detail, providing a full overview of the drug. This would get rid of the chances of getting sick or experiencing side effects as a result of self-medication or improper medication. [1][2][3][4][5].

1.1.Convolutional Neural Networks (CNN)
Convolutional neural networks have a number of advantages over traditional image processing techniques such as image segmentation, image recognition, and object detection. Humans can classify and categorise an object based on the colour and shape it receives from their vision, and computers can categorise an object based on the image it reads in the form of numbers. This numerical representation of an image is subjected to multiple operations within CNN, which were used to train it and then predict what kind of image it will receive. Convolution, max-pooling, flattening, and complete connections are some of the operations that were performed within a simple convolutional neural network. The following is a detailed overview of each of them.

Convolution / conv2D Operation
It is the first operation carried out within a convolutional neural network. Convolution function is used to acquire an feature by combining two features. A feature detector was extracted from the provided input image after the convolution operation was performed on the image in the form of a numerical array. Feature detectors are essentially a subset of the input image that was used to compress the image while still protecting the essential features. The convolved image is the product of this convolution process. The extracted feature detector will be traversed towards the specific input image, and if any array location in the input image matches with the feature detector, the convolved image array will be modified with the number of matching occurrences, or it will return zero to the corresponding convolved image position if there are no matches. Strides are the movements of one traversal across the image; these strides may occur with or without replacement by CNN operations. [6][7][8][9][10].

Max pooling
To down sample the detection of features in feature maps, pooling is necessary. The method of defining unique points on the convolved image using strides is known as max pooling. It returns the maximum number of points on the convolved image array depending on the stride size chosen. There are also other types of pooling methods, such as:  Sum pooling  Average pooling Max pooling can achieve the best results of the two pooling strategies. The workings of a max pooling technique are depicted in the diagram.

. Flattening operation
The process of flattening the maximum pooled image into a flattened image, which was then given to artificial neural networks, as the activity name suggests.
The following is a representation of it.

Full connection
The complete attachment process, which involves connecting the flattened image to the artificial neural network, is the final operation within a convolutional neural network (CNN) (ANN). The number of hidden layers, activation and loss functions, and optimizers used in that particular ANN will be trained on this input image. Artificial neural networks are a type of neural network that is used to predict the outcome of regression and classification type problems. Initially, the input features are exposed to the activation functions that were present within each layer of the ANN; however, there are several activation functions for ANN, which will be chosen depending on the use case. Backpropagation is a remarkable technique that is responsible for making the neural network more efficient by altering the weights of each input function, which plays a critical role in reducing the error (cost) produced during neural network training. Feed forwarding is the mechanism by which the input features are distributed to each layer within the neural network. Following that, the ANN's cost functions will determine the cost for each input element. The inputs are fed back to the neural network using the back propagation method, and the weights for the input features are modified using the gradient descent algorithm. This process is repeated until the cost is reduced. The term "epoch" refers to the movement of one feed forward and backward within a neural network.

Rectified Linear Unit (ReLU)
The rectified linear unit, also known as the ReLU activation function, is one of the most well-known activation functions for solving regression problems in deep learning. In this case, ReLU activation functions were used in the very first layer of the neural network if the output that needed to be predicted was of a continuous form. ReLU will returns the maximum of the input. For example, if an input features contains of a data point greater than 0 then it will be returned to the upcoming layers of the neural network or else, it will return zero to the data point which is lesser than 0.It will works based on the below given formula Y = max (0, X) (1)

Softmax activation function
Sigmoid activation functions are used to predict performance with more than two categorical features, which is known as multi-classification output. Softmax activation function is commonly used in the output layer of a neural network; however, it has a downside in that it cannot be used in the input layer or hidden layers of a neural network because it creates vanishing gradient problems.
The softmax activation mechanism is depicted in the diagram below The categorical cross entropy loss function is the only loss function used to measure the amount of error for a multi-class classification problem. It estimates the loss and then reduces it with the help of certain optimization algorithms. This measured loss was then passed back to the neural network's input layer, where the weights for the inputs were changed to reduce the cost.

Gradient and Stochastic gradient descent:
The gradient descent algorithm was used to update the weights of the input features, and the cost function of the neural network was reduced as a result of the weight update. The loss function calculates a cost for a specific batch of input data points, which is then decreased using gradient descent. However, this leads to the question of bursting gradients, which can be overcome using a technique known as "stochastic gradient descent." This will measure the cost of each row and reduce it by assigning acceptable weights.

Transfer learning techniques
It is the process of taking pre-trained deep learning models and customising them according to our needs by adjusting the input and output features. Rather than constructing a custom CNN model and fine-tuning it, we can use a transfer learning approach to construct a generalised model that can make better predictions with high accuracy. The following are some of the most well-known transfer learning techniques:  Resnet  VGG 16  Mobilenet  Inception V3 etc., These pre-trained models' weights will be posted in the "imagenet" forum. The imagenet is a group that gathers pre-trained convolutional neural networks generated by individuals, and this event is held every year.

Conclusion
The final step was to transform this qualified CNN into a serialized bytecode type using Python's "pickle" library. The aim of this process was to create an end-to-end web application that could be easily accessed by anyone. The front-end was created using an html form, and the serialized CNN model was then integrated with the html application using the flask framework, which is available in Python. For each and every production, a summary of the drug will be given based on data obtained from a licensed and legally registered medical practitioner. Finally, we can deploy this end-to-end web application to cloud services such as AWS, Azure, Google Cloud Platform, and Heroku.