Multilingual Image caption Generator using Big data and Deep Learning

Authors

  • Naman Grover School of Computer Science and Engineering, Vellore Institute of Technology Chennai, Tamil Nadu, India Author
  • Anchita Singh School of Computer Science and Engineering, Vellore Institute of Technology Chennai, Tamil Nadu, India Author
  • Suganeshwari G Assistant Professor(Sr.),School of Computer Science and Engineering, Vellore Institute of Technology Chennai, Tamil Nadu, India. Author

DOI:

https://doi.org/10.47392/irjash.2023.S047

Keywords:

LSTM, CNN, Big data, BELU Score, Deep learning

Abstract

Automatic image captioning aims to produce a descriptive sentence about a picture. For this task, we are creating a model that will spit out an English sentence when an image is given as input describing the image’s subject. Scientists in the field of cognitive computing have paid much attention to it in recent years. The endeavor is challenging because it requires merging ideas from two distinct but related disciplines: natural language processing and computer vision. Using the integration of CNN with LSTM, we developed a model for generating image captions. The ideas behind a Convolutional Neural Network and a Long Short-Term Memory model were combined to create this model. The convolutional neural network serves as the encoder, extracting information from images. At the same time, the long short-term memory is responsible for the decoder role, coming up with words to describe the image. The problem arises when the dataset is significant, and it takes weeks for systems to have only CPU support to train the network to decrease the time it is required to train big data can be taken into accounts. After the caption generation phase, we use BLEU Scores to assess our model’s performance. Using this information, our technology can help users find a fitting description for the uploaded photo with the desired language

Downloads

Published

2023-05-28