Welcome to my GitHub repository! This project is focused on the task of generating descriptive captions for images. The model leverages Convolutional Neural Networks (CNNs), specifically using a ResNet-50 architecture, to extract salient features from images. These features are then used to generate corresponding captions that aim to describe the content of the images accurately. The project offers a comprehensive solution for the automatic generation of image captions, incorporating feature extraction, data preprocessing, model training, and evaluation.
- Python 3.x
- PyTorch
- Google Colab
- NLTK
- Matplotlib
- Pandas
This repository features a Google Colab Notebook named Image_Caption_Generation_Using_Neural_Networks.ipynb
, which you can use to train and evaluate the model interactively.
- Clone the repository.
git clone https://github.com/TanmayWINTR/ImageCaptionGeneration.git
- Open the Google Colab Notebook:
Image_Caption_Generation_Using_Neural_Networks.ipynb
- Encoder-Decoder Model: Utilises ResNet-50 for image encoding and a custom RNN for text decoding.
- Data Preprocessing: Offers text cleaning and vocabulary building.
- Batch Training: Uses mini-batch gradient descent for efficient training.
- Model Checkpointing: Allows saving and restoring of model states.
- BLEU Score: Measures the similarity between the generated captions and the reference captions based on n-gram precision.
- Cosine Similarity: Evaluates the semantic similarity between the generated and reference captions.
- Navigate to the Google Colab Notebook in the repository.
- Follow the instructions within the notebook to train and evaluate the model.
For more information, please refer to the Google Colab Notebook Image_Caption_Generation_Using_Neural_Networks.ipynb
included in this repository. Feel free to raise issues or make pull requests. Thank you for visiting!