Skip to content

Latest commit

 

History

History
167 lines (107 loc) · 5.25 KB

README.md

File metadata and controls

167 lines (107 loc) · 5.25 KB

Image Caption Generator

Generates textual description of any given image. Use both Natural Language Processing (NLP) and Computer Vision to generate captions. The idea implemented is to replace the encoder (RNN layer) in an encoder-decoder architecture with a deep Convolutional Neural Network (CNN) trained to classify objects in images.

Table of Contents

Learn Concepts

Keras Implementation

https://towardsdatascience.com/image-captioning-with-keras-teaching-computers-to-describe-pictures-c88a46a311b8

Detailed Study and Explanation

https://data-flair.training/blogs/python-based-project-image-caption-generator-cnn/

ResNet Model

https://towardsdatascience.com/review-resnet-winner-of-ilsvrc-2015-image-classification-localization-detection-e39402bfa5d8

InceptionV3 Model

https://medium.com/@sh.tsang/review-inception-v3-1st-runner-up-image-classification-in-ilsvrc-2015-17915421f77c

TensorFlow-Keras Applications API

https://www.tensorflow.org/api_docs/python/tf/keras/

RNN MODEL TRAINED

language_model

Try Locally

Make sure you have python installed on your system.
I have used pipenv to manage virtual environment and packages.
You can install pipenv using pip by running

$ python -m pip install pipenv

Now clone this repo and install all the dependencies

$ git clone https://github.com/Smile040501/image_captioning.git

$ cd image_captioning/

$ pipenv install --dev

Once all the packages are installed, you may try running one of the following commands

# To generate predicted captions for all the dev dataset images
$ pipenv run python generate_caption.py generate

# To generate a prediction for provided image
$ pipenv run python generate_caption.py image <path_to_image>

Or you may use try using Jupyter Notebooks.
(Make sure to select python interpreter as installed by pipenv)

Dataset Resource


Node.js REST API

Node.js REST API for the project to make requests from frontend and predict caption for uploaded image from frontend.

Prerequisites

This project requires NodeJS (version 8 or later) and NPM.

To make sure you have them available on your machine, try running the following command.

$ npm -v
7.20.3

$ node -v
v14.17.4

Installation

BEFORE YOU INSTALL : Please read the Prerequisites.

Start with cloning this repo on your local machine, copying this template folder to destination folder and installing dependencies.

$ git clone https://github.com/Smile040501/image_captioning.git

$ cd image_captioning/

$ npm install

Useful Scripts for Local Development

In the project directory, you can run:

npm run start:dev

Runs the app in the development mode.
You can make request to http://localhost:8000
It will bundle the app using webpack to build folder and serve it using nodemon.
It will re-bundle whenever you make edits to any of the development file.

npm run start:server

Will serve the app using nodemon.
Make sure the app was build before webpack at least once and build folder exists.

npm run build:prod

Builds the app for production to the build folder.
It correctly bundles the app in production mode and optimizes the build for the best performance.

The build is minified.
Your app is ready to be deployed!

License

MIT

Author

Mayank Singla

Mayank Singla