Skip to content

Image captioning using deep learning trained and tested using the flicker dataset.

Notifications You must be signed in to change notification settings

akshaypunwatkar/Image-captioning-on-flickerdata

Repository files navigation

Image Captioning using COCO and Flickr Data

This project was implemented by:

  • Ashwini Marathe
  • Akshay Punwatkar
  • Anshupriya Srivastava
  • Srishti Saha

and was submitted as our final project for the course ECE 590 (Data Analysis at Scale in the Cloud) at Duke University.

Link to the application

Objective of the Application

The applications takes an image file as an input and uses a Machine Learning model to generate a caption for the image.

Data Source

The training and test data for the application and the demo are as follows:

  • COCO dataset: link

Model

The model was trained using Tensorflow and the methodology was based on the notebook provided by Google Colab. The basic steps in training this model were as follows:

  • Tokenize vocabulary from Training captions data
  • Implement a Bahdanau Attention based recurrent neural network
  • Use a CNN encoder & RNN decoder to train the model for caption prediction
  • Test the model on a test dataset

The model was adjusted and trained to fit our requiremets for the app.

Sample Output

Below is an example of the caption generated by our model for the given input image: Sample Output

Application also provide REST response, both for local and web-url files:

  1. Eg. for web URL: curl -X GET -d filepath=https://media.stadiumtalk.com/51/78/5178471c78244562a6fa79e0e14d7a32.jpg 'http://35.243.242.165/predict_api'
  2. Eg. for local file: curl -X GET -d filepath=sample/image_1.jpg 'http://35.243.242.165/predict_api'

App Deployment

The application was deployed using Kubernetes on Google cloud platform.
Container image from docker hub can be accessed from here: Link to Container Image

Steps to deploy the app on a Kubernetes Engine can be found here

Post deployment the app could be accessed on the link: http://34.71.22.23:8088

Demo app homescreen

Load Testing

We used the Locust Software on the Google Kubernetes engine to check our app for load testing. For this, we followed the step-by-step tutorial given here.

Further details on the performance can be seen in the demo video linked below.

Demo Video

Link to the video: Here

About

Image captioning using deep learning trained and tested using the flicker dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published