Indonesian Image Captioning

Built using PyTorch

Pembangkitan Deskripsi Gambar Bahasa Indonesia

Paper

Coming soon

Baseline paper : Semantic Compositional Networks for Visual Captioning

Overview

Install prerequisites

Before installing project make sure the following prerequisites have been met.

Sebelum memasang dan menjalankan proyek ini, pastikan segala kebutuhan sudah terpenuhi
Examples

See how caption is being generated

Lihat cara deksripsi gambar dibangkitkan
Project Tree

File and directory structure of this project

Struktur direktori dan berkas
Install and Running the project

Way to run and develop the project

Cara menjalankan dan mengembangkan proyek ini
How it works

How the model implemented and works

Cara model diimplementasikan dan cara kerja
Possible Future Development

Possible enhancement and development in the future

Kemungkinan peningkatan kualitas and pengembangan di masa depan
Author and Credits

See man behind the project and other people that contribute to this project

Orang di belakang proyek dan orang-orang lain yang berkontribusi

Prerequisites

What things you need to install this project and how to install them

Library

Python 3.4 or More for Programming Language
Pytorch for Deep Neural Network Framework
Torchvision for ResNet152 Architecture
Nlg-Eval for evaluation metrics

This project used BLEU, ROUGE, METEOR, and CIDEr-D as evaluation metrics for English caption

METEOR and CIDEr-D is not used in Indonesian because there is no implementation of METEOR and CIDEr-D in Indonesian language

Pretrained Model

You can download pretrained models and scn_data from THIS LINK

Just copy the pretrained and scn_data folder into this project

Examples

Project tree

.
├── pretrained # download and save pretrained models here
├── scn_data   # folder contains files generated by create_input_files.py
├── datasets   # dataset loader for generate train, eval, test
│   ├── caption.py # caption dataset loader
│   └── tag.py # tag dataset loader
├── models # all models implementation
│   ├── decoders # all models implementation
│   │   ├── attention_scn.py # all models implementation
│   │   ├── pure_attention.py # all models implementation
│   │   └── pure_scn.py
│   ├── encoders
│   │   ├── caption.py # all models implementation
│   │   └── tagger.py
│   ├── attention.py
│   └── scn_cell.py
├── trains # all training implementation
│   ├── attention_scn.py
│   ├── pure_attention.py
│   ├── pure_scn.py
│   └── tagger.py
├── utils
│   ├── checkpoint.py
│   ├── dataset.py
│   ├── device.py
│   └── embedding.py
│   ├── loader.py
│   ├── metric.py
│   ├── optimizer.py
│   └── tensor.py
│   ├── token.py
│   ├── url.py
│   └── vizualize.py
├── corpus_score.py # corpus scoring using perplexity and vocab count
├── create_input_files.py  # preprocess input files and split data
├── eval_caption.py # caption model evaluation script
├── eval_tagger.py # image tagger model evaluation script
├── inference.py # caption generator script
├── README.md  # this file
└── train.py # training script

Install and Running the project

git clone https://github.com/rayandrews/semantic-compositional-nets-attention.git
cd semantic-compositional-nets-attention

How It Works

By combining two architecture: SCN by zhegan27 and Soft Attention by kelvinxu

Architecture

Params

Default Params

Parameters	Value
Semantic Concept	1000
Caption Per Image	5
Min Word Freq	5
Max Caption Length	50

Image Tagger

Parameters	Value
Epoch	10
Batch Size	32
Learning Rate	1e-4
Dropout	0.15
Optimizer	Adam

Caption Model

Parameters	SCN	SCN + Attention
Epoch	12	12
Batch Size	32	32
Learning Rate	4e-4	4e-4
Dropout	0.5	0.5
Optimizer	Adam	Adam
Embedding	512	512
Attention	-	512
Factor	512	512
Decoder	512	512

Possible Future Development

Change Soft Attention to Transformer Attention is All You Need
Change baseline
Preprocess and reevaluate Indonesian dataset

Author

Ray Andrew - Github - Linkedin - Email

Credits

Supervisors

Others

zhegan27 as the base implementation for SCN Paper
kelvinxu as the base implementation of Attention Networks ~ Show, Attend, and Tell
sgrvinod as the base of this project with Show Attend and Tell Implementation are taken from him.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Indonesian Image Captioning

Paper

Coming soon

Overview

Prerequisites

Library

Pretrained Model

Examples

Project tree

Install and Running the project

How It Works

Architecture

Params

Default Params

Image Tagger

Caption Model

Possible Future Development

Author

Credits

Supervisors

Others

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
datasets		datasets
figures		figures
models		models
notebooks		notebooks
others		others
trains		trains
utils		utils
.gitignore		.gitignore
README.md		README.md
corpus_score.py		corpus_score.py
create_input_files.py		create_input_files.py
eval_caption.py		eval_caption.py
eval_tagger.py		eval_tagger.py
inference.py		inference.py
train.py		train.py

rayandrew/indonesian-image-captioning

Folders and files

Latest commit

History

Repository files navigation

Indonesian Image Captioning

Paper

Coming soon

Overview

Prerequisites

Library

Pretrained Model

Examples

Project tree

Install and Running the project

How It Works

Architecture

Params

Default Params

Image Tagger

Caption Model

Possible Future Development

Author

Credits

Supervisors

Others

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages