Skip to content

Latest commit

 

History

History
15 lines (13 loc) · 1.26 KB

README.md

File metadata and controls

15 lines (13 loc) · 1.26 KB

Open In Google Colab      Open In GitHub

deepLearning2022FallFinalProject

Abstract

Motivated by the tendency and inspiration of NLP and object detection techniques, we implemented an atten-tion-based model that can describe the image content. To achieve this, we applied CNN as the encoder and RNN with self-attention as the decoder. We also visualize how our model fixes its gaze on salient objects when creating the caption corresponding to the image. We comparative-ly validated the use of attention with state-of-the-art per-formance on benchmark datasets: MS COCO. Finally, we successfully implemented the improvements in image captioning and minimized the test loss of the training model to 11.24%.

Codebase

check source code in the file

dLFinalProjCodeBase.ipynb

Example

截屏2023-05-04 12 42 41

##Licence Distributed under the MIT License.