A Decomposable Attention Model for Natural Language Inference

[email protected]

reference

A Decomposable Attention Model for Natural Language Inference, P Parikh

data reference

snli - Stanford NLP

1. A Decomposable Attention Model

본 모델의 경우 두 문장 사이의 관계를 분류한다.
기존에는 단어 단위의 학습을 위해 attention을 많이 사용한다.
본 모델의 장점은 두 문장의 단어간의 attention을 사용하여 성능을 향상한다.

두 문장간의 attention은 우선 두 문장의 히든 벡터들을 곱하여 학습한다. 이는 [batch, words1, state] * [batch, state, words2] => [batch, words1, words2]의 형태로 만들어주며,
이를 통해 문장 1의 단어별 문장2의 attention 가중치를 구할수 있다. (반대의 경우 문장2의 단어별 문장1의 단어별 attention 또한 구할 수 있다.)

이후 attention 정보와 히든 벡터의 정보를 FC layer G함수로 넣어 v1, v2를 얻는다.
그리고 마지막으로 v1, v2를 합해 최종 컨택스트 벡터를 얻는다.

2. Test result

2.1 Variants model

Vanilla : 논문에서 기본적으로 기술한 내용의 모델이다. 워드 임베딩 -> FC layer -> attention의 흐름이다.

(snli_dec_vanilla_model.py, snli_dec_vanilla_trainer.py)

Bi-RNN : 위 버전에서 조금 나은 성능향상을 위해 개인적으로 시도해보았다. 워드 임베딩 -> Bi-RNN(GRU) -> attention의 흐름이다.

(snli_dec_rnn_model.py, snli_dec_rnn_trainer.py)

epoch	Vanilla test acc	Bi-RNN test acc
1	0.6806	0.6905
2	0.7114	0.7132
3	0.7305	0.7308
4	0.7442	0.7428
5	0.7485	0.7549
6	0.7449	0.7647
7	0.7566	0.7697
8	0.7556	0.7726
9	0.7631	0.7785
10	0.7685	0.7808

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
img		img
README.md		README.md
snli_batcher.py		snli_batcher.py
snli_dec_rnn_model.py		snli_dec_rnn_model.py
snli_dec_rnn_trainer.py		snli_dec_rnn_trainer.py
snli_dec_vanilla_model.py		snli_dec_vanilla_model.py
snli_dec_vanilla_trainer.py		snli_dec_vanilla_trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Decomposable Attention Model for Natural Language Inference

[email protected]

reference

data reference

1. A Decomposable Attention Model

2. Test result

2.1 Variants model

Vanilla : 논문에서 기본적으로 기술한 내용의 모델이다. 워드 임베딩 -> FC layer -> attention의 흐름이다.

Bi-RNN : 위 버전에서 조금 나은 성능향상을 위해 개인적으로 시도해보았다. 워드 임베딩 -> Bi-RNN(GRU) -> attention의 흐름이다.

2.2 Examples

About

Releases

Packages

Languages

cuteboydot/Decomposable-Attention

Folders and files

Latest commit

History

Repository files navigation

A Decomposable Attention Model for Natural Language Inference

[email protected]

reference

data reference

1. A Decomposable Attention Model

2. Test result

2.1 Variants model

Vanilla : 논문에서 기본적으로 기술한 내용의 모델이다. 워드 임베딩 -> FC layer -> attention의 흐름이다.

Bi-RNN : 위 버전에서 조금 나은 성능향상을 위해 개인적으로 시도해보았다. 워드 임베딩 -> Bi-RNN(GRU) -> attention의 흐름이다.

2.2 Examples

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages