About

DRR with encoder/decoder type model

CoNLL

About the task
About the dataset
tutorial on the data
2016 task results review

PDTB

Dataset breakdown

The Pitler et al 2009 breakdown:

Set	WSJ sections	Temporal	Contingency	Comparison	Expansion	EntRel
Training	2-20
Development	0-1, optionally can use 23-24
Test	21-22

Followed by, for example: Zhang et al 2015, Chen et al, 2016, [Ji and Eisenstein, 2015]

The CoNLL breakdown, recommended by the original PDTB 2.0 corpus:

Set	WSJ sections
Training	2-21
Development	22
Test	23

Followed by CoNLL, Wang and Lan, 2016

Types

According to the official PDTB summary:

PDTB Relations	No. of tokens
Explicit	18459
Implicit	16224
AltLex	624
EntRel	5210
NoRel	254
Total	40600

Relations

CoNLL version classifies the lower 16 levels, and includes EntRel.

Top-level breakdown:

Top Level	Explicit (18459)	Implicit (16224)	AltLex (624)	Total
TEMPORAL	3612	950	88	4650
CONTINGENCY	3581	4185	276	8042
COMPARISON	5516	2832	46	8394
EXPANSION	6424	8861	221	15506
Total	19133	16828	634	36592

1st level, one-v-all

For higher level classification, such as in Chen et al, 2016, they experiment with one-v-all with negative sampling from section 2-20. They use the Pitler breakdown and merge EntRel with Expansion.

GRN

Gated Relevance Network. Summary:

BiLSTM + GRN + Pooling + MLP
Embedding: 50D, by Turian et al (2010) (not available online)
Embeddings fixed during training
Use only top 10k word by frequency
All text are set to 50 words
Parameters init between [-0.1, 0.1]

Results:

PDTB, top-level, Implicit, EntRel as Expansion

Type	Author	Comparison	Contingency	Expansion	Temporal
	Pitler et al., 2009	21.96%	47.13%	76.42%	16.76%
	Zhou et al., 2010	31.79%	47.16%	70.11%	20.30%
	Park and Cardie, 2012	31.32%	49.82%	79.22%	26.57%
	Rutherford and Xue, 2014	39.70%	54.42%	80.44%	28.69%
	Ji and Eisenstein, 2015	35.93%	52.78%	80.02%	27.63%
LSTM	Chen et al, 2016	31.78%	45.39%	75.10%	19.65%
Bi-LSTM + GRN	Chen et al, 2016	40.17%	54.76%	80.62%	31.32%

PDTB, top-level, Implicit, no EntRel

Type	Author	Comparison	Contingency	Expansion	Temporal
Shallow CNN	Zhang et al 2015	33.22%	52.04%	69.59%	30.54%

CoNLL English dataset (PDTB), low-level, Implicit F1 score

ID	Blind	Test	Dev
aarjay	9.95	15.6	36.85
BIT	19.3	16.5	17.36
clac	27.7	28.1	37.12
ecnucs	34.1	40.9	46.42
goethe	31.8	37.6	45.42
gtnlp	36.7	34.9	40.72
gw0	33.0	30.2	34.58
gw0	21.2	18.5	35.11
nguyenlab	31.4	28.8	34.31
oslopots	33.8	33.7	43.12
PurdueNLP	29.1	34.4	38.05
steven	23.5	20.5	26.68
tao0920	35.3	38.2	46.33
tbmihaylov	34.5	39.1	40.32
ykido	32.3	22.6	29.11
ttr	37.6	36.1	40.32

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
other/embedding		other/embedding
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
data_info.py		data_info.py
embeddings.py		embeddings.py
enc_dec.py		enc_dec.py
format_output.py		format_output.py
helper.py		helper.py
main.py		main.py
pdtb_helper.py		pdtb_helper.py
result_analysis.py		result_analysis.py
settings.json		settings.json
training.py		training.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

CoNLL

PDTB

Dataset breakdown

Types

Relations

1st level, one-v-all

GRN

Results:

Other refs

About

Releases

Packages

Languages

andrecianflone/seq2seq_drr

Folders and files

Latest commit

History

Repository files navigation

About

CoNLL

PDTB

Dataset breakdown

Types

Relations

1st level, one-v-all

GRN

Results:

Other refs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages