-
Notifications
You must be signed in to change notification settings - Fork 3
/
README
43 lines (25 loc) · 1.22 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
https://wiki.cimec.unitn.it/tiki-index.php?page=CLIC
https://zenodo.org/record/2787612#.Xpc-Sy85T1A
The idea is to run the sentences A and B from the SICK.txt corpus
through FreeLing, and use its tokenization to feed SynaxNet to obtain
the UD dependencies. We generate a CONLL file with the SynaxNet
output, augmented with the lemmas and senses obtained from FreeLing.
We needed to disable FreeLing's MWE module.
We also add the original POS tag from Freeling to the MISC field so we
can compare to SyntaxNet's. The POS tag from Freeling is relevant to
understand the sense selected.
To reproduce:
1. Download SICK.txt from here:
http://clic.cimec.unitn.it/composes/sick.html
2. Install Freeling 4.0 from:
http://nlp.cs.upc.edu/freeling/
3. Install SyntaxNet from:
https://github.com/tensorflow/models/tree/master/syntaxnet
4. Download the English UD pre-trained model from:
https://github.com/tensorflow/models/blob/master/syntaxnet/universal.md
5. Download the SUMO knowledge base from:
https://github.com/ontologyportal/sumo
6. Fix the paths in env.sh, process-freeling.py, and parse.sh to point
to your local installation paths.
7. Run the stats.sh script
See README in pairs/ and derivated/ for more information.