Skip to content

Latest commit

 

History

History

AAAI2021_SSAN

SSAN

Introduction

This is the PaddlePaddle implementation of the SSAN model (see our AAAI2021 paper: Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).
SSAN (Structured Self-Attention Network) is a novel extension of Transformer to effectively incorporate structural dependencies between input elements. And in the scenerio of document-level relation extraction, we consider the structure of entities. Specificly, we propose a transformation module, that produces attentive biases based on the structure prior so as to adaptively regularize the attention flow within and throughout the encoding stage. We achieve SOTA results on several document-level relation extraction tasks.
This implementation is adapted based on ERNIE repo, you can find the main revision for SSAN model in ./model/SSAN_encoder.py#L123-L150.

Tagging Strategy

Requirements

  • python3.7, paddlepaddle-gpu==1.6.3.post107, dataclasses
  • This implementation is tested on a single 32G V100 GPU with CUDA version=10.2 and Driver version=440.33.01.

Prepare Model and Dataset

Download pretrained ERNIE model.

cd ./pretrained_lm/
wget https://ernie.bj.bcebos.com/ERNIE_Base_en_stable-2.0.0.tar.gz
mkdir -p ./ernie2_base_en && tar -zxvf ERNIE_Base_en_stable-2.0.0.tar.gz -C ./ernie2_base_en
wget https://ernie.bj.bcebos.com/ERNIE_Large_en_stable-2.0.0.tar.gz
mkdir -p ./ernie2_large_en && tar -zxvf ERNIE_Large_en_stable-2.0.0.tar.gz -C ./ernie2_large_en

Download DocRED dataset into ./data, including train_annotated.json, dev.json and test.json.

Train

sh train.sh

Train and eval SSAN. Model will be saved in ./checkpoints, and the best threshold for relation prediction will be searched on dev set when evaluation.
By default you are running SSAN based on ERNIE Base, set --with_ent_structure to false and the model will fall back to ERNIE Base Baseline. If you want to train ERNIE Large models, just set model path to ./pretrained_lm/ernie2_large_en.

Predict

sh predict.sh

Set your checkpoint directory and threshold for prediction. The result will be saved as ./data/result.json.
You can compress and upload it to the official competition leaderboard at CodaLab.

cd ./data/
zip result.zip result.json

Results

Results on DocRED datasets:

Model Dev F1 Test Ign F1 Test F1
ERNIE Base Baseline 58.54 55.58 57.71
SSANBiaffine 59.12(+0.58) 57.07(+1.49) 59.05(+1.34)
ERNIE Large Baseline 60.25 57.87 60.11
SSANBiaffine 61.58(+1.33) 58.96(+1.09) 61.17(+1.06)

We set learning rate = 3e-5, batch size = 4, and search for the best epochs among (40, 60, 80, 100) on development set.

Citation

If you use any source code included in this project in your work, please cite the following paper:

@article{xu2021entity,
  title={Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction},
  author={Xu, Benfeng and Wang, Quan and Lyu, Yajuan and Zhu, Yong and Mao, Zhendong},
  journal={arXiv preprint arXiv:2102.10249},
  year={2021}
}

Copyright and License

Copyright 2021 Baidu.com, Inc. All Rights Reserved Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.