Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning
wget https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip
create output path in project path:
mkdir output
python3 bert_lstm_ner.py \
--task_name="NER" \
--do_train=True \
--do_eval=True \
--do_predict=True
--data_dir=NERdata \
--vocab_file=checkpoint/vocab.txt \
--bert_config_file=checkpoint/bert_config.json \
--init_checkpoint=checkpoint/bert_model.ckpt \
--max_seq_length=128 \
--train_batch_size=32 \
--learning_rate=2e-5 \
--num_train_epochs=3.0 \
--output_dir=./output/result_dir/
if os.name == 'nt': #windows path config
bert_path = '{your BERT model path}'
root_path = '{project path}'
else: # linux path config
bert_path = '{your BERT model path}'
root_path = '{project path}'
Than Run:
python3 bert_lstm_ner.py
Just alter bert_lstm_ner.py line of 450, the params of the function of add_blstm_crf_layer: crf_only=True or False
ONLY CRF output layer:
blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
rst = blstm_crf.add_blstm_crf_layer(crf_only=True)
BiLSTM with CRF output layer
blstm_crf = BLSTM_CRF(embedded_chars=embedding, hidden_unit=FLAGS.lstm_size, cell_type=FLAGS.cell, num_layers=FLAGS.num_layers,
dropout_rate=FLAGS.droupout_rate, initializers=initializers, num_labels=num_labels,
seq_length=max_seq_length, labels=labels, lengths=lengths, is_training=is_training)
rst = blstm_crf.add_blstm_crf_layer(crf_only=False)
all params using default
last two result are label level result, the entitly level result in code of line 796-798,this result will be output in predict process. show my entity level result :
If model is train finished, just run
python3 terminal_predict.py
if you want to use your own data to train ner model,you just need to modify the get_labes function.
def get_labels(self):
return ["O", "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]
NOTE: "X", “[CLS]”, “[SEP]” These three are necessary, you just replace your data label to this return list.
Or you can use last code lets the program automatically get the label from training data
def get_labels(self):
# 通过读取train文件获取标签的方法会出现一定的风险。
if os.path.exists(os.path.join(FLAGS.output_dir, 'label_list.pkl')):
with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'rb') as rf:
self.labels = pickle.load(rf)
else:
if len(self.labels) > 0:
self.labels = self.labels.union(set(["X", "[CLS]", "[SEP]"]))
with codecs.open(os.path.join(FLAGS.output_dir, 'label_list.pkl'), 'wb') as rf:
pickle.dump(self.labels, rf)
else:
self.labels = ["O", 'B-TIM', 'I-TIM', "B-PER", "I-PER", "B-ORG", "I-ORG", "B-LOC", "I-LOC", "X", "[CLS]", "[SEP]"]
return self.labels
2019.1.9: Add code to remove the adam related parameters in the model, and reduce the size of the model file from 1.3GB to 400MB.
2019.1.3: Add online predict code