Add files via upload

Tian14267 · Mar 10, 2024 · b20ff7f · b20ff7f
1 parent 172b233
commit b20ff7f
Show file tree

Hide file tree

Showing 13 changed files with 5,653 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -1 +1,82 @@
-# LS_mBert
+# LS-mBERT
+
+
+
+### Configuration Environment
+
+
+- Install Conda:  https://docs.conda.io/en/latest/miniconda.html
+- Found Your Conda environment:
+
+``` sh
+conda create -n LsMbert python=3.8
+conda activate LsMbert
+pip install -r requirements.txt
+```
+
+
+### Data Preparation
+The next step is to download the data. To this end, first create a `download` folder with `mkdir -p download` in the root 
+of this project. You then need to manually download `panx_dataset` (for NER) from [here](https://www.amazon.com/clouddrive/share/d3KGCRCIYwhKJF0H3eWA26hjg2ZCRhjpEQtDL70FSBN)
+(note that it will download as `AmazonPhotos.zip`) to the download directory. Finally, run the following command to 
+download the remaining datasets:
+
+```bash
+bash scripts/download_data.sh
+```
+
+To get the POS-tags and dependency parse of input sentences, we use UDPipe. Go to the 
+[udpipe](https://github.com/wasiahmad/Syntax-MBERT/tree/main/udpipe) directory and run the task-specific scripts -
+`[xnli.sh|pawsx.sh|panx.sh|mtop.sh]`.
+
+Notice：`Data Preparation` is same as [here](https://github.com/wasiahmad/Syntax-MBERT)
+
+
+### Training 
+#### Language Map
+You can download files from [here](https://pan.baidu.com/s/16xabmTo9_bU0HES20o6v4w?pwd=oor6),
+and put it to file directory.\
+Or, you can create it yourself with your own data.
+
+#### Text Classification
+```bash
+# for PAWS-X
+sh run_pawsx.sh
+
+# for XNLI
+sh run_xnli.sh 
+```
+
+#### Named Entity Recognition
+```bash
+sh run_ner.sh
+```
+
+
+#### Task-oriented Semantic Parsing
+
+```bash
+sh mtop.sh
+```
+
+
+### Evaluation 
+Take `mtop` as an example.\
+**First**:  Download the relevant data from Baidu link [here](https://pan.baidu.com/s/1Jwf34v42yQaazN68VoHd6Q?pwd=1mow)\
+**Second**:  Download the model from Baidu link [here](https://pan.baidu.com/s/1uiJRy_qn57F0SNIhw6-jZA?pwd=aare)
+
+```
+export CUDA_VISIBLE_DEVICES=0
+Output_dir="./outputs/mtop_paper"
+python mtop_paper.py \
+    --data_dir "./download/mtop_udpipe_processed" \
+    --model_name_or_path "./outputs/mtop_model" \
+    --intent_labels "./download/mtop_udpipe_processed/intent_label.txt" \
+    --slot_labels "./download/mtop_udpipe_processed/slot_label.txt" \
+    --do_test \
+    --train_langs "en" \
+    --output_dir $Output_dir
+```
+
+The test of other task , or the other ideas if you have ,You can try it yourself.
+
diff --git a/__init__.py b/__init__.py
@@ -0,0 +1,14 @@
+# coding=utf-8
+# Copyright 2020 Google and DeepMind.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.