Tibetan Adversarial Attack

Introduction

This repo is the attacker part in the paper below.

Multi-Granularity Tibetan Textual Adversarial Attack Method Based on Masked Language Model (Cao et al., WWW 2024 Workshop - SocialNLP)

⬆️ commit id: 67fdf3500ec3ccd363cfefc884997d7e40169b82

Pay Attention to the Robustness of Chinese Minority Language Models! Syllable-level Textual Adversarial Attack on Tibetan Script (Cao et al., ACL 2023 Workshop - TrustNLP)

⬆️ commit id: a17c605d44d53b222b0127f77643519ae33aefd9

We developed several Tibetan adversarial attack methods based on OpenAttack (OpenAttack: An Open-source Textual Adversarial Attack Toolkit (Zeng et al., ACL 2021)).

⬆️ commit id: 4df712e0a5aebc03daa9b1ef353da4b7ea0a1b23

Usage Method

You need to put the fine-tuned LMs into the dirs (data/Victim.XLMROBERTA.CINO-SMALL-V2_TNCC-TITLE, data/Victim.XLMROBERTA.CINO-SMALL-V2_TUSA, data/Victim.XLMROBERTA.CINO-BASE-V2_TNCC-DOCUMENT, data/Victim.XLMROBERTA.CINO-BASE-V2_TNCC-TITLE, data/Victim.XLMROBERTA.CINO-BASE-V2_TUSA, data/Victim.XLMROBERTA.CINO-LARGE-V2_TNCC-DOCUMENT, data/Victim.XLMROBERTA.CINO-LARGE-V2_TNCC-TITLE, data/Victim.XLMROBERTA.CINO-LARGE-V2_TUSA, data/Victim.XLMROBERTA.TIBETAN-BERT_TNCC-TITLE, data/Victim.XLMROBERTA.TIBETAN-BERT_TUSA, etc.).
You need to download and unzip the Tibetan word vectors (Learning Word Vectors for 157 Languages (Grave et al., LREC 2018)) into the dir (data/AttackAssist.TibetanWord2Vec).
You need to put the pre-trained LM: Tibetan-BERT (Research and Application of Tibetan Pre-training Language Model Based on BERT (Zhang et al., ICCIR 2022)), TiBERT (TiBERT: Tibetan Pre-trained Language Model (Liu et al., SMC 2022)), etc. into the dirs (data/AttackAssist.Tibetan_BERT, data/AttackAssist.TiBERT, etc.).
You need to put the trained model: segbase.cpkt (link: https://pan.baidu.com/s/1j_60cDWVlfryikaP-1Nvbw password: 19pe) of TibetSegEYE (https://github.com/yjspho/TibetSegEYE) into the dir (data/AttackAssist.TibetSegEYE).
You need to follow the OpenAttack README (OpenAttack: An Open-source Textual Adversarial Attack Toolkit (Zeng et al., ACL 2021)) to install the development environment.
You can run the attack scripts in the dir (demo_tibetan).

Citation

If you think our work useful, please kindly cite our paper.

@inproceedings{10.1145/3589335.3652503,
    author = {Cao, Xi and Qun, Nuo and Gesang, Quzong and Zhu, Yulei and Nyima, Trashi},
    title = {Multi-Granularity Tibetan Textual Adversarial Attack Method Based on Masked Language Model},
    year = {2024},
    isbn = {9798400701726},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3589335.3652503},
    doi = {10.1145/3589335.3652503},
    booktitle = {Companion Proceedings of the ACM on Web Conference 2024},
    pages = {1672–1680},
    numpages = {9},
    keywords = {language model, robustness, textual adversarial attack, tibetan},
    location = {Singapore, Singapore},
    series = {WWW '24}
}

@inproceedings{cao-etal-2023-pay-attention,
    title = "Pay Attention to the Robustness of {C}hinese Minority Language Models! Syllable-level Textual Adversarial Attack on {T}ibetan Script",
    author = "Cao, Xi  and
      Dawa, Dolma  and
      Qun, Nuo  and
      Nyima, Trashi",
    booktitle = "Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.trustnlp-1.4",
    pages = "35--46"
}

Name		Name	Last commit message	Last commit date
Latest commit History 705 Commits
.github/workflows		.github/workflows
OpenAttack		OpenAttack
data		data
dataset		dataset
dataset_loader		dataset_loader
demo_tibetan		demo_tibetan
docs		docs
examples		examples
log		log
slow_tests		slow_tests
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_OpenAttack.md		README_OpenAttack.md
build-doc.py		build-doc.py
build-doc.sh		build-doc.sh
demo.py		demo.py
demo_chinese.py		demo_chinese.py
demo_deo.py		demo_deo.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements-doc.txt		requirements-doc.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tibetan Adversarial Attack

Introduction

Usage Method

Citation

About

Languages

License

metaphors/TibetanAdversarialAttack

Folders and files

Latest commit

History

Repository files navigation

Tibetan Adversarial Attack

Introduction

Usage Method

Citation

About

Resources

License

Stars

Watchers

Forks

Languages