Skip to content

Latest commit

 

History

History

isotropy_analysis

Measuring the Isotropy of Language Models


Catalogue:


1. Data Preparation: [Back to Top]

Before running the experiments, please make sure you have downloaded the WIT dataset as instructed [here].


2. Measure Isotropy: [Back to Top]

To measure the isotropy of a specific language model, please run the following commands:

cd ./scripts/
chmod +x ./inference.sh
./inference.sh

The arguments are as follows:

  • --test_path: The file path of the test data.
  • --max_len: The maximum length of a test sequence.
  • --language_code: The language code of the specific language model. See Section 3 for more details.
  • --model_name: The model name of the huggingface model. See Section 3 for more details.
  • --save_path_prefix: The directory used to save the inferenced result.

[Note] After completing the inference, the inferenced result will be saved in the directory of save_path_prefix + r'/{}/'.format(language_code).


3. Language Code and Model Card: [Back to Top]

In the following Table, we provide the models that we use in our experiments.

Language Model Language Code Model Name Model Size Model Card Isotropy
English GPT en gpt2
gpt2-medium
gpt2-large
gpt2-xl
117M
345M
774M
1.6B
[link]
[link]
[link]
[link]
0.10
0.25
0.70
0.72
English GPT-Neo en EleutherAI/gpt-neo-125M
EleutherAI/gpt-neo-1.3B
EleutherAI/gpt-neo-2.7B
125M
1.3B
2.7B
[link]
[link]
[link]
0.68
0.55
0.60
English OPT en facebook/opt-125m
facebook/opt-350m
facebook/opt-1.3b
facebook/opt-2.7b
facebook/opt-6.7b
facebook/opt-13b
facebook/opt-30b
125M
350M
1.3B
2.7B
6.7B
13B
30B
[link]
[link]
[link]
[link]
[link]
[link]
[link]
0.75
0.69
0.75
0.74
0.70
0.66
0.68
Spanish GPT es datificate/gpt2-small-spanish
DeepESP/gpt2-spanish-medium
117M
345M
[link]
[link]
0.77
0.76
French GPT fr asi/gpt-fr-cased-small 117M [link] 0.76
Portuguese GPT pt pierreguillou/gpt2-small-portuguese 117M [link] 0.77
Thai GPT th flax-community/gpt2-base-thai 117M [link] 0.74
Japanese GPT ja colorfulscoop/gpt2-small-ja 117M [link] 0.72
Korean GPT ko skt/kogpt2-base-v2
skt/ko-gpt-trinity-1.2B-v0.5
117M
1.6B
[link]
[link]
0.58
0.68
Chinese GPT zh uer/gpt2-chinese-cluecorpussmall 117M [link] 0.66
Indonesian GPT id cahya/gpt2-small-indonesian-522M
flax-community/gpt2-medium-indonesian
cahya/gpt2-large-indonesian-522M
117M
345M
774M
[link]
[link]
[link]
0.66
0.67
0.81
Bengali GPT bn flax-community/gpt2-bengali 117M [link] 0.62
Hindi GPT hi surajp/gpt2-hindi 117M [link] 0.62
Arabic GPT ar akhooli/gpt2-small-arabic
aubmindlab/aragpt2-medium
117M
345M
[link]
[link]
0.53
0.64
German GPT de ml6team/gpt2-small-german-finetune-oscar
ml6team/gpt2-medium-german-finetune-oscar
117M
345M
[link]
[link]
0.83
0.81
Dutch GPT nl ml6team/gpt2-small-dutch-finetune-oscar
ml6team/gpt2-medium-dutch-finetune-oscar
117M
345M
[link]
[link]
0.80
0.79
Russian GPT ru sberbank-ai/rugpt3small_based_on_gpt2
sberbank-ai/rugpt3medium_based_on_gpt2
sberbank-ai/rugpt3large_based_on_gpt2
117M
345M
774M
[link]
[link]
[link]
0.67
0.72
0.77
Italian GPT it LorenzoDeMattei/GePpeTto 117M [link] 0.69

Acknowledgements

We thank the research community for open-sourcing these wonderful language models!