1. Data Preparation: [Back to Top]
Before running the experiments, please make sure you have downloaded the WIT dataset as instructed [here].
2. Measure Isotropy: [Back to Top]
To measure the isotropy of a specific language model, please run the following commands:
cd ./scripts/
chmod +x ./inference.sh
./inference.sh
The arguments are as follows:
--test_path
: The file path of the test data.--max_len
: The maximum length of a test sequence.--language_code
: The language code of the specific language model. See Section 3 for more details.--model_name
: The model name of the huggingface model. See Section 3 for more details.--save_path_prefix
: The directory used to save the inferenced result.
[Note] After completing the inference, the inferenced result will be saved in the directory of save_path_prefix + r'/{}/'.format(language_code)
.
3. Language Code and Model Card: [Back to Top]
In the following Table, we provide the models that we use in our experiments.
Language | Model | Language Code | Model Name | Model Size | Model Card | Isotropy |
---|---|---|---|---|---|---|
English |
GPT |
en |
gpt2 gpt2-medium gpt2-large gpt2-xl |
117M 345M 774M 1.6B |
[link] [link] [link] [link] |
0.10 0.25 0.70 0.72 |
English |
GPT-Neo |
en |
EleutherAI/gpt-neo-125M EleutherAI/gpt-neo-1.3B EleutherAI/gpt-neo-2.7B |
125M 1.3B 2.7B |
[link] [link] [link] |
0.68 0.55 0.60 |
English |
OPT |
en |
facebook/opt-125m facebook/opt-350m facebook/opt-1.3b facebook/opt-2.7b facebook/opt-6.7b facebook/opt-13b facebook/opt-30b |
125M 350M 1.3B 2.7B 6.7B 13B 30B |
[link] [link] [link] [link] [link] [link] [link] |
0.75 0.69 0.75 0.74 0.70 0.66 0.68 |
Spanish |
GPT |
es |
datificate/gpt2-small-spanish DeepESP/gpt2-spanish-medium |
117M 345M |
[link] [link] |
0.77 0.76 |
French |
GPT |
fr |
asi/gpt-fr-cased-small |
117M |
[link] | 0.76 |
Portuguese |
GPT |
pt |
pierreguillou/gpt2-small-portuguese |
117M |
[link] | 0.77 |
Thai |
GPT |
th |
flax-community/gpt2-base-thai |
117M |
[link] | 0.74 |
Japanese |
GPT |
ja |
colorfulscoop/gpt2-small-ja |
117M |
[link] | 0.72 |
Korean |
GPT |
ko |
skt/kogpt2-base-v2 skt/ko-gpt-trinity-1.2B-v0.5 |
117M 1.6B |
[link] [link] |
0.58 0.68 |
Chinese |
GPT |
zh |
uer/gpt2-chinese-cluecorpussmall |
117M |
[link] | 0.66 |
Indonesian |
GPT |
id |
cahya/gpt2-small-indonesian-522M flax-community/gpt2-medium-indonesian cahya/gpt2-large-indonesian-522M |
117M 345M 774M |
[link] [link] [link] |
0.66 0.67 0.81 |
Bengali |
GPT |
bn |
flax-community/gpt2-bengali |
117M |
[link] | 0.62 |
Hindi |
GPT |
hi |
surajp/gpt2-hindi |
117M |
[link] | 0.62 |
Arabic |
GPT |
ar |
akhooli/gpt2-small-arabic aubmindlab/aragpt2-medium |
117M 345M |
[link] [link] |
0.53 0.64 |
German |
GPT |
de |
ml6team/gpt2-small-german-finetune-oscar ml6team/gpt2-medium-german-finetune-oscar |
117M 345M |
[link] [link] |
0.83 0.81 |
Dutch |
GPT |
nl |
ml6team/gpt2-small-dutch-finetune-oscar ml6team/gpt2-medium-dutch-finetune-oscar |
117M 345M |
[link] [link] |
0.80 0.79 |
Russian |
GPT |
ru |
sberbank-ai/rugpt3small_based_on_gpt2 sberbank-ai/rugpt3medium_based_on_gpt2 sberbank-ai/rugpt3large_based_on_gpt2 |
117M 345M 774M |
[link] [link] [link] |
0.67 0.72 0.77 |
Italian |
GPT |
it |
LorenzoDeMattei/GePpeTto |
117M |
[link] | 0.69 |
We thank the research community for open-sourcing these wonderful language models!