Creating Arabic Datasets with respect to Arabic Dialects #103

AliAlsalkhadi · 2024-11-15T04:49:50Z

@Mahmoud-s-programs and I went through the articles recommended by @Sepideh-Ahmadian and after a long discussion to find the best way to gather the Arabic datasets with respect to the dialects is by creating different datasets for each region (Gulf, Levantine, Egyptian, Meghrbi). This will encapsulate all Arabic dialects and the model will be able to recognize them.

We have added more reviews to the semEval-2016 dataset already as it uses Gulf dialect exclusively.

Sepideh-Ahmadian · 2024-11-15T07:52:06Z

Thank you @AliAlsalkhadi and @Mahmoud-s-programs! We can discuss it in today's meeting.

Sepideh-Ahmadian · 2024-11-18T16:07:46Z

@AliAlsalkhadi and @Mahmoud-s-programs
The article we discussed in the LADy meeting "Datasheets for datasets". Please share your Gmail addresses so I can send you our English draft. In addition to the questions mentioned in the article, feel free to suggest any others that you think are important for our work.
Also access to the full SemEval 2016 dataset is available through this link, which contains a total of 4,802 sentences(only the train dataset).

AliAlsalkhadi · 2024-11-18T18:45:43Z

@Sepideh-Ahmadian Thanks for sharing this, here is my gmail: [email protected]
Will check out the full dataset

Sepideh-Ahmadian assigned AliAlsalkhadi Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating Arabic Datasets with respect to Arabic Dialects #103

Creating Arabic Datasets with respect to Arabic Dialects #103

AliAlsalkhadi commented Nov 15, 2024

Sepideh-Ahmadian commented Nov 15, 2024

Sepideh-Ahmadian commented Nov 18, 2024

AliAlsalkhadi commented Nov 18, 2024

Creating Arabic Datasets with respect to Arabic Dialects #103

Creating Arabic Datasets with respect to Arabic Dialects #103

Comments

AliAlsalkhadi commented Nov 15, 2024

Sepideh-Ahmadian commented Nov 15, 2024

Sepideh-Ahmadian commented Nov 18, 2024

AliAlsalkhadi commented Nov 18, 2024