-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add blog 2024-12-05-tattle-mlcommons (#210)
- Loading branch information
1 parent
188bade
commit e7aff10
Showing
1 changed file
with
34 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
--- | ||
name: Launch of V1.0 AI Luminate, and how Tattle is involved | ||
excerpt: Tattle's work with ML Commons on AI Safety | ||
author: Tattle | ||
project: "" | ||
date: 2024-12-05 | ||
tags: responsible-ai | ||
--- | ||
## Launch of V1.0 AI Luminate, and how Tattle is involved | ||
|
||
Earlier this year, ML Commons, a global organisation which works to improve AI systems issued an expression of interest for creating prompts in non-English languages. | ||
Tattle was selected as a pilot project to contribute to the benchmark in Hindi, using the participatory approach we followed with Uli [^1], and we commenced working on this project. | ||
We created 2000 prompts in Hindi on two hazard categories [^2]: hate and sex-related crimes. | ||
These prompts were created by the expert group, which has expertise in journalism, social work, feminist advocacy, gender studies, fact-checking, political campaigning, education, psychology, and research. All of the experts were native or fluent Hindi speakers. | ||
|
||
The project took place over the course of 2 months, where we conducted online sessions with the experts organised into groups. | ||
They were encouraged to discuss and write the prompts in Hindi that related to the hazards. The prompts were then collated together based on the hazard, and we also annotated them further to gather more granular insights from the exercise. | ||
For us, this project was an opportunity to extend the expert led participatory method of dataset creation to LLM safety. | ||
|
||
MLCommons is now releasing the v1 Safety Benchmark dataset, AI Luminate. It is an important step in assessing the safety of LLMs. | ||
Our project provided interesting insights on the universality of the framework proposed in v0.5. | ||
We conclude our report, available [here](https://mlcommons.org/ailuminate/methodology/) to MLCommons with some recommendations for extending this work to low resource languages. | ||
In addition to contributing to AI Luminate, we also engaged in an extensive landscape analysis of large language models and their coverage of Indian languages. | ||
In the study, we looked at existing evaluation datasets and methodologies used to assess the performance of LLMs across various language tasks. | ||
For a set of models that support Indian languages, we also analyzed attributes such as the training data, the distribution of Indian languages within it, access, licensing, and the types of LLM. | ||
|
||
Take a look at AI Luminate [here](https://mlcommons.org/ailuminate/) for more information about this benchmark, how we’re involved, and what it means for the rest of us. | ||
|
||
[^1]: https://aclanthology.org/2024.woah-1.16/ | ||
[^2]: https://arxiv.org/html/2404.12241v1 | ||
|
||
|
||
|
||
|