Skip to content

Commit

Permalink
chore: update uli-for-ts.mdx (#610)
Browse files Browse the repository at this point in the history
  • Loading branch information
kaustubhavarma committed Aug 8, 2024
1 parent 65a45be commit 80cb087
Showing 1 changed file with 32 additions and 6 deletions.
38 changes: 32 additions & 6 deletions uli-website/src/pages/uli-for-ts.mdx
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
import ContentPageShell from "../../components/molecules/ContentPageShell.jsx"

<ContentPageShell>

## Uli for Trust and Safety Teams

Uli relies on a number of resources such as :
1. A list of slurs in Indian languages, crowdsourced from activists and researchers
2. A dataset of abuse annotated by people at the receiving end of abuse
3. Translations of community guidelines
4. A machine learning model to detect abuse
### What is the Uli slur list
The Uli slur list is a dataset of slurs and coded language/dog whistling terms in Indian languages. While we call it the slur list, it is more accurately a dataset that contains phrases as well as the metadata on the words, such as what makes the word problematic, whether it has been reclaimed and the identity groups targeted.


### Why was the Uli slur list created?

The Uli slur list was crowdsourced with researchers and activists in the process of building a robust dataset for the Uli plugin feature. It has now become a stand-alone resource that supports Trust and Safety teams and researchers.



### How is it created?

The slur list is crowdsourced with the assistance of researchers and activists in the gender and feminist rights sector who have so far contributed slurs in Indian English, Hindi, Tamil and Malayalam. This takes place through online annotation sessions, conducted in line with our [annotation guidelines](https://docs.google.com/document/d/18H4TlLFB2GXK054oMj1uXVJ2OCFW08Gi/edit).



### What are the future plans?

We are continuing to conduct crowdsourcing sessions to expand slur list into more languages and improve our understanding of the slurs with metadata. We are also iterating the plugin to enable social features that will allow people to make contributions to our slur list through it directly.

Seperately, we are also working on a framework that could guide us on compensating the annotators who contribute to Uli's datasets. As a part of Mozilla's Data Futures Lab cohort, we wrote a White Paper that seeks to understand the different ways in which projects value the contribution of expert annotators- if you'd like a copy of it, please reach out to us.


### I want to use this list. How should I do that?

Some versions of the slur list are open access, and available on our Github [repo](https://github.com/tattle-made/Uli/tree/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts).
However, this list is continuously iterated on by detecting spelling variations, adding metadata, and expanding languages. Please reach out to [email protected] and [email protected] if you would like to access the most up to date version.

Many of these resources are open but some are not. If you are working in South Asia or with South Asian diasporic communities and are interested in any of the backend resources for ensuring safety of your community online, please send an email to [email protected].
</ContentPageShell>

0 comments on commit 80cb087

Please sign in to comment.