HaRe (Harassment Recognizer) is a command line tool and Python library to automatically detect harassment as it happens (real-time) with the help of machine learning techniques.
The easiest way to use HaRe is by simply loading a pretrained HaRe model included with this repo in the models
folder, like the one named 'moba':
from hare import load_pretrained
moba_hare = load_pretrained('moba')
You can then use this object to monitor conversations in progress. Let's start a conversation and ask HaRe to monitor it:
from hare import Conversation
convo = Conversation()
moba_hare.add_conversation(convo)
At any point in time, you can then request the current status of the conversation according to this HaRe model:
convo.add_utterance(speaker='a',content='hello')
convo.add_utterance(speaker='b',content='hi everyone')
moba_hare.get_status()
You can also add multiple sentences at once; for example a whole conversation if it has already finished.
from hare import Utterance
convo.add_utterances([Utterance(speaker='a',content='good luck'),
Utterance(speaker='c',content='ur all n00bs')])
moba_hare.get_status()
If you add multiple conversations for Hare to monitor, you will need to specify the conversation index when asking for the status:
second_convo = Conversation()
convo_index = moba_hare.add_conversation(second_convo)
second_convo.add_utterance(speaker='a',content='hello')
second_convo.add_utterance(speaker='b',content='hi everyone')
moba_hare.get_status(id=convo_index)
If you have a labeled dataset (that is: for each conversation an indication which participants are considered toxic), HaRe can calculate to what extent its judgments match the labels. A label can range from the default 0 (not toxic at all) to 1 (maximally toxic). Let's label speaker c
:
convo.label_speaker('c',0.9)
There are several evaluation metrics, depending on what is important to you (detecting ALL harassment, detecting harassment quickly, no false positives, etc):
moba_hare.calculate_accuracy()
These metrics are calculated on the basis on all conversations the HaRe object is aware of that have at least 1 labeled participant. If you want to exclude a label from evaluation, simply add it to the conversations_excluded_for_evaluation
list:
moba_hare.conversations_excluded_for_evaluation = [0]
At some point, you might want to do some training yourself. This can for example be the case because you are applying HaRe in another domain than the pretrained models, and harassment looks slightly different there, or because you even want to detect something different than harassment.
Whatever your goals are, it is probably most effective to repurpose the existing HaRe models ('transfer learning'). To achieve this, simply load the pretrained model that best matches your goal, add some conversations and label them, like we have done above. If you want to exclude conversations from training, add them to the conversations_excluded_for_training
list:
from hare import load_pretrained, Utterance
moba_hare = load_pretrained('moba')
moba_hare.start_conversation(conversation_id='convo_001')
moba_hare.label_speaker('b',1)
moba_hare.add_utterances([Utterance(speaker='a',content='good luck'),
Utterance(speaker='b',content='ur all n00bs')])
moba_hare.start_conversation(conversation_id='convo_002')
moba_hare.label_speaker('b',1)
moba_hare.add_utterances([Utterance(speaker='a',content='hi'),
Utterance(speaker='b',content='SHUT UP!')])
moba_hare.conversations_excluded_for_training = ['convo_002']
Then, use the retrain
command to take the old model and refit it to you new dataset. You can then use save
to store it as a pretrained model in the models
folder. You can later access this model with load_pretrained
like the other pretrained models in that same folder.
moba_hare.retrain()
moba_hare.save(name='moba_extended')
If you don't want to use transfer learning with an existing model, you can also start from scratch. The procedure is largely the same, except that you don't use the load_pretrained
function, and use train
instead of retrain
:
from hare import Hare, Utterance
new_hare = Hare()
new_hare.start_conversation(conversation_id='convo_001')
new_hare.label_speaker('b',1)
new_hare.add_utterances([Utterance(speaker='a',content='good luck'),
Utterance(speaker='b',content='ur all n00bs')])
new_hare.start_conversation(conversation_id='convo_002')
new_hare.label_speaker('b',1)
new_hare.add_utterances([Utterance(speaker='a',content='hi'),
Utterance(speaker='b',content='SHUT UP!')])
new_hare.train()
It will probably be highly effective to use so-called word embeddings during training. You can see these embeddings as a dictionary that translates from a word's characters to an estimate of its meaning. Research has shown that classifying these 'meanings' is much more successful that classifying the raw words. [reference] . To use them, simply point your HaRe object to the embedding file in the embeddings
folder before training:
new_hare.embedding_file = 'english_large'
new_hare.train()
Of course, these embeddings should be in the same language as the rest of your dataset.