Skip to content

Machine Learning for Sequential Data, HMM model to generate and classify reviews, word2vec to get word embeddings, autoregressive NTPP to predict events occurrence, hate speech detection.

Notifications You must be signed in to change notification settings

msskzx/ml-sequential

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning for Sequential Data

Outline

  • Hidden Markov Model
  • Word2Vec
  • Neural Temporal Point Processes
  • Hate speech detection
  • Toxic speech classification

Hidden Markov Model

Implemented Hidden Markov Model (HMM) to generate and classify reviews

Word2Vec

Implemented Word2Vec to obtain word embeddings

Neural Temporal Point Processes

Implemented Neural Temporal Point Processes to model the time of occurrence of events

Hate Speech Detection

Implemented a Fully Connected Neural Network to detect hate speech in tweets.

  • Tweets labels: RACIST, SEXIST, NEITHER

  • Data Preprocessing

    • LabelEncoder for labels
    • Universal Sentence Encoder to get text embeddings

Toxic Speech Classification

Employed DistilBERT to classify tweets to detec toxic speech.

  • Tweets labels: none, racism, sexism

  • Data Preprocessing

    • LabelEncoder for labels
    • BertTokenizer to get text tokens
    • padding
    • CustomDataset
    • Split to Train/Val/Test 60/20/20
  • Model: DistilBERT

  • Explanation using SHAP

shap sexism

About

Machine Learning for Sequential Data, HMM model to generate and classify reviews, word2vec to get word embeddings, autoregressive NTPP to predict events occurrence, hate speech detection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages