Update README.md

UsamaI000 · Jul 9, 2020 · b6eb268 · b6eb268
1 parent 58edb3a
commit b6eb268
Showing 1 changed file with 84 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -35,10 +35,92 @@ In this project, sentiment analysis is done on Covid19 related tweets from diffe
   <b> Link: https://drive.google.com/drive/folders/1dVr4yYlptJefiooO_lyvGzyPSha44QNF?usp=sharing </b>
 
 ## Proposed Solution
-The first step is to clean up the raw text data.  In tweets,several stop words needs to be removed e.g.  prepositions,mentions,  hashtags,  URLs,  etc.   After a cleanup,  the dataneeds  to  be  converted  in  vector  form  to  feed  to  a  DeepNeural Network.  For the word2vec conversion, skip grammodel  is  used. This  model  learns  the  vector  representa-tion from the raw data using the similarity between severalwords  based  in  their  context. The  objective of the Skip-gram model is to learn word representations that are usefulfor predicting the nearby words in a document. Formally, given a sequence of training words/sentence, the objective of the Skip-gram model is to maximize the average log probability.
+The first step is to clean up the raw text data.  In tweets,several stop words needs to be removed e.g.  prepositions,mentions,  hashtags,  URLs,  etc.   After a cleanup,  the dataneeds  to  be  converted  in  vector  form  to  feed  to  a  DeepNeural Network.  For the word2vec conversion, skip grammodel  is  used. This  model  learns  the  vector  representation from the raw data using the similarity between severalwords  based  in  their  context. The  objective of the Skip-gram model is to learn word representations that are usefulfor predicting the nearby words in a document. Formally, given a sequence of training words/sentence, the objective of the Skip-gram model is to maximize the average log probability.
 
 <p align="center">
   <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/w2v.png">
 </p>
 
-After the conversion of words to usable representation,the next step is to feed it to a classifier. RNN and LSTM are commonly used to extract the global information fromthe data.  RCNN on the other hand, maintains the local in-formation which signifies the prominent features within thelimited  context  of  the  document. In this way, an overallresponse  can  be  pooled  at  the  end  which  can  better  helpduring the classifications. In this model, we use a recurrentarchitecture, which is a bidirectional recurrent network, tocapture the contexts. The recurrent structure can obtain allclin a forward scan of the text andcrin a backward scan of the text. After we obtain the representation of the word, 
+After the conversion of words to usable representation,the next step is to feed it to a classifier. RNN and LSTM are commonly used to extract the global information fromthe data.  RCNN on the other hand, maintains the local in-formation which signifies the prominent features within thelimited  context  of  the  document. In this way, an overall response  can  be  pooled  at  the  end  which  can  better  helpduring the classifications. In this model, we use a recurrent architecture, which is a bidirectional recurrent network, to capture the contexts. The recurrent structure can obtain all context in a forward scan of the text and context in a backward scan of the text. After we obtain the representation of the word, we pass it to the Max-pool layer which gets the most dominant features which are then passed to the FC layer to get classified.
+
+<p align="center">
+  <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/architecture.jpeg">
+</p>
+
+## Training Setup
+We used this RCNN model to train it on the Covid-19 UCD dataset which had five emotion classes i.e. anger, fear, sadness, confident and analytical. We performed a total of 5 experiments. Initial two experiments were to make comparison of LSTM and RCNN on a twitter sentiment dataset i.e. Sentiment140. Other experiments were done of Covid-19 UCD data with two focal losses which are Cross Entropy and Focal Loss. The last experiment is done using Weighted Cross Entropy to handle dataset imbalance.
+
+Following configurations were used for final model training.
+  - Batch Size: 64
+  - Embedding Dimension: 300
+  - Embedding Layers: 3
+  - Learning rate: 0.005
+  - Optimizer: SGD
+  - Loss: Weighted Cross Entropy
+
+  ### Experiments
+   - Experiment 1: Sentiment140 Dataset Performed on LSTM and RCNN
+      - With stop words
+      - Without stop words
+   - Experiment 2: Covid UCD Challenge Performed on LSTM and RCNN
+      - With stop words
+      - Without stop words
+   - Experiment 3: Training model on Covid-19 UCD data using Focal Loss
+   - Experiment 4: Training model on Covid-19 UCD data using Weighted Cross Entropy and Focal Loss to handle imbalanced data.
+     - With stop words
+     - Without stop words
+   - Experiment 5: Training on Best performing model.
+
+## Results
+<p align="center">
+  <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/Capture.PNG">
+</p>
+
+## Analysis
+
+  ### Date wise trend
+  We analyzed the predicted tweets data to get information on how people felt (anger, sadness, fear etc) in different countries during Covid-19. The tweets were gathered from     Fabruary to July. Below are the figures that shows the trend.
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_Pakistan.png"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_Canada.png"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_India.png"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_Nigeria.png"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_United Kingdom.png"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/datewise_country_emotion_United States.png"> </p>
+
+
+  ### Emotion in different countries
+  Below is the plot that explain the emotion of people in different countries towards Covid-19. The plot shows that the most of the people in different countries were confident   during this time of Global pandemic. There was an emotion of fear which kept changing during the timeline.
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/country_emotion.png"> </p>
+
+
+  ### Deaths in countries
+  We also performed analysis of deaths per day due to Covid-19 in different countries
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/pakistan.PNG"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/canada.PNG"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/india.PNG"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/nigeria.PNG"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/uk.PNG"> </p>
+
+
+  <p align="center"> <img src="https://github.com/UsamaI000/G2H_Project_DLSpring2020/blob/master/images/us.PNG"> </p>