This repository is the collection of Assignments implemented in the lecture of Big Data Analytics at the Uniersity of Trier.
Assignment 1 Data Quality: Read, Analysis and Plot data using Python
Assignment 2 Evaluation of Blocking Strategies with Python
Assignment 3 Hadoop: MapReduce, implementing WordCount, Grep and Inverted Index
Assignment 4 Hadoop: Blocking for entity resolution and Aggregation
Assignment 5 Hadoop: Joins and N-grams
Assignment 6 Hadoop: Shingling and Text Similarity in MapReduce