Skip to content

Latest commit

 

History

History
20 lines (13 loc) · 1.39 KB

README.md

File metadata and controls

20 lines (13 loc) · 1.39 KB

2IMW10 - Data Engineering

Project for a course at the University of Technology Eindhoven.

Course description

We study models of contemporary data intensive systems and their practical use. These models are among: Graph databases, Data warehousing and online analytical processing (ROLAP, MOLAP, etc.), Document databases (NoSQL, JSON stores, etc.), Parallel and distributed data processing (MapReduce, etc.), and Deductive databases (Datalog). We discuss why these models were introduced, their relative advantages and disadvantages, how to use them in practice, and, at a high level, how they are implemented. Unlike the subject Database Technology (2ID35) which focuses primarily on systems internals and their efficient implementation at a lower level, the primary goal of this subject is to develop the practical ability to engineer non-trivial data intensive applications based on a solid understanding of the underlying engineering principles. Towards this goal, hands-on practical assignment(s) using contemporary frameworks and technologies are a central component of the course.

Project description

Our goal is to visualize the evolvement (e.g. over time) of communities in the co-authorship dataset. In order to do this, we make use of metadata which indicates when an edge was added to the graph.

Libraries

  • Apache Flink
  • Gelly
  • GraphStream
  • XChart

Authors

S. Luijten, G. Mak & B. Lyu