Skip to content

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink.

License

Notifications You must be signed in to change notification settings

angeligareta/flink-overview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Flink Overview

Final project of Cloud Computing and Big Data Ecosystems Design subject of the EIT Digital data science master at UPM

UPM License GitHub contributors

Aim

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink. The goal is to inform about the trips ending at JFK airport with two or more passengers each hour for each vendorID.

The output format is: vendorID, tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count.

Tools

It is fully developed using Java 8 and using lambda for the Apache Flink pipeline.

Authors

About

This project aims to predict the delays on the Yellow taxi dataset, by implementing an application based on Apache Flink.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages