Skip to content

shpsi/Real-Time-Log-Analytics-using-Cloud-computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cloud-Computing

As a part of the Cloud Computing course, we worked on a range of bi-weekly projects involving different topics including Hadoop & MapReduce, Apache Spark, and many more.

Developed Spark programs to perform data analytics on the ‘hetrec2011-lastfm-2k'​ dataset. This dataset contains social networking, tagging, and music artist listening information from a set of 2K users from Last.fm online music system. We have also developed programs using Spark to perform real-time log analysis and provide execution time of data processing with and without cached RDD (resilient distributed dataset).

Configured Spark distribution on top of the Hadoop cluster. We used YARN for scheduling/running Spark applications on our setup. The entire Spark setup is configured on top of a two node Hadoop cluster.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published