Skip to content

Synthetic data generators for simulating real-time data and work loads

License

Notifications You must be signed in to change notification settings

pradeeppanga/generator

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cloudwick Labs Synthetic Data Generators

These data generators mocks several real-life situations. Purpose built for research and development of several big data use cases.

Generator Description Schema Scope
log mocks the logs generated by apache httpd Log Schema 1. Real time data ingestion (using Flume & MapReduce)
2. Real time analytics (using Storm and kafka)
3. Click stream analytics
4. NoSql
odvs mocks the data generated by a real life on-demand video service provider like netflix, hulu & amazon prime ODVS Schema 1. NoSql
2. Analytics using MapReduce/Hive
osge mocks the data generated by a online social gaming entertainment provider like second life, imvu, onverse... OSGE Schema 1. NoSql
2. Analytics using MapReduce/Hive

###Building the project This project requires sbt, installation instructions found here

Once, sbt is installed use the assembly sbt task from the project path to build the jar with dependencies

sbt assembly

###Running the generator Use the built in shell wrapper to fire up the generator

bin/generator --help

###Using individual generators

All the data generators have a driver command line interface, following links show basic examples

###Feedback For more generators, your specific use-cases or to leave feedback contact support

###License and Authors

Authors: Ashrith

Apache 2.0. Please see LICENSE.txt. All contents copyright (c) 2013, Cloudwick Labs.

About

Synthetic data generators for simulating real-time data and work loads

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published