These data generators mocks several real-life situations. Purpose built for research and development of several big data use cases.
Generator | Description | Schema | Scope |
---|---|---|---|
log | mocks the logs generated by apache httpd | Log Schema | 1. Real time data ingestion (using Flume & MapReduce) 2. Real time analytics (using Storm and kafka) 3. Click stream analytics 4. NoSql |
odvs | mocks the data generated by a real life on-demand video service provider like netflix, hulu & amazon prime | ODVS Schema | 1. NoSql 2. Analytics using MapReduce/Hive |
osge | mocks the data generated by a online social gaming entertainment provider like second life, imvu, onverse... | OSGE Schema | 1. NoSql 2. Analytics using MapReduce/Hive |
###Building the project This project requires sbt, installation instructions found here
Once, sbt is installed use the assembly
sbt task from the project path to build the jar with dependencies
sbt assembly
###Running the generator Use the built in shell wrapper to fire up the generator
bin/generator --help
###Using individual generators
All the data generators have a driver command line interface, following links show basic examples
###Feedback For more generators, your specific use-cases or to leave feedback contact support
###License and Authors
Authors: Ashrith
Apache 2.0. Please see LICENSE.txt
. All contents copyright (c) 2013, Cloudwick Labs.