Skip to content
/ vortex Public

A distributed, fault-tolerant, event stream data processing and high-availability, rapid retrieval software, using apache kafka, spark and ignite

License

Notifications You must be signed in to change notification settings

swap-10/vortex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vortex

Vortex is a real-time, distributed, fault-tolerant, highly-scalable, rapid-fast data stream processing software.

It can consume data from apache Kafka topics on-the-fly, process it using apache spark, including basic processing as well as statistical ML workloads, and stream it to an apache Ignite cluster to store as an in-memory data-grid, which can be persisted to disk as required.

Following are the steps to setup a minimal example with sample e-commerce data:

  • First run the ignite server

  • Then run the kafka zookeeper and the kafka server

  • Then run the kafka producer to generate the event stream

  • Then run the main spark app using:

vortex-venv/bin/spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.5.0 spark-app/spark_main.py

About

A distributed, fault-tolerant, event stream data processing and high-availability, rapid retrieval software, using apache kafka, spark and ignite

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages