Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kwangjong authored Oct 4, 2023
1 parent 457552e commit 8b99658
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# coinbase-real-time-data-pipeline

## Project Overview
The Coinbase Real-time Data Pipeline is engineered to acquire real-time cryptocurrency price data from the Coinbase API. It employs Apache Spark for data processing, Cassandra for data storage, and Grafana for data visualization. This robust pipeline is containerized using Docker for seamless deployment and utilizes Kafka as a message broker to ensure low latency, scalability, and high availability.
The Coinbase Real-time Data Pipeline is designed to retrieve real-time cryptocurrency price data from the Coinbase API. It leverages a combination of technologies, including Kafka for data queuing, Apache Spark for data processing, Cassandra for data storage, and Grafana for data visualization. This comprehensive pipeline is containerized using Docker for easy deployment and is currently set up for Minikube using Kubernetes. With minimal adjustments, it can be seamlessly transitioned to run on actual Kubernetes clusters, such as GKE or EKE, for scalability and production use.

## Screenshot

Expand All @@ -16,9 +16,10 @@ The Coinbase Real-time Data Pipeline is engineered to acquire real-time cryptocu
* [Structured Stream Processing with Scala Spark](https://kwangjong.github.io/blog/2023-09-25-Structured-Stream-Processing-with-Scala-Spark)
* [Storing processed data in Cassandra](https://kwangjong.github.io/blog/2023-09-26-Storing-processed-data-in-Cassandra)
* [Visualizing Stream Data in Cassandra Using Grafana](https://kwangjong.github.io/blog/2023-10-02-Visualizing-Stream-Data-in-Cassandra-Using-Grafana)
* [Deploying a Real-time Data Pipeline in Minikube with Kubernetes](https://kwangjong.github.io/blog/2023-10-04-Deploying-a-Real-time-Data-Pipeline-in-Minikube-with-Kubernetes)

## Todo
* ~~Implement container orchestration using Kubernetes~~ -> write a blog post on this
* ~~Implement container orchestration using Kubernetes -> write a blog post on this~~
* ~~Set up a CI pipeline using GitHub Actions~~ -> write a blog post on this
* Perform code cleanup and integration testing
* Explore interesting data analysis opportunities on the collected data

0 comments on commit 8b99658

Please sign in to comment.