Skip to content

Commit

Permalink
[FSTORE-1107] Pyspark streaming Tutorial (#227)
Browse files Browse the repository at this point in the history
* Draft pyspark streaming

* adding init.py for module

* adding code to download libaries in hopsworks jupyter

* adding topic for testing

* adding notebook for creating simulated data stream

* correcting imports

* working base pyspark streaming

* adding all pipeline pyspark streaming

* updated tutorial

* fixed pyspark streaming

* clearning all outputs from notebooks

* renaming filenames in tutorial

* adding online inference pipeline instead of batch inference

* pyspark streaming online inference with deployments

* moving polars and pyspark streaming tutorial to integrations folder

* Updating readme based on moved polars tutorials and adding Pyspark streaming into readme
  • Loading branch information
manu-sj authored Apr 8, 2024
1 parent 2fd7e4d commit 113e592
Show file tree
Hide file tree
Showing 9 changed files with 2,383 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,8 @@ In order to understand the tutorials you need to be familiar with general concep
- [WandB](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/wandb): Build a machine learning model with Weights & Biases.
- [Great Expectations](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/great_expectations): Introduction to Great Expectations concepts and classes which are relevant for integration with the Hopsworks MLOps platform.
- [Neo4j](integrations/neo4j): Perform Anti-money laundering (AML) predictions using Neo4j Graph representation of transactions.
- [Polars](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/advanced_tutorials/polars/quickstart.ipynb) : Introductory tutorial on using Polars.
- [Polars](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/polars/quickstart.ipynb) : Introductory tutorial on using Polars.
- [PySpark Streaming](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/pyspark_streaming) : Real time feature computation from streaming data using PySpark and HopsWorks Feature Store.
- [Monitoring](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/monitoring): How to implement feature monitoring in your production pipeline.
- [Bytewax](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/bytewax): Real time feature computation using Bytewax.
- [Apache Beam](https://github.com/logicalclocks/hopsworks-tutorials/tree/master/integrations/java/beam): Real time feature computation using Apache Beam, Google Cloud Dataflow and Hopsworks Feature Store.
Expand Down
File renamed without changes.
Loading

0 comments on commit 113e592

Please sign in to comment.