From 90ef9e785f72b881235efdd2deca0c15ff74d207 Mon Sep 17 00:00:00 2001 From: Victor Jouffrey Date: Wed, 5 Jun 2024 15:16:24 +0200 Subject: [PATCH 1/5] Add reference to Python REST client --- .../fs/feature_view/feature-server.md | 9 +++++--- .../fs/feature_view/feature-vectors.md | 23 +++++++++++++++++-- 2 files changed, 27 insertions(+), 5 deletions(-) diff --git a/docs/user_guides/fs/feature_view/feature-server.md b/docs/user_guides/fs/feature_view/feature-server.md index 0c777b769..40e97b5ae 100644 --- a/docs/user_guides/fs/feature_view/feature-server.md +++ b/docs/user_guides/fs/feature_view/feature-server.md @@ -6,10 +6,13 @@ description: Using Feature Store REST API Server for retrieving feature vectors This API server allows users to retrieve single/batch feature vectors from a feature view. -## How to use -Hopsworks 3.3 includes a preview of the feature store REST API, version 0.1.0. By default, the server listens on the `0.0.0.0:4406`. Please refer to `/srv/hops/mysql-cluster/rdrs_config.json` config file located on machines running the REST Server for additional configuration parameters. +## How to use -## Single feature vector +From Hopsworks 3.3, you can connect to the Feature Vector Server via any REST client which supports POST requests. Set the X-API-HEADER to your Hopsworks API Key and send the request with a JSON body, [single](#request) or [batch](#request-1). By default, the server listens on the `0.0.0.0:4406` and the api version is set to `0.1.0`. Please refer to `/srv/hops/mysql-cluster/rdrs_config.json` config file located on machines running the REST Server for additional configuration parameters. + +In Hopsworks 3.7, we introduced a python client for the Online Store REST API Server. The python client is available in the `hsfs` module and can be installed using `pip install hsfs`. This client can be used instead of the Online Store SQL client in the `FeatureView.get_feature_vector(s)` methods. Check the corresponding [documentation](./feature-vectors.md) for these methods. + +## Single feature vector ### Request diff --git a/docs/user_guides/fs/feature_view/feature-vectors.md b/docs/user_guides/fs/feature_view/feature-vectors.md index d9edf9eda..ec1acaf9f 100644 --- a/docs/user_guides/fs/feature_view/feature-vectors.md +++ b/docs/user_guides/fs/feature_view/feature-vectors.md @@ -1,7 +1,26 @@ # Feature Vectors -Once you have trained a model, it is time to deploy it. You can get back all the features required to feed into an ML model with a single method call. A feature view provides great flexibility for you to retrieve a vector (or row) of features from any environment, whether you are either inside the Hopsworks platform, a model serving platform, or in an external environment, such as your application server. Harnessing the powerful [RonDB](https://www.rondb.com/), feature vectors are served at in-memory latency. +The Hopsworks Platform integrates real-time capabilities with its Online Store. Based on [RonDB](https://www.rondb.com/), your feature vectors are served at scale at in-memory latency (~1-10ms). Checkout the benchmarks results [here](https://www.hopsworks.ai/post/feature-store-benchmark-comparison-hopsworks-and-feast#images-2) and the code [here](https://github.com/featurestoreorg/featurestore-benchmarks). The same Feature View which was used to create training datasets can be used to retrieve feature vectors for real-time predictions. This allows you to serve the same features to your model in training and serving, ensuring consistency and reducing boilerplate. Whether you are either inside the Hopsworks platform, a model serving platform, or in an external environment, such as your application server. -If you want to understand more about the concept of feature vectors, you can refer to [here](../../../concepts/fs/feature_view/online_api.md). +Below is a practical guide on how to use the Online Store Python and Java Client. The aim is to get you started quickly by providing code snippets which illustrate various use cases and functionalities of the clients. If you need to get more familiar with the concept of feature vectors, you can read this [short introduction](../../../concepts/fs/feature_view/online_api.md) first. + +## Choose your Client +The Online Store can be accessed via the **Python** or **Java** client allowing you to use your language of choice to connect to the Online Store. Additionally, the Python client provides two different implementations to fetch data: **SQL** or **REST**. The SQL client is the default implementation. It requires a direct SQL connection to your RonDB cluster and uses python asyncio to offer high performance even when your Feature View rows involve querying multiple different tables. The REST client is an alternative implementation connecting to [RonDB Feature Vector Server](./feature-server.md). Perfect if you want to avoid exposing ports of your database cluster directly to clients. This implementation is available as of Hopsworks 3.7. + +Initialise the client by calling the `init_serving` method on the Feature View object before starting to fetch feature vectors. Thiswill initialise the chose client, test the connection, as well as initialise the transformation functions if they are defined in the Feature View. + +=== "Python" +```python +# initialize the SQL client to fetch feature vectors from the Online Store +my_feature_view.init_serving() + +# or use the REST client +my_feature_view.init_serving( + init_rest_client=True, + config_rest_client={ + ... + } +) +``` ## Retrieval You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). From acb3c8c866bc1464f0c4e1e858709c2c2df524c0 Mon Sep 17 00:00:00 2001 From: Victor Jouffrey Date: Wed, 5 Jun 2024 15:22:28 +0200 Subject: [PATCH 2/5] Add warning about needed API key --- docs/user_guides/fs/feature_view/feature-vectors.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/user_guides/fs/feature_view/feature-vectors.md b/docs/user_guides/fs/feature_view/feature-vectors.md index ec1acaf9f..f3cdb2aa7 100644 --- a/docs/user_guides/fs/feature_view/feature-vectors.md +++ b/docs/user_guides/fs/feature_view/feature-vectors.md @@ -17,11 +17,13 @@ my_feature_view.init_serving() my_feature_view.init_serving( init_rest_client=True, config_rest_client={ - ... + "api_key": "your_api_key", } ) ``` +Note that when using the REST client in the Hopsworks Cluster python environment you will need to provide an API key explicitly as JWT authentication is not supported. + ## Retrieval You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). From f3b7c6707e6b61dfc3e1c75dffdbcb47743aee3f Mon Sep 17 00:00:00 2001 From: Victor Jouffrey Date: Wed, 5 Jun 2024 15:39:41 +0200 Subject: [PATCH 3/5] Reorganise, example using both clients --- .../fs/feature_view/feature-vectors.md | 70 +++++++++++++------ 1 file changed, 48 insertions(+), 22 deletions(-) diff --git a/docs/user_guides/fs/feature_view/feature-vectors.md b/docs/user_guides/fs/feature_view/feature-vectors.md index f3cdb2aa7..c20627626 100644 --- a/docs/user_guides/fs/feature_view/feature-vectors.md +++ b/docs/user_guides/fs/feature_view/feature-vectors.md @@ -3,27 +3,6 @@ The Hopsworks Platform integrates real-time capabilities with its Online Store. Below is a practical guide on how to use the Online Store Python and Java Client. The aim is to get you started quickly by providing code snippets which illustrate various use cases and functionalities of the clients. If you need to get more familiar with the concept of feature vectors, you can read this [short introduction](../../../concepts/fs/feature_view/online_api.md) first. -## Choose your Client -The Online Store can be accessed via the **Python** or **Java** client allowing you to use your language of choice to connect to the Online Store. Additionally, the Python client provides two different implementations to fetch data: **SQL** or **REST**. The SQL client is the default implementation. It requires a direct SQL connection to your RonDB cluster and uses python asyncio to offer high performance even when your Feature View rows involve querying multiple different tables. The REST client is an alternative implementation connecting to [RonDB Feature Vector Server](./feature-server.md). Perfect if you want to avoid exposing ports of your database cluster directly to clients. This implementation is available as of Hopsworks 3.7. - -Initialise the client by calling the `init_serving` method on the Feature View object before starting to fetch feature vectors. Thiswill initialise the chose client, test the connection, as well as initialise the transformation functions if they are defined in the Feature View. - -=== "Python" -```python -# initialize the SQL client to fetch feature vectors from the Online Store -my_feature_view.init_serving() - -# or use the REST client -my_feature_view.init_serving( - init_rest_client=True, - config_rest_client={ - "api_key": "your_api_key", - } -) -``` - -Note that when using the REST client in the Hopsworks Cluster python environment you will need to provide an API key explicitly as JWT authentication is not supported. - ## Retrieval You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). @@ -212,5 +191,52 @@ You can also use the parameter to provide values for all the features which are ) ``` +## Choose the right Client + +The Online Store can be accessed via the **Python** or **Java** client allowing you to use your language of choice to connect to the Online Store. Additionally, the Python client provides two different implementations to fetch data: **SQL** or **REST**. The SQL client is the default implementation. It requires a direct SQL connection to your RonDB cluster and uses python asyncio to offer high performance even when your Feature View rows involve querying multiple different tables. The REST client is an alternative implementation connecting to [RonDB Feature Vector Server](./feature-server.md). Perfect if you want to avoid exposing ports of your database cluster directly to clients. This implementation is available as of Hopsworks 3.7. + +Initialise the client by calling the `init_serving` method on the Feature View object before starting to fetch feature vectors. This will initialise the chosen client, test the connection, and initialise the transformation functions registered with the Feature View. Note to use the REST client in the Hopsworks Cluster python environment you will need to provide an API key explicitly as JWT authentication is not yet supported. More configuration options can be found in the [API documentation](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#init_serving). + +=== "Python" +```python +# initialize the SQL client to fetch feature vectors from the Online Store +my_feature_view.init_serving() + +# or use the REST client +my_feature_view.init_serving( + init_rest_client=True, + config_rest_client={ + "api_key": "your_api_key", + } +) +``` +Once the client is initialised, you can start fetching feature vector(s) via the Feature View methods: `get_feature_vector(s)`. You can initialise both clients for a given Feature View and switch between them by using the force flags in the get_feature_vector(s) methods. + +=== "Python" +```python +# initialize both clients and set the default to REST +my_feature_view.init_serving( + init_rest_client=True, + init_sql_client=True, + config_rest_client={ + "api_key": "your_api_key", + }, + default_client="rest" +) + +# this will fetch a feature vector via REST +try: + my_feature_view.get_feature_vector( + entry = {"pk1": 1, "pk2": 2}, + ) +except TimeoutException: + # if the REST client times out, the SQL client will be used + my_feature_view.get_feature_vector( + entry = {"pk1": 1, "pk2": 2}, + force_sql=True + ) +``` + ## Feature Server -In addition to Python/Java clients, from Hopsworks 3.3, a new [feature server](./feature-server.md) implemented in Go is introduced. With this new API, single or batch feature vectors can be retrieved in any programming language. +In addition to Python/Java clients, from Hopsworks 3.3, a new [feature server](./feature-server.md) implemented in Go is introduced. With this new API, single or batch feature vectors can be retrieved in any programming language. Note that you can connect to the Feature Vector Server via any REST client. However registered transformation function will not be applied to values in the JSON response and values stored in Feature Groups which contain embeddings will be missing. + From a8cda944888cc9c321d062b48423717d7cbd20ef Mon Sep 17 00:00:00 2001 From: Victor Jouffrey Date: Fri, 7 Jun 2024 11:25:26 +0200 Subject: [PATCH 4/5] Added version as parameter for redirection --- docs/user_guides/fs/feature_view/feature-vectors.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/user_guides/fs/feature_view/feature-vectors.md b/docs/user_guides/fs/feature_view/feature-vectors.md index c20627626..e022548d1 100644 --- a/docs/user_guides/fs/feature_view/feature-vectors.md +++ b/docs/user_guides/fs/feature_view/feature-vectors.md @@ -4,7 +4,7 @@ The Hopsworks Platform integrates real-time capabilities with its Online Store. Below is a practical guide on how to use the Online Store Python and Java Client. The aim is to get you started quickly by providing code snippets which illustrate various use cases and functionalities of the clients. If you need to get more familiar with the concept of feature vectors, you can read this [short introduction](../../../concepts/fs/feature_view/online_api.md) first. ## Retrieval -You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). +You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/{{version}}/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). === "Python" ```python @@ -38,7 +38,7 @@ You can get back feature vectors from either python or java client by providing ``` ### Required entry -Starting from python client v3.4, you can specify different values for the primary key of the same name which exists in multiple feature groups but are not joint by the same name. The table below summarises the value of `primary_keys` in different settings. Considering that you are joining 2 feature groups, namely, `left_fg` and `right_fg`, the feature groups have different primary keys, and features (`feature_*`) in each setting. Also, the 2 feature groups are [joint](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/query_api/#join) on different *join conditions* and *prefix* as `left_fg.join(right_fg, , prefix=)`. +Starting from python client v3.4, you can specify different values for the primary key of the same name which exists in multiple feature groups but are not joint by the same name. The table below summarises the value of `primary_keys` in different settings. Considering that you are joining 2 feature groups, namely, `left_fg` and `right_fg`, the feature groups have different primary keys, and features (`feature_*`) in each setting. Also, the 2 feature groups are [joint](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/query_api/#join) on different *join conditions* and *prefix* as `left_fg.join(right_fg, , prefix=)`. For java client, and python client before v3.4, the `primary_keys` are the set of primary key of all the feature groups in the query. Python client is backward compatible. It means that the `primary_keys` used before v3.4 can be applied to python client of later versions as well. @@ -195,7 +195,7 @@ You can also use the parameter to provide values for all the features which are The Online Store can be accessed via the **Python** or **Java** client allowing you to use your language of choice to connect to the Online Store. Additionally, the Python client provides two different implementations to fetch data: **SQL** or **REST**. The SQL client is the default implementation. It requires a direct SQL connection to your RonDB cluster and uses python asyncio to offer high performance even when your Feature View rows involve querying multiple different tables. The REST client is an alternative implementation connecting to [RonDB Feature Vector Server](./feature-server.md). Perfect if you want to avoid exposing ports of your database cluster directly to clients. This implementation is available as of Hopsworks 3.7. -Initialise the client by calling the `init_serving` method on the Feature View object before starting to fetch feature vectors. This will initialise the chosen client, test the connection, and initialise the transformation functions registered with the Feature View. Note to use the REST client in the Hopsworks Cluster python environment you will need to provide an API key explicitly as JWT authentication is not yet supported. More configuration options can be found in the [API documentation](https://docs.hopsworks.ai/feature-store-api/3.7/generated/api/feature_view_api/#init_serving). +Initialise the client by calling the `init_serving` method on the Feature View object before starting to fetch feature vectors. This will initialise the chosen client, test the connection, and initialise the transformation functions registered with the Feature View. Note to use the REST client in the Hopsworks Cluster python environment you will need to provide an API key explicitly as JWT authentication is not yet supported. More configuration options can be found in the [API documentation](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_view_api/#init_serving). === "Python" ```python From ed186c1c017fbd4804006759d274e5f26891d7f0 Mon Sep 17 00:00:00 2001 From: Victor Jouffrey <37411285+vatj@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:35:01 +0200 Subject: [PATCH 5/5] Update docs/user_guides/fs/feature_view/feature-vectors.md Co-authored-by: kennethmhc --- docs/user_guides/fs/feature_view/feature-vectors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user_guides/fs/feature_view/feature-vectors.md b/docs/user_guides/fs/feature_view/feature-vectors.md index e022548d1..d2829cb91 100644 --- a/docs/user_guides/fs/feature_view/feature-vectors.md +++ b/docs/user_guides/fs/feature_view/feature-vectors.md @@ -4,7 +4,7 @@ The Hopsworks Platform integrates real-time capabilities with its Online Store. Below is a practical guide on how to use the Online Store Python and Java Client. The aim is to get you started quickly by providing code snippets which illustrate various use cases and functionalities of the clients. If you need to get more familiar with the concept of feature vectors, you can read this [short introduction](../../../concepts/fs/feature_view/online_api.md) first. ## Retrieval -You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/{{version}}/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). +You can get back feature vectors from either python or java client by providing the primary key value(s) for the feature view. Note that filters defined in feature view and training data will not be applied when feature vectors are returned. If you need to retrieve a complete value of feature vectors without missing values, the required `entry` are [feature_view.primary_keys](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_view_api/#primary_keys). Alternative, you can provide the primary key of the feature groups as the key of the entry. It is also possible to provide a subset of the entry, which will be discussed [below](#partial-feature-retrieval). === "Python" ```python