Skip to content

Commit

Permalink
adding docs for the alias function
Browse files Browse the repository at this point in the history
  • Loading branch information
manu-sj committed Dec 11, 2024
1 parent 468b809 commit bbabd3c
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@
## On Demand Transformation Function Creation


An on-demand transformation function can be created by attaching a [transformation function](../transformation_functions.md) to a feature group. Each on-demand transformation function creates one on-demand feature having the same name as the transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` will generate one on-demand feature called `transaction_age`. Hence, only one-to-one or many-to-one transformation functions can be used to create an on-demand transformation functions.
An on-demand transformation function may be created by associating a [transformation function](../transformation_functions.md) with a feature group. Each on-demand transformation function generates a single on-demand feature, which, by default, is assigned the same name as the associated transformation function. For instance, in the example below, the on-demand transformation function `transaction_age` produces an on-demand feature named transaction_age. Alternatively, the name of the resulting on-demand feature can be explicitly defined using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.

It is important to note that only one-to-one or many-to-one transformation functions are compatible with the creation of on-demand transformation functions.

!!! warning "On-demand transformation"
All on-demand transformation functions attached to a feature group must have unique names and, in contrast to model-dependent transformations, they do not have access to training dataset statistics.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Hopsworks allows you to create a model-dependent transformation function by atta

Each model-dependent transformation function can map specific features to its arguments by explicitly providing their names as arguments to the transformation function. If no feature names are provided, the transformation function will default to using features from the feature view that match the name of the transformation function's argument.

The output columns generated by a model-dependent transformation function follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0``add_one_multiple_feature1_feature2_feature3_1` and  `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`.
Hopsworks by default generates default names of transformed features output by a model-dependent transformation function. The generated names follows a naming convention structured as `functionName_features_outputColumnNumber` if the transformation function outputs multiple columns and `functionName_features` if the transformation function outputs one column. For instance, for the function named `add_one_multiple` that outputs multiple columns in the example given below, produces output columns that would be labeled as  `add_one_multiple_feature1_feature2_feature3_0`,  `add_one_multiple_feature1_feature2_feature3_1` and  `add_one_multiple_feature1_feature2_feature3_2`. The function named `add_two` that outputs a single column in the example given below, produces a single output column names as `add_two_feature`. Additionally, Hopsworks also allows users to specify custom names for transformed feature using the [`alias`](../transformation_functions.md#specifying-output-features–names-for-transformation-functions) function.


=== "Python"
Expand Down
19 changes: 19 additions & 0 deletions docs/user_guides/fs/transformation_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,25 @@ The `drop` parameter of the `@udf` decorator is used to drop specific column
return feature1 + 1, feature2 + 1, feature3 + 1
```

### Specifying output features names for transformation functions

The [`alias`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_functions_api/#alias) function of a transformation function allows the specification of names of transformed features generated by the transformation function. Each name must be uniques and should be at-most 63 characters long. If no name is provided via the `alias` function, Hopsworks generates default output feature names when [on-demand](./feature_group/on_demand_transformations.md) or [model-dependent](./feature_view/model-dependent-transformations.md) transformation functions are created.


=== "Python"
!!! example "Specifying output column names for transformation functions."
```python
from hopsworks import udf
import pandas as pd

@udf(return_type=[int, int, int], drop=["feature1", "feature3"])
def add_one_multiple(feature1, feature2, feature3):
return feature1 + 1, feature2 + 1, feature3 + 1

# Specifying output feature names of the transformation function.
add_one_multiple.alias("transformed_feature1", "transformed_feature2", "transformed_feature3")
```

### Training dataset statistics

A keyword argument `statistics` can be defined in the transformation function if it requires training dataset statistics for any of its arguments. The `statistics` argument must be assigned an instance of the class [`TransformationStatistics`](http://docs.hopsworks.ai/hopsworks-api/{{{hopsworks_version}}}/generated/api/transformation_statistics/) as the default value. The `TransformationStatistics` instance must be initialized using the names of the arguments requiring statistics.
Expand Down

0 comments on commit bbabd3c

Please sign in to comment.