Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Feast FeatureStore integration cannot work with a FeatureService instance #3180

Open
1 task done
aiakide opened this issue Nov 8, 2024 · 4 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@aiakide
Copy link
Contributor

aiakide commented Nov 8, 2024

Contact Details [Optional]

No response

System Information

ZENML_LOCAL_VERSION: 0.68.1
ZENML_SERVER_VERSION: 0.68.1
ZENML_SERVER_DATABASE: mysql
ZENML_SERVER_DEPLOYMENT_TYPE: other
ZENML_CONFIG_DIR: /Users/nils/Library/Application Support/zenml
ZENML_LOCAL_STORE_DIR: /Users/nils/Library/Application Support/zenml/local_stores
ZENML_SERVER_URL: http://localhost:8080
ZENML_ACTIVE_REPOSITORY_ROOT: None
PYTHON_VERSION: 3.12.0
ENVIRONMENT: native
SYSTEM_INFO: {'os': 'mac', 'mac_version': '15.0.1'}
ACTIVE_WORKSPACE: default
ACTIVE_STACK: docker-compose
ACTIVE_USER: nils
TELEMETRY_STATUS: enabled
ANALYTICS_CLIENT_ID: 5e1c16f0-8d4d-49d6-99e3-d36d764283ad
ANALYTICS_USER_ID: 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
ANALYTICS_SERVER_ID: 10265ea2-4472-47fd-bb52-a379ec184213
INTEGRATIONS: ['airflow', 'bitbucket', 'numpy', 'pandas', 'feast', 'github', 'kaniko', 'label_studio', 'pigeon', 'pillow', 'scipy', 'xgboost']
PACKAGES: {'deprecated': '1.2.14', 'gitpython': '3.1.43', 'mako': '1.3.6', 'markdown': '3.7', 'markupsafe': '3.0.2', 'pygithub': '2.4.0', 'pyjwt': '2.7.0', 'pymysql': '1.1.1', 'pynacl': '1.5.0', 'pyyaml': '6.0.2', 'sqlalchemy': '2.0.36', 'sqlalchemy-utils': '0.41.2', 'aiobotocore': '2.15.2', 
'aiohappyeyeballs': '2.4.3', 'aiohttp': '3.10.10', 'aioitertools': '0.12.0', 'aiosignal': '1.3.1', 'alembic': '1.8.1', 'annotated-types': '0.7.0', 'anyio': '4.6.2.post1', 'appdirs': '1.4.4', 'asttokens': '2.4.1', 'attrs': '24.2.0', 'bcrypt': '4.0.1', 'bigtree': '0.21.3', 'blinker': '1.8.2', 'boto3': 
'1.35.36', 'botocore': '1.35.36', 'cachetools': '5.5.0', 'certifi': '2024.8.30', 'cffi': '1.17.1', 'cfgv': '3.4.0', 'charset-normalizer': '3.4.0', 'click': '8.1.3', 'cloudpickle': '2.2.1', 'colorama': '0.4.6', 'comm': '0.2.2', 'contourpy': '1.3.0', 'cramjam': '2.9.0', 'cryptography': '43.0.3', 'cycler': 
'0.12.1', 'dask': '2024.8.0', 'dask-expr': '1.1.10', 'databricks-sdk': '0.36.0', 'decorator': '5.1.1', 'dill': '0.3.9', 'distlib': '0.3.9', 'distro': '1.9.0', 'docker': '7.1.0', 'executing': '2.1.0', 'fastapi': '0.110.0', 'fastparquet': '2024.5.0', 'feast': '0.41.3', 'filelock': '3.16.1', 'flask': '3.0.3',
'fonttools': '4.54.1', 'from-jupyter-to-production-ml-platform': '0.1.0', 'frozenlist': '1.5.0', 'fsspec': '2024.10.0', 'gitdb': '4.0.11', 'google-auth': '2.35.0', 'graphene': '3.4.1', 'graphql-core': '3.2.5', 'graphql-relay': '3.2.0', 'greenlet': '3.1.1', 'gunicorn': '23.0.0', 'h11': '0.14.0', 'httpcore':
'1.0.6', 'httptools': '0.6.4', 'httpx': '0.27.2', 'identify': '2.6.1', 'idna': '3.10', 'ijson': '3.3.0', 'importlib-metadata': '8.5.0', 'ipython': '8.29.0', 'ipywidgets': '8.1.5', 'itsdangerous': '2.2.0', 'jedi': '0.19.1', 'jinja2': '3.1.4', 'jmespath': '1.0.1', 'joblib': '1.4.2', 'jsonschema': '4.23.0', 
'jsonschema-specifications': '2024.10.1', 'jupyterlab-widgets': '3.0.13', 'kiwisolver': '1.4.7', 'label-studio-sdk': '1.0.7', 'locket': '1.0.0', 'lxml': '5.3.0', 'markdown-it-py': '3.0.0', 'matplotlib': '3.9.2', 'matplotlib-inline': '0.1.7', 'mdurl': '0.1.2', 'mlflow': '2.17.1', 'mlflow-skinny': '2.17.1', 
'mmh3': '5.0.1', 'multidict': '6.1.0', 'mypy': '1.13.0', 'mypy-extensions': '1.0.0', 'nltk': '3.9.1', 'nodeenv': '1.9.1', 'numpy': '1.26.4', 'opentelemetry-api': '1.16.0', 'opentelemetry-sdk': '1.16.0', 'opentelemetry-semantic-conventions': '0.37b0', 'packaging': '24.1', 'pandas': '2.2.3', 'parso': 
'0.8.4', 'partd': '1.4.2', 'passlib': '1.7.4', 'pexpect': '4.9.0', 'pillow': '11.0.0', 'pip': '24.3.1', 'platformdirs': '4.3.6', 'pre-commit': '4.0.1', 'prometheus-client': '0.21.0', 'prompt-toolkit': '3.0.48', 'propcache': '0.2.0', 'protobuf': '4.25.5', 'psutil': '6.1.0', 'psycopg': '3.2.3', 'psycopg2': 
'2.9.10', 'psycopg-binary': '3.2.3', 'psycopg-pool': '3.2.3', 'ptyprocess': '0.7.0', 'pure-eval': '0.2.3', 'pyarrow': '17.0.0', 'pyasn1': '0.6.1', 'pyasn1-modules': '0.4.1', 'pycparser': '2.22', 'pydantic': '2.8.2', 'pydantic-core': '2.20.1', 'pydantic-settings': '2.6.1', 'pygit': '0.1', 'pygments': 
'2.18.0', 'pyparsing': '3.2.0', 'python-dateutil': '2.9.0.post0', 'python-dotenv': '1.0.1', 'pytz': '2024.2', 'redis': '5.2.0', 'referencing': '0.35.1', 'regex': '2024.9.11', 'requests': '2.32.3', 'requests-mock': '1.12.1', 'rich': '13.9.4', 'rpds-py': '0.20.0', 'rsa': '4.9', 'ruff': '0.7.2', 's3fs': 
'2024.10.0', 's3transfer': '0.10.3', 'scikit-learn': '1.5.2', 'scipy': '1.14.1', 'setuptools': '75.3.0', 'six': '1.16.0', 'smmap': '5.0.1', 'sniffio': '1.3.1', 'sqlmodel': '0.0.18', 'sqlparse': '0.5.1', 'stack-data': '0.6.3', 'starlette': '0.36.3', 'tabulate': '0.9.0', 'tenacity': '8.5.0', 'threadpoolctl':
'3.5.0', 'toml': '0.10.2', 'toolz': '1.0.0', 'tqdm': '4.66.6', 'traitlets': '5.14.3', 'typeguard': '4.4.0', 'typing-extensions': '4.12.2', 'tzdata': '2024.2', 'ujson': '5.10.0', 'urllib3': '2.2.3', 'uvicorn': '0.32.0', 'uvicorn-worker': '0.2.0', 'uvloop': '0.21.0', 'virtualenv': '20.27.0', 'watchfiles': 
'0.24.0', 'wcwidth': '0.2.13', 'websockets': '13.1', 'werkzeug': '3.0.6', 'widgetsnbextension': '4.0.13', 'wrapt': '1.16.0', 'xgboost': '2.1.2', 'xmljson': '0.2.1', 'yarl': '1.17.1', 'zenml': '0.68.1', 'zipp': '3.20.2', 'autocommand': '2.2.2', 'backports.tarfile': '1.2.0', 'importlib-resources': '6.4.0', 
'inflect': '7.3.1', 'jaraco.collections': '5.1.0', 'jaraco.context': '5.3.0', 'jaraco.functools': '4.0.1', 'jaraco.text': '3.12.1', 'more-itertools': '10.3.0', 'tomli': '2.0.1', 'wheel': '0.43.0'}

CURRENT STACK

Name: docker-compose
ID: 21e4c021-f6ed-4db4-9d81-7c14955e6577
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

ORCHESTRATOR: default

Name: default
ID: f275fa0b-003e-4f42-a98e-1ca6124cb870
Type: orchestrator
Flavor: local
Configuration: {}
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

ARTIFACT_STORE: minio_store

Name: minio_store
ID: d3a6a2de-7000-4d85-894d-321ce22436a4
Type: artifact_store
Flavor: s3
Configuration: {'authentication_secret': 'minio_secret', 'path': 's3://zenml', 'key': '********', 'secret': '********', 'token': '********', 'client_kwargs': {'endpoint_url': 'http://localhost:9000', 'region_name': 'eu-east-1'}, 'config_kwargs': None, 's3_additional_kwargs': None}
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

FEATURE_STORE: feast_store

Name: feast_store
ID: be019758-0194-45d2-861a-ebad60300e68
Type: feature_store
Flavor: feast
Configuration: {'online_host': 'localhost', 'online_port': 6379, 'feast_repo': './src/.../feature_repo'}
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

EXPERIMENT_TRACKER: mlflow

Name: mlflow
ID: b1c524cd-24b1-49aa-8cc4-0e24e517e4eb
Type: experiment_tracker
Flavor: mlflow
Configuration: {'experiment_name': None, 'nested': False, 'tags': {}, 'tracking_uri': 'http://localhost:5001', 'tracking_username': '********', 'tracking_password': '********', 'tracking_token': '********', 'tracking_insecure_tls': False, 'databricks_host': None, 'enable_unity_catalog': False}
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

ANNOTATOR: label_studio

Name: label_studio
ID: cd3ea58f-8771-4397-994b-8cec88037fde
Type: annotator
Flavor: label_studio
Configuration: {'authentication_secret': 'label_studio_secrets', 'instance_url': 'http://localhost', 'port': 8081, 'api_key': '********'}
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

MODEL_REGISTRY: mlflow_model_registry

Name: mlflow_model_registry
ID: 6eb91cfd-70ac-402b-8659-fb6d28e471bb
Type: model_registry
Flavor: mlflow
Configuration: {}
User: nils / 2fbd0e70-bfc7-4e96-afe5-bc39d0e49a6c
Workspace: default / 5eaea2fc-1b17-42f3-89d1-2110928be520

What happened?

I use Feast as a FeatureStore and work with FeatureServices, among other things. If I now want to retrieve the historical features from this FeatureStore with a FeatureService, this is not possible with the current implementation of the integration, as it only accepts a list of strings (feature, feature views). The underlying implementation of Feast would support a FeatureService instance in addition to a list of strings.

# zenml.integrations.feast.feature_stores.feast_feature_store.FeastFeatureStore.get_historical_features


    def get_historical_features(
        self,
        entity_df: Union[pd.DataFrame, str],
        features: List[str],
        full_feature_names: bool = False,
    ) -> pd.DataFrame:
        """Returns the historical features for training or batch scoring.

        Args:
            entity_df: The entity DataFrame or entity name.
            features: The features to retrieve.
            full_feature_names: Whether to return the full feature names.

        Raise:
            ConnectionError: If the online component (Redis) is not available.

        Returns:
            The historical features as a Pandas DataFrame.
        """
        fs = FeatureStore(repo_path=self.config.feast_repo)

        return fs.get_historical_features(
            entity_df=entity_df,
            features=features,
            full_feature_names=full_feature_names,
        ).to_df()
# feast.feature_store.FeatureStore.get_historical_features
 def get_historical_features(
        self,
        entity_df: Union[pd.DataFrame, str],
        features: Union[List[str], FeatureService],
        full_feature_names: bool = False,
    ) -> RetrievalJob:
          [...]
    

I noticed that the integration bypasses all Feast implementations anyway and always works with strings.
For example, the function zenml.integrations.feast.feature_stores.feast_feature_store.FeastFeatureStore.get_feature_services also returns a list with the names of the FeatureServices and not the concrete instance of the FeatureService.

Is there a good reason for this?

Reproduction steps

  1. Setup a feast FeatureStore
  2. Install integrations
  3. Add Feast stack component
  4. Try to run the following code:
    feature_store = Client().active_stack.feature_store
    if not isinstance(feature_store, FeastFeatureStore):
        raise ValueError()
    feature_service = feature_store.get_feature_services()[0]
    entity_df = pd.DataFrame.from_dict(...)

    data = feature_store.get_historical_features(
        entity_df=entity_df,
        features=[feature_service],
        full_feature_names=full_feature_names,
    )

I get the following error message, because I do not want to get a FeatureView, but a Feature Service.

feast.errors.FeatureViewNotFoundException: Feature view <feature_service_name> does not exist in project <project_name>

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@aiakide aiakide added the bug Something isn't working label Nov 8, 2024
@aiakide
Copy link
Contributor Author

aiakide commented Nov 8, 2024

FYI, I have customized the Fest integration locally to work with both a list of features / feature views and a FeatureService instance. It looks like everything is working fine.

@htahir1
Copy link
Contributor

htahir1 commented Nov 10, 2024

@aiakide thank you for the report! would you consider making an open source contribution to make this work for Feast?

@aiakide
Copy link
Contributor Author

aiakide commented Nov 11, 2024

I'm happy to do that. 🚀
I just want to make sure that there is no explicit reason to work only with the feature service names and not with the concrete instances (FeatureService) at this point.

@htahir1
Copy link
Contributor

htahir1 commented Nov 11, 2024

@aiakide no I dont think so! Happy to hear this <3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants