Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: handling for context vars. add support for disabling prophet option. #1209

Merged
merged 14 commits into from
Feb 5, 2025

Conversation

alxlyj
Copy link
Contributor

@alxlyj alxlyj commented Jan 28, 2025

Project Robyn

Code Changes

add option for disabling prophet

  • The holidays_data parameter is now optional in init with a default of None
  • check for Prophet enablement via
prophet_enabled = (
    self.holidays_data is not None
    and self.holidays_data.prophet_vars is not None
    and len(self.holidays_data.prophet_vars) > 0
)
  • The code conditionally includes Prophet variables in all_ind_vars only if Prophet is enabled

The implementations between R and Python should match in that:

  • Both skip Prophet decomposition when disabled
  • Both exclude Prophet variables from feature set
  • Both preserve the same core variables (date, dependent var, media vars, context vars)
  • Both handle the feature engineering pipeline similarly with/without Prophet

NOTE: We will validate prophet disabled outputs in another PR.


handling for context vars

In Python (feature_engineering.py), the _prepare_data method is part of the feature engineering process. It creates a copy of input data and standardizes column names. Right now, it has a problematic hardcoded type conversion. Specifically:

        dt_transform["competitor_sales_B"] = dt_transform["competitor_sales_B"].astype(
            "int64"
        )

In this PR:

  • Handle factor variables first
  • Only attempt numeric conversion for context variables that:
    • Are not in factor_vars
    • Are currently object dtype (meaning they need conversion)
  • More focused logging

This aligns better with R's implementation where:

  • Factor variables (like "events") remain categorical
  • Numeric variables (like "competitor_sales_B") are treated as numbers
  • We don't unnecessarily convert variables that are already in the correct format

Notebook demo changes

  • Fix default iterations to 2000.
  • Added path guidances for local import references.

Test

  • Ran tutorial1.ipynb without issues.
  • pytest Robyn/python/tests/integration/test_feature_engineering.py -v -s
  • For more logs: pytest Robyn/python/tests/integration/test_feature_engineering.py -v -s --log-cli-level=DEBUG
collected 9 items                                                                                                                                                                                                                                                                                                

python/tests/integration/test_feature_engineering.py::test_feature_engineering_initialization PASSED
python/tests/integration/test_feature_engineering.py::test_prepare_data_with_custom_context PASSED
python/tests/integration/test_feature_engineering.py::test_feature_engineering_with_different_context_vars 18:55:13 - cmdstanpy - INFO - Chain [1] start processing
18:55:13 - cmdstanpy - INFO - Chain [1] done processing
PASSED
python/tests/integration/test_feature_engineering.py::test_feature_engineering_no_context_vars 18:55:13 - cmdstanpy - INFO - Chain [1] start processing
18:55:13 - cmdstanpy - INFO - Chain [1] done processing
PASSED
python/tests/integration/test_feature_engineering.py::test_invalid_context_var PASSED
python/tests/integration/test_feature_engineering.py::test_feature_engineering_with_prophet_disabled PASSED
python/tests/integration/test_feature_engineering.py::test_feature_engineering_with_empty_prophet_vars PASSED
python/tests/integration/test_feature_engineering.py::test_prophet_enable_disable_comparison 18:55:13 - cmdstanpy - INFO - Chain [1] start processing
18:55:13 - cmdstanpy - INFO - Chain [1] done processing
PASSED
python/tests/integration/test_feature_engineering.py::test_feature_engineering_with_prophet_enabled 18:55:13 - cmdstanpy - INFO - Chain [1] start processing
18:55:13 - cmdstanpy - INFO - Chain [1] done processing
PASSED
========================================================================================================================================= 9 passed, 28 warnings in 2.17s =========================================================================================================================================
(

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 28, 2025
@alxlyj alxlyj requested review from sumane81 and gufengzhou January 28, 2025 20:06
@alxlyj alxlyj marked this pull request as ready for review January 28, 2025 23:15
@alxlyj alxlyj marked this pull request as draft January 29, 2025 01:59
@alxlyj alxlyj marked this pull request as ready for review January 29, 2025 02:23
@alxlyj alxlyj changed the title added proper handling of context variables fix handling for context vars. add support for disabling prophet option. Jan 29, 2025
@alxlyj alxlyj self-assigned this Feb 4, 2025
@alxlyj alxlyj requested a review from sumalreddy17 February 4, 2025 23:20
@alxlyj alxlyj changed the title fix handling for context vars. add support for disabling prophet option. Fix: handling for context vars. add support for disabling prophet option. Feb 5, 2025
@alxlyj
Copy link
Contributor Author

alxlyj commented Feb 5, 2025

We will address further data testing and validation of holiday_data and prophet_vars disabled in a separate PR for #1208

Copy link
Contributor

@sumane81 sumane81 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@alxlyj alxlyj merged commit 025eded into main Feb 5, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants