feature(FileManager): adding FileManager to make feasible work with the library in other environment #1573

scaliseraoul · 2025-01-31T16:28:40Z

Important

Introduce FileManager abstraction for file operations, replacing direct file system calls across modules, and update tests accordingly.

File Management:
- Introduce FileManager and DefaultFileManager in helpers/filemanager.py for file operations.
- Replace os file operations with FileManager methods in pandasai/__init__.py, loader.py, and base.py.
Functionality:
- Update create function in pandasai/__init__.py to use FileManager for directory and file operations.
- Modify DatasetLoader in loader.py to use FileManager for schema file operations.
- Update DataFrame in base.py to use FileManager for push and pull operations.
Removals:
- Remove FileBasedPrompt class from core/prompts/file_based_prompt.py.
Testing:
- Update tests in test_loader.py, test_sql_loader.py, and test_pandasai_init.py to mock FileManager methods.

^{This description was created by}^{for bcc04a9. It will automatically update as commits are pushed.}

…he library in other environment

ellipsis-dev

👍 Looks good to me! Reviewed everything up to bcc04a9 in 1 minute and 29 seconds

More details

1. pandasai/__init__.py:106

Draft comment:
Consider using ConfigManager.get() instead of config.get() for consistency with other parts of the codebase.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new FileManager abstraction to handle file operations, which is a good design choice for flexibility and testing. However, there are some areas where the code can be improved for clarity and consistency.

2. pandasai/data_loader/loader.py:58

Draft comment:
Consider using ConfigManager.get() instead of config.get() for consistency with other parts of the codebase.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new FileManager abstraction to handle file operations, which is a good design choice for flexibility and testing. However, there are some areas where the code can be improved for clarity and consistency.

3. pandasai/dataframe/base.py:167

Draft comment:
Consider using ConfigManager.get() instead of config.get() for consistency with other parts of the codebase.
Reason this comment was not posted:
Confidence changes required: 50%
The PR introduces a new FileManager abstraction to handle file operations, which is a good design choice for flexibility and testing. However, there are some areas where the code can be improved for clarity and consistency.

4. pandasai/__init__.py:109

Draft comment:
The error message "Dataset already exists at path: {path}" could be improved for clarity. Consider rephrasing it to provide more context, such as "A dataset with the specified path already exists. Please choose a different path or remove the existing dataset."
Reason this comment was not posted:
Confidence changes required: 50%
The error message in the create function when a dataset already exists is not very descriptive. It should provide more context to the user.

5. pandasai/data_loader/loader.py:61

Draft comment:
The error message "Schema file not found: {schema_path}" could be improved for clarity. Consider rephrasing it to provide more context, such as "The schema file could not be located at the specified path. Please ensure the path is correct and the file exists."
Reason this comment was not posted:
Confidence changes required: 50%
The error message in the create_loader_from_path method when the schema file is not found is not very descriptive. It should provide more context to the user.

6. pandasai/data_loader/sql_loader.py:46

Draft comment:
The error message "Failed to execute query for '{source_type}' with: {formatted_query}" could be improved for clarity. Consider rephrasing it to provide more context, such as "The query execution failed for the specified source type. Please check the query syntax and source configuration."
Reason this comment was not posted:
Confidence changes required: 50%
The error message in the execute_query method when a query fails is not very descriptive. It should provide more context to the user.

Workflow ID: wflow_N6uHl6egM1T7U7xZ

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

scaliseraoul-sinaptik added 3 commits January 31, 2025 10:58

feature(SqlLoader): transformations in SqlLoader

314ba22

feature(FileManager): adding FileManager to make feasible work with t…

25f4478

…he library in other environment

feature(FileManager): filemanager full implementation

bcc04a9

ellipsis-dev bot reviewed Jan 31, 2025

View reviewed changes

scaliseraoul and others added 3 commits January 31, 2025 17:31

Merge branch 'main' into fix/SIN-340

124d4c3

feature(FileManager): format issues

cb1668b

feature(FileManager): updated tests

459991a

gventuri merged commit 0c6738b into sinaptik-ai:main Jan 31, 2025
12 checks passed

Provide feedback