Merge `release/v0.24.1` #4458

benjaminpkane · 2024-06-06T14:36:38Z

Summary by CodeRabbit

New Features
- Added a new method for downloading 3D scene assets in the fiftyone package.
- Introduced parameters for controlling dataset persistence and generation behavior in various dataset creation functions.
Bug Fixes
- Corrected a comment related to sample_id association with frames in clips datasets.
Documentation
- Updated documentation to include new methods and parameters.
- Adjusted header levels in the anomaly detection tutorial for better content presentation.
Refactor
- Improved handling of file paths and asset paths across multiple modules for consistency and robustness.
- Renamed and simplified functions in the storage module.
Tests
- Added new test methods to verify the functionality of patch generation and asset path updates.
- Modified existing tests to ensure accurate path handling and dataset persistence.

Cherry picking PRs into v0.24.1

coderabbitai · 2024-06-06T14:36:56Z

Walkthrough

The changes encompass various enhancements and fixes across multiple files. Key updates include adding Git commands to a GitHub Actions workflow, introducing new methods and parameters in the fiftyone package for better dataset management, refining path handling, and improving test coverage. These modifications aim to enhance functionality, improve efficiency, and ensure robustness in handling datasets, particularly 3D scene assets, video frames, and patches.

Changes

Files & Paths	Change Summaries
`.github/workflows/push-release.yml`	Added Git commands to handle branch merging and pushing.
`docs/source/teams/cloud_media.rst`	Added `download_scenes` method and `include_assets` parameter in `fo.Dataset` class.
`docs/source/tutorials/anomaly_detection.ipynb`	Updated header level in a markdown cell from "##" to "#".
`fiftyone/core/clips.py`	Added `persistent` and `_generated` parameters to `make_clips_dataset`.
`fiftyone/core/dataset.py`	Corrected a comment in `_get_frames_pipeline` function.
`fiftyone/core/metadata.py`	Refactored `SceneMetadata` class and updated import statements.
`fiftyone/core/patches.py`	Added `persistent` and `_generated` parameters to `make_patches_dataset` and `make_evaluation_patches_dataset`.
`fiftyone/core/stages.py`	Added `_generated=True` parameter to various function calls in `load_view` method.
`fiftyone/core/storage.py`	Renamed `resolve` function to `realpath` and updated related docstrings.
`fiftyone/core/threed/object_3d.py`	Modified `Object3D` class declaration and updated `_asset_path_fields` attribute.
`fiftyone/core/threed/scene_3d.py`	Adjusted asset path handling in `Scene` class.
`fiftyone/core/video.py`	Added `persistent` and `_generated` parameters to `make_frames_dataset`.
`fiftyone/utils/cvat.py`	Refactored `CVATBackendConfig` and improved job retrieval logic in `upload_data` function.
`fiftyone/utils/data/exporters.py`	Refactored logic related to dataset exporters.
`fiftyone/utils/utils3d.py`	Refactored file path handling within functions for consistent path resolution.
`tests/intensive/cvat_tests.py`	Added `segment_size=1` parameter to `test_detection_labelling` method.
`tests/unittests/metadata_tests.py`	Modified logic for counting `size_bytes` and `asset_counts` in `test_build_for` method.
`tests/unittests/patches_tests.py`	Added test method `test_make_patches_dataset` and imported `fiftyone.core.patches`.
`tests/unittests/threed/scene_3d_tests.py`	Added `test_update_asset_paths` method and modified path resolution in `test_write_resolve_relative`.
`tests/unittests/utils3d_tests.py`	Removed `os.path.realpath` calls in `test_get_scene_asset_paths` function.
`tests/unittests/video_tests.py`	Added imports for `fiftyone.core.clips`.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant GitHub Actions
    participant Branch
    participant Git
    participant Develop Branch

    GitHub Actions->>Git: Checkout merge branch
    Git->>Branch: Create or switch to merge branch
    GitHub Actions->>Git: Push changes to merge branch
    Git->>Branch: Push changes
    GitHub Actions->>Git: Pull changes from merge branch into develop branch
    Git->>Develop Branch: Merge changes

Poem

In the land of code, where datasets grow,
New methods sprout, and functions flow.
With paths resolved and assets clear,
Efficiency blooms, there's nothing to fear.
From patches to scenes, improvements abound,
In FiftyOne's realm, progress is found.
🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

Outside diff range and nitpick comments (35)

fiftyone/core/storage.py (3)
Line range hint 8-8: Remove unused import contextlib.contextmanager.
- from contextlib import contextmanager
Line range hint 284-311: Ensure consistent documentation and functionality for realpath and abspath.

The documentation for realpath and abspath is almost identical, and their functionality is very similar. Consider merging these functions if their usage throughout the codebase allows, or at least align their documentation to clearly distinguish their use cases.

Line range hint 882-882: Avoid using bare except.

Replace the bare except with more specific exception handling to improve error diagnostics and maintainability.
- except Exception as e:
+ except (FileNotFoundError, IOError) as e:
docs/source/teams/cloud_media.rst (3)
Line range hint 394-461: Ensure consistency in method documentation for download_scenes.

The method download_scenes shares a lot of similarities with other download methods. Ensure that the documentation and parameters are consistent across all similar methods to maintain a uniform interface.

Line range hint 470-509: Clarify the usage of download_context in batch processing.

The documentation for download_context could be enhanced by providing examples or more detailed explanations of how batch_size and target_size_bytes affect the download process. This would help users better understand how to use this method effectively.

[!TIP]
Codebase Verification

The include_assets parameter is consistently present in the function signatures and docstrings, but there is no evidence from the search results that it is actually used within the function bodies. This suggests that the parameter might be redundant.

Consider reviewing the implementation of these functions to determine if the include_assets parameter is necessary. If it is not used, it may be beneficial to remove it to simplify the interface.

Analysis chain

Line range hint 520-551: Review the necessity of the include_assets parameter in get_local_paths.

Check if the include_assets parameter is used consistently across functions that deal with local path retrieval and asset management. If it's not uniformly needed, consider simplifying the interface by removing it where unnecessary.

The include_assets parameter is consistently used across multiple functions in the docs/source/teams/cloud_media.rst file. This suggests that it is a standard part of the interface for handling media assets. However, to ensure its necessity, we should verify if this parameter is actually utilized in the implementation of these functions.

To ensure thorough verification, we need to search for the usage of the include_assets parameter within the function bodies.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify if the `include_assets` parameter is consistently used or needed across similar functions.
grep -rnw 'include_assets' -A 5 -B 5
Length of output: 8945

Script:
#!/bin/bash
# Search for the implementation of functions that use the `include_assets` parameter to verify its necessity.
ast-grep --lang python --pattern $'def $_($_, include_assets=$_, $$$):\n    $$$' --json
Length of output: 90

Script:
#!/bin/bash
# Search for the usage of the `include_assets` parameter within the function bodies.
rg 'include_assets' -A 10 -B 10
Length of output: 12341
fiftyone/core/patches.py (4)
Line range hint 561-592: Consider documenting the new parameters in the function signature.

It would enhance clarity and maintainability if the newly added parameters persistent and _generated in the make_patches_dataset function were documented in the docstring. This helps other developers understand the purpose and usage of these parameters without digging into the code.

Line range hint 654-654: Use direct truthiness checks for boolean expressions.
- if other_fields == True:
-     other_fields = [f for f in src_schema if f not in curr_schema]
+ if other_fields:
+     other_fields = [f for f in src_schema if f not in curr_schema]
Instead of comparing other_fields to True, you can use the truthiness of the value directly. This is more Pythonic and reduces unnecessary complexity in the condition.

Line range hint 683-738: Document the new parameters in the function signature.

The function make_evaluation_patches_dataset has new parameters persistent and _generated that are not documented in the docstring. Adding a brief explanation about these parameters would help maintain the documentation's usefulness and completeness.

Line range hint 805-805: Use direct truthiness checks for boolean expressions.
- if other_fields == True:
-     other_fields = [f for f in src_schema if f not in curr_schema]
+ if other_fields:
+     other_fields = [f for f in src_schema if f not in curr_schema]
Instead of comparing other_fields to True, you can use the truthiness of the value directly. This is more Pythonic and reduces unnecessary complexity in the condition.
fiftyone/core/video.py (4)
Line range hint 617-617: Use Pythonic conditional checks.
- if sample_frames != True:
+ if not sample_frames:
Replace if sample_frames != True: with if not sample_frames: for a more Pythonic and readable conditional check.

Line range hint 618-618: Clarify variable naming.
- l = locals()
+ local_vars = locals()
The variable name l is ambiguous and can be confused with the number 1. Consider renaming it to local_vars for better readability.

Line range hint 707-707: Use Pythonic conditional checks.
- if sample_frames == False:
+ if not sample_frames:
- if sample_frames == True:
+ if sample_frames:
Replace equality comparisons to True or False with the more Pythonic if sample_frames: or if not sample_frames: respectively.

Also applies to: 741-741, 748-748, 776-776, 810-810, 813-813, 858-858

Line range hint 789-789: Use is not None for None checks.
- if src_dataset.has_frame_field("filepath") != None:
+ if src_dataset.has_frame_field("filepath") is not None:
For checking against None, use is not None instead of != None for better readability and conformance to Python best practices.
fiftyone/core/clips.py (4)
Line range hint 119-119: Replace the bare except with a specific exception type.
- except:
+ except ExceptionType:  # Replace ExceptionType with the specific exception you are catching
This change will make the error handling more precise and avoid catching unintended exceptions.

Line range hint 542-542: Use direct truthiness checks for boolean conditions.
- if other_fields == True:
- if other_fields == True:
+ if other_fields:
+ if other_fields:
This simplifies the condition checks and adheres to Pythonic best practices.

Also applies to: 723-723

Line range hint 1055-1055: Use is not None for None checks.
- if not frame_numbers:
+ if frame_numbers is not None:
This change ensures that the check is explicit and clear, improving code readability and correctness.

Line range hint 1125-1125: Clarify ambiguous variable name l.
- for s, l in zip(frame_numbers, bools):
+ for start, last in zip(frame_numbers, bools):
Renaming l to last improves code readability and avoids confusion with the number 1.
tests/intensive/cvat_tests.py (6)
Line range hint 13-14: Remove unused imports to clean up the code.
- from collections import defaultdict
- import numpy as np
Line range hint 380-381: Local variables api and task_id are assigned but never used in the test_multiple_fields method.
- api = results.connect_to_api()
- task_id = results.task_ids[0]
Line range hint 694-694: Local variable person_labels is assigned but never used in the test_example_restricting_label_edits method.
- person_labels = view.filter_labels(
-     "ground_truth", F("label") == "person"
- ).values("ground_truth.detections", unwind=True)
Line range hint 816-816: Local variable status is assigned but never used in the test_deleted_tasks method.
- status = results.get_status()
Line range hint 892-892: Avoid using single-letter variable names like l to improve code readability.
- label_map = {l: l.upper() for l in labels}
+ label_map = {label: label.upper() for label in labels}
Line range hint 1327-1327: Local variable results is assigned but never used in the test_frame_start_stop_step method.
- results = dataset.annotate(
-     anno_key,
-     backend="cvat",
-     label_field="frames.detections_new_2",
-     label_type="detections",
-     classes=["test"],
-     frame_start=[start],
-     frame_stop=[100],
-     frame_step=[step],
- )
tests/unittests/video_tests.py (2)
Line range hint 1040-1040: Replace lambda expression with a function definition for clarity and maintainability.
# Replace lambda with function definition
def filepath_fcn(sample):
    return sample.filepath
Line range hint 2689-2689: Ambiguous variable name 'l' used, consider renaming for clarity.
# Example of renaming variable 'l' to 'label'
for _id, label, i, s in zip(
    *trajectories.values(
        [
            "sample_id",
            "detections.label",
            "detections.index",
            "support",
        ]
    )
):
    expected_trajs_map = {
        (sample1.id, "cat", 1): [1, 3],
        (sample1.id, "dog", 1): [3, 4],
        (sample2.id, "cat", 1): [2, 3],
        (sample2.id, "dog", 2): [2, 2],
        (sample2.id, "dog", 3): [3, 5],
    }
Also applies to: 2954-2954, 3256-3256
fiftyone/utils/data/exporters.py (9)
Line range hint 543-543: Avoid using bare except statements.
- except:
+ except Exception as e:
Line range hint 573-573: Avoid using bare except statements.
- except:
+ except Exception as e:
Line range hint 616-616: Avoid using bare except statements.
- except:
+ except Exception as e:
Line range hint 651-651: Avoid using bare except statements.
- except:
+ except Exception as e:
Line range hint 906-906: Avoid using bare except statements.
- except:
+ except Exception as e:
Line range hint 3818-3818: Use direct truth checks instead of comparing to False.
- if include_confidence == False:
+ if not include_confidence:
Line range hint 3818-3818: Use direct truth checks instead of comparing to False.
- if include_attributes == False:
+ if not include_attributes:
Line range hint 3944-3944: Use direct truth checks instead of comparing to True.
- if include_confidence == True:
+ if include_confidence:
Line range hint 3949-3949: Use direct truth checks instead of comparing to True.
- if include_attributes == True:
+ if include_attributes:

Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f633834 and e17741c.

Files ignored due to path filters (1)

fiftyone/zoo/models/manifest-torch.json is excluded by !**/*.json

Files selected for processing (21)

.github/workflows/push-release.yml (1 hunks)
docs/source/teams/cloud_media.rst (12 hunks)
docs/source/tutorials/anomaly_detection.ipynb (1 hunks)
fiftyone/core/clips.py (4 hunks)
fiftyone/core/dataset.py (1 hunks)
fiftyone/core/metadata.py (7 hunks)
fiftyone/core/patches.py (6 hunks)
fiftyone/core/stages.py (5 hunks)
fiftyone/core/storage.py (3 hunks)
fiftyone/core/threed/object_3d.py (3 hunks)
fiftyone/core/threed/scene_3d.py (3 hunks)
fiftyone/core/video.py (4 hunks)
fiftyone/utils/cvat.py (7 hunks)
fiftyone/utils/data/exporters.py (6 hunks)
fiftyone/utils/utils3d.py (3 hunks)
tests/intensive/cvat_tests.py (1 hunks)
tests/unittests/metadata_tests.py (1 hunks)
tests/unittests/patches_tests.py (2 hunks)
tests/unittests/threed/scene_3d_tests.py (2 hunks)
tests/unittests/utils3d_tests.py (1 hunks)
tests/unittests/video_tests.py (3 hunks)

Files skipped from review due to trivial changes (1)

docs/source/tutorials/anomaly_detection.ipynb

Additional context used

Ruff

tests/unittests/metadata_tests.py

8-8: json imported but unused (F401)

fiftyone/core/metadata.py

561-561: Do not use bare except (E722)

tests/unittests/patches_tests.py

171-171: Ambiguous variable name: l (E741)

475-475: Ambiguous variable name: l (E741)

537-537: Avoid equality comparisons to True; use if F("crowd"): for truth checks (E712)

579-579: Avoid equality comparisons to True; use if F("crowd"): for truth checks (E712)

fiftyone/core/storage.py

8-8: contextlib.contextmanager imported but unused (F401)

882-882: Do not use bare except (E722)

fiftyone/core/patches.py

285-285: Ambiguous variable name: l (E741)

654-654: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)

805-805: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)

fiftyone/core/video.py

617-617: Avoid inequality comparisons to True; use if not sample_frames: for false checks (E712)

618-618: Ambiguous variable name: l (E741)

707-707: Avoid equality comparisons to False; use if not sample_frames: for false checks (E712)

741-741: Avoid inequality comparisons to False; use if sample_frames: for truth checks (E712)

748-748: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)

776-776: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)

789-789: Comparison to None should be cond is not None (E711)

810-810: Avoid inequality comparisons to False; use if sample_frames: for truth checks (E712)

813-813: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)

858-858: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)

1032-1032: Avoid equality comparisons to False; use if not sample_frames: for false checks (E712)

1037-1037: Avoid inequality comparisons to True; use if not sample_frames: for false checks (E712)

fiftyone/core/clips.py

119-119: Do not use bare except (E722)

542-542: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)

723-723: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)

1055-1055: Comparison to None should be cond is not None (E711)

1125-1125: Ambiguous variable name: l (E741)

tests/intensive/cvat_tests.py

13-13: collections.defaultdict imported but unused (F401)

14-14: numpy imported but unused (F401)

380-380: Local variable api is assigned to but never used (F841)

381-381: Local variable task_id is assigned to but never used (F841)

694-694: Local variable person_labels is assigned to but never used (F841)

816-816: Local variable status is assigned to but never used (F841)

892-892: Ambiguous variable name: l (E741)

1327-1327: Local variable results is assigned to but never used (F841)

tests/unittests/video_tests.py

1040-1040: Do not assign a lambda expression, use a def (E731)

2689-2689: Ambiguous variable name: l (E741)

2954-2954: Ambiguous variable name: l (E741)

3256-3256: Ambiguous variable name: l (E741)

fiftyone/utils/data/exporters.py

543-543: Do not use bare except (E722)

573-573: Do not use bare except (E722)

616-616: Do not use bare except (E722)

651-651: Do not use bare except (E722)

906-906: Do not use bare except (E722)

3818-3818: Avoid inequality comparisons to False; use if include_confidence: for truth checks (E712)

3818-3818: Avoid inequality comparisons to False; use if include_attributes: for truth checks (E712)

3944-3944: Avoid equality comparisons to True; use if include_confidence: for truth checks (E712)

3949-3949: Avoid equality comparisons to True; use if include_attributes: for truth checks (E712)

fiftyone/utils/cvat.py

3440-3440: Do not use bare except (E722)

3475-3475: Do not use bare except (E722)

3989-3989: Local variable response is assigned to but never used (F841)

4152-4152: Local variable response is assigned to but never used (F841)

6078-6078: Local variable classes is assigned to but never used (F841)

6723-6723: Local variable formatted_track is assigned to but never used (F841)

6859-6859: Do not use bare except (E722)

6936-6936: Do not use bare except (E722)

7412-7412: Do not use bare except (E722)

7421-7421: Do not use bare except (E722)

7426-7426: Do not use bare except (E722)

7832-7832: Do not use bare except (E722)

fiftyone/core/stages.py

946-954: Do not assign a lambda expression, use a def (E731)

956-970: Do not assign a lambda expression, use a def (E731)

1540-1540: Comparison to None should be cond is not None (E711)

2504-2504: Do not assign a lambda expression, use a def (E731)

2509-2509: Do not assign a lambda expression, use a def (E731)

2568-2568: Comparison to None should be cond is not None (E711)

2570-2570: Comparison to None should be cond is not None (E711)

2577-2577: Comparison to None should be cond is not None (E711)

2588-2588: Comparison to None should be cond is not None (E711)

2752-2752: Comparison to None should be cond is not None (E711)

3920-3920: Comparison to None should be cond is not None (E711)

5178-5178: Comparison to None should be cond is not None (E711)

6620-6620: Comparison to None should be cond is not None (E711)

7270-7270: Do not use bare except (E722)

7282-7282: Do not use bare except (E722)

7290-7290: Do not use bare except (E722)

8284-8284: Do not use bare except (E722)

fiftyone/core/dataset.py

40-40: fiftyone.core.odm.dataset.SampleFieldDocument imported but unused (F401)

236-236: Do not use bare except (E722)

342-342: Do not use bare except (E722)

3325-3325: Ambiguous variable name: l (E741)

3341-3341: Ambiguous variable name: l (E741)

3384-3384: Ambiguous variable name: l (E741)

6733-6733: Avoid equality comparisons to False; use if not attach_frames: for false checks (E712)

7112-7112: Do not use bare except (E722)

7122-7122: Do not use bare except (E722)

7282-7282: Do not assign a lambda expression, use a def (E731)

7282-7282: Ambiguous variable name: l (E741)

7576-7576: Do not use bare except (E722)

8396-8396: Do not assign a lambda expression, use a def (E731)

Additional comments not posted (28)

.github/workflows/push-release.yml (1)
24-28: Ensure the new git commands in the workflow are secure and efficient.

Verification successful

The git commands in the workflow file appear to be standard operations for managing branches and merging changes. They are used correctly and follow best practices for git operations.

git checkout merge/${{ inputs.ref_name || github.ref_name }} \ || git checkout -b merge/${{ inputs.ref_name || github.ref_name }}: This command checks out the specified branch or creates a new one if it doesn't exist.

git push -u origin merge/${{ inputs.ref_name || github.ref_name }}: This command pushes the branch to the remote repository.

git checkout develop: This command checks out the develop branch.

git pull origin merge/${{ inputs.ref_name || github.ref_name }} --no-rebase: This command pulls changes from the specified branch without rebasing.

git pull origin ${{ inputs.ref_name || github.ref_name }} --no-rebase: This command pulls changes from the specified branch without rebasing.

These commands are used correctly and do not pose any security risks.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the security settings for git operations in the workflow.

# Test: Check for any potential security advisories related to the used git commands.
gh security advisories git
Length of output: 321

Script:
#!/bin/bash
# Description: Verify the correctness and security of git commands in the workflow file.

# Search for the usage of git commands in the workflow file.
rg 'git checkout|git push|git pull' .github/workflows/push-release.yml
Length of output: 496
tests/unittests/metadata_tests.py (1)

33-49: Ensure that the logic for counting size_bytes and asset_counts correctly handles duplicates.
tests/unittests/threed/scene_3d_tests.py (2)
119-132: Ensure that relative paths are correctly resolved in the test_write_resolve_relative method.

74-102: Verify that the updated asset paths are correctly handled in various scenarios.

Verification successful

The update_asset_paths and get_asset_paths methods in fiftyone/core/threed/scene_3d.py appear to be correctly implemented to handle asset path updates and retrievals. The update_asset_paths method updates asset paths based on a provided dictionary, and the get_asset_paths method returns a list of all asset paths in the scene.

update_asset_paths method:

Iterates through nodes and updates asset paths if they exist in the provided dictionary.

Also updates background image and cube paths if they exist in the dictionary.

get_asset_paths method:

Collects asset paths from nodes and background images/cubes without resolving relative paths to absolute paths.

These implementations align with the test case in tests/unittests/threed/scene_3d_tests.py, ensuring that asset paths are correctly updated and retrieved.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that asset paths are correctly updated across different file systems and configurations.

# Test: Check the behavior of asset path updates on different operating systems.
echo "Run tests on Windows, Linux, and macOS to ensure consistent behavior."
Length of output: 148

Script:
#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern $'class $_ {
  $$$
  def update_asset_paths(self, $_):
    $$$
  $$$
}'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern $'class $_ {
  $$$
  def get_asset_paths(self):
    $$$
  $$$
}'
Length of output: 1179

Script:
#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'class $_ {
  def update_asset_paths(self, $_):
    $$$
}'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'class $_ {
  def get_asset_paths(self):
    $$$
}'
Length of output: 1131

Script:
#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'def update_asset_paths(self, $_):'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'def get_asset_paths(self):'
Length of output: 134

Script:
#!/bin/bash
# Search for the definition of the `update_asset_paths` method in the codebase.
rg 'def update_asset_paths'

# Search for the definition of the `get_asset_paths` method in the codebase.
rg 'def get_asset_paths'
Length of output: 214

Script:
#!/bin/bash
# Extract the implementation of the `update_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def update_asset_paths(self, asset_rewrite_paths: dict):/,/^$/p' fiftyone/core/threed/scene_3d.py

# Extract the implementation of the `get_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def get_asset_paths(self):/,/^$/p' fiftyone/core/threed/scene_3d.py
Length of output: 420

Script:
#!/bin/bash
# Extract the full implementation of the `update_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def update_asset_paths(self, asset_rewrite_paths: dict):/,/^    def /p' fiftyone/core/threed/scene_3d.py

# Extract the full implementation of the `get_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def get_asset_paths(self):/,/^    def /p' fiftyone/core/threed/scene_3d.py
Length of output: 2674
fiftyone/core/threed/object_3d.py (1)

Line range hint 29-230: Ensure the simplified list comprehension logic in _get_asset_paths method is efficient and error-free.

fiftyone/core/threed/scene_3d.py (1)

279-279: Ensure consistent handling of asset path updates.

Also applies to: 287-290, 296-296

tests/unittests/utils3d_tests.py (1)

378-380: Ensure absolute path conversion is tested for edge cases.

Also applies to: 386-387

tests/unittests/patches_tests.py (1)

662-720: LGTM! The new test method test_make_patches_dataset correctly sets up the test environment, performs the operation, and asserts expected properties.

docs/source/teams/cloud_media.rst (1)

Line range hint 571-589: Validate the implementation of clear_media for efficiency.

Ensure that the clear_media method efficiently handles large datasets and does not lead to performance bottlenecks, especially when dealing with cloud-backed media.

fiftyone/core/video.py (1)

634-634: Ensure new parameters are used correctly.

The new parameters persistent and _generated are correctly passed to the Dataset constructor. This aligns with the intended functionality enhancements described in the PR.

fiftyone/core/clips.py (1)

589-590: Review the implementation of make_clips_dataset for the new parameters persistent and _generated.

The addition of persistent and _generated parameters allows for better control over dataset persistence and generation behavior. Ensure that these parameters are documented properly in the function's docstring.

Also applies to: 655-656, 682-690, 692-695, 770-776
fiftyone/utils/utils3d.py (2)
835-835: Ensure proper path resolution for PCD files.

Verification successful

The search results show multiple instances where fos.abspath is used in conjunction with fos.join and os.path.dirname to resolve paths. This indicates that the code is consistently using these functions to handle path resolution, which should ensure proper handling of various path formats.

fiftyone/utils/utils3d.py:

Line 835: pcd_path = fos.abspath(fos.join(os.path.dirname(scene_path), pcd_path))

Other instances in the same file show similar usage patterns.

fiftyone/core/threed/scene_3d.py:

Line 5: path = fos.abspath(fos.join(root, path))

fiftyone/core/metadata.py:

Line 5: asset_path = fos.abspath(fos.join(scene_dir, asset_path))

These consistent patterns suggest that the path resolution logic is correctly implemented across the codebase.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the path resolution for PCD files correctly handles various path formats.

# Test: Check if `fos.abspath` correctly resolves different path formats.
fd --type f --exec bash -c 'readlink -f {}' | rg 'fiftyone/utils/utils3d.py'
Length of output: 99952

Script:
#!/bin/bash
# Description: Verify the usage of `fos.abspath` in the codebase to ensure it handles various path formats correctly.

# Search for all instances of `fos.abspath` in the codebase and display the surrounding lines for context.
rg 'fos\.abspath' -A 5 -B 5
Length of output: 2401
473-473: Ensure absolute path conversion handles edge cases.

Verification successful

The previous script failed because the readlink command was not found. To verify the absolute path conversion, we need to inspect the implementation of the fos.abspath function directly.

The previous script failed due to incorrect usage of the fd command. Let's correct the command and locate the implementation of the fos.abspath function.

The script successfully located the definition of the abspath function. Now, let's retrieve the full implementation to verify how it handles edge cases.

The implementation of the abspath function simply calls os.path.abspath, which converts a given path to an absolute path and resolves relative path indicators such as . and ... This should handle most edge cases, including symbolic links and relative paths.

os.path.abspath is a standard Python function that resolves relative paths and symbolic links correctly.

Given this, the absolute path conversion in the code should handle edge cases appropriately.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify that the absolute path conversion handles edge cases such as symbolic links and relative paths correctly.

# Test: Check if `fos.abspath` correctly resolves symbolic links and relative paths.
fd --type f --exec bash -c 'readlink -f {}' | rg 'fiftyone/utils/utils3d.py'
Length of output: 99952

Script:
#!/bin/bash
# Description: Verify the implementation of the `fos.abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Locate the implementation of `fos.abspath`
fd --type f --name '*.py' --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {}
Length of output: 348

Script:
#!/bin/bash
# Description: Verify the implementation of the `fos.abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Locate the implementation of `fos.abspath`
fd --type f -e py --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {}
Length of output: 108

Script:
#!/bin/bash
# Description: Retrieve the full implementation of the `abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Extract the full implementation of `abspath`
fd --type f -e py --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {} -A 10
Length of output: 350
tests/intensive/cvat_tests.py (1)

309-309: Ensure the segment_size parameter is documented in the function's docstring.
tests/unittests/video_tests.py (2)
16-18: Imports are well-organized and relevant to the file's functionality.

3286-3354: The function test_make_clips_dataset effectively tests the creation of a clips dataset from event data. The checks for collection names and generated flags are appropriate. Similar to the previous function, consider enhancing the test by verifying the properties of the clips and frames within the dataset.
[REFACTOR_SUGGESTion]
# Additional assertions could be added to verify the clips' and frames' properties.
self.assertTrue(all("events" in clip for clip in clips_dataset))
fiftyone/utils/cvat.py (7)

3055-3056: Documentation for image_quality and use_cache parameters is clear and well-defined.

3077-3078: Documentation for task_name and occluded_attr parameters is clear and well-defined.

3675-3676: The implementation of job_annotation_url is straightforward and correct.

4270-4270: Polling logic for job IDs is implemented correctly.

4315-4316: Handling of filename ordering for older CVAT versions ensures backward compatibility.

4906-4917: Conditional handling in _get_job_ids for different server versions ensures compatibility and is implemented correctly.

4641-4798: Extensive changes related to job annotation processing and frame number remapping are well-structured. Consider reviewing performance implications due to the complexity.

fiftyone/core/stages.py (5)

7482-7485: The addition of _generated=True aligns with the enhanced dataset management features. Good integration with existing parameters.

7629-7632: Consistent use of _generated=True for evaluation patches dataset, enhancing control over dataset generation.

7789-7792: Proper implementation of _generated=True for clips dataset, aligning with the PR's enhancements in dataset management.

7926-7930: Inclusion of trajectories=True alongside _generated=True provides enhanced control and feature specificity in clips dataset creation.

8114-8116: Consistent application of _generated=True in frames dataset creation, aligning with the PR's focus on enhanced dataset management.

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/video.py

+    persistent=False,
+    _generated=False,


Add documentation for new parameters.

The parameters persistent and _generated have been added to the function signature but are not documented in the docstring. Please update the docstring to include descriptions for these parameters to maintain clarity and completeness of the documentation.

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/dataset.py

@@ -7806,7 +7806,7 @@ def _get_frames_pipeline(sample_collection):
        view = None

    if dataset._is_clips:
-        # Clips datasets use `sample_id` to associated with frames, but now as
+        # Clips datasets use `sample_id` to associate with frames, but now as


Complete the explanatory comment for clarity.

The comment "Clips datasets use sample_id to associate with frames, but now as" seems to be incomplete. Consider completing it to fully explain the change, for example:

- # Clips datasets use `sample_id` to associate with frames, but now as + # Clips datasets use `sample_id` to associate with frames, but now as a standalone collection, they must use `_id` for unique identification.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

# Clips datasets use `sample_id` to associate with frames, but now as

# Clips datasets use `sample_id` to associate with frames, but now as a standalone collection, they must use `_id` for unique identification.

coderabbitai · 2024-06-06T14:46:08Z

tests/unittests/video_tests.py

+    def test_make_frames_dataset(self):
+        dataset = fo.Dataset()
+
+        sample1 = fo.Sample(
+            filepath="video1.mp4",
+            metadata=fo.VideoMetadata(total_frame_count=4),
+            tags=["test"],
+            weather="sunny",
+        )
+        sample1.frames[1] = fo.Frame(filepath="frame11.jpg", hello="world")
+        sample1.frames[2] = fo.Frame(
+            filepath="frame12.jpg",
+            ground_truth=fo.Detections(
+                detections=[
+                    fo.Detection(label="cat"),
+                    fo.Detection(label="dog"),
+                ]
+            ),
+        )
+        sample1.frames[3] = fo.Frame(filepath="frame13.jpg", hello="goodbye")
+
+        sample2 = fo.Sample(
+            filepath="video2.mp4",
+            metadata=fo.VideoMetadata(total_frame_count=5),
+            tags=["test"],
+            weather="cloudy",
+        )
+        sample2.frames[1] = fo.Frame(
+            filepath="frame21.jpg",
+            hello="goodbye",
+            ground_truth=fo.Detections(
+                detections=[
+                    fo.Detection(label="dog"),
+                    fo.Detection(label="rabbit"),
+                ]
+            ),
+        )
+        sample2.frames[3] = fo.Frame(filepath="frame23.jpg")
+        sample2.frames[5] = fo.Frame(filepath="frame25.jpg", hello="there")
+
+        dataset.add_samples([sample1, sample2])
+
+        frames_view = dataset.to_frames()
+        frames_dataset = fov.make_frames_dataset(dataset)
+
+        self.assertNotEqual(
+            frames_dataset._sample_collection_name,
+            dataset._sample_collection_name,
+        )
+        self.assertIsNone(frames_dataset._frame_collection_name)
+        self.assertTrue(frames_view._is_generated)
+        self.assertFalse(frames_dataset._is_generated)
+        self.assertEqual(len(frames_dataset), dataset.count("frames"))
+        self.assertEqual(len(frames_dataset), len(frames_view))
+


The function test_make_frames_dataset correctly initializes a dataset and adds samples with video metadata and frames. The logic to create a frames view and a frames dataset is sound. However, consider adding assertions to verify the properties of the frames within the dataset to ensure they meet expected conditions.

# It might be beneficial to add assertions to check specific frame properties. self.assertTrue(all(isinstance(frame.hello, str) for frame in frames_dataset))

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/patches.py

+    dataset = fod.Dataset(
+        name=name,
+        persistent=persistent,
+        _patches=_generated,
+        _frames=is_frame_patches and _generated,
+    )


Ensure consistent naming conventions for parameters.

- _patches=_generated, + generated=_generated,

The parameter _patches in the Dataset constructor call should likely be renamed to generated to match the naming convention used elsewhere for similar functionality. This change would improve code readability and maintain consistency.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

dataset = fod.Dataset(

name=name,

persistent=persistent,

_patches=_generated,

_frames=is_frame_patches and _generated,

)

dataset = fod.Dataset(

name=name,

persistent=persistent,

generated=_generated,

_frames=is_frame_patches and _generated,

)

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/patches.py

+    dataset = fod.Dataset(
+        name=name,
+        persistent=persistent,
+        _patches=_generated,
+        _frames=is_frame_patches and _generated,
+    )


Ensure consistent naming conventions for parameters.

- _patches=_generated, + generated=_generated,

The parameter _patches in the Dataset constructor call should likely be renamed to generated to match the naming convention used elsewhere for similar functionality. This change would improve code readability and maintain consistency.

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/threed/scene_3d.py

@@ -186,7 +186,7 @@ def copy(self):
        return Scene._from_fo3d_dict(self.as_dict())

    def _resolve_node_asset_paths(self, node, fo3d_path_dir):
-        for asset_path_field in node._asset_path_fields or []:
+        for asset_path_field in node._asset_path_fields:


Refactor asset path resolution to a separate method for clarity.

- for asset_path_field in node._asset_path_fields: - asset_path = getattr(node, asset_path_field, None) - if asset_path: - resolved_asset_path = self._resolve_asset_path( - fo3d_path_dir, asset_path - ) - setattr(node, asset_path_field, resolved_asset_path) + self._resolve_asset_paths_for_node(node, fo3d_path_dir)

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

for asset_path_field in node._asset_path_fields:

self._resolve_asset_paths_for_node(node, fo3d_path_dir)

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/threed/scene_3d.py

@@ -320,28 +323,28 @@
        Returns:
            a list of asset paths
        """
-        asset_paths = list(
+        asset_paths = set(


Optimize asset path collection by using a more efficient data structure.

- asset_paths = set( - itertools.chain.from_iterable( - node._get_asset_paths() for node in self.traverse() - ) - ) - if self.background is not None: - if self.background.image is not None: - asset_paths.add(self.background.image) - if self.background.cube is not None: - asset_paths.update(self.background.cube) + asset_paths = self.collect_asset_paths()

Also applies to: 334-334, 336-336

Committable suggestion was skipped due to low confidence.

coderabbitai · 2024-06-06T14:46:08Z

fiftyone/core/threed/scene_3d.py


    def _resolve_asset_path(self, root: str, path: str):
        if path is None:
            return None

        if not fos.isabs(path):
-            path = fos.join(root, path)
+            path = fos.abspath(fos.join(root, path))


Tip
Codebase Verification

Ensure path resolution handles edge cases correctly.

The codebase does not appear to handle symbolic links (os.path.islink).

Relative paths are handled in various files using os.path.relpath.

Consider adding symbolic link handling to ensure robust path resolution.

Analysis chain

Ensure path resolution handles edge cases correctly.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify that path resolution handles edge cases like symbolic links and relative paths correctly. # Test: Check for symbolic link handling. rg --type python $'os.path.islink'

Length of output: 64

Script:

#!/bin/bash # Description: Verify that path resolution handles edge cases like symbolic links and relative paths correctly. # Test: Check for symbolic link handling. rg --glob '*.py' 'os.path.islink' # Test: Check for relative path handling. rg --glob '*.py' 'os.path.relpath'

Length of output: 1876

coderabbitai · 2024-06-06T14:46:09Z

fiftyone/core/metadata.py

@@ -73,76 +71,12 @@ def _build_for_url(cls, url, mime_type=None):
            mime_type = etau.guess_mime_type(url)

        with requests.get(url, stream=True) as r:
+            r.raise_for_status()


Ensure proper error handling in URL metadata fetching.

- r.raise_for_status() + if r.status_code != 200: + raise ValueError(f"Failed to fetch URL with status code {r.status_code}")

Also applies to: 139-139

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

r.raise_for_status()

if r.status_code != 200:

raise ValueError(f"Failed to fetch URL with status code {r.status_code}")

coderabbitai · 2024-06-06T14:46:09Z

fiftyone/core/metadata.py


    try:
-        return _get_metadata(filepath, media_type)
+        return _get_metadata(filepath, media_type, cache=cache)
    except:


Avoid using bare except to handle exceptions more gracefully.

- except: + except Exception as e: + logger.error(f"Failed to compute metadata: {str(e)}") + return None

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change

except:

except Exception as e:

logger.error(f"Failed to compute metadata: {str(e)}")

return None

Tools

Ruff

561-561: Do not use bare except (E722)

benjaminpkane · 2024-06-06T14:58:21Z

Closing in favor of #4459

brimoor and others added 17 commits June 5, 2024 11:04

allow generated datasets to be directly created

cbb7c00

load individual cvat job annotations instead of entire task

6cd7b0c

update test to create multiple jobs

b726dcb

lint

88627a4

add ultralytics yolov8 open images v7 pretrained models into zoo

584f6b6

handle partial maps when updating Scene asset paths

8883b5a

adding unit test for updating asset paths

56fc86b

optimize 3D operations

3914ced

consistent structure

1640696

cleanup

6e663c3

abspath vs realpath

4d74e08

consistency

1dd6b80

abspath

94c52ba

updating Teams docs

2350e74

fixing anomaly detection tutorial title level

74cc85d

use existing merge branch (#4453)

1d76724

Merge pull request #4455 from voxel51/cherry-pick-develop

e17741c

Cherry picking PRs into v0.24.1

coderabbitai bot reviewed Jun 6, 2024

View reviewed changes

benjaminpkane closed this Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge `release/v0.24.1` #4458

Merge `release/v0.24.1` #4458

benjaminpkane commented Jun 6, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 6, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

coderabbitai bot Jun 6, 2024

benjaminpkane commented Jun 6, 2024

	# Clips datasets use `sample_id` to associate with frames, but now as
	# Clips datasets use `sample_id` to associate with frames, but now as a standalone collection, they must use `_id` for unique identification.

	for asset_path_field in node._asset_path_fields:
	self._resolve_asset_paths_for_node(node, fo3d_path_dir)

	r.raise_for_status()
	if r.status_code != 200:
	raise ValueError(f"Failed to fetch URL with status code {r.status_code}")

-    except:
+    except Exception as e:
+        logger.error(f"Failed to compute metadata: {str(e)}")
+        return None

Merge release/v0.24.1 #4458

Merge release/v0.24.1 #4458

Conversation

benjaminpkane commented Jun 6, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Jun 6, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s) (Beta)

Poem

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

coderabbitai bot Jun 6, 2024

Choose a reason for hiding this comment

benjaminpkane commented Jun 6, 2024

Merge `release/v0.24.1` #4458

Merge `release/v0.24.1` #4458

benjaminpkane commented Jun 6, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 6, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)