Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge release/v0.24.1 #4458

Closed
wants to merge 17 commits into from
Closed

Merge release/v0.24.1 #4458

wants to merge 17 commits into from

Conversation

benjaminpkane
Copy link
Contributor

@benjaminpkane benjaminpkane commented Jun 6, 2024

Summary by CodeRabbit

  • New Features

    • Added a new method for downloading 3D scene assets in the fiftyone package.
    • Introduced parameters for controlling dataset persistence and generation behavior in various dataset creation functions.
  • Bug Fixes

    • Corrected a comment related to sample_id association with frames in clips datasets.
  • Documentation

    • Updated documentation to include new methods and parameters.
    • Adjusted header levels in the anomaly detection tutorial for better content presentation.
  • Refactor

    • Improved handling of file paths and asset paths across multiple modules for consistency and robustness.
    • Renamed and simplified functions in the storage module.
  • Tests

    • Added new test methods to verify the functionality of patch generation and asset path updates.
    • Modified existing tests to ensure accurate path handling and dataset persistence.

Copy link
Contributor

coderabbitai bot commented Jun 6, 2024

Walkthrough

The changes encompass various enhancements and fixes across multiple files. Key updates include adding Git commands to a GitHub Actions workflow, introducing new methods and parameters in the fiftyone package for better dataset management, refining path handling, and improving test coverage. These modifications aim to enhance functionality, improve efficiency, and ensure robustness in handling datasets, particularly 3D scene assets, video frames, and patches.

Changes

Files & Paths Change Summaries
.github/workflows/push-release.yml Added Git commands to handle branch merging and pushing.
docs/source/teams/cloud_media.rst Added download_scenes method and include_assets parameter in fo.Dataset class.
docs/source/tutorials/anomaly_detection.ipynb Updated header level in a markdown cell from "##" to "#".
fiftyone/core/clips.py Added persistent and _generated parameters to make_clips_dataset.
fiftyone/core/dataset.py Corrected a comment in _get_frames_pipeline function.
fiftyone/core/metadata.py Refactored SceneMetadata class and updated import statements.
fiftyone/core/patches.py Added persistent and _generated parameters to make_patches_dataset and make_evaluation_patches_dataset.
fiftyone/core/stages.py Added _generated=True parameter to various function calls in load_view method.
fiftyone/core/storage.py Renamed resolve function to realpath and updated related docstrings.
fiftyone/core/threed/object_3d.py Modified Object3D class declaration and updated _asset_path_fields attribute.
fiftyone/core/threed/scene_3d.py Adjusted asset path handling in Scene class.
fiftyone/core/video.py Added persistent and _generated parameters to make_frames_dataset.
fiftyone/utils/cvat.py Refactored CVATBackendConfig and improved job retrieval logic in upload_data function.
fiftyone/utils/data/exporters.py Refactored logic related to dataset exporters.
fiftyone/utils/utils3d.py Refactored file path handling within functions for consistent path resolution.
tests/intensive/cvat_tests.py Added segment_size=1 parameter to test_detection_labelling method.
tests/unittests/metadata_tests.py Modified logic for counting size_bytes and asset_counts in test_build_for method.
tests/unittests/patches_tests.py Added test method test_make_patches_dataset and imported fiftyone.core.patches.
tests/unittests/threed/scene_3d_tests.py Added test_update_asset_paths method and modified path resolution in test_write_resolve_relative.
tests/unittests/utils3d_tests.py Removed os.path.realpath calls in test_get_scene_asset_paths function.
tests/unittests/video_tests.py Added imports for fiftyone.core.clips.

Sequence Diagram(s) (Beta)

sequenceDiagram
    participant GitHub Actions
    participant Branch
    participant Git
    participant Develop Branch

    GitHub Actions->>Git: Checkout merge branch
    Git->>Branch: Create or switch to merge branch
    GitHub Actions->>Git: Push changes to merge branch
    Git->>Branch: Push changes
    GitHub Actions->>Git: Pull changes from merge branch into develop branch
    Git->>Develop Branch: Merge changes
Loading

Poem

In the land of code, where datasets grow,
New methods sprout, and functions flow.
With paths resolved and assets clear,
Efficiency blooms, there's nothing to fear.
From patches to scenes, improvements abound,
In FiftyOne's realm, progress is found.
🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Outside diff range and nitpick comments (35)
fiftyone/core/storage.py (3)

Line range hint 8-8: Remove unused import contextlib.contextmanager.

- from contextlib import contextmanager

Line range hint 284-311: Ensure consistent documentation and functionality for realpath and abspath.

The documentation for realpath and abspath is almost identical, and their functionality is very similar. Consider merging these functions if their usage throughout the codebase allows, or at least align their documentation to clearly distinguish their use cases.


Line range hint 882-882: Avoid using bare except.

Replace the bare except with more specific exception handling to improve error diagnostics and maintainability.

- except Exception as e:
+ except (FileNotFoundError, IOError) as e:
docs/source/teams/cloud_media.rst (3)

Line range hint 394-461: Ensure consistency in method documentation for download_scenes.

The method download_scenes shares a lot of similarities with other download methods. Ensure that the documentation and parameters are consistent across all similar methods to maintain a uniform interface.


Line range hint 470-509: Clarify the usage of download_context in batch processing.

The documentation for download_context could be enhanced by providing examples or more detailed explanations of how batch_size and target_size_bytes affect the download process. This would help users better understand how to use this method effectively.


[!TIP]
Codebase Verification

The include_assets parameter is consistently present in the function signatures and docstrings, but there is no evidence from the search results that it is actually used within the function bodies. This suggests that the parameter might be redundant.

Consider reviewing the implementation of these functions to determine if the include_assets parameter is necessary. If it is not used, it may be beneficial to remove it to simplify the interface.

Analysis chain

Line range hint 520-551: Review the necessity of the include_assets parameter in get_local_paths.

Check if the include_assets parameter is used consistently across functions that deal with local path retrieval and asset management. If it's not uniformly needed, consider simplifying the interface by removing it where unnecessary.


The include_assets parameter is consistently used across multiple functions in the docs/source/teams/cloud_media.rst file. This suggests that it is a standard part of the interface for handling media assets. However, to ensure its necessity, we should verify if this parameter is actually utilized in the implementation of these functions.


To ensure thorough verification, we need to search for the usage of the include_assets parameter within the function bodies.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Verify if the `include_assets` parameter is consistently used or needed across similar functions.
grep -rnw 'include_assets' -A 5 -B 5

Length of output: 8945



Script:

#!/bin/bash
# Search for the implementation of functions that use the `include_assets` parameter to verify its necessity.
ast-grep --lang python --pattern $'def $_($_, include_assets=$_, $$$):\n    $$$' --json

Length of output: 90



Script:

#!/bin/bash
# Search for the usage of the `include_assets` parameter within the function bodies.
rg 'include_assets' -A 10 -B 10

Length of output: 12341

fiftyone/core/patches.py (4)

Line range hint 561-592: Consider documenting the new parameters in the function signature.

It would enhance clarity and maintainability if the newly added parameters persistent and _generated in the make_patches_dataset function were documented in the docstring. This helps other developers understand the purpose and usage of these parameters without digging into the code.


Line range hint 654-654: Use direct truthiness checks for boolean expressions.

- if other_fields == True:
-     other_fields = [f for f in src_schema if f not in curr_schema]
+ if other_fields:
+     other_fields = [f for f in src_schema if f not in curr_schema]

Instead of comparing other_fields to True, you can use the truthiness of the value directly. This is more Pythonic and reduces unnecessary complexity in the condition.


Line range hint 683-738: Document the new parameters in the function signature.

The function make_evaluation_patches_dataset has new parameters persistent and _generated that are not documented in the docstring. Adding a brief explanation about these parameters would help maintain the documentation's usefulness and completeness.


Line range hint 805-805: Use direct truthiness checks for boolean expressions.

- if other_fields == True:
-     other_fields = [f for f in src_schema if f not in curr_schema]
+ if other_fields:
+     other_fields = [f for f in src_schema if f not in curr_schema]

Instead of comparing other_fields to True, you can use the truthiness of the value directly. This is more Pythonic and reduces unnecessary complexity in the condition.

fiftyone/core/video.py (4)

Line range hint 617-617: Use Pythonic conditional checks.

- if sample_frames != True:
+ if not sample_frames:

Replace if sample_frames != True: with if not sample_frames: for a more Pythonic and readable conditional check.


Line range hint 618-618: Clarify variable naming.

- l = locals()
+ local_vars = locals()

The variable name l is ambiguous and can be confused with the number 1. Consider renaming it to local_vars for better readability.


Line range hint 707-707: Use Pythonic conditional checks.

- if sample_frames == False:
+ if not sample_frames:
- if sample_frames == True:
+ if sample_frames:

Replace equality comparisons to True or False with the more Pythonic if sample_frames: or if not sample_frames: respectively.

Also applies to: 741-741, 748-748, 776-776, 810-810, 813-813, 858-858


Line range hint 789-789: Use is not None for None checks.

- if src_dataset.has_frame_field("filepath") != None:
+ if src_dataset.has_frame_field("filepath") is not None:

For checking against None, use is not None instead of != None for better readability and conformance to Python best practices.

fiftyone/core/clips.py (4)

Line range hint 119-119: Replace the bare except with a specific exception type.

- except:
+ except ExceptionType:  # Replace ExceptionType with the specific exception you are catching

This change will make the error handling more precise and avoid catching unintended exceptions.


Line range hint 542-542: Use direct truthiness checks for boolean conditions.

- if other_fields == True:
- if other_fields == True:
+ if other_fields:
+ if other_fields:

This simplifies the condition checks and adheres to Pythonic best practices.

Also applies to: 723-723


Line range hint 1055-1055: Use is not None for None checks.

- if not frame_numbers:
+ if frame_numbers is not None:

This change ensures that the check is explicit and clear, improving code readability and correctness.


Line range hint 1125-1125: Clarify ambiguous variable name l.

- for s, l in zip(frame_numbers, bools):
+ for start, last in zip(frame_numbers, bools):

Renaming l to last improves code readability and avoids confusion with the number 1.

tests/intensive/cvat_tests.py (6)

Line range hint 13-14: Remove unused imports to clean up the code.

- from collections import defaultdict
- import numpy as np

Line range hint 380-381: Local variables api and task_id are assigned but never used in the test_multiple_fields method.

- api = results.connect_to_api()
- task_id = results.task_ids[0]

Line range hint 694-694: Local variable person_labels is assigned but never used in the test_example_restricting_label_edits method.

- person_labels = view.filter_labels(
-     "ground_truth", F("label") == "person"
- ).values("ground_truth.detections", unwind=True)

Line range hint 816-816: Local variable status is assigned but never used in the test_deleted_tasks method.

- status = results.get_status()

Line range hint 892-892: Avoid using single-letter variable names like l to improve code readability.

- label_map = {l: l.upper() for l in labels}
+ label_map = {label: label.upper() for label in labels}

Line range hint 1327-1327: Local variable results is assigned but never used in the test_frame_start_stop_step method.

- results = dataset.annotate(
-     anno_key,
-     backend="cvat",
-     label_field="frames.detections_new_2",
-     label_type="detections",
-     classes=["test"],
-     frame_start=[start],
-     frame_stop=[100],
-     frame_step=[step],
- )
tests/unittests/video_tests.py (2)

Line range hint 1040-1040: Replace lambda expression with a function definition for clarity and maintainability.

# Replace lambda with function definition
def filepath_fcn(sample):
    return sample.filepath

Line range hint 2689-2689: Ambiguous variable name 'l' used, consider renaming for clarity.

# Example of renaming variable 'l' to 'label'
for _id, label, i, s in zip(
    *trajectories.values(
        [
            "sample_id",
            "detections.label",
            "detections.index",
            "support",
        ]
    )
):
    expected_trajs_map = {
        (sample1.id, "cat", 1): [1, 3],
        (sample1.id, "dog", 1): [3, 4],
        (sample2.id, "cat", 1): [2, 3],
        (sample2.id, "dog", 2): [2, 2],
        (sample2.id, "dog", 3): [3, 5],
    }

Also applies to: 2954-2954, 3256-3256

fiftyone/utils/data/exporters.py (9)

Line range hint 543-543: Avoid using bare except statements.

- except:
+ except Exception as e:

Line range hint 573-573: Avoid using bare except statements.

- except:
+ except Exception as e:

Line range hint 616-616: Avoid using bare except statements.

- except:
+ except Exception as e:

Line range hint 651-651: Avoid using bare except statements.

- except:
+ except Exception as e:

Line range hint 906-906: Avoid using bare except statements.

- except:
+ except Exception as e:

Line range hint 3818-3818: Use direct truth checks instead of comparing to False.

- if include_confidence == False:
+ if not include_confidence:

Line range hint 3818-3818: Use direct truth checks instead of comparing to False.

- if include_attributes == False:
+ if not include_attributes:

Line range hint 3944-3944: Use direct truth checks instead of comparing to True.

- if include_confidence == True:
+ if include_confidence:

Line range hint 3949-3949: Use direct truth checks instead of comparing to True.

- if include_attributes == True:
+ if include_attributes:
Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f633834 and e17741c.

Files ignored due to path filters (1)
  • fiftyone/zoo/models/manifest-torch.json is excluded by !**/*.json
Files selected for processing (21)
  • .github/workflows/push-release.yml (1 hunks)
  • docs/source/teams/cloud_media.rst (12 hunks)
  • docs/source/tutorials/anomaly_detection.ipynb (1 hunks)
  • fiftyone/core/clips.py (4 hunks)
  • fiftyone/core/dataset.py (1 hunks)
  • fiftyone/core/metadata.py (7 hunks)
  • fiftyone/core/patches.py (6 hunks)
  • fiftyone/core/stages.py (5 hunks)
  • fiftyone/core/storage.py (3 hunks)
  • fiftyone/core/threed/object_3d.py (3 hunks)
  • fiftyone/core/threed/scene_3d.py (3 hunks)
  • fiftyone/core/video.py (4 hunks)
  • fiftyone/utils/cvat.py (7 hunks)
  • fiftyone/utils/data/exporters.py (6 hunks)
  • fiftyone/utils/utils3d.py (3 hunks)
  • tests/intensive/cvat_tests.py (1 hunks)
  • tests/unittests/metadata_tests.py (1 hunks)
  • tests/unittests/patches_tests.py (2 hunks)
  • tests/unittests/threed/scene_3d_tests.py (2 hunks)
  • tests/unittests/utils3d_tests.py (1 hunks)
  • tests/unittests/video_tests.py (3 hunks)
Files skipped from review due to trivial changes (1)
  • docs/source/tutorials/anomaly_detection.ipynb
Additional context used
Ruff
tests/unittests/metadata_tests.py

8-8: json imported but unused (F401)

fiftyone/core/metadata.py

561-561: Do not use bare except (E722)

tests/unittests/patches_tests.py

171-171: Ambiguous variable name: l (E741)


475-475: Ambiguous variable name: l (E741)


537-537: Avoid equality comparisons to True; use if F("crowd"): for truth checks (E712)


579-579: Avoid equality comparisons to True; use if F("crowd"): for truth checks (E712)

fiftyone/core/storage.py

8-8: contextlib.contextmanager imported but unused (F401)


882-882: Do not use bare except (E722)

fiftyone/core/patches.py

285-285: Ambiguous variable name: l (E741)


654-654: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)


805-805: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)

fiftyone/core/video.py

617-617: Avoid inequality comparisons to True; use if not sample_frames: for false checks (E712)


618-618: Ambiguous variable name: l (E741)


707-707: Avoid equality comparisons to False; use if not sample_frames: for false checks (E712)


741-741: Avoid inequality comparisons to False; use if sample_frames: for truth checks (E712)


748-748: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)


776-776: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)


789-789: Comparison to None should be cond is not None (E711)


810-810: Avoid inequality comparisons to False; use if sample_frames: for truth checks (E712)


813-813: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)


858-858: Avoid equality comparisons to True; use if sample_frames: for truth checks (E712)


1032-1032: Avoid equality comparisons to False; use if not sample_frames: for false checks (E712)


1037-1037: Avoid inequality comparisons to True; use if not sample_frames: for false checks (E712)

fiftyone/core/clips.py

119-119: Do not use bare except (E722)


542-542: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)


723-723: Avoid equality comparisons to True; use if other_fields: for truth checks (E712)


1055-1055: Comparison to None should be cond is not None (E711)


1125-1125: Ambiguous variable name: l (E741)

tests/intensive/cvat_tests.py

13-13: collections.defaultdict imported but unused (F401)


14-14: numpy imported but unused (F401)


380-380: Local variable api is assigned to but never used (F841)


381-381: Local variable task_id is assigned to but never used (F841)


694-694: Local variable person_labels is assigned to but never used (F841)


816-816: Local variable status is assigned to but never used (F841)


892-892: Ambiguous variable name: l (E741)


1327-1327: Local variable results is assigned to but never used (F841)

tests/unittests/video_tests.py

1040-1040: Do not assign a lambda expression, use a def (E731)


2689-2689: Ambiguous variable name: l (E741)


2954-2954: Ambiguous variable name: l (E741)


3256-3256: Ambiguous variable name: l (E741)

fiftyone/utils/data/exporters.py

543-543: Do not use bare except (E722)


573-573: Do not use bare except (E722)


616-616: Do not use bare except (E722)


651-651: Do not use bare except (E722)


906-906: Do not use bare except (E722)


3818-3818: Avoid inequality comparisons to False; use if include_confidence: for truth checks (E712)


3818-3818: Avoid inequality comparisons to False; use if include_attributes: for truth checks (E712)


3944-3944: Avoid equality comparisons to True; use if include_confidence: for truth checks (E712)


3949-3949: Avoid equality comparisons to True; use if include_attributes: for truth checks (E712)

fiftyone/utils/cvat.py

3440-3440: Do not use bare except (E722)


3475-3475: Do not use bare except (E722)


3989-3989: Local variable response is assigned to but never used (F841)


4152-4152: Local variable response is assigned to but never used (F841)


6078-6078: Local variable classes is assigned to but never used (F841)


6723-6723: Local variable formatted_track is assigned to but never used (F841)


6859-6859: Do not use bare except (E722)


6936-6936: Do not use bare except (E722)


7412-7412: Do not use bare except (E722)


7421-7421: Do not use bare except (E722)


7426-7426: Do not use bare except (E722)


7832-7832: Do not use bare except (E722)

fiftyone/core/stages.py

946-954: Do not assign a lambda expression, use a def (E731)


956-970: Do not assign a lambda expression, use a def (E731)


1540-1540: Comparison to None should be cond is not None (E711)


2504-2504: Do not assign a lambda expression, use a def (E731)


2509-2509: Do not assign a lambda expression, use a def (E731)


2568-2568: Comparison to None should be cond is not None (E711)


2570-2570: Comparison to None should be cond is not None (E711)


2577-2577: Comparison to None should be cond is not None (E711)


2588-2588: Comparison to None should be cond is not None (E711)


2752-2752: Comparison to None should be cond is not None (E711)


3920-3920: Comparison to None should be cond is not None (E711)


5178-5178: Comparison to None should be cond is not None (E711)


6620-6620: Comparison to None should be cond is not None (E711)


7270-7270: Do not use bare except (E722)


7282-7282: Do not use bare except (E722)


7290-7290: Do not use bare except (E722)


8284-8284: Do not use bare except (E722)

fiftyone/core/dataset.py

40-40: fiftyone.core.odm.dataset.SampleFieldDocument imported but unused (F401)


236-236: Do not use bare except (E722)


342-342: Do not use bare except (E722)


3325-3325: Ambiguous variable name: l (E741)


3341-3341: Ambiguous variable name: l (E741)


3384-3384: Ambiguous variable name: l (E741)


6733-6733: Avoid equality comparisons to False; use if not attach_frames: for false checks (E712)


7112-7112: Do not use bare except (E722)


7122-7122: Do not use bare except (E722)


7282-7282: Do not assign a lambda expression, use a def (E731)


7282-7282: Ambiguous variable name: l (E741)


7576-7576: Do not use bare except (E722)


8396-8396: Do not assign a lambda expression, use a def (E731)

Additional comments not posted (28)
.github/workflows/push-release.yml (1)

24-28: Ensure the new git commands in the workflow are secure and efficient.

Verification successful

The git commands in the workflow file appear to be standard operations for managing branches and merging changes. They are used correctly and follow best practices for git operations.

  • git checkout merge/${{ inputs.ref_name || github.ref_name }} \ || git checkout -b merge/${{ inputs.ref_name || github.ref_name }}: This command checks out the specified branch or creates a new one if it doesn't exist.
  • git push -u origin merge/${{ inputs.ref_name || github.ref_name }}: This command pushes the branch to the remote repository.
  • git checkout develop: This command checks out the develop branch.
  • git pull origin merge/${{ inputs.ref_name || github.ref_name }} --no-rebase: This command pulls changes from the specified branch without rebasing.
  • git pull origin ${{ inputs.ref_name || github.ref_name }} --no-rebase: This command pulls changes from the specified branch without rebasing.

These commands are used correctly and do not pose any security risks.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the security settings for git operations in the workflow.

# Test: Check for any potential security advisories related to the used git commands.
gh security advisories git

Length of output: 321



Script:

#!/bin/bash
# Description: Verify the correctness and security of git commands in the workflow file.

# Search for the usage of git commands in the workflow file.
rg 'git checkout|git push|git pull' .github/workflows/push-release.yml

Length of output: 496

tests/unittests/metadata_tests.py (1)

33-49: Ensure that the logic for counting size_bytes and asset_counts correctly handles duplicates.

tests/unittests/threed/scene_3d_tests.py (2)

119-132: Ensure that relative paths are correctly resolved in the test_write_resolve_relative method.


74-102: Verify that the updated asset paths are correctly handled in various scenarios.

Verification successful

The update_asset_paths and get_asset_paths methods in fiftyone/core/threed/scene_3d.py appear to be correctly implemented to handle asset path updates and retrievals. The update_asset_paths method updates asset paths based on a provided dictionary, and the get_asset_paths method returns a list of all asset paths in the scene.

  • update_asset_paths method:

    • Iterates through nodes and updates asset paths if they exist in the provided dictionary.
    • Also updates background image and cube paths if they exist in the dictionary.
  • get_asset_paths method:

    • Collects asset paths from nodes and background images/cubes without resolving relative paths to absolute paths.

These implementations align with the test case in tests/unittests/threed/scene_3d_tests.py, ensuring that asset paths are correctly updated and retrieved.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that asset paths are correctly updated across different file systems and configurations.

# Test: Check the behavior of asset path updates on different operating systems.
echo "Run tests on Windows, Linux, and macOS to ensure consistent behavior."

Length of output: 148



Script:

#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern $'class $_ {
  $$$
  def update_asset_paths(self, $_):
    $$$
  $$$
}'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern $'class $_ {
  $$$
  def get_asset_paths(self):
    $$$
  $$$
}'

Length of output: 1179



Script:

#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'class $_ {
  def update_asset_paths(self, $_):
    $$$
}'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'class $_ {
  def get_asset_paths(self):
    $$$
}'

Length of output: 1131



Script:

#!/bin/bash
# Locate the implementation of the `update_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'def update_asset_paths(self, $_):'

# Locate the implementation of the `get_asset_paths` method in the `scene` class.
ast-grep --lang python --pattern 'def get_asset_paths(self):'

Length of output: 134



Script:

#!/bin/bash
# Search for the definition of the `update_asset_paths` method in the codebase.
rg 'def update_asset_paths'

# Search for the definition of the `get_asset_paths` method in the codebase.
rg 'def get_asset_paths'

Length of output: 214



Script:

#!/bin/bash
# Extract the implementation of the `update_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def update_asset_paths(self, asset_rewrite_paths: dict):/,/^$/p' fiftyone/core/threed/scene_3d.py

# Extract the implementation of the `get_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def get_asset_paths(self):/,/^$/p' fiftyone/core/threed/scene_3d.py

Length of output: 420



Script:

#!/bin/bash
# Extract the full implementation of the `update_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def update_asset_paths(self, asset_rewrite_paths: dict):/,/^    def /p' fiftyone/core/threed/scene_3d.py

# Extract the full implementation of the `get_asset_paths` method from `fiftyone/core/threed/scene_3d.py`.
sed -n '/def get_asset_paths(self):/,/^    def /p' fiftyone/core/threed/scene_3d.py

Length of output: 2674

fiftyone/core/threed/object_3d.py (1)

Line range hint 29-230: Ensure the simplified list comprehension logic in _get_asset_paths method is efficient and error-free.

fiftyone/core/threed/scene_3d.py (1)

279-279: Ensure consistent handling of asset path updates.

Also applies to: 287-290, 296-296

tests/unittests/utils3d_tests.py (1)

378-380: Ensure absolute path conversion is tested for edge cases.

Also applies to: 386-387

tests/unittests/patches_tests.py (1)

662-720: LGTM! The new test method test_make_patches_dataset correctly sets up the test environment, performs the operation, and asserts expected properties.

docs/source/teams/cloud_media.rst (1)

Line range hint 571-589: Validate the implementation of clear_media for efficiency.

Ensure that the clear_media method efficiently handles large datasets and does not lead to performance bottlenecks, especially when dealing with cloud-backed media.

fiftyone/core/video.py (1)

634-634: Ensure new parameters are used correctly.

The new parameters persistent and _generated are correctly passed to the Dataset constructor. This aligns with the intended functionality enhancements described in the PR.

fiftyone/core/clips.py (1)

589-590: Review the implementation of make_clips_dataset for the new parameters persistent and _generated.

The addition of persistent and _generated parameters allows for better control over dataset persistence and generation behavior. Ensure that these parameters are documented properly in the function's docstring.

Also applies to: 655-656, 682-690, 692-695, 770-776

fiftyone/utils/utils3d.py (2)

835-835: Ensure proper path resolution for PCD files.

Verification successful

The search results show multiple instances where fos.abspath is used in conjunction with fos.join and os.path.dirname to resolve paths. This indicates that the code is consistently using these functions to handle path resolution, which should ensure proper handling of various path formats.

  • fiftyone/utils/utils3d.py:

    • Line 835: pcd_path = fos.abspath(fos.join(os.path.dirname(scene_path), pcd_path))
    • Other instances in the same file show similar usage patterns.
  • fiftyone/core/threed/scene_3d.py:

    • Line 5: path = fos.abspath(fos.join(root, path))
  • fiftyone/core/metadata.py:

    • Line 5: asset_path = fos.abspath(fos.join(scene_dir, asset_path))

These consistent patterns suggest that the path resolution logic is correctly implemented across the codebase.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that the path resolution for PCD files correctly handles various path formats.

# Test: Check if `fos.abspath` correctly resolves different path formats.
fd --type f --exec bash -c 'readlink -f {}' | rg 'fiftyone/utils/utils3d.py'

Length of output: 99952



Script:

#!/bin/bash
# Description: Verify the usage of `fos.abspath` in the codebase to ensure it handles various path formats correctly.

# Search for all instances of `fos.abspath` in the codebase and display the surrounding lines for context.
rg 'fos\.abspath' -A 5 -B 5

Length of output: 2401


473-473: Ensure absolute path conversion handles edge cases.

Verification successful

The previous script failed because the readlink command was not found. To verify the absolute path conversion, we need to inspect the implementation of the fos.abspath function directly.


The previous script failed due to incorrect usage of the fd command. Let's correct the command and locate the implementation of the fos.abspath function.


The script successfully located the definition of the abspath function. Now, let's retrieve the full implementation to verify how it handles edge cases.


The implementation of the abspath function simply calls os.path.abspath, which converts a given path to an absolute path and resolves relative path indicators such as . and ... This should handle most edge cases, including symbolic links and relative paths.

  • os.path.abspath is a standard Python function that resolves relative paths and symbolic links correctly.

Given this, the absolute path conversion in the code should handle edge cases appropriately.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that the absolute path conversion handles edge cases such as symbolic links and relative paths correctly.

# Test: Check if `fos.abspath` correctly resolves symbolic links and relative paths.
fd --type f --exec bash -c 'readlink -f {}' | rg 'fiftyone/utils/utils3d.py'

Length of output: 99952



Script:

#!/bin/bash
# Description: Verify the implementation of the `fos.abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Locate the implementation of `fos.abspath`
fd --type f --name '*.py' --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {}

Length of output: 348



Script:

#!/bin/bash
# Description: Verify the implementation of the `fos.abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Locate the implementation of `fos.abspath`
fd --type f -e py --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {}

Length of output: 108



Script:

#!/bin/bash
# Description: Retrieve the full implementation of the `abspath` function to ensure it handles edge cases such as symbolic links and relative paths.

# Extract the full implementation of `abspath`
fd --type f -e py --exec rg --pcre2 --multiline --multiline-dotall 'def abspath\(' {} -A 10

Length of output: 350

tests/intensive/cvat_tests.py (1)

309-309: Ensure the segment_size parameter is documented in the function's docstring.

tests/unittests/video_tests.py (2)

16-18: Imports are well-organized and relevant to the file's functionality.


3286-3354: The function test_make_clips_dataset effectively tests the creation of a clips dataset from event data. The checks for collection names and generated flags are appropriate. Similar to the previous function, consider enhancing the test by verifying the properties of the clips and frames within the dataset.
[REFACTOR_SUGGESTion]

# Additional assertions could be added to verify the clips' and frames' properties.
self.assertTrue(all("events" in clip for clip in clips_dataset))
fiftyone/utils/cvat.py (7)

3055-3056: Documentation for image_quality and use_cache parameters is clear and well-defined.


3077-3078: Documentation for task_name and occluded_attr parameters is clear and well-defined.


3675-3676: The implementation of job_annotation_url is straightforward and correct.


4270-4270: Polling logic for job IDs is implemented correctly.


4315-4316: Handling of filename ordering for older CVAT versions ensures backward compatibility.


4906-4917: Conditional handling in _get_job_ids for different server versions ensures compatibility and is implemented correctly.


4641-4798: Extensive changes related to job annotation processing and frame number remapping are well-structured. Consider reviewing performance implications due to the complexity.

fiftyone/core/stages.py (5)

7482-7485: The addition of _generated=True aligns with the enhanced dataset management features. Good integration with existing parameters.


7629-7632: Consistent use of _generated=True for evaluation patches dataset, enhancing control over dataset generation.


7789-7792: Proper implementation of _generated=True for clips dataset, aligning with the PR's enhancements in dataset management.


7926-7930: Inclusion of trajectories=True alongside _generated=True provides enhanced control and feature specificity in clips dataset creation.


8114-8116: Consistent application of _generated=True in frames dataset creation, aligning with the PR's focus on enhanced dataset management.

Comment on lines +487 to +488
persistent=False,
_generated=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add documentation for new parameters.

The parameters persistent and _generated have been added to the function signature but are not documented in the docstring. Please update the docstring to include descriptions for these parameters to maintain clarity and completeness of the documentation.

@@ -7806,7 +7806,7 @@ def _get_frames_pipeline(sample_collection):
view = None

if dataset._is_clips:
# Clips datasets use `sample_id` to associated with frames, but now as
# Clips datasets use `sample_id` to associate with frames, but now as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete the explanatory comment for clarity.

The comment "Clips datasets use sample_id to associate with frames, but now as" seems to be incomplete. Consider completing it to fully explain the change, for example:

- # Clips datasets use `sample_id` to associate with frames, but now as
+ # Clips datasets use `sample_id` to associate with frames, but now as a standalone collection, they must use `_id` for unique identification.
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
# Clips datasets use `sample_id` to associate with frames, but now as
# Clips datasets use `sample_id` to associate with frames, but now as a standalone collection, they must use `_id` for unique identification.

Comment on lines +2203 to +2257
def test_make_frames_dataset(self):
dataset = fo.Dataset()

sample1 = fo.Sample(
filepath="video1.mp4",
metadata=fo.VideoMetadata(total_frame_count=4),
tags=["test"],
weather="sunny",
)
sample1.frames[1] = fo.Frame(filepath="frame11.jpg", hello="world")
sample1.frames[2] = fo.Frame(
filepath="frame12.jpg",
ground_truth=fo.Detections(
detections=[
fo.Detection(label="cat"),
fo.Detection(label="dog"),
]
),
)
sample1.frames[3] = fo.Frame(filepath="frame13.jpg", hello="goodbye")

sample2 = fo.Sample(
filepath="video2.mp4",
metadata=fo.VideoMetadata(total_frame_count=5),
tags=["test"],
weather="cloudy",
)
sample2.frames[1] = fo.Frame(
filepath="frame21.jpg",
hello="goodbye",
ground_truth=fo.Detections(
detections=[
fo.Detection(label="dog"),
fo.Detection(label="rabbit"),
]
),
)
sample2.frames[3] = fo.Frame(filepath="frame23.jpg")
sample2.frames[5] = fo.Frame(filepath="frame25.jpg", hello="there")

dataset.add_samples([sample1, sample2])

frames_view = dataset.to_frames()
frames_dataset = fov.make_frames_dataset(dataset)

self.assertNotEqual(
frames_dataset._sample_collection_name,
dataset._sample_collection_name,
)
self.assertIsNone(frames_dataset._frame_collection_name)
self.assertTrue(frames_view._is_generated)
self.assertFalse(frames_dataset._is_generated)
self.assertEqual(len(frames_dataset), dataset.count("frames"))
self.assertEqual(len(frames_dataset), len(frames_view))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function test_make_frames_dataset correctly initializes a dataset and adds samples with video metadata and frames. The logic to create a frames view and a frames dataset is sound. However, consider adding assertions to verify the properties of the frames within the dataset to ensure they meet expected conditions.

# It might be beneficial to add assertions to check specific frame properties.
self.assertTrue(all(isinstance(frame.hello, str) for frame in frames_dataset))

Comment on lines +776 to +781
dataset = fod.Dataset(
name=name,
persistent=persistent,
_patches=_generated,
_frames=is_frame_patches and _generated,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure consistent naming conventions for parameters.

- _patches=_generated,
+ generated=_generated,

The parameter _patches in the Dataset constructor call should likely be renamed to generated to match the naming convention used elsewhere for similar functionality. This change would improve code readability and maintain consistency.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
dataset = fod.Dataset(
name=name,
persistent=persistent,
_patches=_generated,
_frames=is_frame_patches and _generated,
)
dataset = fod.Dataset(
name=name,
persistent=persistent,
generated=_generated,
_frames=is_frame_patches and _generated,
)

Comment on lines +613 to +618
dataset = fod.Dataset(
name=name,
persistent=persistent,
_patches=_generated,
_frames=is_frame_patches and _generated,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure consistent naming conventions for parameters.

- _patches=_generated,
+ generated=_generated,

The parameter _patches in the Dataset constructor call should likely be renamed to generated to match the naming convention used elsewhere for similar functionality. This change would improve code readability and maintain consistency.

Committable suggestion was skipped due to low confidence.

@@ -186,7 +186,7 @@ def copy(self):
return Scene._from_fo3d_dict(self.as_dict())

def _resolve_node_asset_paths(self, node, fo3d_path_dir):
for asset_path_field in node._asset_path_fields or []:
for asset_path_field in node._asset_path_fields:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor asset path resolution to a separate method for clarity.

- for asset_path_field in node._asset_path_fields:
-     asset_path = getattr(node, asset_path_field, None)
-     if asset_path:
-         resolved_asset_path = self._resolve_asset_path(
-             fo3d_path_dir, asset_path
-         )
-         setattr(node, asset_path_field, resolved_asset_path)
+ self._resolve_asset_paths_for_node(node, fo3d_path_dir)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
for asset_path_field in node._asset_path_fields:
self._resolve_asset_paths_for_node(node, fo3d_path_dir)

@@ -320,28 +323,28 @@
Returns:
a list of asset paths
"""
asset_paths = list(
asset_paths = set(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize asset path collection by using a more efficient data structure.

- asset_paths = set(
-     itertools.chain.from_iterable(
-         node._get_asset_paths() for node in self.traverse()
-     )
- )
- if self.background is not None:
-     if self.background.image is not None:
-         asset_paths.add(self.background.image)
-     if self.background.cube is not None:
-         asset_paths.update(self.background.cube)
+ asset_paths = self.collect_asset_paths()

Also applies to: 334-334, 336-336

Committable suggestion was skipped due to low confidence.


def _resolve_asset_path(self, root: str, path: str):
if path is None:
return None

if not fos.isabs(path):
path = fos.join(root, path)
path = fos.abspath(fos.join(root, path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

Codebase Verification

Ensure path resolution handles edge cases correctly.

  • The codebase does not appear to handle symbolic links (os.path.islink).
  • Relative paths are handled in various files using os.path.relpath.

Consider adding symbolic link handling to ensure robust path resolution.

Analysis chain

Ensure path resolution handles edge cases correctly.

Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify that path resolution handles edge cases like symbolic links and relative paths correctly.

# Test: Check for symbolic link handling.
rg --type python $'os.path.islink'

Length of output: 64



Script:

#!/bin/bash
# Description: Verify that path resolution handles edge cases like symbolic links and relative paths correctly.

# Test: Check for symbolic link handling.
rg --glob '*.py' 'os.path.islink'

# Test: Check for relative path handling.
rg --glob '*.py' 'os.path.relpath'

Length of output: 1876

@@ -73,76 +71,12 @@ def _build_for_url(cls, url, mime_type=None):
mime_type = etau.guess_mime_type(url)

with requests.get(url, stream=True) as r:
r.raise_for_status()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure proper error handling in URL metadata fetching.

- r.raise_for_status()
+ if r.status_code != 200:
+     raise ValueError(f"Failed to fetch URL with status code {r.status_code}")

Also applies to: 139-139

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
r.raise_for_status()
if r.status_code != 200:
raise ValueError(f"Failed to fetch URL with status code {r.status_code}")


try:
return _get_metadata(filepath, media_type)
return _get_metadata(filepath, media_type, cache=cache)
except:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using bare except to handle exceptions more gracefully.

- except:
+ except Exception as e:
+     logger.error(f"Failed to compute metadata: {str(e)}")
+     return None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
except:
except Exception as e:
logger.error(f"Failed to compute metadata: {str(e)}")
return None
Tools
Ruff

561-561: Do not use bare except (E722)

@benjaminpkane
Copy link
Contributor Author

Closing in favor of #4459

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants