Feature/refactoring (#6)

* Fix typos and update docs + code for inter object segmentation * Add object volume to annotated outputs * Update camera documentation and fix camera intrinsics output * Add Keypoint Annotation configuration documentation * Update virtual environment activation instructions and add metadata output * Add post-processing documentation for bounding boxes * Update inter class segmentation configuration * Update keypoint_annotation.md with instructions for using the Blender script * Update Camera Plugin documentation with frustum visualization support * Add camera frustum visualization settings to camera schema * Refactor keypoint script and remove unused code
DFKI-NI · Mar 12, 2024 · e0dfc21 · e0dfc21
1 parent 6fdb3dd
commit e0dfc21
Show file tree

Hide file tree

Showing 19 changed files with 327 additions and 40 deletions.
diff --git a/.github/workflows/integration_test.yaml b/.github/workflows/integration_test.yaml
@@ -40,8 +40,8 @@ jobs:
           "./output/*/main_camera/rect/*metadata.yaml"
           "./output/*/main_camera/extrinsics/*.yaml"
           "./output/*/main_camera/extrinsics/*metadata.yaml"
-          "./output/*/calibration/*intrinsics.yaml"
-          "./output/*/calibration/*metadata.yaml"
+          "./output/*/main_camera/intrinsics/*.yaml"
+          "./output/*/main_camera/intrinsics/*metadata.yaml"
           "./output/*/main_camera_annotations/bounding_box/*.txt"
           "./output/*/main_camera_annotations/bounding_box/*metadata.yaml"
           "./output/*/main_camera_annotations/depth/*.npz"

diff --git a/README.md b/README.md
@@ -40,9 +40,10 @@ Syclops supports a variety of annotated outputs for different use cases. The fol
 |**Instance Segmentation**|Unique instance id for each object in the scene|
 |**Depth**|Distance from the camera to each pixel|
 |**Bounding Boxes**|Bounding boxes for each object in the scene|
-|**Object Volume**|Volume of each object in the scene|
+|**Object Positions**|3D position of each object in the scene|
 |**Point Cloud**|3D location of each pixel in camera space|
-|**Object Positions**|3D pose of each object in the scene|
+|**Keypoints**|Location of keypoints in camera space|
+|**Object Volume**|Volume of each object in the scene|
 
 
 # ⚡️Getting Started

diff --git a/docs/docs/img/docs/inter_class_seg.png b/docs/docs/img/docs/inter_class_seg.png
diff --git a/docs/docs/index.md b/docs/docs/index.md
@@ -34,6 +34,7 @@ Syclops supports a variety of annotated outputs for different use cases. The fol
 |**Object Positions**|3D position of each object in the scene|
 |**Point Cloud**|3D location of each pixel in camera space|
 |**Keypoints**|Location of keypoints in camera space|
+|**Object Volume**|Volume of each object in the scene|
 
 # 📣 Terminology
 |Term|Description|

diff --git a/docs/docs/usage/command_line.md b/docs/docs/usage/command_line.md
@@ -3,16 +3,16 @@ To use `Syclops` the virtual environment needs to be activated. This can be done
 
 === "conda"
     ```bash
-    conda activate syclops
+    conda activate syclops_venv
     ```
 
 === "virtualenv"    
     ```bash
     # For Windows
-    .\syclops\Scripts\activate
+    ./syclops_venv/Scripts/activate
 
     # For Linux
-    source syclops/bin/activate
+    source syclops_venv/bin/activate
     ```
 
 The most important alias is `syclops` which is the main command to use the pipeline.

diff --git a/docs/docs/usage/job_description/config_descriptions/bounding_box.md b/docs/docs/usage/job_description/config_descriptions/bounding_box.md
@@ -0,0 +1,48 @@
+# Postprocessing
+
+Postprocessing operations are applied to the generated data after the scene rendering is complete. One common postprocessing task is generating bounding box annotations from the instance and semantic segmentation outputs.
+
+## Bounding Box Generation
+
+The `syclops_postprocessing_bounding_boxes` plugin is used to generate bounding box annotations in the YOLO format from the instance and semantic segmentation images.
+
+```yaml
+postprocessing:
+  syclops_postprocessing_bounding_boxes:
+    - type: "YOLO" 
+      classes_to_skip: [0, 1] # List of class ids to exclude from bounding boxes
+      id: yolo_bound_boxes
+      sources: ["main_cam_instance", "main_cam_semantic"] # Names of instance and semantic outputs
+```
+
+The key parameters are:
+
+- `type`: The output format, in this case "YOLO" for the YOLO bounding box format.
+- `classes_to_skip`: A list of class ids to exclude from the bounding box generation.
+- `id`: A unique identifier for this postprocessing output.
+- `sources`: The names of the instance and semantic segmentation outputs to use as sources.
+
+### Algorithm 
+
+The bounding box generation algorithm works as follows:
+
+1. Load the instance and semantic segmentation images for the current frame.
+2. Create a mask of pixels to skip based on the `classes_to_skip` list.
+3. Find all unique remaining instance ids after applying the skip mask.
+4. For each instance id:
+    - Find the class ids associated with that instance, excluding low pixel count classes.
+    - If `multiple_bb_per_instance` is enabled, generate one bounding box per class id.
+    - Otherwise, use the main class id and generate one bounding box.
+5. Write the bounding boxes in YOLO format to an output file.
+
+The bounding box coordinates are calculated from the pixel extents of each instance mask for the given class id(s).
+
+### Output
+
+The bounding box output is generated as a text file in the YOLO format for each frame, located in the `<sensor_name>_annotations/bounding_box/` folder. Each line represents one bounding box:
+
+```
+<class_id> <x_center> <y_center> <width> <height>
+```
+
+The coordinates are normalized between 0-1 based on the image width/height.
diff --git a/docs/docs/usage/job_description/config_descriptions/camera.md b/docs/docs/usage/job_description/config_descriptions/camera.md
@@ -1,6 +1,6 @@
 # Camera Plugin Documentation
 
-The Camera Plugin simulates a basic camera sensor, allowing you to configure the optical and digital properties of the camera within your scene. It supports various parameters such as resolution, focal length, exposure, and depth of field, among others.
+The Camera Plugin simulates a basic camera sensor, allowing you to configure the optical and digital properties of the camera within your scene. It supports various parameters such as resolution, focal length, exposure, depth of field, motion blur, and frustum visualization for debugging purposes. Additionally, it outputs the intrinsic and extrinsic camera parameters.
 
 ## Configuration Parameters
 
@@ -18,6 +18,7 @@ The following table describes each configuration parameter for the Camera Plugin
 | `shutter_speed`  | float                                                 | Shutter speed in seconds. Affects the strength of motion blur. If `motion_blur` is enabled, this becomes required. | Conditional  |
 | `depth_of_field` | object (contains aperture, autofocus, focus_distance) | Settings for the depth of field of the camera.                                                                     | Optional     |
 | `motion_blur`    | object (contains enabled, rolling_shutter)            | Settings for the motion blur of the camera.                                                                        | Optional     |
+| `frustum`        | object (contains settings for frustum visualization)  | Settings for the camera frustum visualization (for debugging purposes).                                            | Optional     |
 | `outputs`        | object                                                | Output configuration, which can include RGB, Pixel Annotation, and Object Positions.                               | **Required** |
 
 ### Depth of Field
@@ -42,9 +43,69 @@ The following table describes each configuration parameter for the Camera Plugin
 | `enabled`     | boolean | Whether rolling shutter is enabled.       |
 | `duration`    | number  | Exposure time of the scanline in seconds. |
 
+### Frustum Visualization
+
+| Sub-parameter | Type    | Description                                                                                |
+| ------------- | ------- | ------------------------------------------------------------------------------------------ |
+| `enabled`     | boolean | Whether to enable frustum visualization.                                                  |
+| `type`        | string  | Type of frustum visualization (e.g., "pyramid").                                          |
+| `depth`       | number  | Depth of the frustum in meters.                                                           |
+| `color`       | array   | RGB color of the frustum as a list of 3 floats.                                           |
+| `transparency`| number  | Transparency value between 0-1.                                                           |
+| `wireframe`   | object  | Settings for wireframe rendering mode.                                                    |
+| `hide_render` | boolean | Whether to hide the frustum in the final rendered images.                                 |
+
+#### Wireframe Settings
+
+| Sub-parameter | Type    | Description                           |
+| ------------- | ------- | --------------------------------------|
+| `enabled`     | boolean | Whether to render as wireframe lines. |
+| `thickness`   | number  | Thickness of the wireframe lines.     |
+
 !!! warning
     If `motion_blur` is enabled, `shutter_speed` becomes a required parameter.
 
+!!! tip
+    The frustum visualization is primarily intended for debugging purposes when running Syclops with the `-d scene` flag. This flag opens the scene in Blender and allows you to visualize the frustum of the sensor, which can be useful for sensor placement prototyping.
+
+## Intrinsic Camera Parameters Output
+
+The Camera Plugin outputs the intrinsic camera parameters, which include the camera matrix. The camera matrix is written to a YAML file named `<frame_number>.yaml` in the `<camera_name>/intrinsics` folder for each frame.
+
+### Example Intrinsics Output
+
+```yaml
+camera_matrix:
+  - [fx, 0, cx]
+  - [0, fy, cy] 
+  - [0, 0, 1]
+```
+
+Where:
+- `fx`, `fy`: Focal lengths in pixels
+- `cx`, `cy`: Principal point coordinates in pixels
+
+## Extrinsic Camera Parameters Output
+
+The Camera Plugin also outputs the extrinsic camera parameters, which represent the global pose of the camera in the scene. The camera pose is written to a YAML file named `<frame_number>.yaml` in the `<camera_name>/extrinsics` folder for each frame.
+
+### Example Extrinsics Output
+
+```yaml
+camera_pose:
+  - [r11, r12, r13, tx]
+  - [r21, r22, r23, ty]
+  - [r31, r32, r33, tz]
+  - [0, 0, 0, 1]
+```
+
+Where:
+- `r11` to `r33`: Rotation matrix elements
+- `tx`, `ty`, `tz`: Translation vector elements
+
+## Metadata Output
+
+Along with the intrinsic and extrinsic parameter files, a `metadata.yaml` file is generated in the respective output folders. This file contains metadata about the parameter outputs, including the output type, format, description, expected steps, sensor name, and output ID.
 
 ## Example Configuration
 
@@ -65,10 +126,19 @@ syclops_sensor_camera:
       rolling_shutter:
         enabled: true
         duration: 0.001
+    frustum:
+      enabled: true
+      type: pyramid
+      depth: 10
+      color: [1, 0, 0]
+      transparency: 0.5
+      wireframe:
+        enabled: true
+        thickness: 0.1
     outputs:
-        Base Plugins/RGB:
+        syclops_output_rgb:
             - samples: 256
-            id: main_cam_rgb
+              id: main_cam_rgb
 ```
 
-In the example above, a camera named "Main_Camera" is defined with a resolution of 1920x1080 pixels, a focal length of 35mm, and other specific properties. The camera will also utilize motion blur with a rolling shutter effect.
+In the example above, a camera named "Main_Camera" is defined with a resolution of 1920x1080 pixels, a focal length of 35mm, and other specific properties. The camera will also utilize motion blur with a rolling shutter effect. Additionally, the frustum visualization is enabled, displaying a wireframe pyramid with a depth of 10 meters, colored red, and semi-transparent. The intrinsic and extrinsic camera parameters will be output according to the specified configuration.
diff --git a/docs/docs/usage/job_description/config_descriptions/keypoint_annotation.md b/docs/docs/usage/job_description/config_descriptions/keypoint_annotation.md
@@ -0,0 +1,86 @@
+# Keypoint Output Documentation
+
+The Keypoint Output is designed to provide the 2D pixel coordinates of predefined keypoints on 3D objects in the camera space. This output is particularly useful for tasks such as pose estimation, tracking, and analysis of object articulation.
+
+## Keypoint Definition
+
+Keypoints are defined using a Blender [script](https://github.com/DFKI-NI/syclops/blob/main/syclops/utility/keypoint_script.py) that allows users to easily add keypoint information to 3D objects. To use it, open Blender with the model and paste the [script](https://github.com/DFKI-NI/syclops/blob/main/syclops/utility/keypoint_script.py) in the Blender text editor. The script can be used in two ways:
+
+1. The user should create empty objects at the desired keypoint locations relative to the mesh object. Then, select all the empty objects and the mesh object, with the mesh object being the active object. Run the script, and it will add the keypoint information to the mesh object based on the positions of the empty objects. The empty objects will be sorted alphabetically, and their index will be used as the keypoint number.
+
+2. With a single mesh object selected that already has a keypoints attribute: The script will create empty objects at the keypoint positions defined in the mesh object to visualize the keypoint locations.
+
+Here's an example of how the keypoints are stored in the mesh object:
+
+```python
+obj["keypoints"] = {
+    "0": {"x": -0.5, "y": 1.0, "z": 0.0},
+    "1": {"x": 0.5, "y": 1.0, "z": 0.0},
+    "2": {"x": 0.0, "y": 1.5, "z": 0.0},
+    ...
+}
+```
+
+Each keypoint is represented by a unique index (based on the alphabetical order of the empty objects) and its 3D coordinates relative to the object's local space.
+
+## Output Format
+
+The keypoint output is saved as a JSON file for each frame, with the following structure:
+
+```json
+{
+  "instance_id_1": {
+    "class_id": 1,
+    "0": {
+      "x": 100,
+      "y": 200
+    },
+    "1": {
+      "x": 150,
+      "y": 220
+    }
+  },
+  "instance_id_2": {
+    "class_id": 2,
+    "0": {
+      "x": 300,
+      "y": 400
+    },
+    "1": {
+      "x": 350,
+      "y": 420
+    }
+  }
+}
+```
+
+Each object instance is identified by a unique `instance_id`, which is calculated based on the object's 3D location. The `class_id` represents the semantic class of the object. Each keypoint is then listed with its index and 2D pixel coordinates (`x`, `y`) in the camera space.
+
+## Configuration Parameters
+
+The keypoint output does not require any additional configuration parameters beyond the standard `id` field for uniquely identifying the output.
+
+| Parameter | Type   | Description                                     | Requirement                        |
+|-----------|--------|-------------------------------------------------|-----------------------------------|
+| `id`      | string | Unique identifier of the keypoint output. | **Required** |
+
+## Example Configuration
+
+```yaml
+syclops_output_keypoint:
+  - id: "keypoints1"
+```
+
+In this example, a keypoint output is configured with the identifier `"keypoints1"`.
+
+## Metadata Output
+
+Along with the keypoint JSON files, a `metadata.yaml` file is generated in the output folder. This file contains metadata about the keypoint output, including the output type, format, description, expected steps, sensor name, and output ID.
+
+## Limitations and Considerations
+
+- Keypoints are only generated if they are visible in the rendered image.
+- The accuracy of keypoint locations depends on the precision of their definition in the 3D object space.
+- Keypoint outputs are generated per frame, so the number of output files will depend on the total number of frames in the animation.
+
+By leveraging the keypoint output, users can obtain precise 2D locations of predefined keypoints on 3D objects, enabling various downstream tasks that require spatial understanding of object parts and their relationships.
diff --git a/docs/docs/usage/job_description/config_descriptions/object_position.md b/docs/docs/usage/job_description/config_descriptions/object_position.md
@@ -35,4 +35,8 @@ syclops_output_object_positions:
     debug_breakpoint: true
 ```
 
-In the example configuration, the global positions of objects within the scene will be captured with the identifier `obj_pos_1`. Additionally, if the [scene debugging](/developement/debugging/#visually-debug-a-job-file) is active, the scene will break and open in Blender before rendering.
+In the example configuration, the global positions of objects within the scene will be captured with the identifier `obj_pos_1`. Additionally, if the [scene debugging](/developement/debugging/#visually-debug-a-job-file) is active, the scene will break and open in Blender before rendering.
+
+## Metadata Output
+
+Along with the output files, a `metadata.yaml` file is generated in the output folder. This file contains metadata about the keypoint output, including the output type, format, description, expected steps, sensor name, and output ID.
diff --git a/docs/docs/usage/job_description/config_descriptions/pixel_annotation.md b/docs/docs/usage/job_description/config_descriptions/pixel_annotation.md
@@ -2,6 +2,28 @@
 
 The Pixel Annotation Output is dedicated to providing various pixel-level annotations of the sensor image. This encompasses a range of annotations from semantic segmentation to the volume of objects.
 
+## Inter Class Segmentation
+
+In Syclops it is possible to have multiple class labels for a single object. This means, that a plant can have the segmentation labels `stem` and `leaf` at the same time.
+![Inter Class Segmentation](/img/docs/inter_class_seg.png)
+
+It has to be configured in the scene description for the object that should have multiple labels. The following example shows how to configure it:
+
+```yaml title="Configure Inter Class Segmentation"
+  syclops_plugin_scatter:
+    - name: "Corn Scatter"
+      ...
+      class_id: 2 # (1)!
+      class_id_offset:
+        Stem: 1 # (2)!
+      ...
+```
+
+1.  Base class label for the object.
+2.  Offset for the material `Stem`.
+
+This will result in the scattered corn objects to have the class label 2 for the whole object and the class label 3 for the part of the object that has the material `Stem` assigned.
+
 ## Configuration Parameters
 
 The following table describes each configuration parameter for the Pixel Annotation Output:
@@ -10,7 +32,6 @@ The following table describes each configuration parameter for the Pixel Annotat
 |-------------------------|-------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|---------------------------|
 | **`semantic_segmentation`** | object                                          | Represents the semantic segmentation output where pixels are mapped with the class id value of the object.           | **Optional**              |
 |     ↳ `id`              | string                                          | Unique identifier of the output.                                                                                    | **Required** for this annotation type |
-|     ↳ `class_id_offset` | boolean                                         | Specifies if custom models can have multiple class ids. Must be set in the model materials.                         | **Optional**              |
 | **`instance_segmentation`** | object                                          | Produces an instance segmentation output, tagging each object with a unique id in the image.                        | **Optional**              |
 |     ↳ `id`              | string                                          | Unique identifier of the output.                                                                                    | **Required** for this annotation type |
 | **`pointcloud`**            | object                                          | Offers 3D coordinates of every pixel in the camera coordinates in meters.                                           | **Optional**              |
@@ -40,4 +61,8 @@ syclops_output_pixel_annotation:
   - debug_breakpoint: true
 ```
 
-In the provided configuration, a variety of pixel annotations are set up, each with their unique identifiers. Additionally, if the [scene debugging](/developement/debugging/#visually-debug-a-job-file) is active, the scene will break and open in Blender before rendering.
+In the provided configuration, a variety of pixel annotations are set up, each with their unique identifiers. Additionally, if the [scene debugging](/developement/debugging/#visually-debug-a-job-file) is active, the scene will break and open in Blender before rendering.
+
+## Metadata Output
+
+Along with the output files, a `metadata.yaml` file is generated in the output folder. This file contains metadata about the keypoint output, including the output type, format, description, expected steps, sensor name, and output ID.