[WIP][DiffusionDet] Diffusion models for object detection #10238

HichTala · 2024-12-16T10:20:01Z

This pull request introduces a new model, DiffusionDet, and its associated components. The changes include the addition of configuration, head, loss, and model files, as well as a temporary debug script to test the model.
This is a first draft, nothing ready to use right now, I am creating this PR to get some feedback as it is my first time.

Key changes:

New Model Addition:

src/diffusers/models/diffusiondet/modeling_diffusiondet.py: Added the DiffusionDet class to implement the DiffusionDet model. This includes the model's initialization and forward pass.

Configuration:

src/diffusers/models/diffusiondet/configuration_diffusiondet.py: Added the DiffusionDetConfig class to handle the configuration settings for the DiffusionDet model.

Head Implementation:

src/diffusers/models/diffusiondet/head.py: Added the DiffusionDetHead class and supporting functions for the model's head, including the ROIPooler class for region of interest pooling.

Loss Computation:

src/diffusers/models/diffusiondet/loss.py: Added the DynamicCriterion class to compute the loss for the DiffusionDet model, including the process for Hungarian assignment and supervision of matched pairs.

Debug Script:

debug_diffusiondet.py: Added a script to create and test the DiffusionDet model using the configuration and model classes.# What does this PR do?

Fixes # (issue)

DiffusionDet: Diffusion models for object detection #1350

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten and @ShoufaChen participated in the previous implementation trial.

(inspired from frcnn modelling of transformers and ROIPooler class of detectron2)

HichTala · 2024-12-16T10:42:14Z

The original model uses lists of detectron Bbox objects, I am adapting the code to use classical pytorch tensor with shape (B, N, 4) (B: batch size, N: number of proposals)

…h format

…tron format to bbox in torch format

Add SinusoidalPositionEmbeddings class, Complet DiffusionDetHead

yiyixuxu · 2024-12-16T22:02:44Z

hi @HichTala thanks for the PR, the model is a bit of out dated at this point I think
do you want to try to see if we can make it work with remote_code? https://huggingface.co/docs/diffusers/using-diffusers/custom_pipeline_overview#community-components

HichTala · 2024-12-17T11:43:15Z

Hi @yiyixuxu,
Thank you for the suggestion! I’m happy to explore making it work with remote_code. My main goal is to contribute to the community, and Hugging Face is a great platform to share this work. I’m open to following whatever approach the main contributors believe is the best way to share it. Let me know how we can proceed!

ShoufaChen · 2024-12-23T02:52:44Z

Hi @HichTala, Thanks very much for your contribution.

Hi @yiyixuxu , Could you please provide instructions on how to process the remote_code? Thank you in advance.

github-actions · 2025-01-16T15:03:10Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

HichTala added 5 commits December 13, 2024 08:39

Initial commit

0b9b70d

Add ROIPooler class

4c7508c

(inspired from frcnn modelling of transformers and ROIPooler class of detectron2)

Move utils functions up and change bboxes format

ca66a53

Add configuration class

f874694

Add debug script for debugging and fix some config

35cac1c

HichTala marked this pull request as draft December 16, 2024 10:21

HichTala added 2 commits December 16, 2024 11:33

Fix missing parameters min_level and max_level in the ROI pooler

8e596b2

Remove useless None parameter

8a6e65c

HichTala added 4 commits December 16, 2024 14:18

Adapt detectron function for bbox in detectron format to bbox in torc…

5ce5884

…h format

Adapt detectron function (assign_boxes_to_levels) for bbox in detec…

2c43cb4

…tron format to bbox in torch format

Add DynamicConv

508aa41

WIP in DynamicHead class,

921d432

Add SinusoidalPositionEmbeddings class, Complet DiffusionDetHead

github-actions bot added the stale Issues that haven't received updates label Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][DiffusionDet] Diffusion models for object detection #10238

[WIP][DiffusionDet] Diffusion models for object detection #10238

Uh oh!

HichTala commented Dec 16, 2024

Uh oh!

HichTala commented Dec 16, 2024

Uh oh!

yiyixuxu commented Dec 16, 2024

Uh oh!

HichTala commented Dec 17, 2024

Uh oh!

ShoufaChen commented Dec 23, 2024

Uh oh!

github-actions bot commented Jan 16, 2025

Uh oh!

Uh oh!

[WIP][DiffusionDet] Diffusion models for object detection #10238

Are you sure you want to change the base?

[WIP][DiffusionDet] Diffusion models for object detection #10238

Uh oh!

Conversation

HichTala commented Dec 16, 2024

New Model Addition:

Configuration:

Head Implementation:

Loss Computation:

Debug Script:

Before submitting

Who can review?

Uh oh!

HichTala commented Dec 16, 2024

Uh oh!

yiyixuxu commented Dec 16, 2024

Uh oh!

HichTala commented Dec 17, 2024

Uh oh!

ShoufaChen commented Dec 23, 2024

Uh oh!

github-actions bot commented Jan 16, 2025

Uh oh!

Uh oh!