[Feature] support pooling model dummy_run #4345

lizexu123 · 2025-10-10T07:21:41Z

支持pooling模型dummy_pooler_run，以及将之前生成式模型预热阶段重构为dummy_sampler_run

…into pooling_emb_3

…into develop

paddle-bot · 2025-10-10T07:21:48Z

Thanks for your contribution!

gongshaotian · 2025-10-10T09:58:28Z

fastdeploy/worker/gpu_model_runner.py

+from fastdeploy.engine.pooling_params import PoolingParams
+from fastdeploy.engine.tasks import PoolingTask


从 engine import 东西到底层是合理的吗

这里是参考vllm做法，它是vllm/tasks,我就放到engine底下了

gongshaotian · 2025-10-11T03:07:06Z

fastdeploy/model_executor/models/interfaces_base.py

+class FdModel(Protocol[T_co]):
+    """The interface required for all models in FastDeploy."""


哪些类会继承FDModel，和 ModelForCasualLM 是啥关系

只有FDModelForPooling继承，和ModelForCasualLM没关系，ModelForCasualLM有compute_logits,pooling模型不计算这个

gongshaotian · 2025-10-11T03:14:39Z

fastdeploy/worker/gpu_model_runner.py

+            [num_reqs, req_num_tokens],
+            dtype="int32",
+        )
+        model = cast(FdModelForPooling, self.get_model())


同上，FdModelForPooling 和 ModelForCasualLM 关系是什么，一定要cast吗

这里是设置一些默认的pooling_type(如果用户不设置)，是需要cast的

gongshaotian · 2025-10-11T03:16:31Z

fastdeploy/worker/gpu_model_runner.py

+        to_update = model.pooler.get_pooling_updates(task)
+        to_update.apply(dummy_pooling_params)


to_update 用命名的语意准确吗

参考vllm规范实现的

…into pooling_emb_4

yuanlehome · 2025-10-13T02:56:27Z

fastdeploy/model_executor/layers/pool/metadata.py

+    cumsum = paddle.zeros([n_seq + 1], dtype="int64")
+    if cumsum.place.is_gpu_place():
+        cumsum = cumsum.cpu()


这里为啥不直接zeros一个cpu tensor ?

yuanlehome · 2025-10-13T02:59:53Z

fastdeploy/worker/gpu_model_runner.py


        self.attn_backends.append(attn_backend)

+    def _dummy_pooler_run_task(


为什么不直接实现在_dummy_pooler_run中，而是单独抽出一个_dummy_pooler_run_task ？

参看vllm规范写的

yuanlehome · 2025-10-13T03:05:25Z

fastdeploy/worker/gpu_model_runner.py

        self.speculative_decoding = self.speculative_method is not None
        self.enable_logprob = fd_config.model_config.enable_logprob
        self.enable_early_stop = self.fd_config.early_stop_config.enable_early_stop
+        self.is_pooling_model = self.fd_config.model_config.runner_type == "pooling"


self.is_pooling_model和is_pooling_model是否能去除一个？有都存在的必要性吗？

去除了is_pooling_model,保留了self.is_pooling_model

lizexu123 added 8 commits September 22, 2025 20:33

support qwen3-embedding

7716866

fix ci bug

a1e505c

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

815d592

…into pooling_emb_3

support pooling dummy_run

d6d8c15

merge develop

5fde033

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

001f23d

…into develop

merge develop

31a4a6b

fix

d8cce66

delete print

b4f9d9c

gongshaotian reviewed Oct 11, 2025

View reviewed changes

merge develop

43fe17f

gongshaotian previously approved these changes Oct 11, 2025

View reviewed changes

lizexu123 dismissed gongshaotian’s stale review via 43fe17f October 11, 2025 06:23

lizexu123 added 2 commits October 11, 2025 15:00

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

d6df785

…into pooling_emb_4

parallel_config.max_model_len

fe5de8b

yuanlehome reviewed Oct 13, 2025

View reviewed changes

lizexu123 and others added 3 commits October 13, 2025 11:25

delete is_pooling_model in dummy_run

af9a48f

fix

73e5f07

Merge branch 'develop' into pooling_emb_4

7d53ef8

		from fastdeploy.engine.pooling_params import PoolingParams
		from fastdeploy.engine.tasks import PoolingTask

		class FdModel(Protocol[T_co]):
		"""The interface required for all models in FastDeploy."""

		to_update = model.pooler.get_pooling_updates(task)
		to_update.apply(dummy_pooling_params)


		self.attn_backends.append(attn_backend)

		def _dummy_pooler_run_task(

[Feature] support pooling model dummy_run #4345

Are you sure you want to change the base?

[Feature] support pooling model dummy_run #4345

Uh oh!

Conversation

lizexu123 commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Oct 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lizexu123 Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lizexu123 commented Oct 10, 2025 •

edited

Loading

lizexu123 Oct 11, 2025 •

edited

Loading