Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LFX-Proposal: Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #156

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

FuryMartin
Copy link
Contributor

What type of PR is this?
/kind design

What this PR does / why we need it:

Proposal for LFX Project CNCF - KubeEdge: Cloud-Edge Speculative Decoding for LLM via KubeEdge-Ianvs

Which issue(s) this PR fixes:

Fixes #126

@kubeedge-bot kubeedge-bot added the kind/design Categorizes issue or PR as related to design. label Oct 14, 2024
@kubeedge-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign moorezheng after the PR has been reviewed.
You can assign the PR to them by writing /assign @moorezheng in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot kubeedge-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 14, 2024
@MooreZheng MooreZheng requested review from hsj576 and removed request for jaypume October 15, 2024 03:08
Copy link
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got to fix the below CI errors for Pylint (3.9) before further actions, see CI logs

Run pylint '/home/runner/work/ianvs/ianvs/core'
core/testenvmanager/dataset/dataset.py:119:4: R0917: Too many positional arguments (8/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:206:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:213:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:246:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:285:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:329:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:368:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
************* Module core.testcasecontroller.algorithm.paradigm.singletask_learning.singletask_learning_active_boost
core/testcasecontroller/algorithm/paradigm/singletask_learning/singletask_learning_active_boost.py:66:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)

-----------------------------------
Your code has been rated at 9.95/10

Error: Process completed with exit code 8.

@FuryMartin
Copy link
Contributor Author

Got to fix the below CI errors for Pylint (3.9) before further actions, see CI logs

This is fixed by #158

Copy link
Member

@hsj576 hsj576 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of cloud-edge collaborative speculative decoding in the proposal needs to be further refined. According to what we discussed in the regular community meeting, speculative decoding can be implemented in the cloud by still adopting the hard example mining paradigm on the edge side.

@kubeedge-bot kubeedge-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 7, 2024
@FuryMartin FuryMartin changed the title Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs LFX-Proposal: Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs Nov 7, 2024
Copy link
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks fine to me. Might need to highlight the difference against the OSPP proposal

Signed-off-by: Yu Fan <[email protected]>

doc: modify proposal

Signed-off-by: Yu Fan <[email protected]>
@hsj576
Copy link
Member

hsj576 commented Nov 28, 2024

It is necessary to highlight why we use speculative decoding to accelerate LLM cloud-edge collaborative inference in the motivation section. The differences between this proposal and OSPP proposal should be further highlighted in the methods section.

Copy link
Collaborator

@MooreZheng MooreZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks fine. As discussed at the routine meeting, there are a few points yet to be achieved.

  1. The motivation for using the technique is not quite clear. I believe that it would make sense the improve the inference time. But the reason and potential of solving this problem could be further explored, e.g., by adding examples.
  2. Need to highlight the difference against the existing design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs
4 participants