-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LFX-Proposal: Cloud-edge collaborative speculative decoding for LLM based on KubeEdge-Ianvs #156
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got to fix the below CI errors for Pylint (3.9) before further actions, see CI logs
Run pylint '/home/runner/work/ianvs/ianvs/core'
core/testenvmanager/dataset/dataset.py:119:4: R0917: Too many positional arguments (8/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:206:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:213:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:246:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:285:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:329:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
core/testenvmanager/dataset/dataset.py:368:4: R0917: Too many positional arguments (6/5) (too-many-positional-arguments)
************* Module core.testcasecontroller.algorithm.paradigm.singletask_learning.singletask_learning_active_boost
core/testcasecontroller/algorithm/paradigm/singletask_learning/singletask_learning_active_boost.py:66:4: R0917: Too many positional arguments (7/5) (too-many-positional-arguments)
-----------------------------------
Your code has been rated at 9.95/10
Error: Process completed with exit code 8.
27ab314
to
d019f65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of cloud-edge collaborative speculative decoding in the proposal needs to be further refined. According to what we discussed in the regular community meeting, speculative decoding can be implemented in the cloud by still adopting the hard example mining paradigm on the edge side.
d019f65
to
7c0e001
Compare
7c0e001
to
6e86e1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks fine to me. Might need to highlight the difference against the OSPP proposal
6e86e1a
to
25fcccb
Compare
Signed-off-by: Yu Fan <[email protected]> doc: modify proposal Signed-off-by: Yu Fan <[email protected]>
25fcccb
to
746c70d
Compare
It is necessary to highlight why we use speculative decoding to accelerate LLM cloud-edge collaborative inference in the motivation section. The differences between this proposal and OSPP proposal should be further highlighted in the methods section. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall it looks fine. As discussed at the routine meeting, there are a few points yet to be achieved.
- The motivation for using the technique is not quite clear. I believe that it would make sense the improve the inference time. But the reason and potential of solving this problem could be further explored, e.g., by adding examples.
- Need to highlight the difference against the existing design.
What type of PR is this?
/kind design
What this PR does / why we need it:
Proposal for LFX Project CNCF - KubeEdge: Cloud-Edge Speculative Decoding for LLM via KubeEdge-Ianvs
Which issue(s) this PR fixes:
Fixes #126