Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples: Add test run for lv2v #283

Merged
merged 31 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
1a7c557
examples: Add possible example for lv2v
victorges Nov 20, 2024
2225784
Merge branch 'feat/test-live-video-to-video' of https://github.com/li…
varshith15 Nov 21, 2024
1b3ed77
go.mod: Update pkg/errors to fix build
victorges Nov 21, 2024
5835437
feat: noop webcam lv2v example
varshith15 Nov 21, 2024
347722e
Merge branch 'feat/test-live-video-to-video' of https://github.com/li…
varshith15 Nov 21, 2024
3c6489d
temp: dummy trickle client
varshith15 Nov 22, 2024
47feacb
temp: docker network host
varshith15 Nov 23, 2024
3c2def5
feat: example for lv2v noop with zmq
varshith15 Nov 27, 2024
4810532
fix: stream protocol as param
varshith15 Nov 28, 2024
fc26de1
fix: zmq bind change
varshith15 Nov 28, 2024
9d47065
feat: fps monitor init
varshith15 Nov 29, 2024
b7e4a39
fix: monitor to async, revert stream_protocol param,
varshith15 Dec 2, 2024
6cfaf4f
fix: kafka revert, ci added for noop
varshith15 Dec 3, 2024
aaf8338
Merge branch 'main' into feat/test-live-video-to-video
varshith15 Dec 4, 2024
0d23630
Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…
varshith15 Dec 5, 2024
81da469
fix: working ci test
varshith15 Dec 5, 2024
6c51f8e
fix: possible caching issue
varshith15 Dec 5, 2024
0ffc438
fix: clean up
varshith15 Dec 6, 2024
5b087ef
fix: remove client resizing
varshith15 Dec 6, 2024
d89d4e6
feat: comfyui test
varshith15 Dec 9, 2024
a3217a3
workflow: Switch over to self-hosted gpu runner (#345)
hjpotter92 Dec 11, 2024
bc50ddb
Merge branch 'main' into feat/test-live-video-to-video
varshith15 Dec 11, 2024
eb14c69
fix: review fixes
varshith15 Dec 11, 2024
e7fc022
fix: run on the same node
varshith15 Dec 11, 2024
e5123b6
fix: workflows
varshith15 Dec 12, 2024
88bd756
fix: add symlink
varshith15 Dec 12, 2024
dfd732e
revert: workflow interlink
varshith15 Dec 12, 2024
a5c973e
Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…
varshith15 Dec 12, 2024
c2f76e0
Merge branch 'main' of https://github.com/livepeer/ai-worker into fea…
varshith15 Dec 12, 2024
afe82dc
fix: revert fps test
varshith15 Dec 12, 2024
38a5e4a
fix: revert ci test
varshith15 Jan 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions .github/workflows/ai-runner-pipelines-fps-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
name: FPS check for live-video-to-video pipelines

on:
pull_request:
paths:
- "runner/**"
- "!runner/.devcontainer/**"
push:
branches:
- main
tags:
- '*'
paths:
- "runner/**"
- "!runner/.devcontainer/**"
workflow_dispatch:

jobs:
run-fps-test:
runs-on: ubuntu-latest
strategy:
max-parallel: 1
matrix:
model_config:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use pipeline instead of model everywhere in this file. It is only called model on the legacy code since we reused the existing model_id param to change which live pipeline to run, but ideally we keep that "misnaming" on the minimum places possible (and always with a comment disclaimer)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- { id: noop, input_fps: 30, exp_output_fps: 28 }
- { id: liveportrait, input_fps: 30, exp_output_fps: 28 }
- { id: streamdiffusion, input_fps: 30, exp_output_fps: 28 }
- { id: comfyui, input_fps: 30, exp_output_fps: 28 } # only depth anything for now
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.23.3'

- name: Build Docker images
env:
MODEL_ID: ${{ matrix.model_config.id }}
run: |
cd runner
docker build -t livepeer/ai-runner:live-base -f docker/Dockerfile.live-base .
if [ "${MODEL_ID}" = "noop" ]; then
docker build -t livepeer/ai-runner:live-app-noop -f docker/Dockerfile.live-app-noop .
else
docker build -t livepeer/ai-runner:live-base-${MODEL_ID} -f docker/Dockerfile.live-base-${MODEL_ID} .
docker build -t livepeer/ai-runner:live-app-${MODEL_ID} -f docker/Dockerfile.live-app__PIPELINE__ --build-arg PIPELINE=${MODEL_ID} .
fi
cd ..
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think this will make the test take waaaay longer like 30 minutes.

Some ideas:

  • Somehow run this test after the separate docker build workflow, which also has some optimizations to try only building what's necessary (like it skips the base images if no changes). Ideally we wouldn't have it in the same workflow file tho, but I'm not sure if it's possible to make cross-workflow dependencies.
  • Simplify the build here always skipping the base. Meaning that we would only copy the app code into the specific pipeline base image that has been published to docker hub (docker would automatically pull the base). This is less perfect in the sense that, when we do change the base images, it won't pick up the change, but at least it won't take 30 minutes each to run.
  • Maybe a little more sophisticated would be if we could start by first pulling the app image from dockerhub. Then all these builds would be optimized using the docker layer caches. Not sure if this would work tho, I've seen the cache not working even after having pulled base :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • if we use the same node each time, which is the case, the layers are cached anyway right? but yeah if docker build workflow already exists then can just try to use that
  • i've added base because we have to account for changes in req packages -- maybe can only run base docker build only when dockerfiles or req files are updated
  • i think its important that there is caching, otherwise any optimizations won't cut it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe can only run base docker build only when dockerfiles or req files are updated

Yeah we do that on the docker build workflow. But it's kinda of a pain TBH, pretty complex, and it hurts even more to repeat all of it here 💀
Would be great if we could reuse somehow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the self-hosted runners, we clean up disk frequently, so that we dont run out of disk space for other jobs.

this means the intermediate layers etc. are also gone. as victor mentioned above (and i mentioned in the discord thread), best would be to trigger this workflow after docker build has finished and pull those images for testing.

Copy link
Collaborator

@varshith15 varshith15 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hjpotter92 can we not wipe the models dir if possible? that would save a lot of time, can just link the persistent path to the req path
@victorges thoughts?


- name: Install packages
run: |
sudo sh -c "apt-get update && apt-get install -y libzmq3-dev && rm -rf /var/lib/apt/lists/*"

- name: Clean up runner dockers
run: |
CONTAINERS=$(docker ps -aq --filter "name=^live-video-to-video*")
if [ -n "$CONTAINERS" ]; then
echo "Removing containers: $CONTAINERS"
echo "$CONTAINERS" | xargs -r docker rm -f
else
echo "No containers to remove for model ${MODEL_ID}"
fi

- name: Run FPS test
env:
MODEL_ID: ${{ matrix.model_config.id }}
INPUT_FPS: ${{ matrix.model_config.input_fps }}
EXP_OUTPUT_FPS: ${{ matrix.model_config.exp_output_fps }}
run: |
# Redirect both stdout and stderr to a file and tee to console
go run cmd/examples/live-video-to-video/main.go \
-modelid=${MODEL_ID} \
-inputfps=${INPUT_FPS} \
-expoutputfps=${EXP_OUTPUT_FPS} 2>&1 | tee output.log

# Use grep with the timestamp pattern to find the TEST PASSED/FAILED message
if grep -q "INFO TEST PASSED!" output.log; then
echo "Test for ${MODEL_ID} pipeline PASSED"
exit 0
elif grep -q "INFO TEST FAILED!" output.log; then
echo "Test for ${MODEL_ID} pipeline FAILED"
exit 1
else
echo "Test for ${MODEL_ID} pipeline had an unknown result"
echo "Full log output:"
cat output.log
exit 1
fi
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ output
aiModels.json
models
checkpoints
runner/run-lv2v.log

# IDE
.vscode
Expand Down
Loading
Loading