Skip to content

YOLOv8 add-on on HD liveview #12

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions examples/livestream-yolov8/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.venv
*.pt
*.bin
model.json
metadata.yaml
94 changes: 94 additions & 0 deletions examples/livestream-yolov8/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# High definition (WebRTC) live streaming

Hub provides two types of live view: a low resolution live view and a high resolution (on demand) live view. Depending on the application you might leverage one over the other, or, both. Below we will explain the differences, and how to open and negotiate a high resolution live view with the Agent using the concepts of WebRTC. It's important to understand that live view is an on-demand action, which requires a negotiation between the requesting client (Hub or this example application) and the remote Agent. This negotiation will setup a sessions between the client and the Agent, for a short amount of time. Once the client closes the connection or the web page, the Agent will also stop forwarding the live view.

## Architecture

Hub and Agent provide a high resolution live view, which includes a full frames per second (FPS) stream. To enable this functionality, the WebRTC protocol is used for negotiating the channels (ICE candidates) and forwarding the RTP packets. We will describe the communication flow in detail below.

![Livestreaming HD](./livestream-hd.svg)

The negotiation of a WebRTC stream is a multi-step approach, we'll detail each step below. If you need more background about WebRTC, we advise to read through the [WebRTC for the curious book](https://github.com/webrtc-for-the-curious/webrtc-for-the-curious).

## 1. Play WebRTC

An user opens the application, and either activates the WebRTC stream or the application automatically loads the WebRTC stream without any user interaction (for example video wall behaviour).

## 2. Create and send offer

The WebRTC flow is initiated, and starts with the initial step: the `offer` creation. This so called `offer` includes all the necessary information about the client application (codecs and much more), which the receiving peer (our Agent) needs to know to successfully setup a real-time connection in the next steps. If you want more information about the technicalities of the `offer`, we advise the go through WebRTC for the curious book.

return this.peerConnection.createOffer({
offerToReceiveAudio: true,
offerToReceiveVideo: true,
iceRestart: true,
}).then(offer => {
return this.peerConnection.setLocalDescription(offer);
}).then(() => {
this.sendOffer();
}).catch(error => console.log(error));

Once the `offer` is successfully created, it's encrypted using the `Hub private key` and send over the Agent through MQTT.

## 3. Create and send answer

The `offer` is received by the Agent, in a response the Agent will generate an `answer`, which is similar to the `offer`; it includes media codec information and more.

The whole idea about `offer` and `answer` is just like sharing phone numbers when you meet someone for the first time. You'll find the corresponding code in the [`kerberos-io\agent` repository](https://github.com/kerberos-io/agent/blob/master/machinery/src/webrtc/main.go#L210-L215)

answer, err := peerConnection.CreateAnswer(nil)
if err != nil {
log.Log.Error("webrtc.main.InitializeWebRTCConnection(): something went wrong while creating answer: " + err.Error())
} else if err = peerConnection.SetLocalDescription(answer); err != nil {
log.Log.Error("webrtc.main.InitializeWebRTCConnection(): something went wrong while setting local description: " + err.Error())
}

## 4. Share ICE candidates

Once the `offer` and `answer` are shared both peers (Agent and client) will start gathering one or more ICE candidates. An ICE candidate contains specific information about communication channels or routes.

Each ICE candidate is shared with the other peer (Agent -> client | client -> Agent). By doing so each peer will know how to communicate with each other. Once all ICE candidates are shared a candidate keypair is chosen for setting up the final communication; this decision is made by the WebRTC protocol.

handleICECandidateEvent(event) {
if (event.candidate) {
// Handle ICE candidate event
const { candidate } = event.candidate;
const payload = {
action: "receive-hd-candidates",
device_id: this.name,
value: {
timestamp: Math.floor(Date.now() / 1000),
session_id: this.sessionId,
candidate: candidate,
}
};
this.mqtt.publish(payload);
}
}

## 5. Forwarding streaming

The Agent will start forwarding the RTP track to the client, using the chosen ICE candidate pair. The client will mount the RTP track to a `<video>` HTML component.

handleTrackEvent(event) {
const videoElement = this.videoRef.current;
if (videoElement) {
videoElement.srcObject = event.streams[0];
}
}

The `<video>` HTML component is rendered in the browser and visualises the RTP stream accordingly.

render(){
return (
<video style={{width: "100%"}} ref={this.videoRef} muted controls></video>
);
}

## Example

In the `ui` folder a React application is created implementing the above feature, which contains a working example using our [`demo enviroment`](https://app-demo.kerberos.io). To run the project, install the dependencies and run the project using `npm install`.

cd ui/
npm install
npm start
Empty file.
7 changes: 7 additions & 0 deletions examples/livestream-yolov8/export/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from ultralytics import YOLO

# Load a model
model = YOLO("helmet.pt")

# Export the model
model.export(format="tfjs")
76 changes: 76 additions & 0 deletions examples/livestream-yolov8/export/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
absl-py==2.1.0
astunparse==1.6.3
certifi==2025.1.31
charset-normalizer==3.4.1
coloredlogs==15.0.1
contourpy==1.3.1
cycler==0.12.1
filelock==3.17.0
flatbuffers==25.2.10
fonttools==4.56.0
fsspec==2025.2.0
gast==0.6.0
google-pasta==0.2.0
grpcio==1.70.0
h5py==3.13.0
humanfriendly==10.0
idna==3.10
Jinja2==3.1.5
keras==3.8.0
kiwisolver==1.4.8
libclang==18.1.1
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
matplotlib==3.10.0
mdurl==0.1.2
ml-dtypes==0.4.1
mpmath==1.3.0
namex==0.0.8
networkx==3.4.2
numpy==1.26.4
onnx==1.17.0
onnx2tf==1.26.3
onnx_graphsurgeon==0.5.5
onnxruntime==1.20.1
onnxslim==0.1.48
opencv-python==4.11.0.86
opt_einsum==3.4.0
optree==0.14.0
packaging==24.2
pandas==2.2.3
pillow==11.1.0
protobuf==5.29.3
psutil==7.0.0
py-cpuinfo==9.0.0
pybind11==2.13.6
Pygments==2.19.1
pyparsing==3.2.1
python-dateutil==2.9.0.post0
pytz==2025.1
PyYAML==6.0.2
requests==2.32.3
rich==13.9.4
scipy==1.15.2
seaborn==0.13.2
six==1.17.0
sng4onnx==1.0.4
sympy==1.13.1
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.37.1
tensorflow-macos==2.16.2
termcolor==2.5.0
tf_keras==2.18.0
tflite-support==0.1.0a1
torch==2.6.0
torchvision==0.21.0
tqdm==4.67.1
typing_extensions==4.12.2
tzdata==2025.1
ultralytics==8.3.78
ultralytics-thop==2.0.14
urllib3==2.3.0
Werkzeug==3.1.3
wrapt==1.17.2
Loading