Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add background segmentation mask #142

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -1688,5 +1688,104 @@ <h2>MediaStream in workers</h2>
};</pre>
</div>
</section>
<section>
<h2>Background segmentation mask</h2>
<p>Some platforms or User Agents may provide built-in support for background segmentation of video frames, in particular for camera video streams.
Web applications may want to control whether background segmentation is computed at the source level and to get access to the computed segmentation masks.
This allows the web application for instance
to do custom framing or background blurring or replacement
while leveraging on platform computed background segmentation.
This allows the web application
to access the original unmodified frame and
to fine tune frame modifications based on its likings.
For that reason, we extend {{MediaStreamTrack}} with the following properties and {{VideoFrame}} with the following attributes.
</p>
<pre class="idl">
partial dictionary MediaTrackSupportedConstraints {
boolean backgroundSegmentationMask = true;
};

partial dictionary MediaTrackConstraintSet {
ConstrainBoolean backgroundSegmentationMask;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it ever be interesting and feasible to tweak the parameters by which segmentation is done?

Copy link

@riju riju May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atleast on Windows, the platform model does not allow tweaking segmentation parameters today. Using tensorflow.js with BodyPix model for Blur, I see there's atleast a segmentationThreshold parameter. Maybe it's the same as foregroundThresholdProbability with the MediaPipeSelfieSegmentation model ?

Did you have some other parameters in mind ?

mediapipe_parameters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have some other parameters in mind?

I am not knowledgeable enough on what parameters would be best to include. I was mostly wondering if this is something we foresee extending from a boolean to a set of parameters, and if so, whether there was a viable path for such future extensions given the current API shape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Media Capture API, the parameter space is flat and not hierarchical.

As an example, there is a constrainable property called whiteBalanceMode which can be constrained to manual. If one then wants to manually change the white balance, there is a constrainable property called colorTemperature which can be constrained separately in order to do that.

So if we later would like to add a numeric constrainable property called backgroundSegmentationThreshold (which could change the segmentation mask to be pre-processed to an blank and white mask according to the threshold without shades of grey) or a string constrainable property called backgroundSegmentationModel (to use the particular AI model), we could certainly do that.

};

partial dictionary MediaTrackSettings {
boolean backgroundSegmentationMask;
};

partial dictionary MediaTrackCapabilities {
sequence&lt;boolean&gt; backgroundSegmentationMask;
};</pre>
<section>
<h3>VideoFrameMetadata dictionary extensions</h3>
<pre class="idl">
partial dictionary VideoFrameMetadata {
ImageBitmap backgroundSegmentationMask;
};</pre>
<section>
<h4>Dictionary {{VideoFrameMetadata}} Members</h4>
<dl data-link-for="VideoFrameMetadata" data-dfn-for="VideoFrameMetadata" class="dictionary-members">
<dt><dfn><code>backgroundSegmentationMask</code></dfn>
of type
<span class="idlMemberType">{{VideoFrameMetadata}}</span></dt>
<dd>
<p>A background segmentation mask with
white denoting certainly foreground,
black denoting certainly background and
grey denoting uncertainty or ambiguity with
light shades of grey denoting likely foreground and
dark shades of grey denoting likely background.</p>
</dd>
</dl>
</section>
</section>
<section>
<h3>Example</h3>
<pre class="example">
// Open camera.
const stream = await navigator.mediaDevices.getUserMedia({video: true});
const [videoTrack] = stream.getVideoTracks();

// Try to enable background segmentation mask.
const videoCapabilities = videoTrack.getCapabilities();
if ((videoCapabilities.backgroundSegmentationMask || []).includes(true)) {
await videoTrack.applyConstraints({backgroundSegmentationMask: {exact: true}});
} else {
// Background segmentation mask is not supported by the platform or
// by the camera. Consider falling back to some other method.
}

const canvasContext = document.querySelector('canvas').getContext('2d');
const videoProcessor = new MediaStreamTrackProcessor({track: videoTrack});
const videoProcessorReader = videoProcessor.readable.getReader();

for (;;) {
const {done, value: videoFrame} = await videoProcessorReader.read();
if (done)
break;
const {backgroundSegmentationMask} = videoFrame.metadata();
if (backgroundSegmentationMask) {
// Draw the video frame.
canvasContext.globalCompositeOperation = 'copy';
context.drawImage(videoFrame, 0, 0);
// Draw (or multiply with) the mask.
// The result is the foreground on black.
context.globalCompositeOperation = 'multiply';
canvasContext.drawImage(backgroundSegmentationMask, 0, 0);
}
else {
// Everything is background. Fill with black.
canvasContext.globalCompositeOperation = 'copy';
canvasContext.fillStyle = 'black';
canvasContext.fillRect(
0,
0,
canvasContext.canvas.width,
canvasContext.canvas.height);
}
}
</pre>
</section>
</section>
</body>
</html>