Add face detection constraints and VideoFrame attributes #48

eehakkin · 2022-01-11T21:17:38Z

This spec update is a follow up to w3c/mediacapture-image#292 and allows face detection as described in #44.

The changes include a new face detection constrainable properties which are used for controlling the face detection.

The face detection results are exposed by VideoFrames through a new readonly detectedFaces sequence attribute.

This allows following kind of code to be used for face detection:

// main.js:
// Check if face detection is supported by the browser
const supports = navigator.mediaDevices.getSupportedConstraints();
if (supports.faceDetectionMode) {
  // Browser supports face contour detection.
} else {
  throw('Face contour detection is not supported');
}

// Open camera with face detection enabled
const stream = await navigator.mediaDevices.getUserMedia({
  video: {faceDetectionMode: ['bounding-box', 'contour']}
});
const [videoTrack] = stream.getVideoTracks();

// Use a video worker and show to user.
const videoElement = document.querySelector("video");
const videoGenerator = new MediaStreamTrackGenerator({kind: 'video'});
const videoProcessor = new MediaStreamTrackProcessor({track: videoTrack});
const videoSettings = videoTrack.getSettings();
const videoWorker = new Worker('video-worker.js');
videoWorker.postMessage({
  videoReadable: videoProcessor.readable,
  videoWritable: videoGenerator.writable
}, [videoProcessor.readable, videoGenerator.writable]);
videoElement.srcObject = new MediaStream([videoGenerator]);

// video-worker.js:
self.onmessage = async function(e) {
  const videoTransformer = new TransformStream({
    async transform(videoFrame, controller) {
      for (const face of videoFrame.detectedFaces) {
        console.log(
          `Face @ (${face.contour[0].x}, ${face.contour[0].y}), ` +
                 `(${face.contour[1].x}, ${face.contour[1].y}), ` +
                 `(${face.contour[2].x}, ${face.contour[2].y}), ` +
                 `(${face.contour[3].x}, ${face.contour[3].y})`);
      }
      controller.enqueue(videoFrame);
    }
  });
  e.data.videoReadable
  .pipeThrough(videoTransformer)
  .pipeTo(e.data.videoWritable);
}

Preview | Diff

eehakkin · 2022-01-11T21:23:28Z

Sorry for closing and reopening. This one should be open and w3c/mediacapture-image#292 should be closed.

riju · 2022-01-17T11:24:00Z

@alvestrand , @youennf : We tried to incorporate the review comments as per our last discussions. Could you please take a look ?

riju · 2022-01-27T16:18:24Z

Friendly ping @alvestrand, @youennf, @jan-ivar

dontcallmedom · 2022-01-27T16:39:41Z

@riju I think it would help if you could document which comments exactly from the last discussion you incorporated and how - for instance, I still see a FaceExpression enum - with fewer values, but still some.

eehakkin · 2022-01-28T22:28:40Z

@dontcallmedom I removed face expressions completely.

eehakkin · 2022-02-05T22:52:12Z

The following example from #57 shows how to use face detection, background concealment (see #45) and eye gaze correction (see #56) with MediaStreamTrack Insertable Media Processing using Streams:

// main.js:
// Open camera.
const stream = navigator.mediaDevices.getUserMedia({video: true});
const [videoTrack] = stream.getVideoTracks();

// Use a video worker and show to user.
const videoElement = document.querySelector('video');
const videoWorker = new Worker('video-worker.js');
videoWorker.postMessage({track: videoTrack}, [videoTrack]);
const {data} = await new Promise(r => videoWorker.onmessage);
videoElement.srcObject = new MediaStream([data.videoTrack]);

// video-worker.js:
self.onmessage = async ({data: {track}}) => {
  // Apply constraints.
  let customBackgroundBlur = true;
  let customEyeGazeCorrection = true;
  let customFaceDetection = false;
  let faceDetectionMode;
  const capabilities = track.getCapabilities();
  if (capabilities.backgroundBlur && capabilities.backgroundBlur.max > 0) {
    // The platform supports background blurring.
    // Let's use platform background blurring and skip the custom one.
    await track.applyConstraints({
      advanced: [{backgroundBlur: capabilities.backgroundBlur.max}]
    });
    customBackgroundBlur = false;
  } else if ((capabilities.faceDetectionMode || []).includes('contour')) {
    // The platform supports face contour detection but not background
    // blurring. Let's use platform face contour detection to aid custom
    // background blurring.
    faceDetectionMode ||= 'contour';
    await videoTrack.applyConstraints({
      advanced: [{faceDetectionMode}]
    });
  } else {
    // The platform does not support background blurring nor face contour
    // detection. Let's use custom face contour detection to aid custom
    // background blurring.
    customFaceDetection = true;
  }
  if ((capabilities.eyeGazeCorrection || []).includes(true)) {
    // The platform supports eye gaze correction.
    // Let's use platform eye gaze correction and skip the custom one.
    await videoTrack.applyConstraints({
      advanced: [{eyeGazeCorrection: true}]
    });
    customEyeGazeCorrection = false;
  } else if ((capabilities.faceDetectionLandmarks || []).includes(true)) {
    // The platform supports face landmark detection but not eye gaze
    // correction. Let's use platform face landmark detection to aid custom eye
    // gaze correction.
    faceDetectionMode ||= 'presence';
    await videoTrack.applyConstraints({
      advanced: [{
        faceDetectionLandmarks: true,
        faceDetectionMode
      }]
    });
  } else {
    // The platform does not support eye gaze correction nor face landmark
    // detection. Let's use custom face landmark detection to aid custom eye
    // gaze correction.
    customFaceDetection = true;
  }

  // Load custom libraries which may utilize TensorFlow and/or WASM.
  const requiredScripts = [].concat(
    customBackgroundBlur    ? 'background.js' : [],
    customEyeGazeCorrection ? 'eye-gaze.js'   : [],
    customFaceDetection     ? 'face.js'       : []
  );
  importScripts(...requiredScripts);

  const generator = new VideoTrackGenerator();
  parent.postMessage({videoTrack: generator.track}, [generator.track]);
  const {readable} = new MediaStreamTrackProcessor({track});
  const transformer = new TransformStream({
    async transform(frame, controller) {
      // Detect faces or retrieve detected faces.
      const detectedFaces =
        customFaceDetection
          ? await detectFaces(frame)
          : frame.detectedFaces;
      // Blur the background if needed.
      if (customBackgroundBlur) {
        const newFrame = await blurBackground(frame, detectedFaces);
        frame.close();
        frame = newFrame;
      }
      // Correct the eye gaze if needed.
      if (customEyeGazeCorrection && (detectedFaces || []).length > 0) {
        const newFrame = await correctEyeGaze(frame, detectedFaces);
        frame.close();
        frame = newFrame;
      }
      controller.enqueue(frame);
    }
  });
  await readable.pipeThrough(transformer).pipeTo(generator.writable);
};

alvestrand · 2022-05-19T14:21:37Z

Waiting for an explainer, or possible move to WebCodecs (since it does frame mods).

youennf · 2022-05-19T14:22:47Z

We should probably work on the abstract attach-metadata-to-video-frame mechanism, then we could reuse this mechanism.

riju · 2022-05-24T06:22:38Z

@alvestrand @youennf : Here's an explainer we have been working on.

youennf · 2022-06-14T11:58:50Z

The explainer is pretty clear to me.
I am not sure what we do with explainers but I guess it should be reviewed by WG and we can discuss at this point whether to merge it.
Some comments on the explainer:

I would be tempted to make the API surface as minimal as possible (What is the MVP?) and leave the rest to a dedicated 'future steps' section. For instance, maybe the MVP only needs faceDetectionMode constraint (not landmarks/numfaces/contourpoints constraints) with a reduced set of values ("none" and "presence"). I am not sure about the difference between presence and contour for instance, which is somehow distracting. Are FaceLandmark part of the MVP as well?
The proposal is based on the VideoFrameMetadata construct, which is fine. We should try to finalise this discussion in WebCodecs.
DetectedFace has a required id and required probability. I can see 'id' being useful, maybe probability should be optional.

youennf · 2022-09-16T15:49:37Z

index.html

+      <h3>{{VideoFrame}}</h3>
+      <pre class="idl"
+>partial interface VideoFrame {
+  readonly attribute FrozenArray&lt;DetectedFace&gt;? detectedFaces;


Based on discussions in w3c/webcodecs#559, the direction might be to move to
partial dictionary VideoFrameMetadata

alvestrand · 2022-11-29T16:34:55Z

Assumed to be superseded by #78

eehakkin closed this Jan 11, 2022

eehakkin reopened this Jan 11, 2022

eehakkin mentioned this pull request Jan 11, 2022

Add face detection mode constraint w3c/mediacapture-image#292

Closed

eehakkin force-pushed the feature/face-detection branch from 6ea729d to 6cee839 Compare January 28, 2022 22:18

eehakkin mentioned this pull request Feb 5, 2022

Face detection, background blur and eye gaze correction example #57

Closed

eehakkin force-pushed the feature/face-detection branch from c6b1685 to 8be1871 Compare February 5, 2022 23:13

eehakkin force-pushed the feature/face-detection branch from 8be1871 to 0099105 Compare March 15, 2022 14:01

alvestrand added the Waiting for input label May 19, 2022

Face detection

d7098a1

eehakkin force-pushed the feature/face-detection branch from 0099105 to d7098a1 Compare June 29, 2022 15:22

youennf reviewed Sep 16, 2022

View reviewed changes

youennf mentioned this pull request Sep 20, 2022

[WebCodecs VideoFrame metadata registry] Introduce VideoFrame metadata w3c/webcodecs#559

Merged

jan-ivar marked this pull request as draft October 13, 2022 14:57

ttoivone mentioned this pull request Nov 11, 2022

Add face detection constraints and VideoFrameMetadata members #78

Merged

alvestrand closed this Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add face detection constraints and VideoFrame attributes #48

Add face detection constraints and VideoFrame attributes #48

eehakkin commented Jan 11, 2022 •

edited by pr-preview bot

Loading

eehakkin commented Jan 11, 2022

riju commented Jan 17, 2022

riju commented Jan 27, 2022

dontcallmedom commented Jan 27, 2022

eehakkin commented Jan 28, 2022

eehakkin commented Feb 5, 2022

alvestrand commented May 19, 2022

youennf commented May 19, 2022

riju commented May 24, 2022

youennf commented Jun 14, 2022

youennf Sep 16, 2022

alvestrand commented Nov 29, 2022

Add face detection constraints and VideoFrame attributes #48

Add face detection constraints and VideoFrame attributes #48

Conversation

eehakkin commented Jan 11, 2022 • edited by pr-preview bot Loading

eehakkin commented Jan 11, 2022

riju commented Jan 17, 2022

riju commented Jan 27, 2022

dontcallmedom commented Jan 27, 2022

eehakkin commented Jan 28, 2022

eehakkin commented Feb 5, 2022

alvestrand commented May 19, 2022

youennf commented May 19, 2022

riju commented May 24, 2022

youennf commented Jun 14, 2022

youennf Sep 16, 2022

Choose a reason for hiding this comment

alvestrand commented Nov 29, 2022

eehakkin commented Jan 11, 2022 •

edited by pr-preview bot

Loading