Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug where ISVCDecoder::DecodeFrameNoDelay() could fail when decoding an H.264 stream encoded by the Apple HWA encoder. #3787

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sile
Copy link

@sile sile commented Sep 18, 2024

Let me report a decoding issue and propose a patch to fix it.

I recorded H.264 streams published by Safari and transmitted through a WebRTC SFU server.
When I attempted to decode these streams using openh264 (commit: 3668daf), the method ISVCDecoder::DecodeFrameNoDelay() failed with the error code -4 for some of the streams, whereas popular players like ffplay were able to decode the streams without any errors.

The following file contains an H.264 elementary stream of a failed attempt.
input.h264.gz

While I cannot provide the exact steps to reproduce the issue, it has been confirmed that such a stream can be generated by Safari on macOS 15.0 and iPad 17.6.1.

After inserting additional debug prints and examining the openh264 source code, it was discovered that the error originated from the following code:

if (iMaxLongTermFrameIdx > int32_t (pSps->uiLog2MaxFrameNum)) {
//ISO/IEC 14496-10:2009(E) 7.4.3.3 Decoded reference picture marking semantics page 96
return GENERATE_ERROR_NO (ERR_LEVEL_SLICE_HEADER, ERR_INFO_INVALID_REF_MARKING);
}

In the attached .h264 file, the iMaxLongTermFrameIdx and pSps->uiLog2MaxFrameNum had values of 13 and 12, respectively. Therefore, the above condition was met, which resulted in the error being returned.

However, the H.264 specification says as follows:

max_long_term_frame_idx_plus1 minus 1 specifies the maximum value of long-term frame index allowed for long-term reference pictures (until receipt of another value of max_long_term_frame_idx_plus1). The value of max_long_term_frame_idx_plus1 shall be in the range of 0 to max_num_ref_frames, inclusive.

From: [Rec. ITU-T H.264 (08/2021)] 7.4.3.3 Decoded reference picture marking semantics (p.97)

Although I am not well-versed in the H.264 specification, it seems that pSps->iNumRefFrames should be used in the code above instead of pSps->uiLog2MaxFrameNum.
In the attached file, the value of pSps->iNumRefFrames was 15. After modifying the code to use pSps->iNumRefFrames (i.e., the code in this PR branch), the decoding process succeeded without any errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant