This PR removes the CLOSED-CAPTIONS attribute (and a few others) from
the EXT-X-I-FRAME-STREAM-INF tag in the HLS master playlist.
According to the HLS RFC (Section 4.4.6.3), the CLOSED-CAPTIONS
attribute is not defined for EXT-X-I-FRAME-STREAM-INF tags. Including it
causes non-compliance and may result in playback issues on strictly
compliant players.
References
[RFC 8216 Section
4.4.6.3](https://datatracker.ietf.org/doc/html/draft-pantos-hls-rfc8216bis-16#section-4.4.6.3)
Affected version : >= 2.5
---------
Co-authored-by: Xavier Laffargue <xavier.laffargue@radio-canada.ca>
Current deps require an old CMake version, support for which has been
removed from latest CMake in Ubuntu and Arch.
```
CMake Error at packager/third_party/c-ares/source/CMakeLists.txt:3 (CMAKE_MINIMUM_REQUIRED):
Compatibility with CMake < 3.5 has been removed from CMake.
```
This upgrades:
- abseil-cpp, 20240116.0 => 20260107.1
- c-ares, v1.20.1 => v1.34.5
- json, v3.10.1-115-g954b10ad => v3.12.0
- libwebm, 1.0.0.28 => 1.0.0.32
With these changes, the build has been verified with CMake 3.31.11
(latest v3) and CMake 4.2.3 (latest v4).
---------
Co-authored-by: Joey Parrish <joeyparrish@google.com>
**Description**
This PR introduces support for subsample encryption for Dolby AC-4 audio
streams including muxing and packaging in Shaka Pakager.
**Motivation**
Subsample encryption is a mode of Common Encryption (CENC) that allows
selective encryption of media samples, leaving specific portions (such
as headers) in the clear. In most cases, subsample encryption is applied
to video streams (e.g., H.265/HEVC or AV1). It can also be used for
audio streams, even though many pipelines default to full-sample
encryption for audio. For Dolby AC-4, this is particularly useful
because the AC-4 frame header (ac4_toc) contains configuration data that
must remain accessible to parsers and playback systems even when the
rest of the frame is encrypted.
By implementing subsample encryption for AC-4:
- Parsers can validate and process encrypted streams without full
decryption.
- Playback systems can perform certain bitstream operations (e.g.,
stream identification, sync validation) efficiently.
- This aligns with best practices for encrypted media delivery and
improves performance with streaming platforms.
**Implementation Details**
- The implementation ensures that the ac4_toc() portion of each AC-4
frame remains unencrypted.
- The remaining payload is encrypted using the selected CENC protection
scheme (e.g., cbcs or cenc).
**Specification Reference**
This feature is not mentioned in any published specification. However,
it is discussed in detail in the specification, which has not yet been
publicly released but is expected to be published in the near future.
Once available, it will provide formal guidance on subsample encryption
for AC-4, including the handling of the ac4_toc() structure and
encryption boundaries.
**Notes**
- This implementation is compatible with the upcoming public release of
the AC-4 ISOBMFF specification.
- The feature has been tested on encryption/decryption, DASH, HLS, and
DRM. The generated content is conforms to expected encryption behavior.
- Additional documentation will be provided once the specification is
officially published.
---------
Co-authored-by: Xingzhao Yun <xyun@dolby.com>
Before, we would add clear ranges to the start of the subsample,
therefore moving the encrypted start further into the video frame; now
we add clear ranges to the end by adding a new subsample (if needed).
This seems more correct. Both are allowed by the spec, but it would be
better to encrypt the beginning of the frame to give attackers less info
about the start of the frame. Having the clear data at the end of the
frame doesn't give attackers much info since most data depends on the
encrypted state.
The `SimpleHlsNotifier::NotifyEncryptionUpdate` method was modified :
When this method is called with the Common System ID, it now checks if
the stream's encryption method is CENC. If it is CENC, the notifier
skips adding the EXT-X-KEY tag with KEYFORMAT="identity", as CENC
content should be handled by the specific DRM system's key format (e.g.,
Widevine's urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed).
Closes#1439
---------
Co-authored-by: Joey Parrish <joeyparrish@google.com>
Adding MV-HEVC support limited to stereo video, issue #1483.
Stereo video coded in MV-HEVC is becoming more widely available: Apple
Vision Pro supports stereo video playback in MV-HEVC and both the
headset and iPhone supports capturing stereo video using the format.
FFmpeg has also added support for MV-HEVC.
Note that this PR is only focusing on adding MV-HEVC support to .mp4 for
encryption/decryption support. Proper HLS and DASH support will still
need to be added.
Support Dolby Vision profile 8.1, 8.2, 8.4, 10.1, 10.4 signaling in HLS
and DASH.
Adds new option `--use_dovi_supplemental_codecs` (off by default) to use
SUPPLEMENTAL-CODECS in HLS and `scte214:supplementalCodecs` and
`scte214:supplementalProfiles` for DASH.
To maintain compatibility with existing players the current behavior of
using two entries in the manifest remains the default. This will be
changed in a future version where `use_dovi_supplemental_codecs` will
become on by default.
Adds Dolby Vision compatible brands, 'db1p', 'db2g', 'db4g', 'db4h',
'dby1' based on https://mp4ra.org/#/brands
---------
Co-authored-by: Xingzhao Yun <xyun@dolby.com>
After change to add forced command line ordering adaptation set IDs in
places were referenced by their sort index (the minimum representation
index they contained).
Instead always refer to adaptation sets by their own ID, and use the
index only as an optional sort key.
Fixes#1393
Set the start number in representation to the segment index that is sent by muxer.
With this enhancement, you can now specify the initial sequence number
to be used on the generated segments when calling the packager.
With the old implementation, it was always starting with "1".
---------
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
This PR adds parsing of teletext styling, and rendering of the styling
in output TTML and WebVTT subtitle tracks.
Beyond unit tests, I've used the sample
https://drive.google.com/file/d/19ZYsoeUfH85gEilQkaAdLbPhC4CxhDEh/view?usp=sharing
which has rather advanced subtitling with two separate rows at the same
time, where one is left aligned and another is right aligned. This
necessitates two parallel cues to be rendered. It also has some colored
text.
Solve #1335.
## parse teletext styling and formatting
Extend the teletext parser to parse the teletext styling and formatting.
This includes translating rows into regions, calculating alignment
from start and stop position of the text, and extracting text and
background colors.
The colors are limited to full lines.
Both lines and regions are propagated in the TextSample structures.
This is because the number of lines may differ from different sources.
For teletext, there are 24 rows, but they are essentially always
used with double height, so the number of output lines is 12
from 0 to 11.
There are also corresponding regions are denoted "ttx_R",
where R is an integer row number. A renderer can use either
the line number or the region ID to render the text.
## ttml generation for teletext to EBU-TT-D
Add support to render teletext input in EBU-TT-D (IMSC-1) format.
This includes appropriate regions ttx_0 to ttx_11 signalled
in the TextSamples, alignment and text and background colors.
The general TTML output has been changed to always include
metadata, layout, and styling nodes, even if they are empty.
EBU-TT-D is detected by the presence of "ttx_?" regions in the
samples. If detected, extra TTML elements will be added and
the EBU-TT-D linePadding used as well.
Appropriate styles for background and text colors are generated
depending on the color and backgroundColor attributes in the
text fragments.
## adapt WebVTT output to teletext TextSample.
Teletext input generates both a region with prefix ttx_
and a floating point line number (e.g. 9.5) in the
range 0 to 11.5 (due to input 0-23 as double lines).
The output is adopted to drop such regions
and convert the line number to an integer
since the standard only used floats for percent
values but not for plain line numbers.
They can still be skipped by passing `-DSKIP_INTEGRATION_TESTS=ON` for
the build configuration. Fix integration tests so they run correctly when building out of tree.
Use FindPython3 in CMake to fix build and integration tests on Windows.
Use the second sample in mp4 and webm formats. #835 had issues with
merging due to golden file conflicts. Because we cannot make dependent
pull requests, this is a replica of #835.
---------
Signed-off-by: Cosmin Stejerean <cstejerean@meta.com>
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
Add startwithSAP/subsegmentstartswithSAP for aac, ac3, ec3 and ac4 audio tracks according to LIVE or VOD profile.
Replaces #1055
Partial solution for #364
---------
Co-authored-by: Xingzhao Yun <xyun@dolby.com>
Co-authored-by: Joey Parrish <joeyparrish@google.com>
This is based on comments at
https://github.com/google/shaka-packager/pull/891. The muxer is deciding
whether to write to a single file or a segment file based on the
configuration.
Example:
```
../packager 'in=TOS.ts,stream=video,output=tos_video.ts,playlist_name=tos_video.m3u8' \
'in=TOS.ts,stream=audio,output=tos_audio.ts,playlist_name=tos_audio.m3u8' \
--hls_master_playlist_output tos.m3u8
```
Tested the content using Exoplayer.
---------
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
feat: Added audio specific configuration udts box to AudioSampleEntry
for MP4 input/output. DASH tags for DTS audio as specified in ETSI TS
103 491 and ETSI TS 102 114.
Closes#1301
---------
Co-authored-by: Cosmin Stejerean <cstejerean@meta.com>
As part of the CMake port we updated the duration formatting to contain
maximum of 6 decimal places but without trailing 0s. There was a bug
however where it used 6 significant digits rather than 6 decimal places
(`%g` rather than `%f`).
This fixes the bug and also updates the MPD sample files for the
integration tests to contain maximum of 6 decimal places.
The current libwebm integration test samples contain `libwebm-0.2.1`
however we have updated to a newer version of libwebm so we need to
update the samples.
As of `libwebm-0.3.0` this signature has been frozen so we won't have to
do this again.
It appears that not all Apple implementations follow the HLS guidelines.
While the DEFAULT=NO for an audio track should be optional and default
to NO, in practice native HLS players Safari and iOS devices treat the
missing DEFAULT as a MAYBE.
Fixes#1169
Legacy players, e.g. older versions of ExoPlayer, do not handle default webvtt text alignment correctly. Need to specify `align:center` explicitly cues without text alignment for backwards compatibility.
Fixes#925.
This changes the default MP4 output to use TTML and adds a way to
choose which one is used. This is done with 'format=ttml+mp4' or
'format=vtt+mp4'.
This also fixes the boxes output in WebVTT in MP4.
Change-Id: Ieaa7fc44fbf4dc020a5bb70cfa3578ec10e088ce
This only supports TTML output; meaning the user can convert WebVTT
into TTML, but not the other way around. This will be useful for
DVB-sub subtitles that would be better supported within TTML.
This only adds text-based output; a follow-up will add MP4 support.
Change-Id: I0944b7df95d7765e55f203fc5e9a644f5c455dd8
We currently have a bug about non-deterministic output in the MPD
generator. This works around that bug by optionally doing everything
in a single thread. This allows us to run manifest comparisons without
making the major changes needed to add that feature.
Issue #177
Change-Id: I10e1084dac77841220161fbd2575cdcb5c13c00e
Now text-based WebVTT also uses the generic media pipeline. This
converts the WebVttTextOutputHandler to a WebVttMuxer to be more
consistent with the other muxer types.
This also allows choosing between single-segment text and multi-segment.
Before, we would generate both and use single-segment for DASH and
multi-segment for HLS; but now you can choose between either and either
are supported in both DASH and HLS.
Change-Id: I6f7edda09e01b5f40e819290d3fe6e88677018d9
Now the same pipeline for handling the audio/videos streams will handle
the segmented text streams too. This doesn't apply to the text output,
only to the MP4 variants. This also fixes a bug where we added the
X-TIMESTAMP-MAP tag even when there wasn't TS streams; this doesn't
otherwise change the behavior around that tag.
Change-Id: I03f7cea56efa42e96311c00841330629a14aa053
The test added in the previous CL was broken due to a rebase on another
change. This subtly changed some of the byte offsets that broke the
test. This wasn't caught since I didn't rebase and re-run the tests
before merging.
Change-Id: Id7e4c7688278eae37da1a14f1648263b4dda98cd
This changes it from an OriginHandler to a MediaParser and moves the
handling of it to the Demuxer. This will allow more generic handling
of text by giving it the same abstractions as video/audio handling.
Change-Id: Ibbde3c84d228ec8e83af1ed266ea97dbc9589c24