mirror of
https://github.com/shaka-project/shaka-packager.git
synced 2026-04-13 00:32:56 +00:00
I've finally generated test material and tests to make a PR to fix text segment generation for MPEG-2 TS streams with sparse teletext data, making a big improvement compared to the original teletext support in #1344. ## Problem When packaging MPEG-TS input containing DVB-Teletext subtitles for DASH/HLS output, two fundamental issues arise: 1. **Sparse input handling** - Teletext streams only contain data when subtitles are displayed. During gaps (which can span multiple segments), no PES packets arrive, leaving the text chunker with no timing information to drive segment generation. This results in missing segments or segments with incorrect timing. 2. **Misaligned segment boundaries** - Even when segments are generated, the text segment timestamps and boundaries differ from video/audio segments. This causes `<SegmentTimeline>` mismatches in the MPD, playback issues on some players, and sometimes fewer text segments than video segments. ## Solution This PR introduces two complementary mechanisms: ### 1. Heartbeat mechanism (sparse input handling) The `Mp2tMediaParser` now sends periodic "heartbeat" signals to text streams: - Video PTS timestamps are forwarded to all text PIDs as `MediaHeartBeat` samples - `EsParserTeletext` emits `TextHeartBeat` samples when PES packets arrive without displayable content - `TextChunker` uses these heartbeats to drive segment generation even during gaps in subtitle content - Ongoing cues that span segment boundaries are properly split and continued A new `heartbeat_shift` stream descriptor parameter (default: 2 seconds at 90kHz) controls the timing offset between video PTS and text segment generation, compensating for pipeline processing delays. ### 2. SegmentCoordinator (segment boundary alignment) A new N-to-N media handler (`SegmentCoordinator`) ensures text segments align precisely with video: - Passes all streams through unchanged - Replicates `SegmentInfo` from video/audio `ChunkingHandler` to registered teletext streams - `TextChunker` in "coordinator mode" uses received `SegmentInfo` events to determine segment boundaries instead of calculating from text timestamps This guarantees identical segment timelines across all adaptation sets. ## Testing - **Integration tests** in `packager_test.cc`: - `TeletextSegmentAlignmentTest.VideoAndTextSegmentsAligned` - Verifies segment count, start times, and durations match between video and text - `TeletextSegmentAlignmentTest.VideoAndTextSegmentsAlignedWithWrapAround` - Same verification with PTS timestamps near the 33-bit wrap-around point (~26.5 hours) - **Test files** (synthetic teletext with known cue timings at 1.0s, 3.5s, 13.0s): - `test_teletext_live.ts` - Normal PTS range - `test_teletext_live_wrap.ts` - PTS near wrap-around boundary - **Unit tests** for `SegmentCoordinator` and updated `TextChunker` tests ## Documentation - Extended `docs/source/tutorials/text.rst` with DVB-Teletext section covering: - Page numbering (3-digit cc_index format) - Heartbeat mechanism explanation - Segment alignment behavior - `--ts_ttx_heartbeat_shift` parameter tuning - Troubleshooting guide - Added teletext processing pipeline diagram to `docs/source/design.rst` ## Future work The heartbeat and `SegmentCoordinator` mechanisms would likely benefit **DVB-SUB (bitmap subtitles)** as well (Issue #1477) , which faces similar challenges with sparse subtitle data in MPEG-TS input and segment alignment. The infrastructure is now in place to extend this support. ## Example usage ```bash packager \ --segment_duration 6 \ --mpd_output manifest.mpd \ 'in=input.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \ 'in=input.ts,stream=audio,init_segment=audio/init.mp4,segment_template=audio/$Number$.m4s' \ 'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s,dash_only=1' ``` Fixes #1428 Fixes #1401 Fixes #1355 Fixes #1430
170 lines
6.4 KiB
ReStructuredText
170 lines
6.4 KiB
ReStructuredText
Text output formats
|
|
===================
|
|
|
|
Shaka Packager supports several text/subtitle formats for both input and output.
|
|
We only support certain formats for output, other formats are converted to the
|
|
specified output format. With the exception of TTML pass-through, there are no
|
|
restrictions of input vs output formats.
|
|
|
|
|
|
Examples
|
|
--------
|
|
|
|
* TTML pass-through::
|
|
|
|
$ packager in=input.ttml,stream=text,output=output.ttml
|
|
|
|
* Convert WebVTT to TTML::
|
|
|
|
$ packager in=input.vtt,stream=text,output=output.ttml
|
|
|
|
* Embed WebVTT in MP4 (single-file)::
|
|
|
|
$ packager in=input.vtt,stream=text,output=output.mp4
|
|
|
|
* Embed WebVTT in MP4 (segmented)::
|
|
|
|
$ packager 'in=input.vtt,stream=text,init_segment=init.mp4,segment_template=text_$Number$.mp4'
|
|
|
|
* Convert WebVTT to TTML in MP4::
|
|
|
|
$ packager in=input.vtt,stream=text,format=ttml+mp4,output=output.mp4
|
|
|
|
* Convert DVB-SUB to TTML in MP4::
|
|
|
|
$ packager in=input.ts,stream=text,format=ttml+mp4,output=output.mp4
|
|
$ packager 'in=input.ts,stream=text,format=ttml+mp4,init_segment=init.mp4,segment_template=text_$Number$.mp4'
|
|
|
|
* Get a single page from DVB-SUB and set language::
|
|
|
|
$ packager in=input.ts,stream=text,cc_index=3,lang=en,format=ttml+mp4,output=output.mp4
|
|
|
|
* Multiple languages::
|
|
|
|
$ packager \
|
|
in=in_en.vtt,stream=text,language=en,output=out_en.mp4 \
|
|
in=in_sp.vtt,stream=text,language=sp,output=out_sp.mp4 \
|
|
in=in_fr.vtt,stream=text,language=fr,output=out_fr.mp4
|
|
|
|
* Get a single 3-digit page from DVB-teletext and set language for output formats stpp (TTML in mp4), wvtt (WebVTT in mp4) and HLS WebVTT::
|
|
|
|
$ packager in=input.ts,stream=text,cc_index=888,lang=en,format=ttml+mp4,output=output.mp4
|
|
$ packager in=input.ts,stream=text,cc_index=888,lang=en,output=output.mp4
|
|
$ packager in=input.ts,stream=text,cc_index=888,segment_template=text/$Number$.vtt,playlist_name=text/main.m3u8,hls_group_id=text,hls_name=ENGLISH
|
|
|
|
|
|
DVB-Teletext
|
|
------------
|
|
|
|
DVB-Teletext subtitles are commonly used in European broadcast systems. They are
|
|
embedded in MPEG-2 Transport Streams and identified by a 3-digit page number
|
|
(e.g., 888 for subtitles in many countries).
|
|
|
|
Page numbering
|
|
^^^^^^^^^^^^^^
|
|
|
|
Teletext pages are identified by a magazine number (1-8) and a two-digit page
|
|
number (00-99). The ``cc_index`` parameter uses a 3-digit format where the first
|
|
digit is the magazine and the last two digits are the page number:
|
|
|
|
* ``cc_index=888`` - Magazine 8, page 88 (common for subtitles)
|
|
* ``cc_index=100`` - Magazine 1, page 00
|
|
* ``cc_index=777`` - Magazine 7, page 77
|
|
|
|
Heartbeat mechanism
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
Teletext subtitles are "sparse" - they only contain data when subtitles are
|
|
displayed. This creates a problem for segmented output (DASH/HLS): if no
|
|
subtitle appears during a segment's time window, that segment might be missing
|
|
or have incorrect timing.
|
|
|
|
To solve this, Shaka Packager uses a "heartbeat" mechanism when processing
|
|
teletext from MPEG-TS input. The video stream's PTS timestamps are used to
|
|
generate periodic timing signals that ensure:
|
|
|
|
1. Text segments are generated continuously, even during gaps in subtitles
|
|
2. Text segment boundaries align with video segment boundaries
|
|
3. Ongoing subtitles that span multiple segments are properly handled
|
|
|
|
This mechanism is automatic when processing MPEG-TS files with both video and
|
|
teletext streams.
|
|
|
|
Heartbeat shift parameter
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The ``--ts_ttx_heartbeat_shift`` parameter controls the timing offset between
|
|
when video timestamps arrive and when they trigger text segment generation.
|
|
This is needed because video is typically processed slightly ahead of teletext
|
|
in the pipeline.
|
|
|
|
The default value (90000 at 90kHz timescale = 1 second) works for most cases.
|
|
You may need to adjust this if:
|
|
|
|
* Text segments are generated later than video segments (value too large)
|
|
* Some text cues are missing from the output (value too small)
|
|
|
|
Example with custom heartbeat shift (3 seconds)::
|
|
|
|
$ packager \
|
|
--ts_ttx_heartbeat_shift 270000 \
|
|
'in=input.ts,stream=video,init_segment=v/init.mp4,segment_template=v/$Number$.m4s' \
|
|
'in=input.ts,stream=audio,init_segment=a/init.mp4,segment_template=a/$Number$.m4s' \
|
|
'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=t/init.mp4,segment_template=t/$Number$.m4s'
|
|
|
|
Segment alignment
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
When generating DASH or HLS output with teletext, the text segments are
|
|
automatically aligned with video segment boundaries. This ensures that:
|
|
|
|
* All adaptation sets have the same segment timeline
|
|
* Seeking works correctly across all media types
|
|
* There are no gaps or overlaps between segments
|
|
|
|
For best results, always include video and teletext streams from the same
|
|
MPEG-TS source in the same packager invocation::
|
|
|
|
$ packager \
|
|
--segment_duration 6 \
|
|
--mpd_output manifest.mpd \
|
|
'in=input.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \
|
|
'in=input.ts,stream=audio,init_segment=audio/init.mp4,segment_template=audio/$Number$.m4s' \
|
|
'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s,dash_only=1'
|
|
|
|
VoD output with non-zero start times
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
When the input MPEG-TS has high PTS values (e.g., from a live recording that
|
|
started hours into a broadcast), use the ``--generate_static_live_mpd`` flag
|
|
to ensure proper ``presentationTimeOffset`` values in the DASH manifest::
|
|
|
|
$ packager \
|
|
--generate_static_live_mpd \
|
|
--segment_duration 4 \
|
|
--mpd_output manifest.mpd \
|
|
'in=recording.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \
|
|
'in=recording.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s'
|
|
|
|
Troubleshooting
|
|
^^^^^^^^^^^^^^^
|
|
|
|
**No subtitles in output**
|
|
|
|
* Verify the correct ``cc_index`` value. Use a tool like ``ccextractor`` or
|
|
``dvbsnoop`` to identify available teletext pages in the input.
|
|
* Ensure the teletext stream contains actual subtitle data, not just page
|
|
structure information.
|
|
|
|
**Subtitles cues are missing**
|
|
|
|
* Check that video and teletext come from the same MPEG-TS source
|
|
* Try adjusting ``--ts_ttx_heartbeat_shift`` if subtitle cues are missing, or
|
|
text segments are generated too late
|
|
|
|
**Missing segments or gaps**
|
|
|
|
* Ensure video stream is included in the same packager run - the heartbeat
|
|
mechanism requires video PTS to drive text segmentation
|
|
* Check that segment duration is appropriate for the subtitle density
|