Files
shaka-packager/docs/source/tutorials/text.rst
Torbjörn Einarsson 19dbd203b0 fix: DVB-Teletext: heartbeat mechanism and segment alignment with video/audio (#1535)
I've finally generated test material and tests to make a PR to fix text
segment generation for MPEG-2 TS streams with sparse teletext data,
making a big improvement compared to the original teletext support in
#1344.


## Problem

When packaging MPEG-TS input containing DVB-Teletext subtitles for
DASH/HLS output, two fundamental issues arise:

1. **Sparse input handling** - Teletext streams only contain data when
subtitles are displayed. During gaps (which can span multiple segments),
no PES packets arrive, leaving the text chunker with no timing
information to drive segment generation. This results in missing
segments or segments with incorrect timing.

2. **Misaligned segment boundaries** - Even when segments are generated,
the text segment timestamps and boundaries differ from video/audio
segments. This causes `<SegmentTimeline>` mismatches in the MPD,
playback issues on some players, and sometimes fewer text segments than
video segments.

## Solution

This PR introduces two complementary mechanisms:

### 1. Heartbeat mechanism (sparse input handling)

The `Mp2tMediaParser` now sends periodic "heartbeat" signals to text
streams:

- Video PTS timestamps are forwarded to all text PIDs as
`MediaHeartBeat` samples
- `EsParserTeletext` emits `TextHeartBeat` samples when PES packets
arrive without displayable content
- `TextChunker` uses these heartbeats to drive segment generation even
during gaps in subtitle content
- Ongoing cues that span segment boundaries are properly split and
continued

A new `heartbeat_shift` stream descriptor parameter (default: 2 seconds
at 90kHz) controls the timing offset between video PTS and text segment
generation, compensating for pipeline processing delays.

### 2. SegmentCoordinator (segment boundary alignment)

A new N-to-N media handler (`SegmentCoordinator`) ensures text segments
align precisely with video:

- Passes all streams through unchanged
- Replicates `SegmentInfo` from video/audio `ChunkingHandler` to
registered teletext streams
- `TextChunker` in "coordinator mode" uses received `SegmentInfo` events
to determine segment boundaries instead of calculating from text
timestamps

This guarantees identical segment timelines across all adaptation sets.

## Testing

- **Integration tests** in `packager_test.cc`:
- `TeletextSegmentAlignmentTest.VideoAndTextSegmentsAligned` - Verifies
segment count, start times, and durations match between video and text
-
`TeletextSegmentAlignmentTest.VideoAndTextSegmentsAlignedWithWrapAround`
- Same verification with PTS timestamps near the 33-bit wrap-around
point (~26.5 hours)

- **Test files** (synthetic teletext with known cue timings at 1.0s,
3.5s, 13.0s):
  - `test_teletext_live.ts` - Normal PTS range
  - `test_teletext_live_wrap.ts` - PTS near wrap-around boundary

- **Unit tests** for `SegmentCoordinator` and updated `TextChunker`
tests

## Documentation

- Extended `docs/source/tutorials/text.rst` with DVB-Teletext section
covering:
  - Page numbering (3-digit cc_index format)
  - Heartbeat mechanism explanation
  - Segment alignment behavior
  - `--ts_ttx_heartbeat_shift` parameter tuning
  - Troubleshooting guide

- Added teletext processing pipeline diagram to `docs/source/design.rst`

## Future work

The heartbeat and `SegmentCoordinator` mechanisms would likely benefit
**DVB-SUB (bitmap subtitles)** as well (Issue #1477) , which faces
similar challenges with sparse subtitle data in MPEG-TS input and
segment alignment. The infrastructure is now in place to extend this
support.

## Example usage

```bash
packager \
  --segment_duration 6 \
  --mpd_output manifest.mpd \
  'in=input.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \
  'in=input.ts,stream=audio,init_segment=audio/init.mp4,segment_template=audio/$Number$.m4s' \
  'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s,dash_only=1'
```


Fixes #1428
Fixes #1401
Fixes #1355
Fixes #1430
2026-03-11 16:06:30 -07:00

170 lines
6.4 KiB
ReStructuredText

Text output formats
===================
Shaka Packager supports several text/subtitle formats for both input and output.
We only support certain formats for output, other formats are converted to the
specified output format. With the exception of TTML pass-through, there are no
restrictions of input vs output formats.
Examples
--------
* TTML pass-through::
$ packager in=input.ttml,stream=text,output=output.ttml
* Convert WebVTT to TTML::
$ packager in=input.vtt,stream=text,output=output.ttml
* Embed WebVTT in MP4 (single-file)::
$ packager in=input.vtt,stream=text,output=output.mp4
* Embed WebVTT in MP4 (segmented)::
$ packager 'in=input.vtt,stream=text,init_segment=init.mp4,segment_template=text_$Number$.mp4'
* Convert WebVTT to TTML in MP4::
$ packager in=input.vtt,stream=text,format=ttml+mp4,output=output.mp4
* Convert DVB-SUB to TTML in MP4::
$ packager in=input.ts,stream=text,format=ttml+mp4,output=output.mp4
$ packager 'in=input.ts,stream=text,format=ttml+mp4,init_segment=init.mp4,segment_template=text_$Number$.mp4'
* Get a single page from DVB-SUB and set language::
$ packager in=input.ts,stream=text,cc_index=3,lang=en,format=ttml+mp4,output=output.mp4
* Multiple languages::
$ packager \
in=in_en.vtt,stream=text,language=en,output=out_en.mp4 \
in=in_sp.vtt,stream=text,language=sp,output=out_sp.mp4 \
in=in_fr.vtt,stream=text,language=fr,output=out_fr.mp4
* Get a single 3-digit page from DVB-teletext and set language for output formats stpp (TTML in mp4), wvtt (WebVTT in mp4) and HLS WebVTT::
$ packager in=input.ts,stream=text,cc_index=888,lang=en,format=ttml+mp4,output=output.mp4
$ packager in=input.ts,stream=text,cc_index=888,lang=en,output=output.mp4
$ packager in=input.ts,stream=text,cc_index=888,segment_template=text/$Number$.vtt,playlist_name=text/main.m3u8,hls_group_id=text,hls_name=ENGLISH
DVB-Teletext
------------
DVB-Teletext subtitles are commonly used in European broadcast systems. They are
embedded in MPEG-2 Transport Streams and identified by a 3-digit page number
(e.g., 888 for subtitles in many countries).
Page numbering
^^^^^^^^^^^^^^
Teletext pages are identified by a magazine number (1-8) and a two-digit page
number (00-99). The ``cc_index`` parameter uses a 3-digit format where the first
digit is the magazine and the last two digits are the page number:
* ``cc_index=888`` - Magazine 8, page 88 (common for subtitles)
* ``cc_index=100`` - Magazine 1, page 00
* ``cc_index=777`` - Magazine 7, page 77
Heartbeat mechanism
^^^^^^^^^^^^^^^^^^^
Teletext subtitles are "sparse" - they only contain data when subtitles are
displayed. This creates a problem for segmented output (DASH/HLS): if no
subtitle appears during a segment's time window, that segment might be missing
or have incorrect timing.
To solve this, Shaka Packager uses a "heartbeat" mechanism when processing
teletext from MPEG-TS input. The video stream's PTS timestamps are used to
generate periodic timing signals that ensure:
1. Text segments are generated continuously, even during gaps in subtitles
2. Text segment boundaries align with video segment boundaries
3. Ongoing subtitles that span multiple segments are properly handled
This mechanism is automatic when processing MPEG-TS files with both video and
teletext streams.
Heartbeat shift parameter
^^^^^^^^^^^^^^^^^^^^^^^^^
The ``--ts_ttx_heartbeat_shift`` parameter controls the timing offset between
when video timestamps arrive and when they trigger text segment generation.
This is needed because video is typically processed slightly ahead of teletext
in the pipeline.
The default value (90000 at 90kHz timescale = 1 second) works for most cases.
You may need to adjust this if:
* Text segments are generated later than video segments (value too large)
* Some text cues are missing from the output (value too small)
Example with custom heartbeat shift (3 seconds)::
$ packager \
--ts_ttx_heartbeat_shift 270000 \
'in=input.ts,stream=video,init_segment=v/init.mp4,segment_template=v/$Number$.m4s' \
'in=input.ts,stream=audio,init_segment=a/init.mp4,segment_template=a/$Number$.m4s' \
'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=t/init.mp4,segment_template=t/$Number$.m4s'
Segment alignment
^^^^^^^^^^^^^^^^^
When generating DASH or HLS output with teletext, the text segments are
automatically aligned with video segment boundaries. This ensures that:
* All adaptation sets have the same segment timeline
* Seeking works correctly across all media types
* There are no gaps or overlaps between segments
For best results, always include video and teletext streams from the same
MPEG-TS source in the same packager invocation::
$ packager \
--segment_duration 6 \
--mpd_output manifest.mpd \
'in=input.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \
'in=input.ts,stream=audio,init_segment=audio/init.mp4,segment_template=audio/$Number$.m4s' \
'in=input.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s,dash_only=1'
VoD output with non-zero start times
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When the input MPEG-TS has high PTS values (e.g., from a live recording that
started hours into a broadcast), use the ``--generate_static_live_mpd`` flag
to ensure proper ``presentationTimeOffset`` values in the DASH manifest::
$ packager \
--generate_static_live_mpd \
--segment_duration 4 \
--mpd_output manifest.mpd \
'in=recording.ts,stream=video,init_segment=video/init.mp4,segment_template=video/$Number$.m4s' \
'in=recording.ts,stream=text,cc_index=888,lang=en,init_segment=text/init.mp4,segment_template=text/$Number$.m4s'
Troubleshooting
^^^^^^^^^^^^^^^
**No subtitles in output**
* Verify the correct ``cc_index`` value. Use a tool like ``ccextractor`` or
``dvbsnoop`` to identify available teletext pages in the input.
* Ensure the teletext stream contains actual subtitle data, not just page
structure information.
**Subtitles cues are missing**
* Check that video and teletext come from the same MPEG-TS source
* Try adjusting ``--ts_ttx_heartbeat_shift`` if subtitle cues are missing, or
text segments are generated too late
**Missing segments or gaps**
* Ensure video stream is included in the same packager run - the heartbeat
mechanism requires video PTS to drive text segmentation
* Check that segment duration is appropriate for the subtitle density