Francesco Altomare
Hasenheide 9
10967 Berlin (Germany)
Mobile: +49 151 65623284

Blog

Streaming: The HLS Version 4 hype, and the whole truth about it #9

Welcome back my Broadcasting Reader,

We’ve thought to apply a first Addendum to our Streaming Mini-Series, specifically to HLS as a HTTP chunked Delivery technology; this Article partly supersedes our previous ones on HLS and the difference between HLS and HDS, as well as what we mentioned about MPEG-DASH ‘s intrinsic advantages over HLS.

The reasons why we chose to do this are:

  • All Articles above were written with HLS Version 3 in mind;
  • There’s a big hype in the panorama with HLS Pioneers now starting to adopt the so called HLS 4 , or HLS Version 4;
  • There’s no easy way to tell – unless you spend your time reading through the relevant IETFs (in which case you can skip this Article) – what
    the main differences are and especially what and how many Versions of HLS are there;
  •  Thus we thought to condensate all information in an easy to print, easy to bookmark, easy to forget about Article and we’ve done so based on the condensating

work provided by Romain Bouqueau – all credits go to him and his commenters for providing us such a clear breakdown of HLS Versions and diffs. HLS comes in different versions. As of when this article is written, version numbers from 1 to 7 and we’re short of one revision.

Each of the version is covered by one or more revisions. The HLS revisions covered in this analysis range from 0 to 14. The modifications are mostly about
clarifications.

Note: if you look closely at the dates in the table, you’ll notice that no revision is separated by more than 6 months from the previous. That’s because the HLS specification has been available as an IETF draft. And IETF drafts automatically expire after 6 months. Therefore Apple HLS has been available as a draft since May 1, 2009 i.e. for more than five years. We have no information of Apple planning to finalize it at this time of writing.

Revision     Version      Date                                New Features                                                                                                                   
0 1 May 1, 2009 Initial release
1 1 June 8, 2009
2 1 October 5, 2009
3 2 April 2, 2010
  • Specifying the resolution for video variant streams.
  • Improving encryption (initialization vector).
  • Introducing version compatibility.
4 2 June 5, 2010
5 3 November 19, 2010 Introducing the playlist-type (VOD, Event)
6 3 March 31, 2011
7 4 September 30, 2011
  • Audio and Video can be specified separately (e.g. unmuxed together), introducing rendition groups.
  • Introducing byte-ranges to access the content from a single file.
  • Allowing special playlists containing only I-frames (i.e. access points).
8 4 March 23, 2012
9 5 September 22, 2012
  • Subtitles (WebVTT).
  • Adding a new per-sample encryption scheme.
10 5 October 15, 2012
11 5 April 16, 2013
12 6 October 14, 2013
  • Introduce Closed-Captions (in addition to subtitles).
  • Error resilience: discontinuity and independence of each segment can be signalled in the playlist.
13 6 April 16, 2014
14 7 October 14, 2014
  • Adding alternate renditions signalling.
  • Adding session data.
  • Closed-Captions: support for CEA-708.

Always by looking at this table, and with regards to the famous “HLS Version 4 hype”, we ‘ve thought to create an Article on “HLS Version 4 VS HLS Version 3″ in the near future; the differences of HLS 4 over HLS 3 are pretty serious, in a good sense of course; what we wanted to calrify today, for although very superficially, is how OLD HLS Version 4 is and the reason why HLS adoption in terms of recent versions has all to do with Client Players , Browsers, Operating Systems. We’ll provide you a deeper insight on this into our Next Article.

To keep today’s Article technical unlike most other ones, and to the effect of clarifying what each HLS Version introduces up on top of the previous ones,
please find below an Architectural Overview of HLS, from its birth Version 1 to 7. Enjoy!

Masters, Variants and Playlists

A complex presentation can be described by a Master Playlist.
The Master Playlist provides a set of Variant Streams, each of which describes a different version of the same content.
A Variant Stream can also specify a set of Renditions. Renditions are alternate versions of the content, such as audio produced in different languages or video recorded from different camera angles.

Version 1
The initial version defines nine new tags:

EXTM3U
An Extended M3U file is distinguished from a basic M3U file by its first line which MUST be #EXTM3U.

EXTINF
An Extended Info Marker describing the media file that follows it: #EXTINF:<duration>,<title>

EXT-X-TARGETDURATION
Approximate duration of the next media file (the real duration must be less or equal). The EXTINF duration segment file must be less than or equal than EXT-X-TARGETDURATION: #EXT-X-TARGETDURATION:<seconds>

EXT-X-MEDIA-SEQUENCE
Sequence number: #EXT-X-MEDIA-SEQUENCE:<number>
Starting at version 5, a client shall not assume that segments with the same media sequence number in different masters, variants or renditions contain matching content.

EXT-X-KEY
Optional tag. The encryption algorithm is AES-128 CBC with PKCS7 padding: #EXT-X-KEY:METHOD=<method>[,URI=”<URI>”]

EXT-X-PROGRAM-DATE-TIME
Associates the beginning of the next media file with an absolute time: #EXT-X-PROGRAM-DATE-TIME:<YYYY-MM-DDThh:mm:ssZ>

EXT-X-ALLOW-CACHE
Specifies whether the client may cache downloaded media files for later replay: #EXT-X-ALLOW-CACHE:<YES|NO>

EXT-X-STREAM-INF
The next URI in the Playlist is another Playlist file: #EXT-X-STREAM-INF:[BANDWIDTH=<n>],[PROGRAM-ID=<i>],[CODECS=”[format][,format]*”] <URI>

EXT-X-ENDLIST
We reached the end of the playlist: #EXT-X-ENDLIST

EXT-X-DISCONTINUITY
Added in revision 2.
Indicates that the next media file has different characteristics than the previous one: #EXT-X-DISCONTINUITY

Version 2
EXT-X-STREAM-INF (additions)
The RESOLUTION=<N>x<M> attribute is added.

EXT-X-VERSION
Indicates the compatibility version of the Playlist file:

#EXT-X-VERSION:<n>

EXT-X-KEY (additions)
Added the optional IV attribute.

Version 3
EXTINF (additions)
The duration can be expressed with a floating point argument.

EXT-X-PLAYLIST-TYPE
Optional:

#EXT-X-PLAYLIST-TYPE:<EVENT|VOD>

If the tag is present and has a value VOD, the playlist shall not change. If the tag is present and has a value of EVENT, the server may only append lines ,to the playlist.

Version 4
Rendition groups
A set of EXT-X-MEDIA tags with the same GROUP-ID value forms a group of renditions.

EXT-X-STREAM-INF (additions)
New AUDIO and VIDEO attributes.

AUDIO (resp. VIDEO) The value is a quoted-string. It MUST match the value of the  GROUP-ID attribute of an EXT-X-MEDIA tag elsewhere in the Playlist  whose TYPE attribute is AUDIO (resp. VIDEO). It indicates the set of audio  renditions that MAY be used when playing the presentation.

EXT-X-BYTERANGE
Indicates that a media file is a sub-range of the resource identified by its media URI: #EXT-X-BYTERANGE:<n>[@o]
With:
n: length of the sub-range
o (optional): offset, may be inferred from the previous media segment length and computed offset.

EXT-X-MEDIA
Indicates that playlists contain alternate renditions of the same content. For example two audio languages, or two video camera angles: #EXT-X-MEDIA:[TYPE={AUDIO,VIDEO}],[URI],[GROUP-ID],[LANGUAGE],[NAME],[DEFAULT={YES,NO}],[AUTOSELECT={YES,NO}]

EXT-X-I-FRAME-STREAM-INF
Identifies a playlist containing the I-frames of a multimedia presentation. It stands alone, in that it does not apply to a particular URI in the playlist: #EXT-X-I-FRAME-STREAM-INF:<attribute-list>
Same attributes as EXT-X-STREAM-INF, minus the AUDIO attributes, plus a URI attribute to identify the I-frame playlist file.

EXT-X-I-FRAMES-ONLY
Indicates that each media segment in the Playlist describes a single I-frame: #EXT-X-I-FRAMES-ONLY

Version 5
Subtitle segments
They must use WebVTT. Each WebVTT segment MUST have an X-TIMESTAMP-MAP metadata header.

EXT-X-KEY (additions)
New encryption method SAMPLE-AES. The possible attributes are URI, IV, KEYFORMAT, KEYFORMATVERSIONS.

EXT-X-MEDIA (additions)
The TYPE attribute can have the value SUBTITLES. New attributes [FORCED={YES,NO}], [CHARACTERISTICS=UTI].

EXT-X-MAP
The EXT-X-MAP tag specifies how to obtain the Transport Stream PAT/PMT for the applicable media segment.
It applies until the next EXT-X-DISCONTINUITY tag #EXT-X-MAP:<attribute-list>
With the attributes URI and BYTERANGE.

Version 6

EXT-X-MEDIA (additions)
Added the TYPE attribute value CLOSED-CAPTIONS. The media segments for the video renditions can include closed captions. The attributes URI, ASSOC-LANGUAGE, CHARACTERISTICS and INSTREAM-ID={CC1,CC2,CC3,CC4} are added.

EXT-X-STREAM-INF (additions and removals)
The CLOSED-CAPTIONS attribute is added.
The PROGRAM-ID attribute is removed.

EXT-X-I-FRAME-STREAM-INF (removals)
The PROGRAM-ID attribute is removed.

EXT-X-DISCONTINUITY-SEQUENCE
Allows synchronization between different renditions of the entire playlist, same variant stream or different variant streams that have EXT-X-DISCONTINUITY tags in their playlists:

#EXT-X-DISCONTINUITY-SEQUENCE:<number>

A playlist that contains an EXT-X-PLAYLIST-TYPE tag with a value of EVENT or VOD must not contain an EXT-X-DISCONTINUITY-SEQUENCE tag.

EXT-X-START
Indicates a preferred point at which to start playing a playlist:

#EXT-X-START:[TIMEOFFSET],[PRECISE={YES,NO}]

Media Segments (audio only)

Each Elementary Audio Stream segment MUST signal the timestamp of its  first sample with an ID3 PRIV tag [ID3] at the beginning of the  segment. The ID3 PRIV owner identifier MUST be  "com.apple.streaming.transportStreamTimestamp". The ID3 payload MUST  be a 33-bit MPEG-2 Program Elementary Stream timestamp expressed as a  big-endian eight-octet number, with the upper 31 bits set to zero.

EXT-X-INDEPENDENT-SEGMENTS
Added at revision 13.
Indicates that all media samples in a segment can be decoded without information from other segments:

#EXT-X-INDEPENDENT-SEGMENTS

Version 7
The specification was re-written in a more readable way.

Alternative Renditions
EXT-X-STREAM-INF tag containing an AUDIO, VIDEO, SUBTITLES, or CLOSED-CAPTIONS attribute indicates that alternative Renditions are available for playback of that Variant Stream.

EXT-X-SESSION-DATA
Allows arbitrary session data to be carried in a Master Playlist:

#EXT-X-SESSION-DATA:<attribute list>

The attributes DATA-ID, VALUE, LANGUAGE and URI are defined.

EXT-X-STREAM-INF (additions)
The AVERAGE-BANDWIDTH attribute was added.

EXT-X-ALLOW-CACHE (removal)
The tag was removed.

EXT-X-MEDIA (additions)
INSTREAM-ID value SERVICE indicated a CEA-708 Digital Television Closed Captioning.

We hope you found this Article useful and as always we’d like to hear your comments. We look forward to hearing from you.

  • Linked In
  • Google

Tags: , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *