MEDIA CONTENT DISTRIBUTION AND PLAYBACK
20220329903 · 2022-10-13
Inventors
- Steven TIELEMANS (Leuven, BE)
- Maarten Tielemans (Aarschot, BE)
- Pieter-Jan SPEELMANS (Diest, BE)
- Lieven PAULISSEN (Wilsele, BE)
- Egon Okerman (Sint-Genesius-Rode, BE)
Cpc classification
H04N21/4424
ELECTRICITY
H04N21/8456
ELECTRICITY
H04N21/437
ELECTRICITY
H04N21/44209
ELECTRICITY
International classification
H04N21/442
ELECTRICITY
H04N21/2343
ELECTRICITY
Abstract
A client for streaming a video from a remote media service is arranged for being available upon request in at least a first representation having a first group of pictures, GOP, size qualified by a first quality score, and a second representation having a second GOP size larger than the first GOP size and qualified by a second quality score superior to the first quality score. The client is configured to perform the following steps monitoring operational metrics of the video during playback of the first representation; and when a requirement of the second representation is met requesting the remote media service to receive a stream of the video according to the second representation and switching the playback of the video to the second representation thereby increasing the quality score of the representation.
Claims
1. —15. (cancelled).
16. A client for streaming a video from a remote media service, wherein the video is available from the remote media service to the client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score; wherein the client is configured to perform the following steps: monitoring operational metrics of the video during playback of the first representation; and verifying, based on the operational metrics, if a playback requirement for playback of another representation having a superior quality score on the client is met; upon detecting that the playback requirement for playback of the second representation is met: requesting the remote media service to receive a stream of the video according to the second representation; and switching the playback of the video to the second representation thereby increasing the quality score of the representation that is being played.
17. The client according to claim 16, wherein the video is further available in a third representation having a third GOP size smaller than the second GOP size, and wherein the third representation is qualified by a third quality score, the third quality score inferior to the second quality score and superior to the first quality score; and wherein the client is further configured to: upon detecting that a playback requirement for playback of the third representation is met, and that the playback requirement for playback of the second representation is not or no longer met: requesting the remote media service to receive a stream of the video according to the third representation; and switching the playback of the video to the third representation thereby maximizing the quality score of the representation that is being played and that meets a requirement.
18. The client according to claim 17, wherein the client is further configured to: upon detecting that the playback requirement for playback of the second and third representation is met: requesting the remote media service to receive a stream of the video according to the second or third representation, depending on which has a first available independent frame; and switching the playback to the representation having the first available independent frame.
19. The client according to claim 18, wherein the client is further configured to: upon the detecting that the playback requirement for playback of the second and third representation is met and when the playback was switched to the third representation: requesting and switching the playback to the second representation thereby further maximizing the quality score.
20. The client according to claim 16, wherein the switching to a requested representation comprises: waiting for an independent frame within the requested representation; and performing the switching at the independent frame.
21. The client according to claim 16, wherein the client is further configured to determine a quality score based on one of the group of: an average bitrate, a maximum bitrate, a resolution, a sample rate, a codec and/or a GOP size.
22. The client according to claim 16, wherein a quality score is received from the remote media service.
23. The client according to claim 16, wherein a playback requirement is determined by at least one of: a resolution of a client's screen; an orientation of the client's screen; a type of network the client is connected to; a current client's CPU usage; a priority assigned to an application running the representation in a multitasking environment; a client's battery status; a client's available memory; a client's supported codecs; a client's estimated available bandwidth; a number of frames dropped during playback of a representation; a number of samples dropped during playback of a representation; an occurrence of errors during decode and playback; and an occurrence of a client's player buffer running empty.
24. The client according to claim 16, wherein the requesting further comprises: requesting the remote media service to receive a stream of the video according to any representation meeting the playback requirement from the next independent frame onwards; and switching to the representation firstly received from the remote media service thereby switching to the representation with a first independent frame.
25. A remote media service configured to provide a video to the client according to claim 16 upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.
26. The remote media service according to claim 25, further configured to determine a quality score based on one of the group of an average bitrate, a maximum bitrate, a resolution, a GOP size, a sample rate, a quality metric determined by a framework such as Video Multimethod Assessment Fusion, VMAF, Peak Signal-to- Noise Ratio, PSNR, or Structural Similarity, SSIM.
27. A computer-implemented method for streaming a video to a client from a remote media service according to claim 25, wherein the video is available from the remote media service to the client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.
28. A computer program product comprising computer-executable instructions for causing a client to perform at least the steps according to claim 16.
29. A computer program product comprising computer-executable instructions for causing a remote media service according to claim 25 to provide a video to a remote client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.
30. A computer readable storage medium comprising the computer program product according to claim 29.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0051] The invention will now further be described with references to the drawings wherein:
[0052]
[0053]
[0054]
[0055]
DETAILED DESCRIPTION OF EMBODIMENT(S)
[0056] The present invention relates to the streaming of a video from a remote media service to a client. A video received by a client is a combination of ordered still pictures or frames that are decoded or decompressed and played one after the other within a video application. To this respect, a client may be any device capable of receiving a digital representation of a video over a communication network and capable of decoding the representation into a sequence of frames that can be displayed on a screen to a user. Examples of devices that are suitable as a client are desktop and laptop computers, smartphones, tablets, setup boxes and TVs. A client may also refer to a video player application running on any of such devices. Streaming a video refers to the concept that the client can request a video from a server and start the playback of the video upon receiving the first frames without having received all the frames of the video. A streaming server is then a server that can provide such streaming of videos upon request of a client to the client over a communication network, for example over the Internet, over a Wide Area Network (WAN) or a Local Area Network (LAN).
[0057] Video received from a streaming server is compressed according to a video compression specification or standard such as H.265/MPEG-H HEVC, H.264/MPEG-4 AVC, H.263/MPEG-4 Part 2, H.262/MPEG-2, SMPTE 421M (VC-1), AOMedia Video 1 (AV1) and VP9. According to those standards, the video frames are compressed in size by using spatial image compression and temporal motion compensation. Frames on which only spatial image compression is applied or no compression is applied are referred to as temporal independent frames, key frames, independent frames or I-frames. A key frame is thus a frame that is decodable independently from other frames in the video. Frames to which temporal motion compensation is applied, either in combination with image compression, are referred to as temporal dependent frames or, shortly dependent frames. Dependent frames are thus frames for which information of other frames is needed to decompress them. Dependent frames are sometimes further categorized in P frames and B frames. P frames can use data from previous frames to decode and are thus more compressible than I frames. B frames can use both previous and forward frames to decode and may therefore achieve the highest amount of data compression.
[0058]
[0059] The three representations 130-132 are further characterized by a respective quality score. The quality score represents the quality of the respective representation 130-132, for example when played by the client 183 on a screen.
[0060] The quality score is, for example, based on one or more parameters of the group of an average bitrate, a maximum bitrate, a resolution, a sample rate, a codec and/or a GOP size. For example, when the resolution of the second representation 131 is higher than that one of the first representation 130, the quality score of the second representation 131 is higher than that of the first one 130.
[0061] In this illustrative embodiment, the quality score of the first representation 130 is inferior to that of the second representation 131. Further, the quality score of the third representation 132 is likewise inferior to that of the second representation 131. Further, the quality score of the first representation 130 is inferior to that of the third representation 132.
[0062] In
[0063] During playback 202 monitoring 203 is performed by monitoring operational metrics of the video during playback of the first representation 130 by the client 183. These operational metrics may comprise one of the group of a resolution of the client's 183 screen, the orientation of the client's 183 screen, a type of network the client 183 is connected to, the current client's 183 CPU usage, a priority assigned to an application running the representation 130 in a multitasking environment, the client's 183 battery status, the client's 183 available memory, the client's 183 supported codecs, the client's 183 estimated available bandwidth, a number of frames dropped during playback of the representation 130, a number of media samples dropped during playback of the representation 130, an occurrence of errors during decode and playback, and/or an occurrence of the client's 183 player buffer running empty.
[0064] Through the monitoring 203 it is determined if a requirement of the second representation 131 is met 204. If so, the client 183 request to the server 182 to receive the second representation 131 and, when received, plays the video when the second representation 131 is received. If the requirement of the second representation 131 continues to be met 204, the device 183 continues to play the video by the second representation 131. In other words, the device 183 doesn't switch to another representation 130, 132 having an inferior quality.
[0065] If through the monitoring 203 it is determined that the requirement of the second representation 131 is either not met 204, or no longer met 204 after playback 205 of the second representation 131, the device 183 determines if a requirement of the third representation 132 is met 207. If this requirement is met 207, the device or client 183 requests the server 182 to receive the third representation 132 and continues to play back 206 the video through the third representation 132. During playback 206 of the third representation 132, the monitoring 203 continues. This way, the device 183 aims to playback the video by the second representation 131 which has the highest quality.
[0066] In the illustrative example of
[0067] Embodiments of the invention have been described by solely referring to video frames that are exchanged between server and client. It should be understood that the media comprises video frames, but that the media further may data, such as audio data and audio samples of subtitles. Thus, the media may for example comprise one or more audio tracks or subtitles. Other media may also comprise additional frames of other video streams, for example in the case of panoramic video or video with multiple viewing angles.
[0068] Each frame may also be encapsulated by the server in a frame packet with an additional header. The header may then comprise further information about the content of the packet. Header information may comprise the following fields: [0069] Decode Time Stamp: a number which parameterizes the frame in time. It describes the timestamp of this frame on the decoding timeline, which does not necessarily equal the presentation timeline used to present the media. The timestamp may further be expressed in timescale units (see below). [0070] Presentation Time Stamp: a number which describes the position of the frame on the presentation timeline. The timestamp may further be expressed in timescale units (see below). [0071] Timescale: the number of time units that pass in one second. This applies to the timestamps and the durations given within the frame. For example, a timescale of 50 would mean that each time unit measures 20 milliseconds. A frame duration of 7 would signify 140 milliseconds. [0072] Frame Duration: an integer describing the duration of the frame in timescale units. Type: a field describing the type of frame, e.g. a video independent frame, a video non- independent frame, an audio independent frame, an audio dependent frame. [0073] Media Data Size: the actual length of the frame itself
[0074] Independent frames may further comprise the following fields in the header: [0075] Width: the width of the independent frame and all subsequent dependent frames. [0076] Height: the height of the independent frame and all subsequent dependent frames. [0077] Total Duration: the total duration of the track this independent frame belongs to, e.g. expressed in timescale units. [0078] Decoder configuration and codec information.
[0079]
[0080] As used in this application, the term “circuitry” may refer to one or more or all of the following: [0081] (a) hardware-only circuit implementations such as implementations in only analog and/or digital circuitry and [0082] (b) combinations of hardware circuits and software, such as (as applicable): [0083] (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and [0084] (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions) and [0085] (c) hardware circuit(s) and/or processor(s), such as microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g. firmware) for operation, but the software may not be present when it is not needed for operation.
[0086] This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in a server, a cellular network device, or other computing or network device.
[0087] Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the scope of the claims are therefore intended to be embraced therein.
[0088] It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third”, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.