MEDIA CONTENT DISTRIBUTION AND PLAYBACK

20220329903 · 2022-10-13

    Inventors

    Cpc classification

    International classification

    Abstract

    A client for streaming a video from a remote media service is arranged for being available upon request in at least a first representation having a first group of pictures, GOP, size qualified by a first quality score, and a second representation having a second GOP size larger than the first GOP size and qualified by a second quality score superior to the first quality score. The client is configured to perform the following steps monitoring operational metrics of the video during playback of the first representation; and when a requirement of the second representation is met requesting the remote media service to receive a stream of the video according to the second representation and switching the playback of the video to the second representation thereby increasing the quality score of the representation.

    Claims

    1. —15. (cancelled).

    16. A client for streaming a video from a remote media service, wherein the video is available from the remote media service to the client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score; wherein the client is configured to perform the following steps: monitoring operational metrics of the video during playback of the first representation; and verifying, based on the operational metrics, if a playback requirement for playback of another representation having a superior quality score on the client is met; upon detecting that the playback requirement for playback of the second representation is met: requesting the remote media service to receive a stream of the video according to the second representation; and switching the playback of the video to the second representation thereby increasing the quality score of the representation that is being played.

    17. The client according to claim 16, wherein the video is further available in a third representation having a third GOP size smaller than the second GOP size, and wherein the third representation is qualified by a third quality score, the third quality score inferior to the second quality score and superior to the first quality score; and wherein the client is further configured to: upon detecting that a playback requirement for playback of the third representation is met, and that the playback requirement for playback of the second representation is not or no longer met: requesting the remote media service to receive a stream of the video according to the third representation; and switching the playback of the video to the third representation thereby maximizing the quality score of the representation that is being played and that meets a requirement.

    18. The client according to claim 17, wherein the client is further configured to: upon detecting that the playback requirement for playback of the second and third representation is met: requesting the remote media service to receive a stream of the video according to the second or third representation, depending on which has a first available independent frame; and switching the playback to the representation having the first available independent frame.

    19. The client according to claim 18, wherein the client is further configured to: upon the detecting that the playback requirement for playback of the second and third representation is met and when the playback was switched to the third representation: requesting and switching the playback to the second representation thereby further maximizing the quality score.

    20. The client according to claim 16, wherein the switching to a requested representation comprises: waiting for an independent frame within the requested representation; and performing the switching at the independent frame.

    21. The client according to claim 16, wherein the client is further configured to determine a quality score based on one of the group of: an average bitrate, a maximum bitrate, a resolution, a sample rate, a codec and/or a GOP size.

    22. The client according to claim 16, wherein a quality score is received from the remote media service.

    23. The client according to claim 16, wherein a playback requirement is determined by at least one of: a resolution of a client's screen; an orientation of the client's screen; a type of network the client is connected to; a current client's CPU usage; a priority assigned to an application running the representation in a multitasking environment; a client's battery status; a client's available memory; a client's supported codecs; a client's estimated available bandwidth; a number of frames dropped during playback of a representation; a number of samples dropped during playback of a representation; an occurrence of errors during decode and playback; and an occurrence of a client's player buffer running empty.

    24. The client according to claim 16, wherein the requesting further comprises: requesting the remote media service to receive a stream of the video according to any representation meeting the playback requirement from the next independent frame onwards; and switching to the representation firstly received from the remote media service thereby switching to the representation with a first independent frame.

    25. A remote media service configured to provide a video to the client according to claim 16 upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.

    26. The remote media service according to claim 25, further configured to determine a quality score based on one of the group of an average bitrate, a maximum bitrate, a resolution, a GOP size, a sample rate, a quality metric determined by a framework such as Video Multimethod Assessment Fusion, VMAF, Peak Signal-to- Noise Ratio, PSNR, or Structural Similarity, SSIM.

    27. A computer-implemented method for streaming a video to a client from a remote media service according to claim 25, wherein the video is available from the remote media service to the client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.

    28. A computer program product comprising computer-executable instructions for causing a client to perform at least the steps according to claim 16.

    29. A computer program product comprising computer-executable instructions for causing a remote media service according to claim 25 to provide a video to a remote client upon request in at least: a first representation having a first group of pictures, GOP, size, and wherein the first representation is qualified by a first quality score; and a second representation having a second GOP size larger than the first GOP size, and wherein the second representation is qualified by a second quality score, the second quality score superior to the first quality score.

    30. A computer readable storage medium comprising the computer program product according to claim 29.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0051] The invention will now further be described with references to the drawings wherein:

    [0052] FIG. 1 illustrates a server and a client according to an embodiment;

    [0053] FIG. 2 illustrates steps performed by a client for receiving a representation from a server;

    [0054] FIG. 3 illustrates a set-up of a remote media service for communication with a server and a client; and

    [0055] FIG. 4 illustrates a suitable computing system for performing methods according to various embodiments of the invention.

    DETAILED DESCRIPTION OF EMBODIMENT(S)

    [0056] The present invention relates to the streaming of a video from a remote media service to a client. A video received by a client is a combination of ordered still pictures or frames that are decoded or decompressed and played one after the other within a video application. To this respect, a client may be any device capable of receiving a digital representation of a video over a communication network and capable of decoding the representation into a sequence of frames that can be displayed on a screen to a user. Examples of devices that are suitable as a client are desktop and laptop computers, smartphones, tablets, setup boxes and TVs. A client may also refer to a video player application running on any of such devices. Streaming a video refers to the concept that the client can request a video from a server and start the playback of the video upon receiving the first frames without having received all the frames of the video. A streaming server is then a server that can provide such streaming of videos upon request of a client to the client over a communication network, for example over the Internet, over a Wide Area Network (WAN) or a Local Area Network (LAN).

    [0057] Video received from a streaming server is compressed according to a video compression specification or standard such as H.265/MPEG-H HEVC, H.264/MPEG-4 AVC, H.263/MPEG-4 Part 2, H.262/MPEG-2, SMPTE 421M (VC-1), AOMedia Video 1 (AV1) and VP9. According to those standards, the video frames are compressed in size by using spatial image compression and temporal motion compensation. Frames on which only spatial image compression is applied or no compression is applied are referred to as temporal independent frames, key frames, independent frames or I-frames. A key frame is thus a frame that is decodable independently from other frames in the video. Frames to which temporal motion compensation is applied, either in combination with image compression, are referred to as temporal dependent frames or, shortly dependent frames. Dependent frames are thus frames for which information of other frames is needed to decompress them. Dependent frames are sometimes further categorized in P frames and B frames. P frames can use data from previous frames to decode and are thus more compressible than I frames. B frames can use both previous and forward frames to decode and may therefore achieve the highest amount of data compression.

    [0058] FIG. 1 illustrates a streaming server 182 for providing video streams to a client 183 according to an embodiment of the invention. The server 182 comprises a video 180 which is available through different representations. A first representation 130 has a number of I-frames 101-106 separated from each other with a predefined group of pictures, GOP, size 190. The video 180 is further available in a second representation 131 which comprises less I-frames 111- 114 compared to the first representation 130, and thus has a larger GOP size 191 compared to the first representation. The video 180 is further available in a third representation 132 with less I- frames 121-123 compared to the second representation 131 and thus a larger GOP size 192.

    [0059] The three representations 130-132 are further characterized by a respective quality score. The quality score represents the quality of the respective representation 130-132, for example when played by the client 183 on a screen.

    [0060] The quality score is, for example, based on one or more parameters of the group of an average bitrate, a maximum bitrate, a resolution, a sample rate, a codec and/or a GOP size. For example, when the resolution of the second representation 131 is higher than that one of the first representation 130, the quality score of the second representation 131 is higher than that of the first one 130.

    [0061] In this illustrative embodiment, the quality score of the first representation 130 is inferior to that of the second representation 131. Further, the quality score of the third representation 132 is likewise inferior to that of the second representation 131. Further, the quality score of the first representation 130 is inferior to that of the third representation 132.

    [0062] In FIG. 2 steps are illustrated performed by the client 183 for receiving the video 180 from the server 182 and subsequently to display it on a screen thereon. In a first step, the client 183 determines the start time 201. Preferably, the video 180 is played back on the screen from the first I-frame 101 of the first representation 130 onwards.

    [0063] During playback 202 monitoring 203 is performed by monitoring operational metrics of the video during playback of the first representation 130 by the client 183. These operational metrics may comprise one of the group of a resolution of the client's 183 screen, the orientation of the client's 183 screen, a type of network the client 183 is connected to, the current client's 183 CPU usage, a priority assigned to an application running the representation 130 in a multitasking environment, the client's 183 battery status, the client's 183 available memory, the client's 183 supported codecs, the client's 183 estimated available bandwidth, a number of frames dropped during playback of the representation 130, a number of media samples dropped during playback of the representation 130, an occurrence of errors during decode and playback, and/or an occurrence of the client's 183 player buffer running empty.

    [0064] Through the monitoring 203 it is determined if a requirement of the second representation 131 is met 204. If so, the client 183 request to the server 182 to receive the second representation 131 and, when received, plays the video when the second representation 131 is received. If the requirement of the second representation 131 continues to be met 204, the device 183 continues to play the video by the second representation 131. In other words, the device 183 doesn't switch to another representation 130, 132 having an inferior quality.

    [0065] If through the monitoring 203 it is determined that the requirement of the second representation 131 is either not met 204, or no longer met 204 after playback 205 of the second representation 131, the device 183 determines if a requirement of the third representation 132 is met 207. If this requirement is met 207, the device or client 183 requests the server 182 to receive the third representation 132 and continues to play back 206 the video through the third representation 132. During playback 206 of the third representation 132, the monitoring 203 continues. This way, the device 183 aims to playback the video by the second representation 131 which has the highest quality.

    [0066] In the illustrative example of FIGS. 1 and 2 the server 182 is represented as being located in a single location. However, mostly the client 183 will communicate with a remote media service as illustrated in FIG. 3. The remote media service 301 will then act as an agent between the client 183 and one or more servers 182, 302. The first representation 130 may then, for example, be stored at server 182, while the second 131 and third 132 representation are stored in server 302.

    [0067] Embodiments of the invention have been described by solely referring to video frames that are exchanged between server and client. It should be understood that the media comprises video frames, but that the media further may data, such as audio data and audio samples of subtitles. Thus, the media may for example comprise one or more audio tracks or subtitles. Other media may also comprise additional frames of other video streams, for example in the case of panoramic video or video with multiple viewing angles.

    [0068] Each frame may also be encapsulated by the server in a frame packet with an additional header. The header may then comprise further information about the content of the packet. Header information may comprise the following fields: [0069] Decode Time Stamp: a number which parameterizes the frame in time. It describes the timestamp of this frame on the decoding timeline, which does not necessarily equal the presentation timeline used to present the media. The timestamp may further be expressed in timescale units (see below). [0070] Presentation Time Stamp: a number which describes the position of the frame on the presentation timeline. The timestamp may further be expressed in timescale units (see below). [0071] Timescale: the number of time units that pass in one second. This applies to the timestamps and the durations given within the frame. For example, a timescale of 50 would mean that each time unit measures 20 milliseconds. A frame duration of 7 would signify 140 milliseconds. [0072] Frame Duration: an integer describing the duration of the frame in timescale units. Type: a field describing the type of frame, e.g. a video independent frame, a video non- independent frame, an audio independent frame, an audio dependent frame. [0073] Media Data Size: the actual length of the frame itself

    [0074] Independent frames may further comprise the following fields in the header: [0075] Width: the width of the independent frame and all subsequent dependent frames. [0076] Height: the height of the independent frame and all subsequent dependent frames. [0077] Total Duration: the total duration of the track this independent frame belongs to, e.g. expressed in timescale units. [0078] Decoder configuration and codec information.

    [0079] FIG. 4 shows a suitable computing system 400 enabling to implement embodiments of the method for streaming a video from a remote media service 301 according to the invention. Computing system 400 may in general be formed as a suitable general-purpose computer and comprise a bus 410, a processor 402, a local memory 404, one or more optional input interfaces 414, one or more optional output interfaces 416, a communication interface 412, a storage element interface 406, and one or more storage elements 408. Bus 410 may comprise one or more conductors that permit communication among the components of the computing system 400. Processor 402 may include any type of conventional processor or microprocessor that interprets and executes programming instructions. Local memory 404 may include a random- access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by processor 402 and/or a read only memory (ROM) or another type of static storage device that stores static information and instructions for use by processor 402. Input interface 414 may comprise one or more conventional mechanisms that permit an operator or user to input information to the computing device 400, such as a keyboard 420, a mouse 430, a pen, voice recognition and/or biometric mechanisms, a camera, etc. Output interface 416 may comprise one or more conventional mechanisms that output information to the operator or user, such as a display 440, etc. Communication interface 412 may comprise any transceiver-like mechanism such as for example one or more Ethernet interfaces that enables computing system 400 to communicate with other devices and/or systems, for example with other computing devices 182, 183, 301. The communication interface 412 of computing system 400 may be connected to such another computing system by means of a local area network (LAN) or a wide area network (WAN) such as for example the internet. Storage element interface 406 may comprise a storage interface such as for example a Serial Advanced Technology Attachment (SATA) interface or a Small Computer System Interface (SCSI) for connecting bus 410 to one or more storage elements 408, such as one or more local disks, for example SATA disk drives, and control the reading and writing of data to and/or from these storage elements 408. Although the storage element(s) 408 above is/are described as a local disk, in general any other suitable computer-readable media such as a removable magnetic disk, optical storage media such as a CD or DVD, -ROM disk, solid state drives, flash memory cards, . . . could be used. Computing system 400 could thus correspond to the device 183 or the server 182 in the embodiments illustrated by FIG. 1 or FIG. 2.

    [0080] As used in this application, the term “circuitry” may refer to one or more or all of the following: [0081] (a) hardware-only circuit implementations such as implementations in only analog and/or digital circuitry and [0082] (b) combinations of hardware circuits and software, such as (as applicable): [0083] (i) a combination of analog and/or digital hardware circuit(s) with software/firmware and [0084] (ii) any portions of hardware processor(s) with software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a mobile phone or server, to perform various functions) and [0085] (c) hardware circuit(s) and/or processor(s), such as microprocessor(s) or a portion of a microprocessor(s), that requires software (e.g. firmware) for operation, but the software may not be present when it is not needed for operation.

    [0086] This definition of circuitry applies to all uses of this term in this application, including in any claims. As a further example, as used in this application, the term circuitry also covers an implementation of merely a hardware circuit or processor (or multiple processors) or portion of a hardware circuit or processor and its (or their) accompanying software and/or firmware. The term circuitry also covers, for example and if applicable to the particular claim element, a baseband integrated circuit or processor integrated circuit for a mobile device or a similar integrated circuit in a server, a cellular network device, or other computing or network device.

    [0087] Although the present invention has been illustrated by reference to specific embodiments, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied with various changes and modifications without departing from the scope thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the scope of the claims are therefore intended to be embraced therein.

    [0088] It will furthermore be understood by the reader of this patent application that the words “comprising” or “comprise” do not exclude other elements or steps, that the words “a” or “an” do not exclude a plurality, and that a single element, such as a computer system, a processor, or another integrated unit may fulfil the functions of several means recited in the claims. Any reference signs in the claims shall not be construed as limiting the respective claims concerned. The terms “first”, “second”, third”, “a”, “b”, “c”, and the like, when used in the description or in the claims are introduced to distinguish between similar elements or steps and are not necessarily describing a sequential or chronological order. Similarly, the terms “top”, “bottom”, “over”, “under”, and the like are introduced for descriptive purposes and not necessarily to denote relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances and embodiments of the invention are capable of operating according to the present invention in other sequences, or in orientations different from the one(s) described or illustrated above.