H04L65/762

Systems and methods for player input motion compensation by anticipating motion vectors and/or caching repetitive motion vectors
11695951 · 2023-07-04 · ·

Systems and methods for reducing latency through motion estimation and compensation techniques are disclosed. The systems and methods include a client device that uses transmitted lookup tables from a remote server to match user input to motion vectors, and tag and sum those motion vectors. When a remote server transmits encoded video frames to the client, the client decodes those video frames and applies the summed motion vectors to the decoded frames to estimate motion in those frames. In certain embodiments, the systems and methods generate motion vectors at a server based on predetermined criteria and transmit the generated motion vectors and one or more invalidators to a client, which caches those motion vectors and invalidators. The server instructs the client to receive input from a user, and use that input to match to cached motion vectors or invalidators. Based on that comparison, the client then applies the matched motion vectors or invalidators to effect motion compensation in a graphic interface. In other embodiments, the systems and methods cache repetitive motion vectors at a server, which transmits a previously generated motion vector library to a client. The client stores the motion vector library, and monitors for user input data. The server instructs the client to calculate a motion estimate from the input data and instructs the client to update the stored motion vector library based on the input data, so that the client applies the stored motion vector library to initiate motion in a graphic interface prior to receiving actual motion vector data from the server. In this manner, latency in video data streams is reduced.

Computer vision on broadcast video

Disclosed are systems and methods for improving interactions with and between computers in content searching, hosting and/or providing systems supported by or configured with devices, servers and/or platforms. The disclosed systems and methods provide an image processing framework that sub-divides computer vision techniques into three computationally efficient steps: detection, classification and matching. These steps provide an improved image processing framework that can analyze live stream data of a media file, in real-time, in order to identify and track specific digital objects depicted therein. This enables not only image processing detection results, but also the capabilities of augmenting the video stream with additional data related to the detected object.

Securing image data from unintended disclosure at a videoconferencing endpoint

A system for preventing private image data captured at an endpoint from being shared during a videoconference is provided. A user can select three-dimensional regions which will not be seen during a videoconference while areas in front of the designated regions remain viewable.

Method and system for redundant media presentation generation

A system, apparatus and method for distributed adaptive streaming packaging can include a plurality of distributed adaptive streaming packagers having one or more processors configured to perform the functions of identifying one or more media segments in one or more input signals, identifying one or more latest media segment presentation times in the one or more media segments, Identifying one or more latest media segment presentation durations in the in the one or more media segments, adding each of the one or more latest media segment presentation times to each of the one or more latest media segment presentation durations in the input signal to compute one or more calculated publish times. The system or method can further include choosing one of the one or more publish times as the media presentation publish time and generating a media presentation based on the media presentation publish time. In some embodiments, the method can set MPD@publishTime to the media presentation publish time or set #EXT-X-PROGRAM-DATE-TIME tag to the media presentation publish time.

Packetized data communication over multiple unreliable channels
11546615 · 2023-01-03 · ·

A method comprising: receiving a plurality of duplicates of a serial bit stream, wherein said serial bit stream comprises a sequence of data packets; continuously dividing each of said duplicates of said serial bit stream based on sequential time windows; with respect to each of said time windows, aligning said data packets associated with each of said duplicates of said serial bit stream, received within said time window, based, at least in part, on data packet similarity; and recreating in real time said serial bit stream by selecting at least one of said aligned data packets as representing a next data packet in said sequence of data packets.

Method and apparatus for controlling devices to present content and storage medium

A method for controlling devices to present content includes: determining content to be presented in response to a content request event occurring; determining a content presentation form supported by a presentation device; and providing corresponding presentation content to the presentation device in accordance with a determined presentation form. As such, the current content presentation form can be determined based on the content presentation forms supported by the presentation device, and the content in different presentation forms can be continuously presented, thereby enhancing the user experience.

Method and system for providing personalized content to a user

Disclosed herein is a method and system for providing personalized content to a user. The method comprises categorizing original content to be provided to user into a plurality of data packets. The data packets include data of similar domain. The user is categorized into one of plurality of classes and a vocabulary of words suitable for the class is identified. The class is associated with a domain. The system identifies relevant content for the class. Thereafter, the system modifies the original content by either by inserting a new data packet or deleting a data packet. A target content is generated for the class based on vocabulary of words associated with class and modified original content. Thereafter, the target content is provided to the class by incorporating one or more features of a presenter for presenting the target content. The present disclosure enhances user experience by personalizing content for the user.

Flexible interoperability and capability signaling using initialization hierarchy
11546402 · 2023-01-03 · ·

A method and apparatus include including, in a moving pictures experts group (MPEG) dynamic adaptive streaming over hypertext transfer protocol (DASH) media presentation description (MPD) file, an initialization presentation element that identifies an initialization presentation and one or more initialization groups included in the initialization presentation. An initialization group element that identifies an initialization group and one or more initialization sets included in the initialization group is included in the MPD file. An initialization set element that identifies an initialization set is included in the MPD file. The MPD file is transmitted to a client device.

Method and apparatus for functional improvements to moving picture experts group network based media processing
11544108 · 2023-01-03 · ·

A method of processing media content in Moving Picture Experts Group (MPEG) Network Based Media Processing (NBMP) includes obtaining, from a function repository storing one or more functions for processing the media content, at least one among the one or more functions, each of the at least one among the one or more functions having a function descriptor indicating, for a respective one among the one or more functions, a maximum throughput, a minimum buffer size, a maximum size of metadata and a maximum frequency between two instances of the metadata, and obtaining a task for processing the media content, based on the obtained at least one among the one or more functions, the task having a task descriptor indicating, for the task, a maximum throughput, a minimum buffer size, a maximum size of metadata and a maximum frequency between two instances of the metadata.

Fast multi-rate encoding for adaptive HTTP streaming

According to embodiments of the disclosure, information of higher and lower quality encoded video segments is used to limit Rate-Distortion Optimization (RDO) for each Coding Unit Tree (CTU). A method first encodes the highest bit-rate segment and consequently uses it to encode the lowest bit-rate video segment. Block structure and selected reference frame of both highest and lowest bit-rate video segments are used to predict and shorten RDO process for each CTU in middle bit-rates. The method delays just one frame using parallel processing. This approach provides time-complexity reduction compared to the reference software for middle bit-rates while degradation is negligible.