H04N21/43072

Distribution of multiple signals of video content independently over a network

A stereoscopic production solution, e.g., for live events, that provides 3D video asset distribution to multiple devices and networks is described. In some embodiments, live or recorded 3D video content may be accessible by different service providers with different subscribers/users and protocols across a network of the content provider. A first video signal corresponding to a first video feed for one eye of a viewer may be received and a second video signal corresponding to a second video feed for the second eye of the viewer may be received. The first video signal and the second video signal may be encoded. The encoded first video signal and the encoded second video signal may be transmitted independently over a network. The two video signals may be received and frame synced at an off-site location for eventual rendering to a display device.

Media content identification on mobile devices

A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.

Method and apparatus for an interactive user interface

A method, apparatus and computer program product are provided to facilitate user interaction with, such as modification of, respective audio objects. An example method may include causing a multimedia file to be presented that includes at least two images. The images are configured to provide animation associated with respective audio objects and representative of a direction of the respective audio objects. The method may also include receiving user input in relation to an animation associated with an audio object or the direction of the audio object represented by an animation. The method may further include causing replay of the audio object for which the user input was received to be modified.

Low latency wireless virtual reality systems and methods

Virtual Reality (VR) processing devices and methods are provided for transmitting user feedback information comprising at least one of user position information and user orientation information, receiving encoded audio-video (A/V) data, which is generated based on the transmitted user feedback information, separating the A/V data into video data and audio data corresponding to a portion of a next frame of a sequence of frames of the video data to be displayed, decoding the portion of a next frame of the video data and the corresponding audio data, providing the audio data for aural presentation and controlling the portion of the next frame of the video data to be displayed in synchronization with the corresponding audio data.

SYSTEMS, METHODS, AND APPARATUSES FOR BUFFER MANAGEMENT
20230217060 · 2023-07-06 ·

Methods, systems, and apparatuses for content deliver, buffer management and synchronization are described herein. Content being played back on a playback platform is analyzed and based on the analysis, asynchronous playback of the content may be determined. A source buffer flush may be performed to correct the asynchronous playback. After the source buffer flush is performed, content segments may be written to the source buffer so that playback of the content on the playback platform resumes with synchronized audio and video.

Systems and methods for matching audio to video punchout
11553238 · 2023-01-10 · ·

An image capture device may capture multiple audio content during capture of visual content. A viewing window for the visual content and rotational position of the image capture device during capture of the visual content may be used to generate modified audio content from the multiple audio content. The modified audio content may provide sound for playback of a punchout of the visual content using the viewing window.

Video stream synchronization

A system clock reference is transmitted to user devices over a short range communication channel and used to calculated a clock offset. The clock offset is stored together with a user identifier in a server memory, from which the clock offset can be retrieved when receiving the user identifier together with a video stream of video frames tagged with respective timestamps. The timestamps are converted into capture times according to a system clock using the retrieved clock offsets. Video streams from multiple user devices thereby have video frames timestamped using a same clock reference and can be time aligned.

Systems and methods for efficient media editing
11545186 · 2023-01-03 · ·

In the field of media editing, in one embodiment, a computer-implemented method may include steps for receiving a video at a user device, generating a reversed video portion based on a selected portion of the video, and generating a media file by combining at least a first portion of the video and a second portion of the reversed video portion. In some embodiments, the method further include receiving a user input at the user device, the user input indicative of playback of the selected portion of the video in a forward direction and in a reverse direction; updating the reversed video portion based on the user input to yield an updated reversed video portion; and combining the selected portion and the updated reversed video portion to produce the media file.

SYSTEMS AND METHODS FOR PROVIDING OPTIMIZED TIME SCALES AND ACCURATE PRESENTATION TIME STAMPS

The disclosed computer-implemented method includes determining, for multiple different media items, a current time scale at which the media items are encoded for distribution, where at least two of the media items are encoded at different frame rates. The method then includes identifying, for the media items, a unified time scale that provides a constant frame interval for each of the media items. The method also includes changing at least one of the media items from the current time scale to the identified unified time scale to provide a constant frame interval for the changed media item(s). Various other methods, systems, and computer-readable media are also disclosed.

Real-time incorporation of user-generated content into third-party streams

Systems and methods for real-time incorporation of user-produced content into a broadcast media stream are provided. A media title may be streamed to a producer computing device over a communication network. The producer computing device is associated with a channel for distributing the user-produced content in conjunction with the media title. Produced content may be captured from the producer computing device as the streamed media title is played on the producer computing device. Such captured produced content may be designated for the channel. The media title and the produced content may then be broadcast in real-time over the communication network to one or more subscriber devices subscribed to the channel. The media title and the produced content may be synchronized within the broadcast to reflect when the produced content was captured in relation to the media title as the media title was played on the producer computing device.