H04N9/8715

System and method for creating immersive interactive application

The present disclosure provides a development system to permit a developer to generate mixed reality (MR) streaming content for display on a VR headset worn by a viewer. The system allows development and generation of the content steam by non-technical personnel, where such developers are not required to possess computer skills or engineering knowledge. The streaming content generated includes embedded pre-recorded video files originally recorded in a 360 degree format, which significantly reduces computer processing time, memory requirements, and significantly speeds up the development time required to produce a final executable streaming content.

Image processing apparatus, image processing method and medium

An object of one embodiment of the present disclosure is to provide a product with a high added value to a user by preventing an erroneous character string from being combined in a case where a voice before and after an image selected from within a moving image is a mixed voice. One embodiment of the present disclosure is an image processing apparatus including: a selection unit configured to select, from a moving image including a plurality of frames, a part of the moving image; an extraction unit configured to extract a voice during a predetermined time corresponding to the selected part in the moving image; and a combination unit configured to combine a character string based on a voice extracted by the extraction unit, with the part of the moving image selected by the selection unit or a frame among frames corresponding to the part.

Video tagging by correlating visual features to sound tags

Automatically recommending sound effects based on visual scenes enables sound engineers during video production of computer simulations, such as movies and video games. This recommendation engine may be accomplished by classifying SFX and using a machine learning engine to output a first of the classified SFX for a first computer simulation based on learned correlations between video attributes of the first computer simulation and the classified SFX.

USER-GENERATED TEMPLATES FOR SEGMENTED MULTIMEDIA PERFORMANCE

Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing user-generated templates for segmented multimedia performances. An embodiment includes at least one computer processor configured to transmit a first version of a content instance and corresponding metadata. The first version of the content instance may include a plurality of structural elements, with at least one structural element corresponding to at least part of the metadata. The first content instance may be transformed by a rendering engine triggered by the at least one computer processor.

MODIFYING PLAYBACK OF REPLACEMENT CONTENT RESPONSIVE TO DETECTION OF REMOTE CONTROL SIGNALS THAT MODIFY OPERATION OF THE PLAYBACK DEVICE

In one aspect, an example method includes (i) providing, by a playback device, replacement media content for display; (ii) determining, by the playback device, that a remote control transmitted to the playback device an instruction configured to cause a modification to operation of the playback device while the playback device displays the replacement media content; (iii) determining, by the playback device based on the instruction, an overlay that the playback device is configured to provide for display in conjunction with the modification; (iv) determining, by the playback device, a region within a display of the playback device corresponding to the overlay; and (v) modifying, by the playback device, a transparency of the region such that the overlay is visible through the replacement media content when the playback device provides the overlay for display.

Modifying playback of replacement content responsive to detection of remote control signals that modify operation of the playback device

In one aspect, an example method includes (i) providing, by a playback device, replacement media content for display; (ii) determining, by the playback device, that a remote control transmitted to the playback device an instruction configured to cause a modification to operation of the playback device while the playback device displays the replacement media content; (iii) determining, by the playback device based on the instruction, an overlay that the playback device is configured to provide for display in conjunction with the modification; (iv) determining, by the playback device, a region within a display of the playback device corresponding to the overlay; and (v) modifying, by the playback device, a transparency of the region such that the overlay is visible through the replacement media content when the playback device provides the overlay for display.

User-generated templates for segmented multimedia performance

Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing user-generated templates for segmented multimedia performances. An embodiment includes at least one computer processor configured to transmit a first version of a content instance and corresponding metadata. The first version of the content instance may include a plurality of structural elements, with at least one structural element corresponding to at least part of the metadata. The first content instance may be transformed by a rendering engine triggered by the at least one computer processor.

Editing text in video captions

This disclosure describes techniques that include modifying text associated with a sequence of images or a video sequence to thereby generate new text and overlaying the new text as captions in the video sequence. In one example, this disclosure describes a method that includes receiving a sequence of images associated with a scene occurring over a time period; receiving audio data of speech uttered during the time period; transcribing into text the audio data of the speech, wherein the text includes a sequence of original words; associating a timestamp with each of the original words during the time period; generating, responsive to input, a sequence of new words; and generating a new sequence of images by overlaying each of the new words on one or more of the images.

Systems for optimized presentation capture

Systems herein allow a user to record a presentation with a slides file. The system can record action events generated by a viewer application that displays slides of the slides file. The system can also record an audio segment for each displayed slide. An action information file can be created that links action events and audio segments to slides, and provides timing information for the action events. This can allow for playback of a narrated presentation where actions are recreated in synchronization with the narration while reducing the reliance on large video files.

AUTOMATIC VERSIONING OF VIDEO PRESENTATIONS
20210312184 · 2021-10-07 ·

A system and method are presented to create custom versions for users of recorded sessions of individuals. Individuals are recorded at a booth responding to prompts. Audio and visual data recorded at the booth are divided into time segments according to the timing of the prompts. Depth sensors at the booth are used to assign score values to time segments. Prompts are related to criteria that were selected as being relevant to an objective. Users are associated with subsets of criteria in order to identify subsets of prompts whose responses are relevant to the users. Time segments of audio and visual data created by the identified subset of prompts are selected. The selected time segments are ordered according to herd behavior analysis. Lesser weighted time segments may be redacted. The remaining portions of ordered time segments are presented to the user as a custom version.