H04N5/9305

SYSTEM AND METHOD FOR PRESENTING VIRTUAL REALITY CONTENT TO A USER
20210136341 · 2021-05-06 ·

This disclosure describes a system configured to present primary and secondary, tertiary, etc., virtual reality content to a user. Primary virtual reality content may be displayed to a user, and, responsive to the user turning his view away from the primary virtual reality content, a sensory cue is provided to the user that indicates to the user that his view is no longer directed toward the primary virtual reality content, and secondary, tertiary, etc., virtual reality content may be displayed to the user. Primary virtual reality content may resume when the user returns his view to the primary virtual reality content. Primary virtual reality content may be adjusted based on a user's interaction with the secondary, tertiary, etc., virtual reality content. Secondary, tertiary, etc., virtual reality content may be adjusted based on a user's progression through the primary virtual reality content, or interaction with the primary virtual reality content.

METHOD AND DEVICE FOR FOCUSING SOUND SOURCE

Disclosed are a sound source focus method and device in which the sound source focus device, in a 5G communication environment by amplifying and outputting a sound source signal of a user's object of interest extracted from an acoustic signal included in video content by executing a loaded artificial intelligence (AI) algorithm and/or machine learning algorithm. The sound source focus method includes playing video content including a video signal including at least one moving object and the acoustic signal in which sound sources output by the object are mixed, determining the user's object of interest from the video signal, acquiring unique sound source information about the user's object of interest, extracting an actual sound source for the user's object of interest corresponding to the unique sound source information from the acoustic signal, and outputting the actual sound source extracted for the user's object of interest.

Method and device for focusing sound source

Disclosed are a sound source focus method and device in which the sound source focus device, in a 5G communication environment by amplifying and outputting a sound source signal of a user's object of interest extracted from an acoustic signal included in video content by executing a loaded artificial intelligence (AI) algorithm and/or machine learning algorithm. The sound source focus method includes playing video content including a video signal including at least one moving object and the acoustic signal in which sound sources output by the object are mixed, determining the user's object of interest from the video signal, acquiring unique sound source information about the user's object of interest, extracting an actual sound source for the user's object of interest corresponding to the unique sound source information from the acoustic signal, and outputting the actual sound source extracted for the user's object of interest.

Synchronously playing method and device of media file, and storage medium

The disclosure relates to a synchronously playing method and device of a media file, and a storage medium, the method includes: creating a media source object corresponding to a playing window in a webpage through a player embedded into the webpage; adding different tracks in the fragmented media file into the same source buffer object in the media source object; transmitting a virtual address taking the media source object as a data source to a media element of the webpage; calling the media element to parse the media source object associated with the virtual address, and reading the tracks in the source buffer object of the associated media source object, and decoding and playing the tracks.

SYSTEMS AND METHODS FOR AUTOMATICALLY GENERATING SOUND EVENT SUBTITLES
20230412760 · 2023-12-21 ·

The disclosed computer-implemented method may include systems and methods for automatically generating sound event subtitles for digital videos. For example, the systems and methods described herein can automatically generate subtitles for sound events within a digital video soundtrack that includes sounds other than speech. Additionally, the systems and methods described herein can automatically generate sound event subtitles as part of an automatic and comprehensive approach that generates subtitles for all sounds within a soundtrack of a digital videothereby avoiding the need for any manual inputs as part of the subtitling process.

Editing text in video captions

This disclosure describes techniques that include modifying text associated with a sequence of images or a video sequence to thereby generate new text and overlaying the new text as captions in the video sequence. In one example, this disclosure describes a method that includes receiving a sequence of images associated with a scene occurring over a time period; receiving audio data of speech uttered during the time period; transcribing into text the audio data of the speech, wherein the text includes a sequence of original words; associating a timestamp with each of the original words during the time period; generating, responsive to input, a sequence of new words; and generating a new sequence of images by overlaying each of the new words on one or more of the images.

Selection of a prerecorded media file for superimposing into a video
11057667 · 2021-07-06 · ·

In a method for selecting of a prerecorded media file for superimposing into a video, a video of a scene is displayed on a display device of a mobile electronic device. A location of the scene is determined. A prerecorded video file is selected based at least in part on the location. The prerecorded video file is superimposed over the video, such that the video is partially obscured by the prerecorded video file. The prerecorded video file is played while displaying the video, such that the prerecorded video file and a non-obscured portion of the video are rendered simultaneously.

System and method for presenting virtual reality content to a user
10897606 · 2021-01-19 · ·

This disclosure describes a system configured to present primary and secondary, tertiary, etc., virtual reality content to a user. Primary virtual reality content may be displayed to a user, and, responsive to the user turning his view away from the primary virtual reality content, a sensory cue is provided to the user that indicates to the user that his view is no longer directed toward the primary virtual reality content, and secondary, tertiary, etc., virtual reality content may be displayed to the user. Primary virtual reality content may resume when the user returns his view to the primary virtual reality content. Primary virtual reality content may be adjusted based on a user's interaction with the secondary, tertiary, etc., virtual reality content. Secondary, tertiary, etc., virtual reality content may be adjusted based on a user's progression through the primary virtual reality content, or interaction with the primary virtual reality content.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD AND MEDIUM

An object of one embodiment of the present disclosure is to provide a product with a high added value to a user by preventing an erroneous character string from being combined in a case where a voice before and after an image selected from within a moving image is a mixed voice. One embodiment of the present disclosure is an image processing apparatus including: a selection unit configured to select, from a moving image including a plurality of frames, a part of the moving image; an extraction unit configured to extract a voice during a predetermined time corresponding to the selected part in the moving image; and a combination unit configured to combine a character string based on a voice extracted by the extraction unit, with the part of the moving image selected by the selection unit or a frame among frames corresponding to the part.

SYNCHRONOUSLY PLAYING METHOD AND DEVICE OF MEDIA FILE, AND STORAGE MEDIUM
20200388304 · 2020-12-10 ·

The disclosure relates to a synchronously playing method and device of a media file, and a storage medium, the method includes: creating a media source object corresponding to a playing window in a webpage through a player embedded into the webpage; adding different tracks in the fragmented media file into the same source buffer object in the media source object; transmitting a virtual address taking the media source object as a data source to a media element of the webpage; calling the media element to parse the media source object associated with the virtual address, and reading the tracks in the source buffer object of the associated media source object, and decoding and playing the tracks.