H04N21/854

Audio file processing method, electronic device, and storage medium

An audio file processing method is provided for an electronic device. The method includes extracting at least one audio segment from a first audio file, recognizing at least one to-be-replaced audio segment representing a target role from the at least one audio segment, and determining time frame information of each to-be-replaced audio segment in the first audio file. The method also includes obtaining to-be-dubbed audio data for each to-be-replaced audio segment, and replacing data in the to-be-replaced audio segment with the to-be-dubbed audio data according to the time frame information, to obtain a second audio file. The at least one to-be-replaced audio segment is divided from the at least one audio segment based on a structure and a word count in a sentence corresponding to each to-be-replaced audio segment.

METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE
20220398793 · 2022-12-15 ·

A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.

METHOD AND APPARATUS FOR DETERMINING OBJECT ADDING MODE, ELECTRONIC DEVICE AND MEDIUM
20220394326 · 2022-12-08 ·

Embodiments of the present application disclose a method and apparatus for determining an object adding mode, an electronic device and a medium. A specific implementation of the method includes: obtaining a target video and a to-be-added object set to be added to a target video; determining time periods corresponding to storyboards of the target video; determining a presentation period of a to-be-added object in the to-be-added object set in the target video based on the time periods corresponding to the storyboards; generating object adding indication information based on the determined presentation period, where the object adding indication information includes time prompt information used to indicate the presentation period of the to-be-added object in the to-be-added object set in the target video. According to this implementation, automatic selection of time for adding the to-be-added object to the target video is realized.

Blockchained media stored in a material exchange format file
11522710 · 2022-12-06 ·

Digital media that has been blockchained into a blockchain file format may be stored into a secondary file format like a Material eXchange Format (MXF) digital file by deconstructing the blockchain file and storing its subcomponent blockchain data and blockchain hash digests for each block within separate structures of the MXF digital file by generating a table for the blockchain hash digests that links to the blockchain data through data pointers. These separate structures of the MXF digital file are the generic container for a media file and a SDTI-CP (Serial Data Transport Interface—Content Package) compatible system item.

Systems and methods for autodirecting a real-time transmission

In some aspects, the described systems and methods provide for a system for processing a stream for real-time transmission. The system comprises a processor in communication with memory. The processor is configured to execute instructions for an autodirection component stored in memory that cause the processor to receive a real-time stream for an artistic performance, detect one or more human persons in the real-time stream, rank the detected one or more human persons in the real-time stream, select, based on the ranking, a subject from the detected one or more human persons, determine a subject framing for the real-time stream based on the selected subject, process the real-time stream to select a portion of each frame in the real-time stream according to the subject framing, wherein the portion of each frame includes at least the subject, and transmit the processed stream in real-time.

IMAGE DISPLAY METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
20220383836 · 2022-12-01 ·

Embodiments of the present application provide an image display method and apparatus, a device, and a storage medium. The method comprises: displaying a preceding image in a first display period, the preceding image comprising a video image sequence or a single image; superimposing a foreground target area of a succeeding image on an upper layer of the preceding image in a second display period for display, the succeeding image comprising a video image sequence or a single image; and displaying the succeeding image in a third display period. According to the method, a good playback effect can be achieved in scenarios where time variations are desired.

Power aware video decoding and streaming

Methods and systems are disclosed for a mobile device to decode video based on available power and/or energy. For example, the mobile device may receive a media description file (MDF) from for a video stream from a video server. The MDF may include complexity information associated with a plurality of video segments. The complexity information may be related to the amount of processing power to be utilized for decoding the segment at the mobile device. The mobile device may determine at least one power metric for the mobile device. The mobile device may determine a first complexity level to be requested for a first video segment based on the complexity information from the MDF and the power metric. The mobile device may dynamically alter the decoding process to save energy based on the detected power/energy level.

Generating composite video stream for display in VR

A processor system and computer-implemented method may be provided for generating a composite video stream which may combine a background video and a foreground video stream into one stream. For that purpose, a spatially segmented encoding of the background video may be obtained, for example in the form of a tiled stream. The foreground video stream may be received, for example, from a(nother) client device. The foreground video stream may be a real-time stream, e.g., when being used in real-time communication. The image data of the foreground video stream may be inserted into the background video by decoding select segments of the background video, inserting the foreground image data into the decoded background image data of these segments, and by encoding the resulting composite image data to obtain composite segments which, together with the non-processed segments of the background video, form a spatially segmented encoding of a composite video.

CONTENT GENERATION DEVICE, CONTENT DISTRIBUTION SERVER, CONTENT GENERATION METHOD, AND CONTENT GENERATION PROGRAM
20220377435 · 2022-11-24 · ·

To provide a content generating device, content streaming server, content generating method, and content generating program for generating highly interesting video content. A content generating device according to an embodiment comprises: an acquiring portion (25) for requesting, from an accumulating portion that accumulates a plurality of video data, video data that satisfies a prescribed condition, to acquire a plurality of video data that satisfies the condition; a starting frame identifying portion (26) for identifying starting frames wherein prescribed actions start, in the video playback of the plurality of video data acquired by the acquiring portion (25); and a generating portion (27) for generating multi-video data by combining a plurality of video data with the timings of the identified starting frames synchronized.

CONTENT GENERATION DEVICE, CONTENT DISTRIBUTION SERVER, CONTENT GENERATION METHOD, AND CONTENT GENERATION PROGRAM
20220377435 · 2022-11-24 · ·

To provide a content generating device, content streaming server, content generating method, and content generating program for generating highly interesting video content. A content generating device according to an embodiment comprises: an acquiring portion (25) for requesting, from an accumulating portion that accumulates a plurality of video data, video data that satisfies a prescribed condition, to acquire a plurality of video data that satisfies the condition; a starting frame identifying portion (26) for identifying starting frames wherein prescribed actions start, in the video playback of the plurality of video data acquired by the acquiring portion (25); and a generating portion (27) for generating multi-video data by combining a plurality of video data with the timings of the identified starting frames synchronized.