G10L21/0356

Audio file envelope based on RMS power in sequences of sub-windows
11450339 · 2022-09-20 · ·

A method comprising determining an envelope of an audio file based on a double-windowing analysis of the audio file.

Methods and systems for altering video clip objects

The present disclosure relates generally to content delivery techniques in audio-visual streaming systems. The techniques include altering video or audio portions of media content based on user input or interaction. The techniques further include altering text or messaging distributed to multiple users based on user input.

Audio metadata smoothing

The disclosed computer-implemented method for smoothing audio gaps using adaptive metadata identifies an initial audio segment and a subsequent audio segment that follows the initial audio segment. The method accesses a first set of metadata that corresponds to a last audio frame of the initial audio segment and accesses a second set of metadata that corresponds to the first audio frame of the subsequent audio segment. The first and second sets of metadata include audio characteristic information for the two audio segments. The method then generates a new set of metadata that is based on both sets of audio characteristics. The method further inserts a new audio frame between the last audio frame of the initial audio segment and the first audio frame of the subsequent audio segment and applies the new set of metadata to the new audio frame. Various other methods, systems, and computer-readable media are also disclosed.

Methods and systems for augmenting audio content

The audio content (e.g., an audio track, an audio file, an audio signal, etc.) of a content item (e.g., multimedia content, a movie, streaming content, etc.) may be modified to augment and/or include one or more auditory events, such as a sound, a plurality of sounds, a sound effect(s), a voice(s), and/or music.

Method, system and storage medium for signal separation

Methods, systems and storage medium for separating a target signal from noise are disclosed. A method comprises providing a plurality of input signals, each of the plurality of input signals comprising the target signal; synchronizing the plurality of input signals; and separating the plurality of synchronized input signals into the target signal and the noise.

Method, system and storage medium for signal separation

Methods, systems and storage medium for separating a target signal from noise are disclosed. A method comprises providing a plurality of input signals, each of the plurality of input signals comprising the target signal; synchronizing the plurality of input signals; and separating the plurality of synchronized input signals into the target signal and the noise.

Electronic device with audio zoom and operating method thereof

An electronic device is provided. The electronic device includes a camera, a plurality of microphones, at least one processor electrically coupled with the camera and the plurality of microphones. The at least one processor may acquire a video signal, based on a designated zoom level via the camera, acquire a plurality of audio signals respectively via the plurality of microphones while acquiring the video signal, identify a first signal characteristic of a first audio signal acquired via a first microphone and a second signal characteristic of a second audio signal acquired via a second microphone among the plurality of microphones, derive a control parameter for signal processing for the first audio signal and the second audio signal, based on the designated zoom level, the first signal characteristic, and the second signal characteristic, and perform audio signal processing including beamforming using the first audio signal and the second audio signal, based on the derived control parameter. Various other embodiments are also possible.

Mixed reality complementary systems

Multiple sound systems are used to provide a realistic audio MR audio experience for one or more users. In one example, an MR space sound system has one or more speakers distributed within an MR space. MR device sound systems for users provides sound directly to the users wearing the MR devices. Audio signals representative of sound in the MR experience are mixed by each sound system to provide sounds that complement each other. Both sound systems provide sound to the users based on events occurring in the MR experience.

Mixed reality complementary systems

Multiple sound systems are used to provide a realistic audio MR audio experience for one or more users. In one example, an MR space sound system has one or more speakers distributed within an MR space. MR device sound systems for users provides sound directly to the users wearing the MR devices. Audio signals representative of sound in the MR experience are mixed by each sound system to provide sounds that complement each other. Both sound systems provide sound to the users based on events occurring in the MR experience.

METHODS AND SYSTEMS FOR IMAGE AND VOICE PROCESSING

Systems and methods are disclosed configured to train an autoencoder using images that include faces, wherein the autoencoder comprises an input layer, an encoder configured to output a latent image from a corresponding input image, and a decoder configured to attempt to reconstruct the input image from the latent image. An image sequence of a face exhibiting a plurality of facial expressions and transitions between facial expressions is generated and accessed. Images of the plurality of facial expressions and transitions between facial expressions are captured from a plurality of different angles and using different lighting. An autoencoder is trained using source images that include the face with different facial expressions captured at different angles with different lighting, and using destination images that include a destination face. The trained autoencoder is used to generate an output where the likeness of the face in the destination images is swapped with the likeness of the source face, while preserving expressions of the destination face.