G10L25/18

Audio Generation Methods and System

A method of generating audio assets, comprising the steps of: receiving an input multi-layered audio asset comprising a plurality of audio layers, generating an input multi-channel image, wherein each channel of the input multi-channel image comprises an input image representative of one of the audio layers, training a generative model on the input multi-channel image and implementing the trained generative model to generate an output multi-channel image, wherein each channel of the output multi-channel image comprises an output image representative of an output audio layer, and generating an output multi-layered audio asset based on a combination of output audio layers derived from the output images.

Audio Generation Methods and System

A method of generating audio assets, comprising the steps of: receiving an input multi-layered audio asset comprising a plurality of audio layers, generating an input multi-channel image, wherein each channel of the input multi-channel image comprises an input image representative of one of the audio layers, training a generative model on the input multi-channel image and implementing the trained generative model to generate an output multi-channel image, wherein each channel of the output multi-channel image comprises an output image representative of an output audio layer, and generating an output multi-layered audio asset based on a combination of output audio layers derived from the output images.

Processing Apparatus, Processing Method, and Storage Medium
20230016242 · 2023-01-19 ·

A processing apparatus includes one or more processors and one or more memories operatively coupled to the one or more processors. The one or more processors are configured to acquire a spectrogram of a sound signal. The one or more processors are also configured to perform a first convolution on the spectrogram at every predetermined width on one of a frequency axis or a time axis. The one or more processors are also configured to combine results of the first convolution to obtain one-dimensional first feature data. The one or more processors are also configured to perform at least one second convolution on the one-dimensional first feature data to obtain one-dimensional second feature data indicating a feature of the spectrogram.

Processing Apparatus, Processing Method, and Storage Medium
20230016242 · 2023-01-19 ·

A processing apparatus includes one or more processors and one or more memories operatively coupled to the one or more processors. The one or more processors are configured to acquire a spectrogram of a sound signal. The one or more processors are also configured to perform a first convolution on the spectrogram at every predetermined width on one of a frequency axis or a time axis. The one or more processors are also configured to combine results of the first convolution to obtain one-dimensional first feature data. The one or more processors are also configured to perform at least one second convolution on the one-dimensional first feature data to obtain one-dimensional second feature data indicating a feature of the spectrogram.

IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD
20230222711 · 2023-07-13 · ·

An image processing apparatus includes a controller. The controller calculates a fundamental frequency component included in sound data and a harmonic component corresponding to the fundamental frequency component, converts the fundamental frequency component and the harmonic component into image data, and generates a sound image where the fundamental frequency component and the harmonic component converted into the image data are arranged adjacent each other.

IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD
20230222711 · 2023-07-13 · ·

An image processing apparatus includes a controller. The controller calculates a fundamental frequency component included in sound data and a harmonic component corresponding to the fundamental frequency component, converts the fundamental frequency component and the harmonic component into image data, and generates a sound image where the fundamental frequency component and the harmonic component converted into the image data are arranged adjacent each other.

Methods for phase ECU F0 interpolation split and related controller
11705136 · 2023-07-18 · ·

Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.

Methods for phase ECU F0 interpolation split and related controller
11705136 · 2023-07-18 · ·

Controlling a concealment method for a lost audio frame associated with a received audio signal is provided. At least one bin vector of a spectral representation for at least one tone is obtained, wherein the at least one bin vector includes three consecutive bin values for the at least one tone. Whether each of the three consecutive bin values has a complex value or a real value is determined. Responsive to the determination, the three consecutive bin values are processed to estimate a frequency of the at least one tone based on whether each bin value has a complex value or a real value.

SYSTEMS AND METHODS FOR VISUALLY GUIDED AUDIO SEPARATION
20230223035 · 2023-07-13 ·

A system for separating audio based on sound producing objects includes a processor configured to receive video data and audio data. The processor is also configured to perform object detection using the video data to identify a number of sound producing objects in the video data and predict a separation for each sound producing object detected in the video data. The processor is also configured to generate separated audio data for each sound producing object using the separation and the audio data.

SYSTEMS AND METHODS FOR VISUALLY GUIDED AUDIO SEPARATION
20230223035 · 2023-07-13 ·

A system for separating audio based on sound producing objects includes a processor configured to receive video data and audio data. The processor is also configured to perform object detection using the video data to identify a number of sound producing objects in the video data and predict a separation for each sound producing object detected in the video data. The processor is also configured to generate separated audio data for each sound producing object using the separation and the audio data.