Patent classifications
G10L21/14
System and a method for sound recognition
A method for automatic for sound recognition, comprising a) raw spectrogram generation from a sound signal spectrum; b) wide-band spectrum determination; c) wide-band continuous spectrum determination; d) tonal and time-transient spectrum determination; wide-band continuous spectrogram and tonal and time-transient spectrogram determination; and) spectrogram image generation.
System and a method for sound recognition
A method for automatic for sound recognition, comprising a) raw spectrogram generation from a sound signal spectrum; b) wide-band spectrum determination; c) wide-band continuous spectrum determination; d) tonal and time-transient spectrum determination; wide-band continuous spectrogram and tonal and time-transient spectrogram determination; and) spectrogram image generation.
Audio Generation Methods and Systems
A method of generating audio assets, comprising the steps of: receiving a plurality of input audio assets, converting each input audio asset into an input graphical representation, generating an input multi-channel image by stacking each input graphical representation in a separate channel of the image, feeding the input multi-channel image into a generative model to train the generative model and generate one or more output multi-channel images, each output multi-channel image comprising an output graphical representation, extracting the output graphical representations from each output multi-channel image and converting each output graphical representation into an output audio asset.
Audio Generation Methods and Systems
A method of generating audio assets, comprising the steps of: receiving a plurality of input audio assets, converting each input audio asset into an input graphical representation, generating an input multi-channel image by stacking each input graphical representation in a separate channel of the image, feeding the input multi-channel image into a generative model to train the generative model and generate one or more output multi-channel images, each output multi-channel image comprising an output graphical representation, extracting the output graphical representations from each output multi-channel image and converting each output graphical representation into an output audio asset.
Audio Generation Methods and System
A method of generating audio assets, comprising the steps of: receiving an input multi-layered audio asset comprising a plurality of audio layers, generating an input multi-channel image, wherein each channel of the input multi-channel image comprises an input image representative of one of the audio layers, training a generative model on the input multi-channel image and implementing the trained generative model to generate an output multi-channel image, wherein each channel of the output multi-channel image comprises an output image representative of an output audio layer, and generating an output multi-layered audio asset based on a combination of output audio layers derived from the output images.
Audio Generation Methods and System
A method of generating audio assets, comprising the steps of: receiving an input multi-layered audio asset comprising a plurality of audio layers, generating an input multi-channel image, wherein each channel of the input multi-channel image comprises an input image representative of one of the audio layers, training a generative model on the input multi-channel image and implementing the trained generative model to generate an output multi-channel image, wherein each channel of the output multi-channel image comprises an output image representative of an output audio layer, and generating an output multi-layered audio asset based on a combination of output audio layers derived from the output images.
IMAGE PROCESSING APPARATUS AND IMAGE PROCESSING METHOD
An image processing apparatus includes a controller. The controller calculates a fundamental frequency component included in sound data and a harmonic component corresponding to the fundamental frequency component, converts the fundamental frequency component and the harmonic component into image data, and generates a sound image where the fundamental frequency component and the harmonic component converted into the image data are arranged adjacent each other.
EFFICIENT BLIND SOURCE SEPARATION USING TOPOLOGICAL APPROACH
Aspects disclosed herein generally related to a method and system for efficient blind source separation using a topological approach. The method and system comprise locating and separating the audio streams by constructing and simplifying contour tree in a built time-frequency smooth weighted histogram in the subsystems included. Thus, in one example, the audio streams can be separated and reproduced in a faster, more reliability, higher quality and more robust way.
EFFICIENT BLIND SOURCE SEPARATION USING TOPOLOGICAL APPROACH
Aspects disclosed herein generally related to a method and system for efficient blind source separation using a topological approach. The method and system comprise locating and separating the audio streams by constructing and simplifying contour tree in a built time-frequency smooth weighted histogram in the subsystems included. Thus, in one example, the audio streams can be separated and reproduced in a faster, more reliability, higher quality and more robust way.
SOUND PROCESSING METHOD USING DJ TRANSFORM
Provided is a sound processing method performed by a computer, the method comprising generating a DJ transform spectrogram indicating estimated pure-tone amplitudes for respective frequencies corresponding to natural frequencies of a plurality of springs and a plurality of time points by modeling an oscillation motion of the plurality of springs having different natural frequencies, with respect to an input sound, and calculating the estimated pure-tone amplitudes for the respective natural frequencies; calculating degrees of fundamental frequency suitability based on a moving average of the estimated pure-tone amplitudes or a moving standard deviation of the estimated pure-tone amplitudes with respect to each natural frequency of the DJ transform spectrogram; and extracting the fundamental frequency based on local maximum values of the degrees of fundamental frequency suitability for the respective natural frequencies at each of the plurality of time points.