G10L21/028

METHOD AND SYSTEM TO IMPROVE VOICE SEPARATION BY ELIMINATING OVERLAP

Aspects disclosed herein generally relate to a method and a system for improving voice separation by eliminating overlaps or overlapping points. The time-frequency points from the two recorded mixtures are separated by using a Degenerate unmixing estimation technique (DUET) algorithm. The method or system further eliminates the overlapping time-frequency points which belongs to neither of the original resources of sounds.

SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND PROGRAM

A signal processing device applies a convolutional separation filter, which is a combined filter of: a rear reverberation removal filter for suppressing a rear reverberation component from a mixed acoustic signal obtained by converting an observed mixed acoustic signal obtained by observing a source signal into a time-frequency domain; and a sound source separation filter for emphasizing components corresponding to source signals from the mixed acoustic signal, to a mixed acoustic signal string including the mixed acoustic signal and a delay signal of the mixed acoustic signal and estimates model parameters of a model for obtaining information corresponding to signals in which the rear reverberation component is suppressed and target signals emitted from target sound sources in the source signal are emphasized.

SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND PROGRAM

A signal processing device applies a convolutional separation filter, which is a combined filter of: a rear reverberation removal filter for suppressing a rear reverberation component from a mixed acoustic signal obtained by converting an observed mixed acoustic signal obtained by observing a source signal into a time-frequency domain; and a sound source separation filter for emphasizing components corresponding to source signals from the mixed acoustic signal, to a mixed acoustic signal string including the mixed acoustic signal and a delay signal of the mixed acoustic signal and estimates model parameters of a model for obtaining information corresponding to signals in which the rear reverberation component is suppressed and target signals emitted from target sound sources in the source signal are emphasized.

Systems and methods for visually guided audio separation

A system for separating audio based on sound producing objects includes a processor configured to receive video data and audio data. The processor is also configured to perform object detection using the video data to identify a number of sound producing objects in the video data and predict a separation for each sound producing object detected in the video data. The processor is also configured to generate separated audio data for each sound producing object using the separation and the audio data.

Systems and methods for visually guided audio separation

A system for separating audio based on sound producing objects includes a processor configured to receive video data and audio data. The processor is also configured to perform object detection using the video data to identify a number of sound producing objects in the video data and predict a separation for each sound producing object detected in the video data. The processor is also configured to generate separated audio data for each sound producing object using the separation and the audio data.

SOUND SOURCE SEPARATION APPARATUS, SOUND SOURCE SEPARATION METHOD, AND PROGRAM

A sound source separation device (10) acquires, from a mixed signal including sounds that came from a plurality of sound sources, a separated signal including an emphasized sound for every sound source. A signal conversion unit (1) converts the mixed signal into the frequency domain. A separated signal estimation unit (2) acquires the separated signals from the mixed signal using an optimized filter. A gradient calculation unit (3) calculates the gradient of a cost function using the mixed signal and the separated signals. A filter update unit (4) optimizes the filter to fulfill separating, for every sound source, a sound emitted from the sound source, and to fulfill having, for every sound source, strong directivity in a direction of the sound source compared with a direction not of the sound source. A signal inverse conversion unit (5) converts the separated signals into the time domain.

SOUND SOURCE SEPARATION APPARATUS, SOUND SOURCE SEPARATION METHOD, AND PROGRAM

A sound source separation device (10) acquires, from a mixed signal including sounds that came from a plurality of sound sources, a separated signal including an emphasized sound for every sound source. A signal conversion unit (1) converts the mixed signal into the frequency domain. A separated signal estimation unit (2) acquires the separated signals from the mixed signal using an optimized filter. A gradient calculation unit (3) calculates the gradient of a cost function using the mixed signal and the separated signals. A filter update unit (4) optimizes the filter to fulfill separating, for every sound source, a sound emitted from the sound source, and to fulfill having, for every sound source, strong directivity in a direction of the sound source compared with a direction not of the sound source. A signal inverse conversion unit (5) converts the separated signals into the time domain.

SOUND SOURCE SEPARATION PROGRAM, SOUND SOURCE SEPARATION METHOD, AND SOUND SOURCE SEPARATION DEVICE

A sound source separation program causes a computer to acquire an acoustic signal, convert the acquired acoustic signal from a time region to a frequency region, and perform sound source separation on the acoustic signal converted to the frequency region by performing updating based on elementary row operation on a demixing matrix to iteratively minimize an objective function including a quadratic form of a separation vector and a determinant of the demixing matrix.

SOUND SOURCE SEPARATION PROGRAM, SOUND SOURCE SEPARATION METHOD, AND SOUND SOURCE SEPARATION DEVICE

A sound source separation program causes a computer to acquire an acoustic signal, convert the acquired acoustic signal from a time region to a frequency region, and perform sound source separation on the acoustic signal converted to the frequency region by performing updating based on elementary row operation on a demixing matrix to iteratively minimize an objective function including a quadratic form of a separation vector and a determinant of the demixing matrix.

Joint source localization and separation method for acoustic sources

A method is provided for acoustic source direction of arrival estimation and acoustic source separation, via spatial weighting of the dictionary based display of the steered response function calculated for a certain number of directions from spherical harmonic decomposition coefficients obtained from microphone array recordings of the sound field. The usage of spatial band limited functions of plane waves to represent more complex directional maps of the sound field constitutes the algorithm. These functions are calculated for pre-defined directions on an analysis surface (such as a sphere). The directions of arrival of sound sources are calculated with the same method in order to group source estimates to localize sound sources. Thereby, directions of arrival can be obtained from the recordings of the sound sources captured by means of a microphone array and following this, sound sources can be separated by using this direction information or predetermined source arrival directions.