G10L21/028

AI BASED REMIXING OF MUSIC: TIMBRE TRANSFORMATION AND MATCHING OF MIXED AUDIO DATA
20230120140 · 2023-04-20 ·

The present invention provides a method for processing audio data, comprising the steps of providing input audio data containing a mixture of audio data including first audio data of a first musical timbre and second audio data of a second musical timbre different from said first musical timbre, decomposing the input audio data to provide decomposed data representative of the first audio data, transforming the decomposed data to obtain third audio data.

METHOD FOR PROVIDING VIDEO AND ELECTRONIC DEVICE SUPPORTING THE SAME
20230124111 · 2023-04-20 ·

An electronic device is provided. The electronic device includes a memory, and at least one processor electrically connected to the memory, wherein the at least one processor is configured to obtain a video including an image and an audio, obtain information on at least one object included in the image from the image, obtain a visual feature of the at least one object, based on the image and the information on the at least one object, obtain a spectrogram of the audio, obtain an audio feature of the at least one object from the spectrogram of the audio, combine the visual feature and the audio feature, obtain, based on the combined visual feature and audio feature, information on a position of the at least one object the information indicating the position of the at least one object in the image, obtain an audio part corresponding to the at least one object in the audio, based on the combined visual feature and audio feature, and store, in the memory, the information on the position of the at least one object and the audio part corresponding to the at least one object.

METHOD FOR PROVIDING VIDEO AND ELECTRONIC DEVICE SUPPORTING THE SAME
20230124111 · 2023-04-20 ·

An electronic device is provided. The electronic device includes a memory, and at least one processor electrically connected to the memory, wherein the at least one processor is configured to obtain a video including an image and an audio, obtain information on at least one object included in the image from the image, obtain a visual feature of the at least one object, based on the image and the information on the at least one object, obtain a spectrogram of the audio, obtain an audio feature of the at least one object from the spectrogram of the audio, combine the visual feature and the audio feature, obtain, based on the combined visual feature and audio feature, information on a position of the at least one object the information indicating the position of the at least one object in the image, obtain an audio part corresponding to the at least one object in the audio, based on the combined visual feature and audio feature, and store, in the memory, the information on the position of the at least one object and the audio part corresponding to the at least one object.

Methods and apparatus to assist listeners in distinguishing between electronically generated binaural sound and physical environment sound
11632470 · 2023-04-18 ·

Methods and apparatus assist listeners in distinguishing between electronically generated binaural sound and physical environment sound while the listener wears a wearable electronic device that provides the binaural sound to the listener. The wearable electronic device generates a visual alert or audio alert when the electronically generated binaural sound occurs.

Methods and apparatus to assist listeners in distinguishing between electronically generated binaural sound and physical environment sound
11632470 · 2023-04-18 ·

Methods and apparatus assist listeners in distinguishing between electronically generated binaural sound and physical environment sound while the listener wears a wearable electronic device that provides the binaural sound to the listener. The wearable electronic device generates a visual alert or audio alert when the electronically generated binaural sound occurs.

System and method for multi-microphone automated clinical documentation

A method, computer program product, and computing system for receiving information associated with an acoustic environment. Acoustic metadata associated with audio encounter information received by a first microphone system may be received. One or more speaker representations may be defined based upon, at least in part, the acoustic metadata associated with the audio encounter information and the information associated with the acoustic environment. One or more portions of the audio encounter information may be labeled with the one or more speaker representations and a speaker location within the acoustic environment.

System and method for multi-microphone automated clinical documentation

A method, computer program product, and computing system for receiving information associated with an acoustic environment. Acoustic metadata associated with audio encounter information received by a first microphone system may be received. One or more speaker representations may be defined based upon, at least in part, the acoustic metadata associated with the audio encounter information and the information associated with the acoustic environment. One or more portions of the audio encounter information may be labeled with the one or more speaker representations and a speaker location within the acoustic environment.

Processor and method for processing an audio signal using truncated analysis or synthesis window overlap portions

A processor for processing an audio signal has: an analyzer for deriving a window control signal from the audio signal indicating a change from a first asymmetric window to a second window, or indicating a change from a third window to a fourth asymmetric window, wherein the second window is shorter than the first window, or wherein the third window is shorter than the fourth window; a window constructor for constructing the second window using a first overlap portion of the first asymmetric window, wherein the window constructor is configured to determine a first overlap portion of the second window using a truncated first overlap portion of the first asymmetric window, or wherein the window constructor is configured to calculate a second overlap portion of the third window using a truncated second overlap portion of the fourth asymmetric window; and a windower for applying the first and second windows or the third and fourth windows to obtain windowed audio signal portions.

Processor and method for processing an audio signal using truncated analysis or synthesis window overlap portions

A processor for processing an audio signal has: an analyzer for deriving a window control signal from the audio signal indicating a change from a first asymmetric window to a second window, or indicating a change from a third window to a fourth asymmetric window, wherein the second window is shorter than the first window, or wherein the third window is shorter than the fourth window; a window constructor for constructing the second window using a first overlap portion of the first asymmetric window, wherein the window constructor is configured to determine a first overlap portion of the second window using a truncated first overlap portion of the first asymmetric window, or wherein the window constructor is configured to calculate a second overlap portion of the third window using a truncated second overlap portion of the fourth asymmetric window; and a windower for applying the first and second windows or the third and fourth windows to obtain windowed audio signal portions.

Voice signal enhancement for head-worn audio devices

A head-worn audio device is provided with a circuit for voice signal enhancement. The circuit comprises at least a plurality of microphones, arranged at predefined positions, where each microphone provides a microphone signal. The circuit further comprises a directivity pre-processor and a blind source separation processor. The directivity pre-processor is connected with the plurality of microphones to receive the microphone signals and being configured to provide at least a voice signal and a noise signal. Directivity pre-processing increases the mutual independence of the signals provided to the blind source separation processor and thus improves processing by blind source separation. The blind source separation processor receives at least the voice signal and the noise signal, and is configured to conduct blind source separation on at least the voice signal and the noise signal to provide at least an enhanced voice signal with reduced noise components.