G10L25/15

Speaker recognition with assessment of audio frame contribution

This application describes methods and apparatus for speaker recognition. An apparatus according to an embodiment has an analyzer for analyzing each frame of a sequence of frames of audio data which correspond to speech sounds uttered by a user to determine at least one characteristic of the speech sound of that frame. An assessment module determines, for each frame of audio data, a contribution indicator of the extent to which that frame of audio data should be used for speaker recognition processing based on the determined characteristic of the speech sound. Said contribution indicator comprises a weighting to be applied to each frame in the speaker recognition processing. In this way frames which correspond to speech sounds that are of most use for speaker discrimination may be emphasized and/or frames which correspond to speech sounds that are of least use for speaker discrimination may be de-emphasized.

ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF

An electronic apparatus is provided. The electronic apparatus includes an interface configured to receive a first audio signal from a first microphone set and receive a second audio signal from a second microphone set provided at a position different from that of the first microphone set; a processor configured to: obtain a plurality of first sound-source components based on the first audio signal and a plurality of second sound-source components based on the second audio signal; identify a first sound-source component, from among the plurality of first sound-source components, and a second sound-source component, from among the plurality of second sound-source components, that correspond to each other; identify a user command based on the first sound-source component and the second sound-source component; and control an operation corresponding to the user command.

Communication system for processing audio input with visual display
11315588 · 2022-04-26 ·

A reference acoustic input is processed into a quantization representation such that the quantization representation comprises acoustic components determined from the reference acoustic input, wherein the acoustic components comprise amplitude, rhythm, and pitch frequency of the reference acoustic input. A visual representation is generated that simultaneously depicts the acoustic components comprising amplitude, rhythm, and pitch frequency of the reference acoustic input. A user spoken input may be received and similarly processed and displayed.

Communication system for processing audio input with visual display
11315588 · 2022-04-26 ·

A reference acoustic input is processed into a quantization representation such that the quantization representation comprises acoustic components determined from the reference acoustic input, wherein the acoustic components comprise amplitude, rhythm, and pitch frequency of the reference acoustic input. A visual representation is generated that simultaneously depicts the acoustic components comprising amplitude, rhythm, and pitch frequency of the reference acoustic input. A user spoken input may be received and similarly processed and displayed.

Cognitive function evaluation device, cognitive function evaluation system, and cognitive function evaluation method

A cognitive function evaluation device includes: an obtainment unit configured to obtain speech data indicating speech uttered by a subject; a calculation unit configured to extract a plurality of vowels from the speech data obtained by the obtainment unit, and calculate, for each of the plurality of vowels, a feature value based on a frequency and an amplitude of at least one formant obtained from a spectrum of the vowel; an evaluation unit configured to evaluate a cognitive function of the subject from the feature value calculated by the calculation unit; and an output unit configured to output an evaluation result of the evaluation unit.

Training apparatus, method of the same and program

A training device changes feedback formant frequencies which are formant frequencies of a picked-up speech signal, applies a lowpass filter, converts the picked-up speech signal, adds high-pass noise to the converted speech signal, feeds back the converted speech signal with the high-pass noise added to a subject, calculates a compensatory response vector by using pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted with change of the feedback formant frequencies to the subject, and pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted without change of the feedback formant frequencies to the subject, and determines an evaluation based on the compensatory response vector and a correct compensatory response vector.

Training apparatus, method of the same and program

A training device changes feedback formant frequencies which are formant frequencies of a picked-up speech signal, applies a lowpass filter, converts the picked-up speech signal, adds high-pass noise to the converted speech signal, feeds back the converted speech signal with the high-pass noise added to a subject, calculates a compensatory response vector by using pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted with change of the feedback formant frequencies to the subject, and pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted without change of the feedback formant frequencies to the subject, and determines an evaluation based on the compensatory response vector and a correct compensatory response vector.

Automated music composition and generation system employing virtual musical instrument libraries for producing notes contained in the digital pieces of automatically composed music
11776518 · 2023-10-03 · ·

An automated music composition and generation system including a system user interface for enabling system users to review and select one or more musical experience descriptors, as well as time and/or space parameters; and an automated music composition and generation engine, operably connected to the system user interface, for receiving, storing and processing musical experience descriptors and time and/or space parameters selected by the system user, so as to automatically compose and generate one or more digital pieces of music in response to the musical experience descriptors and time and/or space parameters selected by the system user. Each digital piece of composed and generated music contains a set of musical notes arranged and performed in the digital piece of music. The engine includes: a digital piece creation subsystem and a digital audio sample producing subsystem supported by virtual musical instrument libraries.

Automated music composition and generation system employing virtual musical instrument libraries for producing notes contained in the digital pieces of automatically composed music
11776518 · 2023-10-03 · ·

An automated music composition and generation system including a system user interface for enabling system users to review and select one or more musical experience descriptors, as well as time and/or space parameters; and an automated music composition and generation engine, operably connected to the system user interface, for receiving, storing and processing musical experience descriptors and time and/or space parameters selected by the system user, so as to automatically compose and generate one or more digital pieces of music in response to the musical experience descriptors and time and/or space parameters selected by the system user. Each digital piece of composed and generated music contains a set of musical notes arranged and performed in the digital piece of music. The engine includes: a digital piece creation subsystem and a digital audio sample producing subsystem supported by virtual musical instrument libraries.

Audio Encoder for Encoding an Audio Signal, Method for Encoding an Audio Signal and Computer Program under Consideration of a Detected Peak Spectral Region in an Upper Frequency Band

An audio encoder for encoding an audio signal having a lower frequency band and an upper frequency band includes: a detector for detecting a peak spectral region in the upper frequency band of the audio signal; a shaper for shaping the lower frequency band using shaping information for the lower band and for shaping the upper frequency band using at least a portion of the shaping information for the lower band, wherein the shaper is configured to additionally attenuate spectral values in the detected peak spectral region in the upper frequency band; and a quantizer and coder stage for quantizing a shaped lower frequency band and a shaped upper frequency band and for entropy coding quantized spectral values from the shaped lower frequency band and the shaped upper frequency band.