G10L21/0316

WEARABLE HEARING ASSIST DEVICE WITH ARTIFACT REMEDIATION
20220369047 · 2022-11-17 ·

Various implementations include systems for processing audio signals to remove artifacts introduced by a machine learning system in challenging environments. In particular implementations, a method includes generating a processed audio signal for a hearing assistance device in which the processed audio signal is intended to perceptually dominate a user auditory experience, including: processing an unprocessed audio signal received by the hearing assistance device, wherein the processing includes utilizing a machine learning (ML) system to generate an ML enhanced audio signal; determining a mixing coefficient from an environmental noise assessment; mixing the ML enhanced audio signal with the unprocessed audio signal using the mixing coefficient to generate the processed audio signal; and outputting the processed audio signal.

Voice signal control device, voice signal control system, and voice signal control program

A voice signal control device includes a processor having hardware configured to process a voice signal generated by a voice signal generation device configured to generate the voice signal according to setting information of a voice output in a voice processing device configured to output the voice signal, and make the voice processing device output voice according to the voice signal after the processing.

Voice signal control device, voice signal control system, and voice signal control program

A voice signal control device includes a processor having hardware configured to process a voice signal generated by a voice signal generation device configured to generate the voice signal according to setting information of a voice output in a voice processing device configured to output the voice signal, and make the voice processing device output voice according to the voice signal after the processing.

SIGNAL PROCESSING DEVICE, METHOD, AND PROGRAM
20220360930 · 2022-11-10 ·

The present technology relates to a signal processing device, a method, and a program that make it possible for a user to obtain a higher realistic feeling. The signal processing device includes: an audio generation unit that generates a sound source signal according to a type of a sound source on the basis of a recorded signal obtained by sound collection by a microphone attached to a moving object; a correction information generation unit that generates position correction information indicating a distance between the microphone and the sound source; and a position information generation unit that generates sound source position information indicating a position of the sound source in a target space on the basis of microphone position information indicating a position of the microphone in the target space and the position correction information. The present technology can be applied to a recording/transmission/reproduction system.

VOICE COMMAND RECOGNITION SYSTEM
20220358915 · 2022-11-10 ·

Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for a voice command recognition system (VCR). An example embodiment operates by receiving a voice command directed to controlling a device, the voice command including a wake command and an action command. An amplitude of the wake command is determined. A gain adjustment for the voice command is calculated based on a comparison of the amplitude of the wake command to a target amplitude. An amplitude of the action command is adjusted based on the calculated gain adjustment for the voice command based on the comparison of the amplitude of the wake command to the target amplitude. A device command for controlling the device is identified based on the action command comprising the adjusted amplitude. The device command is provided to the device.

CONTROLLING PLAYBACK OF AUDIO DATA
20230096846 · 2023-03-30 · ·

Playback of audio data is controlled by: receiving a speech signal to be conveyed to a user simultaneously with playback of the audio data. Volume and/or spectral appearance of selected elements of the audio data are then modified to obtain adjusted audio data, and the adjusted audio data is played back. The received speech signal may then be played back simultaneously with the adjusted audio data.

SPEECH ENHANCEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
20230097520 · 2023-03-30 ·

A speech enhancement method includes: performing pre-enhancement on a target speech frame according to a complex spectrum corresponding to the target speech frame, to obtain a first complex spectrum; performing speech decomposition on the target speech frame according to the first complex spectrum, to obtain a glottal parameter, a gain, and an excitation signal that correspond to the target speech frame; and performing synthesis according to the glottal parameter, the gain, and the excitation signal, to obtain an enhanced speech signal corresponding to the target speech frame.

SPEECH ENHANCEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
20230097520 · 2023-03-30 ·

A speech enhancement method includes: performing pre-enhancement on a target speech frame according to a complex spectrum corresponding to the target speech frame, to obtain a first complex spectrum; performing speech decomposition on the target speech frame according to the first complex spectrum, to obtain a glottal parameter, a gain, and an excitation signal that correspond to the target speech frame; and performing synthesis according to the glottal parameter, the gain, and the excitation signal, to obtain an enhanced speech signal corresponding to the target speech frame.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
20230100767 · 2023-03-30 · ·

An information processing device includes a processor configured to output, in a case where a service is being used in which at least speech is exchanged among multiple users such that a conversation takes places among all of the multiple users, a speech of a separate conversation distinctly from a speech of the conversation taking place among all of the multiple users to a device of a user who is engaged in the separate conversation with a specific user from among the multiple users, and output the speech of the conversation taking place among all of the multiple users without outputting the speech of the separate conversation to a device of a user who is not engaged in the separate conversation.

System and method of enhancing intelligibility of audio playback

A personal listening system and a method of using the personal listening system to enhance speech intelligibility of audio playback, are described. The method includes determining a speech intelligibility metric, such as a speech reception threshold, of a user. Based on the speech intelligibility metric, a tuning parameter is applied to an audio input signal. The speech reception threshold is compared to an environmental signal-to-noise ratio to determine whether enhancement of the audio input signal is warranted. Application of the tuning parameter to the audio input signal generates an audio output signal having reduced noise, making playback of the audio output signal more intelligible to the user. Other aspects are also described and claimed.