G10L21/003

Post filter for audio signals

In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

Post filter for audio signals

In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

Natural human-computer interaction for virtual personal assistant systems
11609631 · 2023-03-21 · ·

Technologies for natural language interactions with virtual personal assistant systems include a computing device configured to capture audio input, distort the audio input to produce a number of distorted audio variations, and perform speech recognition on the audio input and the distorted audio variants. The computing device selects a result from a large number of potential speech recognition results based on contextual information. The computing device may measure a user's engagement level by using an eye tracking sensor to determine whether the user is visually focused on an avatar rendered by the virtual personal assistant. The avatar may be rendered in a disengaged state, a ready state, or an engaged state based on the user engagement level. The avatar may be rendered as semitransparent in the disengaged state, and the transparency may be reduced in the ready state or the engaged state. Other embodiments are described and claimed.

Natural human-computer interaction for virtual personal assistant systems
11609631 · 2023-03-21 · ·

Technologies for natural language interactions with virtual personal assistant systems include a computing device configured to capture audio input, distort the audio input to produce a number of distorted audio variations, and perform speech recognition on the audio input and the distorted audio variants. The computing device selects a result from a large number of potential speech recognition results based on contextual information. The computing device may measure a user's engagement level by using an eye tracking sensor to determine whether the user is visually focused on an avatar rendered by the virtual personal assistant. The avatar may be rendered in a disengaged state, a ready state, or an engaged state based on the user engagement level. The avatar may be rendered as semitransparent in the disengaged state, and the transparency may be reduced in the ready state or the engaged state. Other embodiments are described and claimed.

Methods and systems for providing changes to a live voice stream

Methods and Systems for providing a change to a voice interacting with a user are described. Information indicating a change that can be made to the voice can be received. The voice can be changed based on the information.

HEARING AID WITH VOICE RECOGNITION
20230080418 · 2023-03-16 · ·

A system for selectively amplifying audio signals may include a microphone configured to capture sounds from an environment of a user. The system may also include a processor programmed to: receive audio signals representative of the sounds captured by the microphone; cause selective conditioning of at least one audio signal received by the microphone from a region associated with the recognized individual; and cause transmission of the at least one conditioned audio signal to a hearing interface device configured to provide sound to an ear of the user.

HEARING AID WITH VOICE RECOGNITION
20230080418 · 2023-03-16 · ·

A system for selectively amplifying audio signals may include a microphone configured to capture sounds from an environment of a user. The system may also include a processor programmed to: receive audio signals representative of the sounds captured by the microphone; cause selective conditioning of at least one audio signal received by the microphone from a region associated with the recognized individual; and cause transmission of the at least one conditioned audio signal to a hearing interface device configured to provide sound to an ear of the user.

Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Characterizing, selecting and adapting audio and acoustic training data for automatic speech recognition systems

A system for and method of characterizing a target application acoustic domain analyzes one or more speech data samples from the target application acoustic domain to determine one or more target acoustic characteristics, including a CODEC type and bit-rate associated with the speech data samples. The determined target acoustic characteristics may also include other aspects of the target speech data samples such as sampling frequency, active bandwidth, noise level, reverberation level, clipping level, and speaking rate. The determined target acoustic characteristics are stored in a memory as a target acoustic data profile. The data profile may be used to select and/or modify one or more out of domain speech samples based on the one or more target acoustic characteristics.

Modulation of packetized audio signals
11482216 · 2022-10-25 · ·

Modulating packetized audio signals in a voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request, and generate a first action data structure. The system can identify a content item object based on the trigger keyword, and generate an output signal comprising a first portion corresponding to the first action data structure and a second portion corresponding to the content item object. The system can apply a modulation to the first or second portion of the output signal, and transmit the modulated output signal to the device.