IPIQ

G10L15/12

Automatic speech recognition system addressing perceptual-based adversarial audio attacks

11222651 · 2022-01-11 ·

Robert Bosch Gmbh

A computer-implemented method for creating a combined audio signal in a speech recognition system, the method includes sampling the audio input signal to generate a time-domain sampled input signal, then converting the time-domain sampled input signal to a frequency-domain input signal, afterwards generating perceptual weights in response to frequency components of critical bands of the frequency-domain input signal, creating a time-domain adversary signal in response to the perceptual weights; and combining the time-domain adversary signal with the audio input signal to create a combined audio signal, wherein a speech processing of the combined audio signal will output a different result from speech processing of the audio input signal.

Automatic speech recognition system addressing perceptual-based adversarial audio attacks

11222651 · 2022-01-11 ·

Robert Bosch Gmbh

User voice activity detection using dynamic classifier

11783809 · 2023-10-10 ·

Qualcomm Incorporated

A device includes a memory configured to store instructions and one or more processors configured execute the instructions. The one or more processors are configured execute the instructions to receive audio data including first audio data corresponding to a first output of a first microphone and second audio data corresponding to a second output of a second microphone. The one or more processors are also configured to execute the instructions to provide the audio data to a dynamic classifier. The dynamic classifier is configured to generate a classification output corresponding to the audio data. The one or more processors are further configured to execute the instructions to determine, at least partially based on the classification output, whether the audio data corresponds to user voice activity.

User voice activity detection using dynamic classifier

11783809 · 2023-10-10 ·

Qualcomm Incorporated

Augmentation of Audiographic Images for Improved Machine Learning

20230359898 · 2023-11-09 ·

Generally, the present disclosure is directed to systems and methods that generate augmented training data for machine-learned models via application of one or more augmentation techniques to audiographic images that visually represent audio signals. In particular, the present disclosure provides a number of novel augmentation operations which can be performed directly upon the audiographic image (e.g., as opposed to the raw audio data) to generate augmented training data that results in improved model performance. As an example, the audiographic images can be or include one or more spectrograms or filter bank sequences.

Augmentation of Audiographic Images for Improved Machine Learning

20230359898 · 2023-11-09 ·

Augmentation of audiographic images for improved machine learning

11816577 · 2023-11-14 ·

Google Llc

Augmentation of audiographic images for improved machine learning

11816577 · 2023-11-14 ·

Google Llc

Unsupervised keyword spotting and word discovery for fraud analytics

11810559 · 2023-11-07 ·

Pindrop Security, Inc.

Hrishikesh RAO

Embodiments described herein provide for a computer that detects one or more keywords of interest using acoustic features, to detect or query commonalities across multiple fraud calls. Embodiments described herein may implement unsupervised keyword spotting (UKWS) or unsupervised word discovery (UWD) in order to identify commonalities across a set of calls, where both UKWS and UWD employ Gaussian Mixture Models (GMM) and one or more dynamic time-warping algorithms. A user may indicate a training exemplar or occurrence of call-specific information, referred to herein as “a named entity,” such as a person's name, an account number, account balance, or order number. The computer may perform a redaction process that computationally nullifies the import of the named entity in the modeling processes described herein.

UNSUPERVISED KEYWORD SPOTTING AND WORD DISCOVERY FOR FRAUD ANALYTICS

20220301554 · 2022-09-22 ·

Pindrop Security, Inc.

Hrishikesh RAO

Patent classifications

G10L15/12