IPIQ

G10L25/30

Emitting word timings with end-to-end models

11580956 · 2023-02-14 ·

Google Llc

A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

Apparatus and method for encoding/decoding audio signal using information of previous frame

11581000 · 2023-02-14 ·

Electronics And Telecommunications Research Institute

Disclosed is an apparatus and method for encoding/decoding an audio signal using information of a previous frame. An audio signal encoding method includes: generating a current latent vector by reducing dimension of a current frame of an audio signal; generating a concatenation vector by concatenating a previous latent vector generated by reducing dimension of a previous frame of the audio signal with the current latent vector; and encoding and quantizing the concatenation vector.

Apparatus and method for encoding/decoding audio signal using information of previous frame

11581000 · 2023-02-14 ·

Electronics And Telecommunications Research Institute

METHOD AND APPARATUS FOR AUTOMATIC COUGH DETECTION

20230039619 · 2023-02-09 ·

A method for identifying cough sounds in an audio recording of a subject including: operating at least one electronic processor to identify potential cough sounds in the audio recording; operating the at least one electronic processor to transform one or more of the potential cough sounds into corresponding one or more image representations; operating the at least one electronic processor to apply the one or more image representations to a representation pattern classifier trained to confirm that a potential cough sound is a cough sound or is not a cough sound; and operating the at least one electronic processor to flag one or more of the potential cough sounds as confirmed cough sounds based on an output of the representation pattern classifier.

METHOD AND APPARATUS FOR AUTOMATIC COUGH DETECTION

20230039619 · 2023-02-09 ·

Systems and Methods for Assisted Translation and Lip Matching for Voice Dubbing

20230039248 · 2023-02-09 ·

Paul McCartney

Systems and methods for generating candidate translations for use in creating synthetic or human-acted voice dubbings, aiding human translators in generating translations that match the corresponding video, automatically grading how well a candidate translation matches the corresponding video, suggesting modifications to the speed and/or timing of the translated text to improve the grading of a candidate translation, and suggesting modifications to the voice dubbing and/or video to improve the grading of a candidate translation. In that regard, the present technology may be used to fully automate the process of generating lip-matched translations and associated voice dubbings, or as an aid for human-in-the-loop processes that may reduce or eliminate the time and effort required from translators, adapters, voice actors, and/or audio editors to generate voice dubbings.

DIFFICULT AIRWAY EVALUATION METHOD AND DEVICE BASED ON MACHINE LEARNING VOICE TECHNOLOGY

20230044289 · 2023-02-09 ·

Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine

The present disclosure relates to a difficult airway evaluation method and device based on a machine learning voice technology. The method includes the following steps: acquiring voice data of a patient; carrying out feature extraction on the voice data, obtaining a pitch period of pronunciations, and acquiring a voiced sound feature and unvoiced sound features based on the pitch period of pronunciations; and constructing a difficult airway evaluation classifier based on the machine learning voice technology, analyzing the received voiced sound feature and unvoiced sound features by the trained difficult airway evaluation classifier, and carrying out scoring on the severity of a difficult airway to obtain an evaluation result of the difficult airway.

DIFFICULT AIRWAY EVALUATION METHOD AND DEVICE BASED ON MACHINE LEARNING VOICE TECHNOLOGY

20230044289 · 2023-02-09 ·

Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine

Method and System for Dereverberation of Speech Signals

20230042468 · 2023-02-09 ·

Mitsubishi Electric Research Laboratories, Inc.

A system and method for reverberation reduction is disclosed. A first Deep Neural Network (DNN) produces a first estimate of a target direct-path signal from a mixture of acoustic signals that include the target direct-path signal and a reverberation of the target direct-path signal. A filter modeling a room impulse response (RIR) for the first estimate is estimated. The filter when applied to the first estimate of the target direct-path signal generates a result closest to a residual between the mixture of the acoustic signals and the first estimate of the target direct-path signal according to a distance function. A mixture with reduced reverberation of the target direct-path signal is obtained by removing the result of applying the filter to the first estimate of the target direct-path signal from the received mixture. A second DNN produces a second estimate of the target direct-path signal from the mixture with reduced reverberation.

Method and System for Dereverberation of Speech Signals

20230042468 · 2023-02-09 ·

Mitsubishi Electric Research Laboratories, Inc.

Patent classifications

G10L25/30