IPIQ

G10L19/125

Audio decoder and method for providing a decoded audio information using an error concealment modifying a time domain excitation signal

10964334 · 2021-03-30 ·

Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Jérémie Lecomte

An audio decoder for providing a decoded audio information on the basis of an encoded audio information. The audio decoder has an error concealment configured to provide an error concealment audio information for concealing a loss of an audio frame, wherein the error concealment is configured to modify a time domain excitation signal obtained for one or more audio frames preceding a lost audio frame, in order to obtain the error concealment audio information.

Phase reconstruction in a speech decoder

10957331 · 2021-03-23 ·

Microsoft Technology Licensing, Llc

Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.

Watermarking of Synthetic Speech

20210050024 · 2021-02-18 ·

An audio watermark is embedded in synthetic speech, such as synthetic speech created using text-to-speech (TTS) synthesis. Such audio watermarks can, for example, be used to increase the accuracy of voice biometric (VB) and other systems in distinguishing synthetic speech from human speech. In addition to its use in voice biometrics, such audio watermarking can prevent misuse of human quality TTS, or other synthetic speech, in a variety of other contexts, such as incriminating recordings, spam messages, contact center denial of service, and protection of personal information in contact centers not utilizing VB.

Watermarking of Synthetic Speech

20210050024 · 2021-02-18 ·

System and Method for Voice Morphing

20210089626 · 2021-03-25 ·

Soundhound, Inc.

Dylan H. Ross

A system and method for masking an identity of a speaker of natural language speech, such as speech clips to be labeled by humans in a system generating voice transcriptions for training an automatic speech recognition model. The natural language speech is morphed prior to being presented to the human for labeling. In one embodiment, morphing comprises pitch shifting the speech randomly either up or down, then frequency shifting the speech, then pitch shifting the speech in a direction opposite the first pitch shift.

System and Method for Voice Morphing

20210089626 · 2021-03-25 ·

Soundhound, Inc.

Dylan H. Ross

POST FILTER FOR AUDIO SIGNALS

20210035592 · 2021-02-04 ·

Dolby International Ab

In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

POST FILTER FOR AUDIO SIGNALS

20210035592 · 2021-02-04 ·

Dolby International Ab

ENCODER, DECODER AND METHOD FOR ENCODING AND DECODING AUDIO CONTENT USING PARAMETERS FOR ENHANCING A CONCEALMENT

20240005935 · 2024-01-04 ·

Described are an encoder for coding speech-like content and/or general audio content, wherein the encoder is configured to embed, at least in some frames, parameters in a bitstream, which parameters enhance a concealment in case an original frame is lost, corrupted or delayed, and a decoder for decoding speech-like content and/or general audio content, wherein the decoder is configured to use parameters which are sent later in time to enhance a concealment in case an original frame is lost, corrupted or delayed, as well as a method for encoding and a method for decoding.

Encoder, decoder and method for encoding and decoding audio content using parameters for enhancing a concealment

10878830 · 2020-12-29 ·

Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Patent classifications

G10L19/125