Patent classifications
G10L19/00
Method and system for streaming a multichannel audio signal to a binaural hearing system
There is provided a method for streaming a multichannel audio signal comprising a first channel (L) and a second channel (R) from an audio source device to a binaural hearing system comprising a first hearing device worn at first ear of a user and a second hearing device worn at a second ear of the user.
METHOD AND SYSTEM OF AUDIO PROCESSING USING COCHLEAR-SIMULATING SPIKE DATA
A method and system of audio processing encodes cochlear-simulating spike data into spectrogram data.
METHOD AND DEVICE FOR GENERATING SPEECH MOVING IMAGE
A device for generating a speech moving image according to an embodiment includes a first encoder that receives a person background image in which a portion related to speech of a person that is a video part of the speech moving image of the person is covered with a mask, extracts an image feature vector from the person background image, and compresses the extracted image feature vector, a second encoder that receives a speech audio signal that is an audio part of the speech moving image, extracts a voice feature vector from the speech audio signal, and compresses the extracted voice feature vector, a combination unit that generates a combination vector of the compressed image feature vector and the compressed voice feature vector, and an image reconstruction unit that reconstructs the speech moving image of the person with the combination as an input.
PERFORMING GLOBAL IMAGE EDITING USING EDITING OPERATIONS DETERMINED FROM NATURAL LANGUAGE REQUESTS
The present disclosure relates to systems, methods, and non-transitory computer-readable media that utilize a neural network having a long short-term memory encoder-decoder architecture to progressively modify a digital image in accordance with a natural language request. For example, in one or more embodiments, the disclosed systems utilize a language-to-operation decoding cell of a language-to-operation neural network to sequentially determine one or more image-modification operations to perform to modify a digital image in accordance with a natural language request. In some cases, the decoding cell determines an image-modification operation to perform partly based on the previously used image-modification operations. The disclosed systems further utilize the decoding cell to determine one or more operation parameters for each selected image-modification operation. The disclosed systems utilize the image-modification operation(s) and operation parameter(s) to modify the digital image (e.g., by generating one or more modified digital images) via the decoding cell.
COMPUTERIZED MONITORING OF DIGITAL AUDIO SIGNALS
A digital audio quality monitoring device uses a deep neural network (DNN) to provide accurate estimates of signal-to-noise ratio (SNR) from a limited set of features extracted from incoming audio. Some embodiments improve the SNR estimate accuracy by selecting a DNN model from a plurality of available models based on a codec used to compress/decompress the incoming audio. Each model has been trained on audio compressed/decompressed by a codec associated with the model, and the monitoring device selects the model associated with the codec used to compress/decompress the incoming audio. Other embodiments are also provided.
COMPUTERIZED MONITORING OF DIGITAL AUDIO SIGNALS
A digital audio quality monitoring device uses a deep neural network (DNN) to provide accurate estimates of signal-to-noise ratio (SNR) from a limited set of features extracted from incoming audio. Some embodiments improve the SNR estimate accuracy by selecting a DNN model from a plurality of available models based on a codec used to compress/decompress the incoming audio. Each model has been trained on audio compressed/decompressed by a codec associated with the model, and the monitoring device selects the model associated with the codec used to compress/decompress the incoming audio. Other embodiments are also provided.
METHODS AND SYSTEM FOR WAVEFORM CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL
Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.
METHODS AND SYSTEM FOR WAVEFORM CODING OF AUDIO SIGNALS WITH A GENERATIVE MODEL
Described herein is a method of waveform decoding, the method including the steps of: (a) receiving, by a waveform decoder, a bitstream including a finite bitrate representation of a source signal; (b) waveform decoding the finite bitrate representation of the source signal to obtain a waveform approximation of the source signal; (c) providing the waveform approximation of the source signal to a generative model that implements a probability density function, to obtain a probability distribution for a reconstructed signal of the source signal; and (d) generating the reconstructed signal of the source signal based on the probability distribution. Described are further a method and system for waveform coding and a method of training a generative model.
COMPUTER SYSTEM FOR TRANSMITTING AUDIO CONTENT TO REALIZE CUSTOMIZED BEING-THERE AND METHOD THEREOF
Provided are a computer system for transmitting audio content to realize a user-customized being-there and a method thereof. The computer system may be configured to detect audio files that are generated for a plurality of objects at a venue, respectively, and metadata including spatial features that are set for the objects at the venue, respectively, and to transmit the audio files and the metadata for a user. An electronic device of the user may realize a being-there at the venue by rendering the audio files based on the spatial features in the metadata. That is, the user may feel a user-customized being-there as if the user directly listens to audio signals generated from corresponding objects at a venue in which the objects are provided.
COMPUTER SYSTEM FOR TRANSMITTING AUDIO CONTENT TO REALIZE CUSTOMIZED BEING-THERE AND METHOD THEREOF
Provided are a computer system for transmitting audio content to realize a user-customized being-there and a method thereof. The computer system may be configured to detect audio files that are generated for a plurality of objects at a venue, respectively, and metadata including spatial features that are set for the objects at the venue, respectively, and to transmit the audio files and the metadata for a user. An electronic device of the user may realize a being-there at the venue by rendering the audio files based on the spatial features in the metadata. That is, the user may feel a user-customized being-there as if the user directly listens to audio signals generated from corresponding objects at a venue in which the objects are provided.