Patent classifications
G10L19/0216
LOW LATENCY AUDIO STREAM ACCELERATION BY SELECTIVELY DROPPING AND BLENDING AUDIO BLOCKS
A method and device for accelerated audio processing in a streaming environment. The method comprises receiving a streaming audio asset, locating a position to ignore processing of an audio block of the streaming audio asset, ignoring the audio block, compensating for the ignored audio block and playing the compensated audio on an audio device.
Deep learning segmentation of audio using magnitude spectrogram
A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.
Method, apparatus, and device for transient noise detection
Disclosed is a method, an apparatus, and a device for transient noise detection. The method includes: obtaining an audio frame signal having a preset duration; performing wavelet decomposition on a first audio frame signal to obtain a first wavelet decomposition signal corresponding to the first audio frame signal; determining a first reference audio intensity value of a first sub-wavelet decomposition signal according to reference audio intensity values of all samples in the first sub-wavelet decomposition signal; determining energy distribution information of the first wavelet decomposition signal according to first reference audio intensity values of all sub-wavelet decomposition signals in the first wavelet decomposition signal; and determining a probability that the first audio frame signal is transient noise according to the energy distribution information of the first wavelet decomposition signal.
METHOD AND SYSTEM FOR MULTI-TALKER BABBLE NOISE REDUCTION
A system and method for improving intelligibility of speech is provided. The system and method may include obtaining an input audio signal frame, classifying the input audio signal frame into a first category or a second category, wherein the first category corresponds to the noise being stronger than the speech signal, and the second category corresponds to the speech signal being stronger than the noise, decomposing the input audio signal frame into a plurality of sub-band components; de-noising each sub-band component of the input audio signal frame in parallel by applying a first wavelet de-noising method including a first wavelet transform and a predetermined threshold for the sub-band component, and a second wavelet de-noising method including a second wavelet transform and the predetermined threshold for the sub-band component, wherein the predetermined threshold for each sub-band component is based on at least one previous noise-dominant signal frame received by the receiving arrangement.
DEEP LEARNING SEGMENTATION OF AUDIO USING MAGNITUDE SPECTROGRAM
A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.
ADVANCED MAXIMAL ENTROPY MEDIA COMPRESSION PROCESSING
A system and method for compression performs analysis of incoming audio or video data, and selects a manifold based on the analysis of the data. A deep learning model is then trained for the manifold. The data is broken down into components and entropy maximization algorithms are utilized for each component before compression commences. Finally, the system translates the compressed data into a standard file format.
Advanced maximal entropy media compression processing
A system and method for compression performs analysis of incoming audio or video data, and selects a manifold based on the analysis of the data. A deep learning model is then trained for the manifold. The data is broken down into components and entropy maximization algorithms are utilized for each component before compression commences. Finally, the system translates the compressed data into a standard file format.
ADVANCED MAXIMAL ENTROPY MEDIA COMPRESSION PROCESSING
A system and method for compression performs analysis of incoming audio or video data, and selects a manifold based on the analysis of the data. A deep learning model is then trained for the manifold. The data is broken down into components and entropy maximization algorithms are utilized for each component before compression commences. Finally, the system translates the compressed data into a standard file format.
Information exchange on mobile devices using audio
In some implementations, a user device may receive input that triggers transmission of information via sound. The user device may select an audio clip based on a setting associated with the device, and may modify a digital representation of the selected audio clip using an encoding algorithm and based on data associated with a user of the device. The user device may transmit, to a remote server, an indication of the selected audio clip, an indication of the encoding algorithm, and the data associated with the user. The user device may use a speaker to play audio, based on the modified digital representation, for recording by other devices. Accordingly, the user device may receive, from the remote server and based on the speaker playing the audio, a confirmation that users associated with the other devices have performed an action based on the data associated with the user of the device.
INFORMATION EXCHANGE ON MOBILE DEVICES USING AUDIO
In some implementations, a user device may receive input that triggers transmission of information via sound. The user device may select an audio clip based on a setting associated with the device, and may modify a digital representation of the selected audio clip using an encoding algorithm and based on data associated with a user of the device. The user device may transmit, to a remote server, an indication of the selected audio clip, an indication of the encoding algorithm, and the data associated with the user. The user device may use a speaker to play audio, based on the modified digital representation, for recording by other devices. Accordingly, the user device may receive, from the remote server and based on the speaker playing the audio, a confirmation that users associated with the other devices have performed an action based on the data associated with the user of the device.