G10H2250/225

Methods for audio signal transient detection and decorrelation control

Some audio processing methods may involve receiving audio data corresponding to a plurality of audio channels and determining audio characteristics of the audio data, which may include transient information. An amount of decorrelation for the audio data may be based, at least in part, on the audio characteristics. If a definite transient event is determined, a decorrelation process may be temporarily halted or slowed. Determining transient information may involve evaluating the likelihood and/or the severity of a transient event. In some implementations, determining transient information may involve evaluating a temporal power variation in the audio data. Explicit transient information may or may not be received with the audio data, depending on the implementation. Explicit transient information may include a transient control value corresponding to a definite transient event, a definite non-transient event or an intermediate transient control value.

METHODS AND SYSTEMS FOR DETERMINING COMPACT SEMANTIC REPRESENTATIONS OF DIGITAL AUDIO SIGNALS

A method and system for determining a compact semantic representation of a digital audio signal using a computer-based system by calculating at least one low-level feature matrix from the digital audio signal; processing the low-level feature matrix or matrices using pre-trained machine learning engines including an ensemble of modules, wherein each module in the ensemble is trained to predict a one of a plurality of high-level feature values; and concatenating the obtained plurality of high-level feature values into a descriptor vector. The calculated descriptor vectors can be used alone, or in an arbitrary or temporally ordered combination with further descriptor vectors calculated from different audio signals extracted from the same music track, as a compact semantic representation of the respective music track.

COMPUTER-IMPLEMENTED METHOD AND DEVICE FOR GENERATING FREQUENCY COMPONENT VECTOR OF TIME-SERIES DATA
20210166128 · 2021-06-03 ·

A computer-implemented method generates a frequency component vector of time series data, by executing a first process and a second process in each unit step. The first process includes: receiving first data; and processing the first data using a first neural network to generate intermediate data. The second process includes: receiving the generated intermediate data; and generating a plurality of component values corresponding to a plurality of frequency bands based on the generated intermediate data such that: a first component value corresponding to a first frequency band is generated using a second neural network based on the generated intermediate data; and a second component value corresponding to a second frequency band different from the first frequency band is generated using the second neural network based on the generated intermediate data and the generated first component value corresponding to the first frequency band.

Parameter determination device, method, program and recording medium for determining a parameter indicating a characteristic of sound signal

A parameter determination device includes: a spectral envelope estimating portion performing estimation of a spectral envelope using a parameter .sub.0 specified in a predetermined method, regarding the .sub.0-th power of absolute values of a frequency domain sample sequence corresponding to a time-series signal as a power spectrum on the assumption that the parameter .sub.0 and a parameter are positive numbers; a whitened spectral sequence generating portion obtaining a whitened spectral sequence which is a sequence obtained by dividing the frequency domain sample sequence by the spectral envelope; and a parameter acquiring portion determining such a parameter that generalized Gaussian distribution with the parameter as a shape parameter approximates a histogram of the whitened spectral sequence.

Multi-structural, multi-level information formalization and structuring method, and associated apparatus

Systems and methods for structuring information include determining information quantity (IQ) and information value (IV) in an original digital information file (ODIF). An initial manipulation process applied to the ODIF forms a first resulting DIF (FRDIF), and a subsequent manipulation process applied to the FRDIF forms a second resulting DIF, wherein each manipulation process removes at least one element of the processed DIF and/or represents an element combination with a representative element and a first indicia of an interrelationship between the representative element and one or more elements in the combination, to reduce the IQ of the processed DIF, while retaining the IV thereof within a threshold. Manipulation processes are successively applied to the previously resulting DIF until successive applications do not achieve a threshold reduction in IQ. The last resulting DIF has a primary structure with a reduced IQ and an IV within the threshold of the original IV.

PARAMETER DETERMINATION DEVICE, METHOD, PROGRAM AND RECORDING MEDIUM

A parameter determination device includes: a spectral envelope estimating portion performing estimation of a spectral envelope using a parameter .sub.0 specified in a predetermined method, regarding the .sub.0-th power of absolute values of a frequency domain sample sequence corresponding to a time-series signal as a power spectrum on the assumption that the parameter .sub.0 and a parameter are positive numbers; a whitened spectral sequence generating portion obtaining a whitened spectral sequence which is a sequence obtained by dividing the frequency domain sample sequence by the spectral envelope; and a parameter acquiring portion determining such a parameter that generalized Gaussian distribution with the parameter as a shape parameter approximates a histogram of the whitened spectral sequence.

Enhanced chroma extraction from an audio codec

The present document relates to methods and systems for music information retrieval (MIR). In particular, the present document relates to methods and systems for extracting a chroma vector from an audio signal. A method (900) for determining a chroma vector (100) for a block of samples of an audio signal (301) is described. The method (900) comprises receiving (901) a corresponding block of frequency coefficients derived from the block of samples of the audio signal (301) from a core encoder (412) of a spectral band replication based audio encoder (410) adapted to generate an encoded bitstream (305) of the audio signal (301) from the block of frequency coefficients; and determining (904) the chroma vector (100) for the block of samples of the audio signal (301) based on the received block of frequency coefficients.

Audio signal processing methods and systems
09570057 · 2017-02-14 ·

Described are methods and systems of identifying one or more fundamental frequency component(s) of an audio signal. The methods and systems may include any one or more of an audio event receiving step, a signal discretization step, a masking step, and/or a transcription step.

MULTI-STRUCTURAL, MULTI-LEVEL INFORMATION FORMALIZATION AND STRUCTURING METHOD, AND ASSOCIATED APPARATUS

Systems and methods for structuring information include determining information quantity (IQ) and information value (IV) in an original digital information file (ODIF). An initial manipulation process applied to the ODIF forms a first resulting DIF (FRDIF), and a subsequent manipulation process applied to the FRDIF forms a second resulting DIF, wherein each manipulation process removes at least one element of the processed DIF and/or represents an element combination with a representative element and a first indicia of an interrelationship between the representative element and one or more elements in the combination, to reduce the IQ of the processed DIF, while retaining the IV thereof within a threshold. Manipulation processes are successively applied to the previously resulting DIF until successive applications do not achieve a threshold reduction in IQ. The last resulting DIF has a primary structure with a reduced IQ and an IV within the threshold of the original IV.