Patent classifications
G10L19/022
GROUPING AND TRANSPORT OF AUDIO OBJECTS
An apparatus for audio signal processing audio objects within at least one audio scene, the apparatus comprising at least one processor configured to: define for at least one time period at least one contextual grouping comprising at least two of a plurality of audio objects and at least one further audio object of the plurality of audio objects outside of the at least one contextual grouping, the plurality of audio objects within at least one audio scene; and define with respect to the at least one contextual grouping at least one first parameter and/or parameter rule type which is configured to be applied with respect to a common element associated with the at least two of the plurality of audio objects and wherein the at least one first parameter and/or parameter rule type is configured to be applied with respect to individual element associated with the at least one further audio object outside of the at least one contextual grouping, the at least one first parameter and/or parameter rule type being applied in audio rendering of both the at least two of the plurality of audio objects and the at least one further audio object.
Audio object clustering by utilizing temporal variations of audio objects
Embodiments of the present invention relate to audio object clustering by utilizing temporal variation of audio objects. There is provided a method of estimating temporal variation of an audio object for use in audio object clustering. The method comprises obtaining at least one segment of an audio track associated with the audio object, the at least one segment containing the audio object; estimating variation of the audio object over a time duration of the at least one segment based on at least one property of the audio object and adjusting, at least partially based on the estimated variation of the audio object, a contribution of the audio object to the determination of a centroid in the audio object clustering. Corresponding system and computer program product are disclosed.
Audio object clustering by utilizing temporal variations of audio objects
Embodiments of the present invention relate to audio object clustering by utilizing temporal variation of audio objects. There is provided a method of estimating temporal variation of an audio object for use in audio object clustering. The method comprises obtaining at least one segment of an audio track associated with the audio object, the at least one segment containing the audio object; estimating variation of the audio object over a time duration of the at least one segment based on at least one property of the audio object and adjusting, at least partially based on the estimated variation of the audio object, a contribution of the audio object to the determination of a centroid in the audio object clustering. Corresponding system and computer program product are disclosed.
Subband Block Based Harmonic Transposition
The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S. The subband processing unit performs a block based nonlinear processing wherein the magnitude of samples of the synthesis subband signal are determined from the magnitude of corresponding samples of the analysis subband signal and a predetermined sample of the analysis subband signal. In addition, the system comprises a synthesis filterbank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
Subband Block Based Harmonic Transposition
The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S. The subband processing unit performs a block based nonlinear processing wherein the magnitude of samples of the synthesis subband signal are determined from the magnitude of corresponding samples of the analysis subband signal and a predetermined sample of the analysis subband signal. In addition, the system comprises a synthesis filterbank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.
SOUND SIGNAL ENCODING METHOD, SOUND SIGNAL ENCODER, PROGRAM, AND RECORDING MEDIUM
There is provided such embedded encoding that the algorithmic delay of stereo coding/decoding is not larger than that of monaural coding/decoding. An encoding device (100) encodes a sound signal having a plurality of channels. A stereo encoding unit (110) obtains and outputs a stereo code representing a characteristic of difference between channels of the sound signal. A downmix unit (150) obtains a signal by mixing the sound signal as a downmix signal. A monaural encoding unit (120) encodes the downmix signal by an encoding scheme that includes processing of applying a window having overlap between frames to obtain and output a monaural code. An additional encoding unit (130) encodes a part of the downmix signal for a section corresponding to the overlap between a current frame and an immediately following frame to obtain and output an additional code.
SOUND SIGNAL ENCODING METHOD, SOUND SIGNAL ENCODER, PROGRAM, AND RECORDING MEDIUM
There is provided such embedded encoding that the algorithmic delay of stereo coding/decoding is not larger than that of monaural coding/decoding. An encoding device (100) encodes a sound signal having a plurality of channels. A stereo encoding unit (110) obtains and outputs a stereo code representing a characteristic of difference between channels of the sound signal. A downmix unit (150) obtains a signal by mixing the sound signal as a downmix signal. A monaural encoding unit (120) encodes the downmix signal by an encoding scheme that includes processing of applying a window having overlap between frames to obtain and output a monaural code. An additional encoding unit (130) encodes a part of the downmix signal for a section corresponding to the overlap between a current frame and an immediately following frame to obtain and output an additional code.
Data carriage in encoded and pre-encoded audio bitstreams
A method for a machine or group of machines to carry watermark data in an encoded audio data frame of an audio signal includes receiving the encoded audio data frame having encoded therein a portion of the audio signal. The encoded audio data frame includes a plurality of data blocks, wherein the plurality of data blocks includes, a synchronization information block, at least one encoded data block, and an error check block. The method further includes receiving modified watermark data as modified based on a masking threshold analysis of the audio signal and transforming the encoded audio data frame into a modified encoded audio data frame.
Data carriage in encoded and pre-encoded audio bitstreams
A method for a machine or group of machines to carry watermark data in an encoded audio data frame of an audio signal includes receiving the encoded audio data frame having encoded therein a portion of the audio signal. The encoded audio data frame includes a plurality of data blocks, wherein the plurality of data blocks includes, a synchronization information block, at least one encoded data block, and an error check block. The method further includes receiving modified watermark data as modified based on a masking threshold analysis of the audio signal and transforming the encoded audio data frame into a modified encoded audio data frame.
Systems and methods for capturing noise for pattern recognition processing
Systems and methods provide a first sample of audio data and detect speech onset in the first sample of the audio data. Responsive to detecting the speech onset, systems and methods switch from capturing second samples of the audio data at first intervals, to capturing the second samples of the audio data at second intervals. Systems and methods provide contiguous audio data using the second samples of the audio data captured at the first intervals and at least one captured portion of the second samples of the audio data captured at the second intervals.