G10H2250/015

CHARACTERIZING AUDIO USING TRANSCHROMAGRAMS
20190096371 · 2019-03-28 ·

Methods, systems and apparatus to characterize audio using transchromagrams are disclosed. An example apparatus includes a transchromagram generator to generate a data structure based on a set of transition matrices corresponding to a plurality of time frames of audio data, the data structure indicative of probabilities that first musical notes will transition to second musical notes, a database controller to prompt a database to store the data structure within the audio data, and a notification manager to generate, based on a comparison between query audio data and the stored data structure of the audio data, a notification identifying at least one characteristic of the query audio data.

Intuitive Music Visualization Using Efficient Structural Segmentation
20180374459 · 2018-12-27 ·

Embodiments of the present invention relate to automatically identifying structures of a music stream. A segment structure may be generated that visually indicates repeating segments of a music stream. To generate a segment structure, a feature that corresponds to a music attribute from a waveform corresponding to the music stream is extracted from a waveform, such as an input signal. Utilizing a signal segmentation algorithm, such as a Variable Markov Oracle (VMO) algorithm, a symbolized signal, such as a VMO structure, is generated. From the symbolized signal, a matrix is generated. The matrix may be, for instance, a VMO-SSM. A segment structure is then generated from the matrix. The segment structure illustrates a segmentation of the music stream and the segments that are repetitive.

Characterizing audio using transchromagrams
10147407 · 2018-12-04 · ·

Methods, systems and apparatus to characterize audio using transchromagrams are disclosed. An example method includes generating, by executing one or more instructions on a processor, a set of transition matrices based on a plurality of time frames of the audio data, each of the plurality of transition matrices generated based on a different pair of time frames in the plurality of time frames, and indicating probabilities that anterior musical notes in an anterior time frame of the pair transition to posterior musical notes in a posterior time frame of the pair, generating, by executing one or more instructions on a processor, a data structure representing how the audio data changes statistically between the plurality of time frames based on the set of transition matrices, and causing, by executing one or more instructions on a processor, a database to store the data structure within metadata that describes the audio data.

Intuitive music visualization using efficient structural segmentation

Embodiments of the present invention relate to automatically identifying structures of a music stream. A segment structure may be generated that visually indicates repeating segments of a music stream. To generate a segment structure, a feature that corresponds to a music attribute from a waveform corresponding to the music stream is extracted from a waveform, such as an input signal. Utilizing a signal segmentation algorithm, such as a Variable Markov Oracle (VMO) algorithm, a symbolized signal, such as a VMO structure, is generated. From the symbolized signal, a matrix is generated. The matrix may be, for instance, a VMO-SSM. A segment structure is then generated from the matrix. The segment structure illustrates a segmentation of the music stream and the segments that are repetitive.

TRACKING BEATS AND DOWNBEATS OF VOICES IN REAL TIME
20240395231 · 2024-11-28 ·

The present disclosure describes techniques for tracking beats and downbeats of audio, such as human voices, in real time. Audio may be received in real time. The audio may be split into a sequence of segments. A sequence of audio features representing the sequence of segments of the audio may be extracted. A continuous sequence of activations indicative of probabilities of beats or downbeats occurring in the sequence of segments of the audio may be generated using a machine learning model with causal mechanisms. Timings of the beats or the downbeats occurring in the sequence of segments of the audio may be determined based on the continuous sequence of activations by fusing local rhythmic information with respect to each instant segment with information indicative of beats or downbeats in previous segments among the sequence of segments.

Music modeling

A computer implemented method is provided for generating a prediction of a next musical note by a computer having at least a processor and a memory. A computer processor system is also provided for generating a prediction of a next musical note. The method includes storing sequential musical notes in the memory. The method further includes dividing, by the processor, the sequential musical notes into sections of a given length based on a Generative Theory of Tonal Music. The method also includes generating, by the processor, the prediction of the next musical note based upon a music model, the sections, and the sequential musical notes stored in the memory. The given length is determined based on one or more conditions.

CHARACTERIZING AUDIO USING TRANSCHROMAGRAMS
20180061382 · 2018-03-01 ·

Methods, systems and apparatus to characterize audio using transchromagrams are disclosed. An example method includes generating, by executing one or more instructions on a processor, a set of transition matrices based on a plurality of time frames of the audio data, each of the plurality of transition matrices generated based on a different pair of time frames in the plurality of time frames, and indicating probabilities that anterior musical notes in an anterior time frame of the pair transition to posterior musical notes in a posterior time frame of the pair, generating, by executing one or more instructions on a processor, a data structure representing how the audio data changes statistically between the plurality of time frames based on the set of transition matrices, and causing, by executing one or more instructions on a processor, a database to store the data structure within metadata that describes the audio data.

CROWD-SOURCED TECHNIQUE FOR PITCH TRACK GENERATION
20180018949 · 2018-01-18 ·

Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.

Method for following a musical score and associated modeling method
09865241 · 2018-01-09 · ·

A method for following a musical score in real time. At least one sound emitted by a performer is recorded. At least one chromatic vector is estimated. The chromatic vector is compared with theoretical chromatic vectors of the musical score. A transition between the chromatic vector and a previous chromatic vector with theoretical transitions of the musical score is compared. A work position of the performer depending on a previous work position is estimated from the comparison of the chromatic vector and the comparison of the transition. The recording is carried out for a suitable period depending on the ratio between a period of the transition and a reference period.

Comparison training for music generator

Techniques are disclosed relating to automatically generating new music content based on image representations of audio files. A music generation system includes a music generation subsystem and a music classification subsystem. The music generation subsystem may generate output music content according to music parameters that define policy for generating music. The classification subsystem may be used to classify whether music is generated by the music generation subsystem or is professionally produced music content. The music generation subsystem may implement an algorithm that is reinforced by prediction output from the music classification subsystem. Reinforcement may include tuning the music parameters to generate more human-like music content.