G10H2250/015

Audio processing method and audio processing apparatus, and training method

Audio processing method and audio processing apparatus, and training method are described. According to embodiments of the application, an accent identifier is used to identify accent frames from a plurality of audio frames, resulting in an accent sequence comprised of probability scores of accent and/or non-accent decisions with respect to the plurality of audio frames. Then a tempo estimator is used to estimate a tempo sequence of the plurality of audio frames based on the accent sequence. The embodiments can be well adaptive to the change of tempo, and can be further used to tracking beats properly.

APPARATUS AND METHOD FOR GENERATING VISUAL CONTENT FROM AN AUDIO SIGNAL
20170337913 · 2017-11-23 ·

An apparatus and method for generating visual content from an audio signal are described. The method includes receiving (310) audio content, processing (320) the audio content to separate into a first and second portion of the audio content, converting (330) the second portion into visual content, delaying (340) the first portion based on a time relationship between the audio content and the visual content, the delaying accounting for time to process the first portion and convert the second portion, and providing (350) the visual content and audio content for reproduction. The apparatus includes a source separation module (210) processing the received audio content to separate into a first and second portion of the audio content, a converter module (220) converting the second portion into visual content, and a synchronization module (230) delaying the first portion based on a time relationship between the audio content and the visual content.

Music modeling

A computer implemented method is provided for generating a prediction of a next musical note by a computer having at least a processor and a memory. A computer processor system is also provided for generating a prediction of a next musical note. The method includes storing sequential musical notes in the memory. The method further includes generating, by the processor, the prediction of the next musical note based upon a music model and the sequential musical notes stored in the memory. The method also includes updating, by the processor, the music model based upon the prediction of the next musical note and an actual one of the next musical note. The method additionally includes resetting, by the processor, the memory at fixed time intervals.

Crowd-sourced technique for pitch track generation

Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
20220230104 · 2022-07-21 · ·

An information processing apparatus according to the present disclosure includes a generation unit that generates a model regarding generation of content by using data provided by a user subject of a service regarding creation of the content, the user subject having one authority level among a plurality of authority levels of the service, and a determination unit that determines a usage mode of the model generated by the generation unit according to the one authority level of the user subject.

Chord identification method and chord identification apparatus
11322124 · 2022-05-03 · ·

A chord identification method selects from among a plurality of chord identifiers a chord identifier that corresponds to an attribute of a piece of music represented by an audio signal, where the plurality of chord identifiers corresponds to respective ones of a plurality of attributes relating to pieces of music; and identifies a chord for the audio signal by applying a feature amount of the audio signal to the selected chord identifier.

Generative composition with texture groups
20230146658 · 2023-05-11 · ·

A computer-implemented method of generating a musical composition containing a plurality of musical texture groups is disclosed. The method includes assembling musical texture groups from musical instrument components and associating therewith a tag expressing emotional textural connotation. The instrument components have musical textural classifiers selected from a set of pre-defined textural classifiers such that different instrument components may have a different subset of pre-defined textural classifiers. The textural classifiers within a texture group possess either no musical feature attribute or a single musical feature attribute and any number of musical accompaniment attributes. The method then generates at least one chord scheme to a narrative brief, to provide an emotional connotation to a series of events, the chord scheme generated by selecting and assembling Form Atoms. The final step includes applying a texture to the chord scheme to generate the musical composition reflecting the narrative brief.

FORM ATOM HEURISTICS AND GENERATIVE COMPOSITION
20230197041 · 2023-06-22 · ·

A Form Atom defined by self-contained constructional properties representing a historical corpus of music and contained within metadata of the Form Atom is disclosed. The Form Atom has a generative set of heuristics to support generation of a set of chords in a chord scheme or many different sets of chords. The generated chords are spaced out within a defined window of musical time by chord spacer heuristics. The Form Atom has a tag describing its compositional heuristics. A chord list of the Form Atom is provided in local tonic and defines branching structures that may be used for the generation of different chords from the local tonic. A progression descriptor is combined with a form function such that the Form Atom expresses musically a question, an answer and a statement. A meta-map of a chord scheme for a musical section is created from the metadata.

GENERATIVE COMPOSITION USING FORM ATOM HEURISTICS
20230206890 · 2023-06-29 · ·

A processor-based method of producing a generative musical composition is disclosed herein. The method includes the step of receiving a briefing narrative which describes a musical journey by referencing a plurality of emotional descriptions related to a plurality of musical sections. The generative musical composition is assembled with regard to the briefing narrative through the selection and concatenation of Form Atoms with tags that align with the emotional descriptions related to the musical sections. The Form Atoms, which have compositional nature aligned with the emotional descriptions and self-contained constructional properties representative of the historical corpus of music, are then selected and substituted into the generative composition. The method further involves the step of generating the musical composition by mapping musical transition between selectively chosen Form Atoms to reflect pre-established transitions between Form Atoms and groups Form Atoms that have been identified to have similar tags but different constructional properties.

SYSTEM AND METHODS FOR AUTOMATICALLY GENERATING A MUSCIAL COMPOSITION HAVING AUDIBLY CORRECT FORM
20220319478 · 2022-10-06 ·

A generative composition system reduces existing musical artefacts to constituent elements termed “Form Atoms”. These Form Atoms may each be of varying length and have musical properties and associations that link together through Markov chains. To provide myriad new composition, a set of heuristics ensures that musical textures between concatenated musical sections follow a supplied and defined briefing narrative for the new composition whilst contiguous concatenated Form Atoms are also automatically selected to see that similarities in respective and identified attributes of musical textures for those musical sections are maintained to support maintenance of musical form. Independent aspects of the disclosure further ensure that, within the composition work, such as a media product or a real-time audio stream, chord spacing determination and control are practiced to maintain musical sense in the new composition. Further, a structuring of primitive heuristics operates to maintain pitch and permit key transformation. The system and its functionality provides signal analysis and music generation through allowing emotional connotations to be specified and reproduced from cross-referenced Form-Atoms.