Patent classifications
G10H2210/056
Audio stem identification systems and methods
Methods, systems and computer program products are provided for identifying an audio stem. Audio stems (t.sub.1, . . . , t.sub.N) are stored on a stem database and songs (S.sub.1, . . . , S.sub.P) made with at least a subset of the plurality of the audio stems (t.sub.1, . . . , t.sub.N) are stored on a song database. At least partially composed song (S*) having a predetermined number of pre-selected stems (k) are received. In turn, a probability vector (or relevance value or ranking) is produced for each stem (t.sub.1, . . . , t.sub.N) to be complementary to the at least partially composed song (S*).
IMAGE CONTROL SYSTEM AND METHOD FOR CONTROLLING IMAGE
An image control system includes an estimation circuit, an obtaining circuit, and an image control circuit, which circuits can be at least one processor. The estimation circuit receives an output from a trained model in response to an input of an acoustic signal indicating a performance of a musical piece. The estimation circuit estimates musical performance information based on the output. The musical performance information is associated with the performance indicated by the acoustic signal relative to the musical piece. The obtaining circuit obtains manipulation information that is associated with a playback of an image. The image control circuit controls the playback of the image based on the musical performance information. The image control circuit controls the playback of the image based on the manipulation information upon obtaining the manipulation information.
COMPUTING ORDERS OF MODELED EXPECTATION ACROSS FEATURES OF MEDIA
A method implemented by a determination engine is provided. The determination engine receives a media dataset comprising target piece music information, target piece audience information, corpus music information, corpus audience information, and corpus preference data. The determination engine determines a subset of the corpus music and preference information and determines at least one surprise factor of the subset of the corpus music and preference information across features at one of a plurality of orders. The determination engine learns a model that estimates a likelihood that time-varying surprise trends across the features achieves a preference level. The determination engine determines at least one surprise factor of the target piece music information across the features at the one of the plurality of orders and predicts, using the model, preference information using the time-varying surprise trends for the target piece music information across the features.
METHOD AND DEVICE FOR DETERMINING MIXING PARAMETERS BASED ON DECOMPOSED AUDIO DATA
The present invention provides a method for processing audio data, comprising the steps of providing a first audio track of mixed input data, said mixed input data representing an audio signal containing a plurality of different timbres, decomposing the mixed input data to obtain decomposed data representing an audio signal containing at least one, but not all, of the plurality of different timbres, providing a second audio track, analyzing audio data, including at least the decomposed data, to determine at least one mixing parameter, and generating an output track based on the at least one mixing parameter, said output track comprising first output data obtained from the first audio track and second output data obtained from the second audio track.
Methods and systems for vocalist part mapping
Systems and methods for mapping parts in a digital sheet music file for a harmony. The method may include receiving a selection of a music segment for part mapping, receiving a digital sheet music representation of the selected music segment, and determining a plurality of plausible part mapping for the digital sheet music representation. A part mapping identifies one or more distinct musical parts in the digital sheet music representation, each of said one or more distinct musical parts corresponding to a performer of the harmony. The method may also include analyzing one or more features of the plurality of plausible part mapping to identify a highest probability part mapping based on previously stored information, and outputting the highest probability part mapping.
Audio contribution identification system and method
A system for identifying the contribution of a given sound source to a composite audio track, the system comprising an audio input unit operable to receive an input composite audio track comprising two or more sound sources, including the given sound source, an audio generation unit operable to generate, using a model of a sound source, an approximation of the contribution of the given sound source to the composite audio track, an audio comparison unit operable to compare the generated audio to at least a portion of the composite audio track to determine whether the generated audio provides an approximation of the composite audio track that meets a threshold degree of similarity, and an audio identification unit operable to identify, when the threshold is met, the generated audio as a suitable representation of the contribution of the sound source to the composite audio track.
METHOD AND SYSTEM FOR INTERACTIVE SONG GENERATION
A method and system may provide for interactive song generation. In one aspect, a computer system may present options for selecting a background track. The computer system may generate suggested lyrics based on parameters entered by the user. User interface elements allow the computer system to receive input of lyrics. As the user inputs lyrics, the computer system may update its suggestions of lyrics based on the previously input lyrics. In addition, the computer system may generate proposed melodies to go with the lyrics and the background track. The user may select from among the melodies created for each portion of lyrics. The computer system may optionally generate a computer-synthesized vocal(s) or capture a vocal track of a human voice singing the song. The background track, lyrics, melodies, and vocals may be combined to produce a complete song without requiring musical training or experience by the user.
AUTOMATIC ISOLATION OF MULTIPLE INSTRUMENTS FROM MUSICAL MIXTURES
A system, method and computer product for training a neural network system. The method comprises inputting an audio signal to the system to generate plural outputs f(X, Θ). The audio signal includes one or more of vocal content and/or musical instrument content, and each output f(X, Θ) corresponds to a respective one of the different content types. The method also comprises comparing individual outputs f(X, Θ) of the neural network system to corresponding target signals. For each compared output f(X, Θ), at least one parameter of the system is adjusted to reduce a result of the comparing performed for the output f(X, Θ), to train the system to estimate the different content types. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate various different types of vocal and/or instrument components of an audio signal, depending on which type of component(s) the system is trained to estimate.
METHOD AND DEVICE FOR PROCESSING, PLAYING AND/OR VISUALIZING AUDIO DATA, PREFERABLY BASED ON AI, IN PARTICULAR DECOMPOSING AND RECOMBINING OF AUDIO DATA IN REAL-TIME
The present invention relates to a method for processing and playing audio data comprising the steps of receiving mixed input data and playing recombined output data. Furthermore, the invention relates to a device for processing and playing audio data, preferably DJ equipment, comprising an audio input unit for receiving a mixed input signal, a recombination unit and a playing unit for playing recombined output data. In addition, the present invention relates to a method and a device for representing audio data, i.e. on a display.
Audio matching with semantic audio recognition and report generation
Example articles of manufacture and apparatus for producing supplemental information for audio signature data are disclosed herein. An example apparatus includes memory including computer readable instructions. The example apparatus also includes a processor to execute the instructions to at least obtain first audio signature data associated with a first time period of media, obtain first semantic signature data associated with the first time period of the media and second semantic signature data associated with a second time period of the media, and when second audio signature data associated with the second time period of the media is unavailable, identify the media based on the first audio signature data associated with the first time period of media when the second semantic signature data associated with the second time period matches the first semantic signature data associated with the first time period of the media.