Patent classifications
G10H2210/061
Method and system for learning and using latent-space representations of audio signals for audio content-based retrieval
A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals. A method and system are also provided for extracting features from digital audio signals and learning, from the extracted features, an artificial neural network model for generating latent-space representations of digital audio signals which take care of selecting salient attributes of the signals that represent psychoacoustic differences between the signals.
System and Method For Reusable Digital Video Templates Incorporating Cumulative Sequential Iteration Technique In Music Education
Disclosed is a method of presenting a media file enabling a user to emulate musical content therein. The media file comprises a plurality of segments, each segment representing a demonstration of at least part of a musical phrase comprising one or more musical notes of a piece of music. The method comprises presenting a first segment of the plurality of segments for emulation by the user. The method further comprises subsequently presenting the first segment followed by a second segment of the plurality of segments for emulation by the user. The method further comprises subsequently presenting the previously presented segments followed by additional segments until all of the plurality of segments have been presented for emulation by the user.
Method, device and software for controlling transport of audio data
A method for processing music audio data, including providing input audio data representing a first piece of music comprising a mixture of musical timbres. The method also includes decomposing the input audio data to generate at least first-timbre decomposed data representing a first timbre selected from the musical timbres of the first piece of music, and second-timbre decomposed data representing a second timbre selected from the musical timbres of the first piece of music. The method also includes applying a transport control to obtain transport controlled first-timbre decomposed data. The method also includes recombining audio data obtained from the transport controlled first-timbre decomposed data with audio data obtained from the second-timbre decomposed data to obtain recombined audio data.
Music modeling
A computer implemented method is provided for generating a prediction of a next musical note by a computer having at least a processor and a memory. A computer processor system is also provided for generating a prediction of a next musical note. The method includes storing sequential musical notes in the memory. The method further includes generating, by the processor, the prediction of the next musical note based upon a music model and the sequential musical notes stored in the memory. The method also includes updating, by the processor, the music model based upon the prediction of the next musical note and an actual one of the next musical note. The method additionally includes resetting, by the processor, the memory at fixed time intervals.
DJ stem systems and methods
Systems and methods selectively mix a first and second song together during a live performance. The first song has a plurality of first stems, each having stereo audio that combine to form the audio of the first song. The second song has a plurality of second stems, each having stereo audio that combine to form the audio of the second song. A computer, with memory and a processor, executes machine readable instructions of a multiple channel audio mixing application with stored within the memory. The multiple channel audio mixing application plays and mixes audio of at least one of the first stems with audio of at least one of the second stems. The multiple channel audio mixing application is controlled in real-time during the performance to select the at least one first stem and the at least one second stem for the mixing.
METHOD AND SYSTEM FOR LEARNING AND USING LATENT-SPACE REPRESENTATIONS OF AUDIO SIGNALS FOR AUDIO CONTENT-BASED RETRIEVAL
A method and system are provided for extracting features from digital audio signals which exhibit variations in pitch, timbre, decay, reverberation, and other psychoacoustic attributes and learning, from the extracted features, an artificial neural network model for generating contextual latent-space representations of digital audio signals. A method and system are also provided for learning an artificial neural network model for generating consistent latent-space representations of digital audio signals in which the generated latent-space representations are comparable for the purposes of determining psychoacoustic similarity between digital audio signals. A method and system are also provided for extracting features from digital audio signals and learning, from the extracted features, an artificial neural network model for generating latent-space representations of digital audio signals which take care of selecting salient attributes of the signals that represent psychoacoustic differences between the signals.
MUSICAL PERFORMANCE ASSISTANCE APPARATUS AND METHOD
Musical score acquisition section acquires musical score information representing a musical score of a music piece to be practiced by a user. Target portion setting section sets a plurality of portions of the music piece as target portions for training, respectively. Target musical score acquisition section acquires target musical score information indicative of partial musical scores of the respective target portions for training set by the target portion setting section. Display control section controls, on the basis of the target musical score information acquired by the target musical score acquisition section, a display device to display two or more of target musical scores in a side-by-side arrangement. Thus, a user can easily grasp the plurality of target portions for training.
Performance training apparatus and method
For each sound of a model performance, performance information designating a sound generation timing and sound are supplied, and for each of a plurality of phrases into which the model performance is divided, intensity information indicative of an intensity of sound for the phrase is supplied. In accordance with a progression of a performance time and for each phrase of the model performance, the intensity information is acquired ahead of a start timing when a performance of the phrase is to be started, and the intensity of sound common to sounds in the phrase is presented based on the acquired intensity information. The intensity of the sound is presented in a visual or audible manner. In this way, a human player can know, through a visual display and/or an audible sound, an intensity of a key depression operation for each phrase of the model performance before starting the phrase. As a result, the human player can practice the performance while being aware of the intensity of sound for each phrase.
INFORMATION PROCESSING METHOD, IMAGE PROCESSING APPARATUS, AND PROGRAM
[Object] To propose an image processing method, image processing apparatus and program which are capable of exciting the emotions of a viewer more effectively. [Solution] An information processing method including: analyzing a beat of input music; extracting a plurality of unit images from an input image; and generating, by a processor, editing information for switching the extracted unit images depending on the analyzed beat.
Audiovisual collaboration system and method with seed/join mechanic
User interface techniques provide user vocalists with mechanisms for seeding subsequent performances by other users (e.g., joiners). A seed may be a full-length seed spanning much or all of a pre-existing audio (or audiovisual) work and mixing, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited “chunk” of an audio (or audiovisual) work may constitute a seed. A seeding user's call invites other users to join the full-length or short-form seed by singing along, singing a particular vocal part or musical section, singing harmony or other duet part, rapping, talking, clapping, recording video, adding a video clip from camera roll, etc. The resulting group performance, whether full-length or just a chunk, may be posted, livestreamed, or otherwise disseminated in a social network.