G10H2240/141

Systems, methods, smart musical instruments, computer readable media for music score matching

The present disclosure relates to a system and method for matching performance with score. The method may include acquiring performance information in a preset time period, wherein the performance information is related to a musical device. The method may also include analyzing the performance information and obtaining a played music score in the preset time period, wherein the played music score contains the performance information. The method may further include comparing the played music score with one or more standard music scores. The method may still further include identifying a standard music score from the one or more standard music score based on the comparison of the played music score with the one or more standard music scores, wherein a matching degree between the played music score and the identified standard music score reaches a preset value.

Enhanced graphical user interface for voice communications
11574633 · 2023-02-07 · ·

Enhanced graphical user interfaces for transcription of audio and video messages is disclosed. Audio data may be transcribed, and the transcription may include emphasized words and/or punctuation corresponding to emphasis of user speech. Additionally, the transcription may be translated into a second language. A message spoken by a user depicted in one or more images of video data may also be transcribed and provided to one or more devices.

INFORMATION RECEPTION SYSTEM, RECORDING MEDIUM, AND INFORMATION INPUT METHOD

An information reception device is provided with: operation surface which is adjusted to produce a characteristic index vibration by an object contact; a storage device stores candidate information (candidate information is related with the index vibration) which serves as a candidate of input information; a microphone which acquires observation information according to observation of the actual vibration arising in the surrounding environment; and a CPU. The CPU (selecting part) judges whether or not the index vibration exists in the observation information acquired. When the CPU judges that the index vibration exists, the CPU selects the candidate information which is related with the index vibration as the input information.

Audio stem identification systems and methods

Methods, systems and computer program products are provided for determining acoustic feature vectors of query and target items in a first vector space, and mapping the acoustic feature vectors to a second vector space having a lower dimension. The distribution of vectors in the second vector space can then be used to identify items from the same songs, and/or items that are complementary. A mapping function is trained using a machine learning algorithm, such that complementary audio items are closer in the second vector space than the first, according to a given distance metric.

Media content identification on mobile devices

A mobile device responds in real time to media content presented on a media device, such as a television. The mobile device captures temporal fragments of audio-video content on its microphone, camera, or both and generates corresponding audio-video query fingerprints. The query fingerprints are transmitted to a search server located remotely or used with a search function on the mobile device for content search and identification. Audio features are extracted and audio signal global onset detection is used for input audio frame alignment. Additional audio feature signatures are generated from local audio frame onsets, audio frame frequency domain entropy, and maximum change in the spectral coefficients. Video frames are analyzed to find a television screen in the frames, and a detected active television quadrilateral is used to generate video fingerprints to be combined with audio fingerprints for more reliable content identification.

AUDIO STEM IDENTIFICATION SYSTEMS AND METHODS

Methods, systems and computer program products are provided for determining acoustic feature vectors of query and target items in a first vector space, and mapping the acoustic feature vectors to a second vector space having a lower dimension. The distribution of vectors in the second vector space can then be used to identify items from the same songs, and/or items that are complementary. A mapping function is trained using a machine learning algorithm, such that complementary audio items are closer in the second vector space than the first, according to a given distance metric.

METHODS, SYSTEMS, AND MEDIA FOR RIGHTS MANAGEMENT OF EMBEDDED SOUND RECORDINGS USING COMPOSITION CLUSTERING

Methods, systems, and media for determining and presenting information related to embedded sound recordings are provided. In some embodiments, the method comprises: receiving a content item; extracting a sound recording from the content item; generating a melody fingerprint of the extracted sound recording; determining whether the melody fingerprint of the extracted sound recording matches one of a plurality of clusters of similar sounding sound recordings in a reference database, wherein each cluster in the plurality of clusters of similar sounding sound recordings is associated with ownership information based on a plurality of ownership information associated with each of the sound recordings in the cluster; in response to determining that the melody fingerprint of the extracted sound recording matches a cluster of similar sounding sound recordings, retrieving ownership information associated with the cluster; mapping the ownership information to the sound recording extracted from the content item; and causing an action to be performed on the content item based on the mapped ownership information

DEEP LEARNING SYSTEM FOR DETERMINING AUDIO RECOMMENDATIONS BASED ON VIDEO CONTENT
20220414381 · 2022-12-29 ·

Embodiments are disclosed for determining an answer to a query associated with a graphical representation of data. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an unprocessed audio sequence and a request to perform an audio signal processing effect on the unprocessed audio sequence. The one or more embodiments further include analyzing, by a deep encoder, the unprocessed audio sequence to determine parameters for processing the unprocessed audio sequence. The one or more embodiments further include sending the unprocessed audio sequence and the parameters to one or more audio signal processing effects plugins to perform the requested audio signal processing effect using the parameters and outputting a processed audio sequence after processing of the unprocessed audio sequence using the parameters of the one or more audio signal processing effects plugins.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
20220406283 · 2022-12-22 ·

An information processing apparatus according to the present disclosure includes: an acquisition unit that acquires music information; an extraction unit that extracts a plurality of types of feature amounts from the music information acquired by the acquisition unit; and a generation unit that generates information in which the plurality of types of feature amounts extracted by the extraction unit is associated with predetermined identification information as music feature information to be used as learning data in composition processing using machine learning.

AUDIO GENERATION METHOD, RELATED APPARATUS, AND STORAGE MEDIUM
20230054740 · 2023-02-23 ·

Embodiments of this application provide an audio generation method, a related apparatus, and a storage medium, to provide a better audio generation solution for a user. In embodiments of this application, a text is obtained, a song clip corresponding to the text is obtained through matching, and the song clip is used as audio corresponding to the text. In this way, the text can be expressed in a manner of the song clip.