G10H2240/036

Methods, computer server systems and media devices for media streaming
11330348 · 2022-05-10 · ·

In general, this disclosure concerns media streaming. Among other things, the present disclosure presents a first media item for streaming from a computer server system to a media device. The first media item has an audio format. Furthermore, the first media item comprises a number of media segments, wherein each one of the number of media segments is identifiable by a media segment identifier. Still further, one or several of the number of media segments is/are associated with a respective second media item corresponding to a respective media segment identifier. The second media item(s) typically has/have a media format other than audio.

MUSIC COVER IDENTIFICATION WITH LYRICS FOR SEARCH, COMPLIANCE, AND LICENSING
20210357451 · 2021-11-18 ·

Embodiments cover identifying an unidentified media content item as a cover of a known media content item using lyrical contents. In an example, a processing device receives an unidentified media content item and determines lyrical content associated with the unidentified media content item. The processing device then determines a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items. The processing device then identifies the unidentified media content item as a cover of the known media content item based at least in part on the lyrical similarity, resulting in an identified cover-media content item.

Music cover identification with lyrics for search, compliance, and licensing
11816151 · 2023-11-14 · ·

Embodiments cover identifying an unidentified media content item as a cover of a known media content item using lyrical contents. In an example, a processing device receives an unidentified media content item and determines lyrical content associated with the unidentified media content item. The processing device then determines a lyrical similarity between the lyrical content associated with the unidentified media content item and additional lyrical content associated with a known media content item of a plurality of known media content items. The processing device then identifies the unidentified media content item as a cover of the known media content item based at least in part on the lyrical similarity, resulting in an identified cover-media content item.

METHODS, COMPUTER SERVER SYSTEMS AND MEDIA DEVICES FOR MEDIA STREAMING
20220329921 · 2022-10-13 ·

A computer server system associates one or more media items with a first segment of a first media item, the one or more media items selected based on current location information of a media device. The computer server system receives, from a media device, a request for a media item associated with the first media item, wherein the request includes a media segment identifier for the first segment of the first media item. In response to the request, the computer server system identifies the one or more media items associated with the first segment and provides the one or more media items to the media device.

IDENTIFYING LANGUAGE IN MUSIC
20220277729 · 2022-09-01 ·

The present disclosure describes techniques for identifying languages associated with music. Training data may be received, wherein the training data comprise information indicative of audio data representative of a plurality of music samples and metadata associated with the plurality of music samples. The training data further comprises information indicating a language corresponding to each of the plurality of music samples. A machine learning model may be trained to identify a language associated with a piece of music by applying the training data to the machine model until the model reaches a predetermined recognition accuracy. A language associated with the piece of music may be determined using the trained machine learning model.

System and method for music and effects sound mix creation in audio soundtrack versioning

Implementations of the disclosure describe systems and methods that leverage machine learning to automate the process of creating music and effects mixes from original sound mixes including domestic dialogue. In some implementations, a method includes: receiving a sound mix including human dialogue; extracting metadata from the sound mix, where the extracted metadata categorizes the sound mix; extracting content feature data from the sound mix, the extracted content feature data including an identification of the human dialogue and instances or times the human dialogue occurs within the sound mix; automatically calculating, with a trained model, content feature data of a music and effects (M&E) sound mix using at least the extracted metadata and the extracted content feature data of the sound mix; and deriving the M&E sound mix using at least the calculated content feature data.

ARTIFICIALLY GENERATING AUDIO DATA FROM TEXTUAL INFORMATION AND RHYTHM INFORMATION
20210224319 · 2021-07-22 ·

Methods and systems for artificially generating media streams are provided. Textual information, rhythm information and voice characteristics may be received. It may be determined that a first portion of the textual information corresponds to a first portion of the rhythm information and that a second portion of the textual information corresponds to a second portion of the rhythm information. Audio stream may be generated based on the textual information, the rhythm information and the voice characteristics. A first portion of the audio stream may include a vocal expression of the first portion of the textual information in a voice corresponding to the voice characteristics and according to the first portion of the rhythm information, and a second portion may include a vocal expression of the second portion of the textual information in the voice corresponding to the voice characteristics and according to the second portion of the rhythm information.

METHODS, COMPUTER SERVER SYSTEMS AND MEDIA DEVICES FOR MEDIA STREAMING
20210120318 · 2021-04-22 ·

In general, this disclosure concerns media streaming. Among other things, the present disclosure presents a first media item for streaming from a computer server system to a media device. The first media item has an audio format. Furthermore, the first media item comprises a number of media segments, wherein each one of the number of media segments is identifiable by a media segment identifier. Still further, one or several of the number of media segments is/are associated with a respective second media item corresponding to a respective media segment identifier. The second media item(s) typically has/have a media format other than audio.

Methods, computer server systems and media devices for media streaming
10887671 · 2021-01-05 · ·

In general, this disclosure concerns media streaming. Among other things, the present disclosure presents a first media item for streaming from a computer server system to a media device. The first media item has an audio format. Furthermore, the first media item comprises a number of media segments, wherein each one of the number of media segments is identifiable by a media segment identifier. Still further, one or several of the number of media segments is/are associated with a respective second media item corresponding to a respective media segment identifier. The second media item(s) typically has/have a media format other than audio.

SYSTEM AND METHOD FOR MUSIC AND EFFECTS SOUND MIX CREATION IN AUDIO SOUNDTRACK VERSIONING

Implementations of the disclosure describe systems and methods that leverage machine learning to automate the process of creating music and effects mixes from original sound mixes including domestic dialogue. In some implementations, a method includes: receiving a sound mix including human dialogue; extracting metadata from the sound mix, where the extracted metadata categorizes the sound mix; extracting content feature data from the sound mix, the extracted content feature data including an identification of the human dialogue and instances or times the human dialogue occurs within the sound mix; automatically calculating, with a trained model, content feature data of a music and effects (M&E) sound mix using at least the extracted metadata and the extracted content feature data of the sound mix; and deriving the M&E sound mix using at least the calculated content feature data.