G10H2240/125

Systems and methods for automatic mixing of media

A first device includes one or more processors and memory storing one or more programs configured to be executed by the one or more processors. The one or more programs include instructions for receiving, from a second device, audio mix information for a first audio item and receiving, from the second device, an indication that the first audio item is to be mixed with a second audio item distinct from the first audio item. In response to the indication, the one or more programs include instructions for transmitting to the second device an audio stream including the first audio item and the second audio item mixed in accordance with the audio mix information.

SERVER SIDE CROSSFADING FOR PROGRESSIVE DOWNLOAD MEDIA
20180005667 · 2018-01-04 ·

Systems and methods are provided to implement and facilitate cross-fading, interstitials and other effects/processing of two or more media elements in a personalized media delivery service. Effects or crossfade processing can occur on the broadcast, publisher or server-side, but can still be personalized to a specific user, in a manner that minimizes processing on the downstream side or client device. The cross-fade can be implemented after decoding, processing, re-encoding, and rechunking the relevant chunks of each component clip. Alternatively, the cross-fade or other effect can be implemented on the relevant chunks in the compressed domain, thus obviating any loss of quality by re-encoding. A large scale personalized content delivery service can limit the processing to essentially the first and last chunks of any file, there being no need to process the full clip.

CROWD-SOURCED TECHNIQUE FOR PITCH TRACK GENERATION
20230005463 · 2023-01-05 ·

Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.

Audiovisual content rendering with display animation suggestive of geolocation at which content was previously rendered

Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.

AUTOMATED GENERATION OF AUDIO TRACKS
20230197042 · 2023-06-22 ·

Conventionally, significant time and effort are required to construct audio tracks. Disclosed embodiments enable automation of audio tracks using templates that associate sound generator(s) with template section(s). Each template enables a model to automatically generate unique audio tracks in which the sections and/or sounds are probabilistically determined. Certain embodiments introduce additional variability into the automated generation of audio tracks. In addition, the model may generate the audio tracks, note by note, to ensure that no copyrights are infringed.

GENERATING MUSIC OUT OF A DATABASE OF SETS OF NOTES

A method of generating music contents from input music contents that includes development of models of music composition generation on the basis of business rules and composition rules. In parallel, sounds are prepared, which may be saved in the sound repository. Then, models in the form of source code are sent to a melody generator. Firstly, the generator is set with specific parameters using a controller conforming to MIDI standards and supplemented with composition characteristics read from the user preference database. Next, the contents are sent to automatic generation based on artificial intelligence algorithms and the digital score of the composition with the desired characteristics is generated. Sound tracks of individual instruments are rendered and the rendered tracks are mixed into the final music record. Next, the composition and its record are verified by the critic module using algorithms based on neural networks.

METHODS AND SYSTEMS FOR SYNCHRONIZING AN AUDIO CLIP EXTRACTED FROM AN ORIGINAL RECORDING WITH CORRESPONDING LYRICS
20170316768 · 2017-11-02 ·

Methods, systems, and devices for determining an audio portion based on a request received from a consumer or user of an operating device where the requests comprise a set of lyrics, then effect the streaming of the determined audio portion.

Music Generator Generation of Continuous Personalized Music
20220059063 · 2022-02-24 ·

Techniques are disclosed relating to automatically generate new music content. In some embodiments, a computing system receivers user input specifying a user-defined music control element. The computing system may train a machine learning model to change both composition and performance parameters based on user adjustments to the user-defined music control element. In embodiments in which composition and performance subsystems are on different devices, one device may transmit configuration information to another device, where the configuration information specifies how to adjust parameters based on user input to the user-defined music control element. Disclosed techniques may facilitate centralized learning for human-like music production while allowing individualized customization for individual users. Further, disclosed techniques may allow artists to define their own abstract music controls and make those controls available to end-users.

Server side crossfading for progressive download media
11257524 · 2022-02-22 · ·

In exemplary embodiments of the present invention systems and methods are provided to implement and facilitate cross-fading, interstitials and other effects/processing of two or more media elements in a personalized media delivery service so that each client or user has a consistent high quality experience. The effects or crossfade processing can occur on the broadcast, publisher or server-side, but can still be personalized to a specific user, thus still allowing a personalized experience for each individual user, in a manner where the processing burden is minimized on the downstream side or client device. This approach enables a consistent user experience, independent of client device capabilities, both static and dynamic. The cross-fade can be implemented after decoding the relevant chunks of each component clip, processing, recoding and rechunking, or, in a preferred embodiment, the cross-fade or other effect can be implemented on the relevant chunks to the effect in the compressed domain, thus obviating any loss of quality by re-encoding. A large scale personalized content delivery service can be implemented by limiting the processing to essentially the first and last chunks of any file, since there is no need to processing the full clip. In exemplary embodiments of the present invention this type of processing can easily be accommodated in cloud computing technology, where the first and last files may be conveniently extracted and processed within the cloud to meet the required load. Processing may also be done locally, for example, by the broadcaster, with sufficient processing power to manage peak load.

Crowd-sourced technique for pitch track generation

Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.