G10H2210/331

MUSIC CUSTOMIZATION USER INTERFACE
20230042616 · 2023-02-09 ·

Computing devices and methods for providing a music customization graphical user interface (GUI) to a user computing device are disclosed. The music customization GUI comprises song name selectors that each correspond to a different song and a music player region that includes a song customization selector. A customization window comprises music stem indicators that each correspond to at least one music stem of a selected song. The customization window also comprises mixing buttons that include a first mixing button configured to add a corresponding music stem to a user song mix, and a second mixing button configured to remove the music stem from the user song mix. A download button is configured to download a file comprising the user song mix to a user computing device.

Systems and methods for automatic mixing of media

A first device includes one or more processors and memory storing one or more programs configured to be executed by the one or more processors. The one or more programs include instructions for receiving, from a second device, audio mix information for a first audio item and receiving, from the second device, an indication that the first audio item is to be mixed with a second audio item distinct from the first audio item. In response to the indication, the one or more programs include instructions for transmitting to the second device an audio stream including the first audio item and the second audio item mixed in accordance with the audio mix information.

CROWD-SOURCED TECHNIQUE FOR PITCH TRACK GENERATION
20230005463 · 2023-01-05 ·

Digital signal processing and machine learning techniques can be employed in a vocal capture and performance social network to computationally generate vocal pitch tracks from a collection of vocal performances captured against a common temporal baseline such as a backing track or an original performance by a popularizing artist. In this way, crowd-sourced pitch tracks may be generated and distributed for use in subsequent karaoke-style vocal audio captures or other applications. Large numbers of performances of a song can be used to generate a pitch track. Computationally determined pitch trackings from individual audio signal encodings of the crowd-sourced vocal performance set are aggregated and processed as an observation sequence of a trained Hidden Markov Model (HMM) or other statistical model to produce an output pitch track.

AUDIO TRANSPOSITION

An electronic device comprising circuitry configured to separate by audio source separation a first audio input signal into a first vocal signal and an accompaniment, and to transpose an audio output signal by a transposition value based on a pitch ratio, wherein the pitch ratio is based on comparing a first pitch range of the first vocal signal and a second pitch range of the second vocal signal.

Short segment generation for user engagement in vocal capture applications

User interface techniques provide user vocalists with mechanisms for solo audiovisual capture and for seeding subsequent performances by other users (e.g., joiners). Audiovisual capture may be against a full-length work or seed spanning much or all of a pre-existing audio (or audiovisual) work and in some cases may mix, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed or short segment may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited “chunk” of an audio (or audiovisual) work may constitute a short seed or short segment. Computational techniques are described that allow a system to automatically identify suitable short seeds or short segments. After audiovisual capture against the short seed or short segment, a resulting, solo or group, full-length or short-form performance may be posted, livestreamed, or otherwise disseminated in a social network.

Audiovisual content rendering with display animation suggestive of geolocation at which content was previously rendered

Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.

Template-Based Excerpting and Rendering of Multimedia Performance

Disclosed herein are computer-implemented method, system, and computer-readable storage-medium embodiments for implementing template-based excerpting and rendering of multimedia performances technologies. An embodiment includes at least one computer processor configured to retrieve a first content instance and corresponding first metadata. The first content instance may include a first plurality of structural elements, with at least one structural element corresponding to at least part of the first metadata. The first content instance may be transformed by a rendering engine running on the at least one computer processor and/or transmitted to a content-playback device.

Audio-visual effects system for augmentation of captured performance based on content thereof

Visual effects schedules are applied to audiovisual performances with differing visual effects applied in correspondence with differing elements of musical structure. Segmentation techniques applied to one or more audio tracks (e.g., vocal or backing tracks) are used to compute some of the components of the musical structure. In some cases, applied visual effects schedules are mood-denominated and may be selected by a performer as a component of his or her visual expression or determined from an audiovisual performance using machine learning techniques.

INFORMATION PROCESSING DEVICE, METHOD AND RECORDING MEDIA
20230090773 · 2023-03-23 · ·

An information processing device includes: an input interface; and at least one processor, configured to perform the following: selecting an instrument, a musical tone of which is to be digitally synthesized based on corresponding musical tone data, via the input interface; acquiring a parameter value that has been set for the selected instrument; generating a random number based on a random function; and changing a pitch of the musical tone of the selected instrument based on the generated random number and the acquired parameter values.

ARPEGGIATOR, RECORDING MEDIUM AND METHOD OF MAKING ARPEGGIO
20220343884 · 2022-10-27 · ·

[Problem] To provide an arpeggiator capable of suppressing sound muddiness caused by simultaneous production of sounds in a multi-arpeggiator capable of automatically playing arpeggios in a plurality of performance parts, and a program provided with the function of the arpeggiator. [Solution] A synthesizer 1 decreases velocity in velocity memory 12c in accordance with a duck rate when the time of sound production for a sound production part is the same as the time of sound production for a duck part, and at that time of sound production, a note number of the duck part is the same as a note number of a duck note. This makes it possible to suppress muddiness of output sound, since an increase in output level is suppressed even when a plurality parts produce sound at the same time.