Patent classifications
G10H2220/011
Audiovisual collaboration method with latency management for wide-area broadcast
Techniques have been developed to facilitate the livestreaming of group audiovisual performances. Audiovisual performances including vocal music are captured and coordinated with performances of other users in ways that can create compelling user and listener experiences. For example, in some cases or embodiments, duets with a host performer may be supported in a sing-with-the-artist style audiovisual livestream in which aspiring vocalists request or queue particular songs for a live radio show entertainment format. The developed techniques provide a communications latency-tolerant mechanism for synchronizing vocal performances captured at geographically-separated devices (e.g., at globally-distributed, but network-connected mobile phones or tablets or at audiovisual capture devices geographically separated from a live studio).
Method and apparatus for presenting media information, storage medium, and electronic apparatus
The present disclosure describes embodiments of a method, a device, and a non-transitory computer readable storage medium for presenting media information. The method includes displaying, by a device, an interaction interface. The device includes a memory storing instructions and a processor in communication with the memory. The method includes obtaining, by the device, an image set through the interaction interface, the image set comprising at least one image. The method includes obtaining, by the device, target media based on the image set through the interaction interface, the target media comprising a first audio generated according to an image feature of the image set. The method includes presenting, by the device, the target media.
AUDIOVISUAL CONTENT RENDERING WITH DISPLAY ANIMATION SUGGESTIVE OF GEOLOCATION AT WHICH CONTENT WAS PREVIOUSLY RENDERED
Techniques have been developed to facilitate (1) the capture and pitch correction of vocal performances on handheld or other portable computing devices and (2) the mixing of such pitch-corrected vocal performances with backing tracks for audible rendering on targets that include such portable computing devices and as well as desktops, workstations, gaming stations, even telephony targets. Implementations of the described techniques employ signal processing techniques and allocations of system functionality that are suitable given the generally limited capabilities of such handheld or portable computing devices and that facilitate efficient encoding and communication of the pitch-corrected vocal performances (or precursors or derivatives thereof) via wireless and/or wired bandwidth-limited networks for rendering on portable computing devices or other targets.
Method and apparatus for generating digital score file of song, and storage medium
A method and an information processing apparatus to generate a digital score file of a song are described. The information processing apparatus includes processing circuitry. The processing circuitry is configured to obtain a candidate audio file satisfying a first condition from audio files of unaccompanied singing of the song without instrumental accompaniment. The processing circuitry is configured to divide the candidate audio file into valid audio segments based on timing information of the song, and extract pieces of music note information from the valid audio segments. Each of the pieces of music note information includes at least one data set of a music note in the song. The data set includes an onset time, a duration, and a music note value of the music note. The processing circuitry is configured to generate the digital score file based on the pieces of music note information.
System and method for providing a video with lyrics overlay for use in a social messaging environment
In accordance with an embodiment, described herein is a system and method for providing a live lyrics overlay in a social messaging environment. The system can utilize advances in three-dimensional mapping technology that allow social messaging services, to offer real time video lenses or overlays to their users, and extends this three-dimensional mapping technology to support for lyrics. During creation of a video with lyrics lens overlay, the lyrics corresponding to a selected song are retrieved from a lyrics source, and are displayed within the video. For example, with the lyrics lens, a user can record an image of themselves on live video, singing along to a song clip, with the lyrics of the song displayed as if they appear to be coming from their mouths. The created live lyrics content can also be shared with other users of a social messaging environment.
Non-linear media segment capture and edit platform
User interface techniques provide user vocalists with mechanisms for forward and backward traversal of audiovisual content, including pitch cues, waveform- or envelope-type performance timelines, lyrics and/or other temporally-synchronized content at record-time, during edits, and/or in playback. Recapture of selected performance portions, coordination of group parts, and overdubbing may all be facilitated. Direct scrolling to arbitrary points in the performance timeline, lyrics, pitch cues and other temporally-synchronized content allows user to conveniently move through a capture or audiovisual edit session. In some cases, a user vocalist may be guided through the performance timeline, lyrics, pitch cues and other temporally-synchronized content in correspondence with group part information such as in a guided short-form capture for a duet. A scrubber allows user vocalists to conveniently move forward and backward through the temporally-synchronized content.
Lyrics analyzer
A lyrics analyzer generates tags and explicitness indicators for a set of tracks. These tags may indicate the genre, mood, occasion, or other features of each track. The lyrics analyzer does so by generating an n-dimensional vector relating to a set of topics extracted from the lyrics and then using those vectors to train a classifier to determine whether each tag applies to each track. The lyrics analyzer may also generate playlists for a user based on a single seed song by comparing the lyrics vector or the lyrics and acoustics vectors of the seed song to other songs to select songs that closely match the seed song. Such a playlist generator may also take into account the tags generated for each track.
SYSTEMS AND METHODS FOR TRANSPOSING SPOKEN OR TEXTUAL INPUT TO MUSIC
Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.
Music Notation Using a Disproportionate Correlated Scale
Methods and systems of music notation for visually representing music that provide a visual scale representing a range of an auditory scale of a portion of a musical composition spanning at least four and a half steps. The visual scale may comprise a plurality of whole-step segments each representing one whole step in the auditory scale. Each whole-step segment may be approximately a first height. The visual scale may also comprise one or more half-step segments each representing one half step in the auditory scale. Each half-step segment may be approximately a second height. A first ratio representing the first height divided by the second height may be significantly greater than a second ratio representing the whole step divided by the half step.
SHORT SEGMENT GENERATION FOR USER ENGAGEMENT IN VOCAL CAPTURE APPLICATIONS
User interface techniques provide user vocalists with mechanisms for solo audiovisual capture and for seeding subsequent performances by other users (e.g., joiners). Audiovisual capture may be against a full-length work or seed spanning much or all of a pre-existing audio (or audiovisual) work and in some cases may mix, to seed further contributions of one or more joiners, a user's captured media content for at least some portions of the audio (or audiovisual) work. A short seed or short segment may span less than all (and in some cases, much less than all) of the audio (or audiovisual) work. For example, a verse, chorus, refrain, hook or other limited chunk of an audio (or audiovisual) work may constitute a short seed or short segment. Computational techniques are described that allow a system to automatically identify suitable short seeds or short segments. After audiovisual capture against the short seed or short segment, a resulting, solo or group, full-length or short-form performance may be posted, livestreamed, or otherwise disseminated in a social network.