Patent classifications
G10H2220/011
Electronic musical instrument, electronic musical instrument control method, and storage medium
An electronic musical instrument includes: a memory that stores lyric data including lyrics for a plurality of timings, pitch data including pitches for said plurality of timings, and a trained model that has been trained and learned singing voice features of a singer; and at least one processor, wherein at each of said plurality of timings, the at least one processor: if the operation unit is not operated, obtains, from the trained model, a singing voice feature associated with a lyric indicated by the lyric data and a pitch indicated by the pitch data; if the operation unit is operated, obtains, from the trained model, a singing voice feature associated with the lyric indicated by the lyric data and a pitch indicated by the operation of the operation unit; and synthesizes and outputs singing voice data based on the obtained singing voice feature of the singer.
SONG PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND READABLE STORAGE MEDIUM
This application provides a song processing method performed by a computer device. The method includes: presenting a song recording interface in response to a singing instruction triggered in a session interface of a group chat session; recording a song in response to a song recording instruction triggered in the song recording interface, and determining a reverberation effect corresponding to the recorded song; and transmitting, in response to a song transmitting instruction, a target song obtained by processing the song based on the reverberation effect to members of the group chat session, presenting a session message corresponding to the target song in the session interface, and presenting the pick-up singing function item corresponding to the target song in the session interface, the pick-up singing function item being used for implementing pick-up singing of the target song by a member of the group chat session.
AUTOENCODER-BASED LYRIC GENERATION
Some embodiments of the present disclosure relate to generating novel lyrics lines conditioned on music audio. A bimodal neural network model may learn to generate lyric lines conditioned on a given short audio clip. The bimodal neural network model includes a spectrogram variational autoencoder and a text variational autoencoder. Output from the spectrogram variational autoencoder is used to influence output from text variational autoencoder.
System and method for association of a song, music, or other media content with a user's video content
In accordance with an embodiment, described herein is a system and method for association of a song, music, or other media content with a user's video content. The system enables a user to associate a song, music, or other media content that is associated with an audio clip and a song metadata of a media content, with a video they are about to create, or have created, to create a video moment. A recipient of the video moment can hear the audio clip in combination with the video content, and also view the song metadata overlay, to determine the name of the song and artist that was used in the video, or optionally access the song at a media server, for further listening by the recipient.
SYSTEMS AND METHODS FOR TRANSPOSING SPOKEN OR TEXTUAL INPUT TO MUSIC
Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.
REFERENCE DISPLAY DEVICE, REFERENCE DISPLAY METHOD, AND PROGRAM
Provided are a display device and a program which allow a user to intuitively recognize a connection and a breathing timing between respective notes. A CPU (11) generates a guide image, based on information about a sound-producing timing and a sound length of each note, which are included in a guide melody track. The CPU (11) smoothly connects respective notes. Thereafter, the CPU (11) disconnects the notes at the breathing timing indicated in a breath position track.
ELECTRONIC DEVICE, METHOD AND COMPUTER PROGRAM
An electronic device having a circuitry configured to perform audio source separation on an audio input signal to obtain a vocals signal and an accompaniment signal and to perform a confidence analysis on a user's voice signal based on the vocals signal to provide guidance to the user.
User interfaces for content applications
In some embodiments, an electronic device displays time-synced lyrics of content items playing on an electronic device. In some embodiments, an electronic device displays representations of content items in a playback sequence on an electronic device. In some embodiments, an electronic device shares an item of content with another user account of another electronic device.
APPARATUS AND METHOD FOR GENERATING VISUAL CONTENT FROM AN AUDIO SIGNAL
An apparatus and method for generating visual content from an audio signal are described. The method includes receiving (310) audio content, processing (320) the audio content to separate into a first and second portion of the audio content, converting (330) the second portion into visual content, delaying (340) the first portion based on a time relationship between the audio content and the visual content, the delaying accounting for time to process the first portion and convert the second portion, and providing (350) the visual content and audio content for reproduction. The apparatus includes a source separation module (210) processing the received audio content to separate into a first and second portion of the audio content, a converter module (220) converting the second portion into visual content, and a synchronization module (230) delaying the first portion based on a time relationship between the audio content and the visual content.
VIDEO GENERATION METHOD, APPARATUS AND TERMINAL
Embodiments of the present disclosure provide a video generation method, device and terminal. Embodiments of the present disclosure may perform effect processing for the image frames obtained from a video during shooting of the video or after the shooting of the video is finished using the selected video effect template, and compose the processed image frames to obtain the composite video. The problem of only presenting contents shot using a camera during the recording of a song and being not able to provide customized functions satisfying users' requirements may be solved. The effect of performing effect processing for all image frames or partial image frames in the shot video using the video effect template selected by the user to obtain the composite video satisfying the users' requirements may be achieved.