Patent classifications
G10H2250/455
SINGING VOICE CONVERSION
A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
ELECTRONIC MUSICAL INSTRUMENT AND CONTROL METHOD FOR ELECTRONIC MUSICAL INSTRUMENT
An electronic musical instrument outputs synthesized lyrics of a song based on lyric data in accordance with operations by a user. One or more processors in electronic musical instrument generate voice synthesis data for a lyric of the song based on the lyric data for the song at a timing at which said lyric is supposed to be outputted regardless of whether or not a user operation of the operating unit is detected at said timing; when the user operation of the operating unit is detected at said timing, cause voice sound synthesized based on the generated voice synthesis data to be outputted; and when the user operation of the operating unit is not detected at said timing, cause the voice sound synthesized based on the generated voice synthesis data not to be outputted.
Voice synthesis method, voice synthesis apparatus, and recording medium
A voice synthesis method designates a target feature of a voice to be synthesized; specifies harmonic frequencies for a plurality of respective harmonic components of the voice and an amplitude spectrum envelope of the voice; specifies a harmonic amplitude distribution of each of the plurality of respective harmonic components based on (i) the target feature, (ii) the amplitude spectrum envelope, and (iii) the harmonic frequency specified for the respective harmonic component, the harmonic amplitude distribution representing a distribution of amplitudes in a unit band with a peak amplitude corresponding to the respective harmonic component; and generates a frequency spectrum of the voice with the target feature based on harmonic amplitude distributions specified for each of the plurality of respective harmonic components and the amplitude spectrum envelope.
SYSTEMS AND METHODS FOR TRANSPOSING SPOKEN OR TEXTUAL INPUT TO MUSIC
Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.
Systems and methods for transposing spoken or textual input to music
Described herein are real-time musical translation devices (RETM) and methods of use thereof. Exemplary uses of RETMs include optimizing the understanding and/or recall of an input message for a user and improving a cognitive process in a user.
UNSUPERVISED SINGING VOICE CONVERSION WITH PITCH ADVERSARIAL NETWORK
A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.
ELECTRONIC MUSICAL INSTRUMENTS, METHOD AND STORAGE MEDIA
In an electronic musical instrument that can output stored lyrics of a song in accordance with keyboard operations by a user, a processor determines whether a melody should be advanced or not while multiple keys of a keyboard are pressed by the user using prescribed criteria, if the processor determines that the melody should be advanced, the processor advances the lyric in response to the user's multiple key operation and if the processor determines that the melody should not be advanced, the processor does not advance the lyric in response to the user's multiple key operation.
ELECTRONIC MUSICAL INSTRUMENTS, METHOD AND STORAGE MEDIA
In an electronic musical instrument that can output stored lyrics of a song in accordance with operations by a user, a processor determines whether a pedal is on or off, and if the pedal is off, the lyric is advanced in accordance with a user operation of a keyboard, and if the pedal is on, the lyric is not advanced in accordance with a user operation of a keyboard.
COMPUTER-IMPLEMENTED METHOD AND DEVICE FOR GENERATING FREQUENCY COMPONENT VECTOR OF TIME-SERIES DATA
A computer-implemented method generates a frequency component vector of time series data, by executing a first process and a second process in each unit step. The first process includes: receiving first data; and processing the first data using a first neural network to generate intermediate data. The second process includes: receiving the generated intermediate data; and generating a plurality of component values corresponding to a plurality of frequency bands based on the generated intermediate data such that: a first component value corresponding to a first frequency band is generated using a second neural network based on the generated intermediate data; and a second component value corresponding to a second frequency band different from the first frequency band is generated using the second neural network based on the generated intermediate data and the generated first component value corresponding to the first frequency band.
DISPLAY CONTROL METHOD, DISPLAY CONTROL DEVICE, AND PROGRAM
A display control method includes causing a display device to display a processing image in which a first image representing a note corresponding to a synthesized sound and a second image representing a sound effect are arranged in an area, in which a pitch axis and a time axis are set, in accordance with synthesis data that specify the synthesized sound generated by sound synthesis and the sound effect added to the synthesized sound.