G10H2250/455

Method and apparatus for generating music

A terminal for generating music may identify, based on execution of scenario recognition, scenarios for images previously received by the terminal. The terminal may generate respective description texts for the scenarios. The terminal may execute keyword-based rhyme matching based on the respective description texts. The terminal may generate respective rhyming lyrics corresponding to the images. The terminal may convert the respective rhyming lyrics corresponding to the images into a speech. The terminal may synthesize the speech with preset background music to obtain image music.

Voice synthesis method, voice synthesis apparatus, and recording medium
11295723 · 2022-04-05 · ·

A voice synthesis method includes: supplying a first trained model with control data including phonetic identifier data to generate a series of frequency spectra of harmonic components; supplying a second trained model with the control data to generate a waveform signal representative of non-harmonic components; and generating a voice signal including the harmonic components and the non-harmonic components based on the series of frequency spectra of the harmonic components generated by the first trained model and the waveform signal representative of the non-harmonic components generated by the second trained model.

ELECTRONIC MUSICAL INSTRUMENT, METHOD, AND STORAGE MEDIUM

An electronic musical instrument includes: a plurality of keys that include at least first keys corresponding to a first pitch range and second keys corresponding to a second pitch range; and at least one processor, configured to perform the following: in accordance with a key operation in the first pitch range, determining a syllable position contained in a phrase; and in accordance with a key operation in the second pitch rang, instructing a sound production of a digitally synthesized sound corresponding to the determined syllable position.

ELECTRONIC MUSICAL INSTRUMENT, METHOD, AND STORAGE MEDIUM

An electronic musical instrument includes: a plurality of keys that include at least first keys corresponding to a first pitch range and second keys corresponding to a second pitch range; and at least one processor, configured to perform the following: causing a syllable position in a phrase that is digitally synthesized for output to be not advanced no matter how the second keys in the second pitch range are operated while a key operation in the first pitch range is being continuously maintained; and causing the syllable position to advance every time a key operation in the second pitch rang is performed while none of the first keys in the first pitch range are being operated.

Audio Information Playback Method, Audio Information Playback Device, Audio Information Generation Method and Audio Information Generation Device
20220044662 · 2022-02-10 ·

An audio information playback method includes reading audio information, reading separator information, acquiring note-on information and note-off information, moving a playback position, and starting playback. The starting of the playback is from the loop end position to the playback end position of an utterance unit subject to playback in response to acquisition of the note-off information corresponding to the note-on information.

METHOD AND SYSTEM FOR INTERACTIVE SONG GENERATION
20210312897 · 2021-10-07 ·

A method and system may provide for interactive song generation. In one aspect, a computer system may present options for selecting a background track. The computer system may generate suggested lyrics based on parameters entered by the user. User interface elements allow the computer system to receive input of lyrics. As the user inputs lyrics, the computer system may update its suggestions of lyrics based on the previously input lyrics. In addition, the computer system may generate proposed melodies to go with the lyrics and the background track. The user may select from among the melodies created for each portion of lyrics. The computer system may optionally generate a computer-synthesized vocal(s) or capture a vocal track of a human voice singing the song. The background track, lyrics, melodies, and vocals may be combined to produce a complete song without requiring musical training or experience by the user.

AUDIO PROCESSING METHOD AND AUDIO PROCESSING SYSTEM
20210256959 · 2021-08-19 ·

An audio processing system includes a memory and a processor. The processor implements instructions to: establish a re-trained synthesis model by additional training a pre-trained synthesis model for generating feature data representative of acoustic features of an audio signal according to condition data representative of sounding conditions, using: first condition data representative of sounding conditions identified from a first audio signal of a first sound source; and first feature data representative of acoustic features of the first audio signal; receive an instruction to modify at least one of the sounding conditions of the first audio signal; generate second feature data by inputting second condition data representative of the modified at least one sounding condition into the re-trained synthesis model established by the additional training; and generate a modified audio signal in accordance with the generated second feature data.

INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING SYSTEM
20210256960 · 2021-08-19 ·

An information processing system includes at least one memory storing a program and at least one processor. The at least one processor implements the program to input a piece of sound source data representative of a sound source, a piece of style data representative of a performance style, and synthesis data representative of sounding conditions into a synthesis model generated by machine learning, and to generate, using the synthesis model, feature data representative of acoustic features of a target sound of the sound source to be generated in the performance style and according to the sounding conditions, and to generate an audio signal corresponding to the target sound using the generated feature data.

ARTIFICIALLY GENERATING AUDIO DATA FROM TEXTUAL INFORMATION AND RHYTHM INFORMATION
20210224319 · 2021-07-22 ·

Methods and systems for artificially generating media streams are provided. Textual information, rhythm information and voice characteristics may be received. It may be determined that a first portion of the textual information corresponds to a first portion of the rhythm information and that a second portion of the textual information corresponds to a second portion of the rhythm information. Audio stream may be generated based on the textual information, the rhythm information and the voice characteristics. A first portion of the audio stream may include a vocal expression of the first portion of the textual information in a voice corresponding to the voice characteristics and according to the first portion of the rhythm information, and a second portion may include a vocal expression of the second portion of the textual information in the voice corresponding to the voice characteristics and according to the second portion of the rhythm information.

LEARNING SINGING FROM SPEECH
20210248997 · 2021-08-12 · ·

A method, computer program, and computer system is provided for converting a singing voice of a first person associated with a first speaker to a singing voice of a second person using a speaking voice of the second person associated with a second speaker. A context associated with one or more phonemes corresponding to the singing voice of a first person is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes, the target acoustic frames, and a sample of the speaking voice of the second person. A sample corresponding to the singing voice of a first person is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.