G10H2250/455

VOICE SYNTHESIS METHOD, VOICE SYNTHESIS APPARATUS, AND RECORDING MEDIUM
20200342848 · 2020-10-29 ·

A voice synthesis method designates a target feature of a voice to be synthesized; specifies harmonic frequencies for a plurality of respective harmonic components of the voice and an amplitude spectrum envelope of the voice; specifies a harmonic amplitude distribution of each of the plurality of respective harmonic components based on (i) the target feature, (ii) the amplitude spectrum envelope, and (iii) the harmonic frequency specified for the respective harmonic component, the harmonic amplitude distribution representing a distribution of amplitudes in a unit band with a peak amplitude corresponding to the respective harmonic component; and generates a frequency spectrum of the voice with the target feature based on harmonic amplitude distributions specified for each of the plurality of respective harmonic components and the amplitude spectrum envelope.

Speech characteristic recognition and conversion
10818308 · 2020-10-27 · ·

Systems, devices, media, and methods are presented for converting sounds in an audio stream. The systems and methods receive an audio conversion request initiating conversion of one or more sound characteristics of an audio stream from a first state to a second state. The systems and methods access an audio conversion model associated with an audio signature for the second state. The audio stream is converted based on the audio conversion model and an audio construct is compiled from the converted audio stream and a base audio segment. The compiled audio construct is presented at a client device.

Electronic musical instrument, electronic musical instrument control method, and storage medium

An electronic musical instrument includes: a memory that stores a trained acoustic model obtained by performing machine learning on training musical score data and training singing voice data of a singer; and at least one processor, wherein the at least one processor: in accordance with a user operation on an operation element in a plurality of operation elements, inputs prescribed lyric data and pitch data corresponding to the user operation of the operation element to the trained acoustic model, and digitally synthesizes and outputs inferred singing voice data that infers a singing voice of the singer on the basis of at least a portion of acoustic feature data output by the trained acoustic model, and on the basis of instrument sound waveform data that are synthesized in accordance with the pitch data corresponding to the user operation of the operation element.

Electronic musical instrument, electronic musical instrument control method, and storage medium

An electronic musical instrument in one aspect of the disclosure includes; a plurality of operation elements to be performed by a user for respectively specifying different pitches; a memory that stores musical piece data that includes data of a vocal part, the vocal part including at least a first note with a first pitch and an associated first lyric part that are to be played at a first timing; and at least one processor, wherein if the user does not operate any of the plurality of operation elements in accordance with the first timing, the at least one processor digitally synthesizes a default first singing voice that includes the first lyric part and that has the first pitch in accordance with data of the first note stored in the memory, and causes the digitally synthesized default first singing voice to be audibly output at the first timing.

SINGING EXPRESSION TRANSPLANTATION SYSTEM
20200302903 · 2020-09-24 ·

Disclosed are a system and a method for singing expression transplantation. A singing expression transplantation method performed by a singing expression transplantation system according to an embodiment may comprise the steps of: synchronizing each of a first sound source and a second sound source, which include different pieces of voice information with regard to an identical song; modifying the pitch of the first sound source on the basis of pitch information extracted from each of the first sound source and the second sound source, which have been synchronized; and extracting volume information from each of the first sound source and the second sound source and adjusting the magnitude of the volume regarding the first sound source, the pitch of which has been modified, according to each piece of extracted volume information.

VOICE SYNTHESIS METHOD, VOICE SYNTHESIS APPARATUS, AND RECORDING MEDIUM
20200294484 · 2020-09-17 ·

Voice synthesis method and apparatus generate second control data using an intermediate trained model with first input data including first control data designating phonetic identifiers, change the second control data in accordance with a first user instruction provided by a user, generate synthesis data representing frequency characteristics of a voice to be synthesized using a final trained model with final input data including the first control data and the changed second control data, and generate a voice signal based on the generated synthesis data.

VOICE SYNTHESIS METHOD, VOICE SYNTHESIS APPARATUS, AND RECORDING MEDIUM
20200294486 · 2020-09-17 ·

A voice synthesis method includes: supplying a first trained model with control data including phonetic identifier data to generate a series of frequency spectra of harmonic components; supplying a second trained model with the control data to generate a waveform signal representative of non-harmonic components; and generating a voice signal including the harmonic components and the non-harmonic components based on the series of frequency spectra of the harmonic components generated by the first trained model and the waveform signal representative of the non-harmonic components generated by the second trained model.

KEYBOARD INSTRUMENT AND METHOD PERFORMED BY COMPUTER OF KEYBOARD INSTRUMENT
20200294485 · 2020-09-17 · ·

A keyboard instrument includes at least one processor that determines a first pattern of intonation to be applied to a first time segment of a voice data on the basis of a first user operation on a first operation element, causes a first singing voice for the first time segment to be digitally synthesized from the first segment data in accordance with the determined first pattern of intonation, determines a second pattern of intonation to be applied to the second time segment of the voice data on the basis of a second user operation on a second operation element, and causes a second singing voice for the second time segment to be digitally synthesized from the second segment data in accordance with the determined second pattern of intonation.

Content control device and storage medium
10714066 · 2020-07-14 · ·

A content control device includes: a plurality of controls to which a plurality of parameters for controlling properties of a content containing at least one of sound and video are respectively assigned, each of the plurality of controls outputting a first indicated value in accordance with an operation amount of the control; and a processor configured to previously create setting information used to determine respective values of the plurality of parameters in accordance with the second indicated value; determine the values of the plurality of parameters corresponding to the second indicated value respectively in accordance with the second indicated value and the setting information; and revise each of the values of the parameters to be determined in accordance with the first indicated value outputted for the control assigned to the parameter.

ELECTRONIC MUSICAL INSTRUMENT, ELECTRONIC MUSICAL INSTRUMENT CONTROL METHOD, AND PROGRAM
20240021180 · 2024-01-18 · ·

An electronic musical instrument includes a pitch designation unit configured to output performance time pitch data designated at a time of a performance, a performance style output unit configured to output performance time performance style data indicating a performance style at the time of the performance, and a sound generation model unit configured, based on an acoustic model parameter inferred by inputting the performance time pitch data and the performance time performance style data to a trained acoustic model, to synthesize and output musical sound data corresponding to the performance time pitch data and the performance time performance style data, at the time of the performance.