IPIQ

G10L2013/021

Synthesis of speech from text in a voice of a target speaker using neural networks

11488575 · 2022-11-01 ·

Google Llc

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.

Multifunctional audio signal generation apparatus

09792916 · 2017-10-17 ·

Yamaha Corporation

Taro Shirahama

A sample counter in each channel performs counting operation at a given rate. Independently for each channel, the rate and an initial value for the counter are set, and start and stop of the counting operation of the counter are controlled, so that a partial portion of an original waveform corresponding to a count range from the set initial value to a count stop point is reproduced in the channel. A control section sets the initial values in individual ones of a set of channels, selected from among the channels, such that sample values at different sample positions of the original waveform are simultaneously retrieved in individual ones of the set of channels, and controls an overlap adder to add up the retrieved sample values, so that sample values of an audio waveform signal with a plurality of partial portions of the original waveform, partially overlapping each other are output.

TAILORED VOICE NAVIGATION ANALYSIS SYSTEM

20170292853 · 2017-10-12 ·

An approach, for tailoring voice navigation instruction output. A navigation audio tailor receives songs including instrumental segments and associated vocal segments. The navigation audio tailor identifies, instrumental only segments where the instrumental only segments mark time durations based on the instrumental segments being absent of the associated vocal segments. The navigation audio tailor receives, navigations instructions where the navigation instructions are based on text to create voice navigation instructions. The navigation audio tailor determines, navigation instruction output timing where the navigation instructions output timing is associated to one of the instrumental only segments to create tailored navigation instructions and the navigation audio tailor outputs the tailored navigation instructions to an audio playback device where the output combines at least one of the songs.

Synthesis of Speech from Text in a Voice of a Target Speaker Using Neural Networks

20220351713 · 2022-11-03 ·

Google Llc

Synthesis of speech from text in a voice of a target speaker using neural networks

11848002 · 2023-12-19 ·

Google Llc

METHOD FOR PROVIDING GROUP CALL SERVICE, AND ELECTRONIC DEVICE SUPPORTING SAME

20230410788 · 2023-12-21 ·

An electronic device includes a communication module and a processor operatively connected to the communication module. The processor is configured to: receive and store a first speech voice related to at least a first external device, and a second speech voice related to a second external device; if individual speech is detected, transmit the first speech voice or the second speech voice having a first playback speed to at least a first external device and a second external device; and, if simultaneous speech is detected, convert, into a second playback speed different from the first playback speed, at least a part of a synthesized voice in which at least first overlap speech of the first speech voice and at least second overlap speech of the second speech voice are successively connected, and transmit the synthesized voice to the at least first external device and the second external device.

Synthesis of Speech from Text in a Voice of a Target Speaker Using Neural Networks

20210217404 · 2021-07-15 ·

Google Llc

SYNTHESIS OF SPEECH FROM TEXT IN A VOICE OF A TARGET SPEAKER USING NEURAL NETWORKS

20240112667 · 2024-04-04 ·

Google Llc

Multifunctional audio signal generation apparatus

10388290 · 2019-08-20 ·

Yamaha Corporation

Taro Shirahama

Method and device for optimizing speech synthesis system

10242660 · 2019-03-26 ·

Baidu Online Network Technology (Beijing) Co., Ltd.

The present invention provides a method and a device for optimizing speech synthesis system. The method comprises: receiving speech synthesis requests contained text messages; and determining the load level of the speech synthesis system when the speech synthesis requests are received; and selecting speech synthesis paths corresponding to the load level and synthesizing the text into speech according to the speech synthesis paths.

Patent classifications

G10L2013/021