Patent classifications
G10L21/013
Unsupervised singing voice conversion with pitch adversarial network
A method, a computer readable medium, and a computer system are provided for singing voice conversion. Data corresponding to a singing voice is received. One or more features and pitch data are extracted from the received data using one or more adversarial neural networks. One or more audio samples are generated based on the extracted pitch data and the one or more features.
METHOD FOR TRANSFORMING AUDIO SIGNAL, DEVICE, AND STORAGE MEDIUM
A method for transforming an audio signal comprises obtaining a plurality of segmental original frequency-domain signal segments and a plurality of segmental target frequency-domain signal segments by segmenting and performing a Fourier transform on an original audio signal and an initial target audio signal obtained by pitch shifting on the original audio signal; obtaining a plurality of original formant envelopes by respectively filtering the plurality of segmental original frequency-domain signal segments according to a plurality of original segment window functions, and obtaining a plurality of target formant envelopes by respectively filtering the plurality of segmental target frequency-domain signal segments according to a plurality of target segment window functions; and determining a pitch-shifted audio signal based on the plurality of segmental target frequency-domain signal segments, the plurality of original formant envelopes, and the plurality of target formant envelopes.
METHOD FOR TRANSFORMING AUDIO SIGNAL, DEVICE, AND STORAGE MEDIUM
A method for transforming an audio signal comprises obtaining a plurality of segmental original frequency-domain signal segments and a plurality of segmental target frequency-domain signal segments by segmenting and performing a Fourier transform on an original audio signal and an initial target audio signal obtained by pitch shifting on the original audio signal; obtaining a plurality of original formant envelopes by respectively filtering the plurality of segmental original frequency-domain signal segments according to a plurality of original segment window functions, and obtaining a plurality of target formant envelopes by respectively filtering the plurality of segmental target frequency-domain signal segments according to a plurality of target segment window functions; and determining a pitch-shifted audio signal based on the plurality of segmental target frequency-domain signal segments, the plurality of original formant envelopes, and the plurality of target formant envelopes.
METHOD AND APPARATUS FOR EXEMPLARY MORPHING COMPUTER SYSTEM BACKGROUND
Method and apparatus for reducing a size of databases required for recorded speech data.
METHOD AND APPARATUS FOR EXEMPLARY MORPHING COMPUTER SYSTEM BACKGROUND
Method and apparatus for reducing a size of databases required for recorded speech data.
Dynamically adapted pitch correction based on audio input
Systems and methods for adjusting pitch of an audio signal include detecting input notes in the audio signal, mapping the input notes to corresponding output notes, each output note having an associated upper note boundary and lower note boundary, and modifying at least one of the upper note boundary and the lower note boundary of at least one output note in response to previously received input notes. Pitch of the input notes may be shifted to match an associated pitch of corresponding output notes. Delay of the pitch shifting process may be dynamically adjusted based on detected stability of the input notes.
Dynamically adapted pitch correction based on audio input
Systems and methods for adjusting pitch of an audio signal include detecting input notes in the audio signal, mapping the input notes to corresponding output notes, each output note having an associated upper note boundary and lower note boundary, and modifying at least one of the upper note boundary and the lower note boundary of at least one output note in response to previously received input notes. Pitch of the input notes may be shifted to match an associated pitch of corresponding output notes. Delay of the pitch shifting process may be dynamically adjusted based on detected stability of the input notes.
MULTIFUNCTIONAL MICROPHONE
A multifunctional microphone, includes a controlling mainboard; a sound collector electrically connected with the controlling mainboard; a speaker electrically connected with the controlling mainboard; and a sound adjusting module arranged on the controlling mainboard and configured to adjust sound collected by the sound collector.
MULTIFUNCTIONAL MICROPHONE
A multifunctional microphone, includes a controlling mainboard; a sound collector electrically connected with the controlling mainboard; a speaker electrically connected with the controlling mainboard; and a sound adjusting module arranged on the controlling mainboard and configured to adjust sound collected by the sound collector.
EVALUATION APPARATUS, TRAINING APPARATUS, METHODS AND PROGRAMS FOR THE SAME
An evaluation device applies a lowpass filter with a cutoff frequency being a first predetermined value or a second predetermined value greater than the first predetermined value with or without change of feedback formant frequencies which are formant frequencies of a picked-up speech signal, converts the picked-up speech signal, feeds back the converted speech signal to a subject, and includes an evaluation unit that calculates a compensatory response vector by using pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted with change of the feedback formant frequencies to the subject, and pickup formant frequencies which are formant frequencies of a speech signal acquired by picking up an utterance made by the subject while feeding back a speech signal that has been converted without change of the feedback formant frequencies to the subject, and determines an evaluation based on a compensatory response vector for each cutoff frequency.