Patent classifications
G10L17/04
Gathering user's speech samples
Disclosed is gathering a user's speech samples. According to an embodiment of the disclosure, a method of gathering learning samples may gather a speaker's speech data obtained while talking on a mobile terminal and text data generated from the speech data and gather training data for generating a speech synthesis model. According to the disclosure, the method of gathering learning samples may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
Gathering user's speech samples
Disclosed is gathering a user's speech samples. According to an embodiment of the disclosure, a method of gathering learning samples may gather a speaker's speech data obtained while talking on a mobile terminal and text data generated from the speech data and gather training data for generating a speech synthesis model. According to the disclosure, the method of gathering learning samples may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
SYSTEM AND METHOD FOR EXTRACTING AND DISPLAYING SPEAKER INFORMATION IN AN ATC TRANSCRIPTION
A system for extracting speaker information in an ATC transcription and displaying the speaker information on a graphical display unit is provided. The system is configured to: segment a stream of audio received from an ATC and other aircraft into a plurality of chunks; determine, for each chunk, if the speaker is enrolled in an enrolled speaker database; when the speaker is enrolled in the enrolled speaker database, decode the chunk using a speaker-dependent automatic speech recognition (ASR) model and tag the chunk with a permanent name for the speaker; when the speaker is not enrolled in the enrolled speaker database, assign a temporary name for the speaker, tag the chunk with the temporary name, and decode the chunk using a speaker independent speech recognition model; format the decoded chunk as text; and signal the graphical display unit to display the formatted text along with an identity for the speaker.
SYSTEM AND METHOD FOR EXTRACTING AND DISPLAYING SPEAKER INFORMATION IN AN ATC TRANSCRIPTION
A system for extracting speaker information in an ATC transcription and displaying the speaker information on a graphical display unit is provided. The system is configured to: segment a stream of audio received from an ATC and other aircraft into a plurality of chunks; determine, for each chunk, if the speaker is enrolled in an enrolled speaker database; when the speaker is enrolled in the enrolled speaker database, decode the chunk using a speaker-dependent automatic speech recognition (ASR) model and tag the chunk with a permanent name for the speaker; when the speaker is not enrolled in the enrolled speaker database, assign a temporary name for the speaker, tag the chunk with the temporary name, and decode the chunk using a speaker independent speech recognition model; format the decoded chunk as text; and signal the graphical display unit to display the formatted text along with an identity for the speaker.
Voice synthesis for virtual agents
Techniques are described for generating a custom voice for a virtual agent. In one implementations, a method includes receiving information identifying a customer contacting a call center. The method includes selecting a voice for a virtual agent based on information about the customer. The method also includes assigning the voice to the virtual agent during communications with the customer.
Voice synthesis for virtual agents
Techniques are described for generating a custom voice for a virtual agent. In one implementations, a method includes receiving information identifying a customer contacting a call center. The method includes selecting a voice for a virtual agent based on information about the customer. The method also includes assigning the voice to the virtual agent during communications with the customer.
IDENTIFICATION SYSTEM DEVICE
An identification system device includes an identification element processing unit that generates a plurality of identification elements based on sound information including a frequency of a sound source or a frequency of a sound, an ID conversion processing unit that generates an ID based on the sound information, an information generation processing unit that generates identification information by associating the ID with the identification elements, a memory unit that stores the identification information, a judgment unit that compares the identification information with a plurality of newly generated identification elements to determine whether or not both are the sound information from the same sound source, the ID conversion processing unit generates the new ID related to the ID when the determination was the sound information from the same sound source, the information generation processing unit generates the new identification information by associating the new ID with the newly generated identification elements.
MACHINE LEARNING FOR IMPROVING QUALITY OF VOICE BIOMETRICS
Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
MACHINE LEARNING FOR IMPROVING QUALITY OF VOICE BIOMETRICS
Methods and systems are disclosed herein for improving the quality of audio for use in a biometric. A biometric system may use machine learning to determine whether audio or a portion of the audio should be used as a biometric for a user. A sample of the user's voice may be used to generate a voice signature of the user. Portions of the audio that do not meet a similarity threshold when compared with the voice signature may be removed from the audio. Additionally or alternatively, interfering noises may be detected and removed from the audio to improve the quality of a voice biometric generated from the audio.
SYSTEM AND METHOD FOR VOICE BIOMETRICS AUTHENTICATION
A system and method for authenticating an identity may include generating a first generic representation representing a stored audio content, generating a second generic representation representing input audio content, and, providing the first and second generic representations to a voice biometrics unit adapted to authenticate an identity based on the first and second generic representations.