Patent classifications
G10L17/14
SMART SPEAKER, MULTI-VOICE ASSISTANT CONTROL METHOD, AND SMART HOME SYSTEM
The invention discloses a smart loudspeaker, wherein the smart loudspeaker includes a voice input module, a language recognition module and at least two voice assistants, and the language recognition module receives a voice information from the voice input module and determines the language category based on the voice information and activates the voice assistant corresponding to the language category.
SMART SPEAKER, MULTI-VOICE ASSISTANT CONTROL METHOD, AND SMART HOME SYSTEM
The invention discloses a smart loudspeaker, wherein the smart loudspeaker includes a voice input module, a language recognition module and at least two voice assistants, and the language recognition module receives a voice information from the voice input module and determines the language category based on the voice information and activates the voice assistant corresponding to the language category.
Speech recognition
A method includes receiving acoustic features of a first utterance spoken by a first user that speaks with typical speech and processing the acoustic features of the first utterance using a general speech recognizer to generate a first transcription of the first utterance. The operations also include analyzing the first transcription of the first utterance to identify one or more bias terms in the first transcription and biasing the alternative speech recognizer on the one or more bias terms identified in the first transcription. The operations also include receiving acoustic features of a second utterance spoken by a second user that speaks with atypical speech and processing, using the alternative speech recognizer biased on the one or more terms identified in the first transcription, the acoustic features of the second utterance to generate a second transcription of the second utterance.
Speech recognition
A method includes receiving acoustic features of a first utterance spoken by a first user that speaks with typical speech and processing the acoustic features of the first utterance using a general speech recognizer to generate a first transcription of the first utterance. The operations also include analyzing the first transcription of the first utterance to identify one or more bias terms in the first transcription and biasing the alternative speech recognizer on the one or more bias terms identified in the first transcription. The operations also include receiving acoustic features of a second utterance spoken by a second user that speaks with atypical speech and processing, using the alternative speech recognizer biased on the one or more terms identified in the first transcription, the acoustic features of the second utterance to generate a second transcription of the second utterance.
Speaker verification
Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate language independent-speaker verification. In one aspect, a method includes actions of receiving, by a user device, audio data representing an utterance of a user. Other actions may include providing, to a neural network stored on the user device, input data derived from the audio data and a language identifier. The neural network may be trained using speech data representing speech in different languages or dialects. The method may include additional actions of generating, based on output of the neural network, a speaker representation and determining, based on the speaker representation and a second representation, that the utterance is an utterance of the user. The method may provide the user with access to the user device based on determining that the utterance is an utterance of the user.
Speaker verification
Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate language independent-speaker verification. In one aspect, a method includes actions of receiving, by a user device, audio data representing an utterance of a user. Other actions may include providing, to a neural network stored on the user device, input data derived from the audio data and a language identifier. The neural network may be trained using speech data representing speech in different languages or dialects. The method may include additional actions of generating, based on output of the neural network, a speaker representation and determining, based on the speaker representation and a second representation, that the utterance is an utterance of the user. The method may provide the user with access to the user device based on determining that the utterance is an utterance of the user.
ACCOUNT ADDING METHOD, TERMINAL, SERVER, AND COMPUTER STORAGE MEDIUM
An account adding method is performed by a social networking application running at a mobile terminal when communicating with a second terminal (e.g., using a chat session). The method includes: recording voice information from the second terminal using the social networking application; extracting character string information and voiceprint information from the voice information; sending the character string information and the voiceprint information to a server; receiving an account that matches the character string information and the voiceprint information and that is sent by the server; and adding the account to a contact list of the social networking application. For example, the social networking application is started before starting a telephone call with the second terminal and the voice information is recorded during the telephone call.
Intelligent test cases generation based on voice conversation
Aspects of the disclosure relate to generating test cases based on voice conversation. In some embodiments, a computing platform may receive voice data associated with an agile development meeting. Subsequently, the computing platform may identify, using a natural language processing engine, context of one or more requirements being discussed during the agile development meeting. Based on identifying the context of the one or more requirements being discussed during the agile development meeting, the computing platform may store context data into a database. Next, the computing platform may map the context data to a corresponding task item of a software development project. Thereafter, the computing platform may identify one or more test cases to be generated. Then, the computing platform may cause the identified test cases to be executed.
Intelligent test cases generation based on voice conversation
Aspects of the disclosure relate to generating test cases based on voice conversation. In some embodiments, a computing platform may receive voice data associated with an agile development meeting. Subsequently, the computing platform may identify, using a natural language processing engine, context of one or more requirements being discussed during the agile development meeting. Based on identifying the context of the one or more requirements being discussed during the agile development meeting, the computing platform may store context data into a database. Next, the computing platform may map the context data to a corresponding task item of a software development project. Thereafter, the computing platform may identify one or more test cases to be generated. Then, the computing platform may cause the identified test cases to be executed.
Reducing bandwidth requirements of virtual collaboration sessions
A computer-implemented method, a computer system and a computer program product reduce bandwidth requirements of a virtual collaboration session. The method includes capturing session data from a virtual collaboration session. The session data is selected from a group consisting of video data, audio data, an image of a screen of a connected device and text data. The method also includes connecting to a live blog platform. The method further includes transmitting a text transcription of the virtual collaboration session to the live blog platform. The text transcription is generated by scanning the audio data using a speech-to-text algorithm. In addition, the method includes classifying a topic in the virtual collaboration session based on importance. Lastly, the method includes transmitting a multimedia file related to the topic to the live blog platform in response to the topic being classified as important. The multimedia file is extracted from the session data.