Patent classifications
G10L15/34
VOICE CONTROL METHOD, CLOUD SERVER AND TERMINAL DEVICE
A voice control method that includes: a terminal device receiving voice information; the terminal device querying a control instruction corresponding to the voice information from a local voice library; when no control instruction corresponding to the voice information is found in the local voice library, the terminal device uploading the voice information onto a cloud server; the cloud server querying the control instruction corresponding to the voice information from a cloud voice library; when the control instruction corresponding to the voice information is found in the cloud voice library, the cloud server sending the control instruction to the terminal device; the terminal device receiving the control instruction, and executing a corresponding operation on the basis of the control instruction. The present disclosure improves the response speed of a terminal device, and improves user experience.
Systems and methods for cloud computing data processing
Systems and methods allow users to leverage multiple disparate cloud solutions, offered by disparate service providers, in a unified and cohesive manner. A system includes an image database configured to store a virtual machine image in a stored image format and an engine configured to allocate a task among two or more disparate cloud services. The engine is further configured to convert the virtual machine image from the stored image format to a deployed image format, wherein the deployed image format conforms to formatting for one of the two or more disparate cloud services, and deploy the virtual machine image in the deployed image format to a virtual machine instance of the one of the two or more disparate cloud services.
Systems and methods for cloud computing data processing
Systems and methods allow users to leverage multiple disparate cloud solutions, offered by disparate service providers, in a unified and cohesive manner. A system includes an image database configured to store a virtual machine image in a stored image format and an engine configured to allocate a task among two or more disparate cloud services. The engine is further configured to convert the virtual machine image from the stored image format to a deployed image format, wherein the deployed image format conforms to formatting for one of the two or more disparate cloud services, and deploy the virtual machine image in the deployed image format to a virtual machine instance of the one of the two or more disparate cloud services.
SPEECH RECOGNITION METHOD IN EDGE COMPUTING DEVICE
Disclosed herein is a speech recognition method in a distributed network environment. A method of performing a speech recognition operation in an edge computing device includes receiving a natural language understanding (NLU) model from the cloud server, storing the received NLU model, receiving voice data spoken by a user from the client device, performing a natural language processing operation on the received voice data using the NLU model, performing speech recognition according to the natural language processing operation, and transmitting a result of the speech recognition to the client device.
At least one of the edge computing device, a voice recognition device, and a server may be associated with an artificial intelligence module, a drone (an unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to a 5G service, and the like.
SPEECH RECOGNITION METHOD IN EDGE COMPUTING DEVICE
Disclosed herein is a speech recognition method in a distributed network environment. A method of performing a speech recognition operation in an edge computing device includes receiving a natural language understanding (NLU) model from the cloud server, storing the received NLU model, receiving voice data spoken by a user from the client device, performing a natural language processing operation on the received voice data using the NLU model, performing speech recognition according to the natural language processing operation, and transmitting a result of the speech recognition to the client device.
At least one of the edge computing device, a voice recognition device, and a server may be associated with an artificial intelligence module, a drone (an unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to a 5G service, and the like.
INVOKING FUNCTIONS OF AGENTS VIA DIGITAL ASSISTANT APPLICATIONS USING ADDRESS TEMPLATES
Systems and methods of invoking functions of agents via digital assistant applications are provided. Each action-inventory can have an address template for an action by an agent. The address template can include a portion having an input variable used to execute the action. A data processing system can parse an input audio signal from a client device to identify a request and a parameter to be executed by the agent. The data processing system can select an action-inventory for the action corresponding to the request. The data processing system can generate, using the address template, an address. The address can include a substring having the parameter used to control execution of the action. The data processing system can direct an action data structure including the address to the agent to cause the agent to execute the action and to provide output for presentation.
METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE COMMAND PROCESSED BYAN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM
A method and device for detecting an audio adversarial attack with respect to a voice command processed by an automatic speech recognition system is described. The method is implemented by a detection device connected to the automatic speech recognition system and includes obtaining an audio signal associated with the voice command, performing a phonetic transcription of the audio signal, according to a phonetic transcription scheme, delivering a first character string; obtaining a transcript resulting from the processing, by the automatic speech recognition system, of the audio signal, performing a phonetic transcription of the transcript, according to the phonetic transcription scheme, delivering a second character string, computing a similarity score between the first character string and the second character string, and delivering a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the similarity score and a predetermined threshold.
METHOD FOR DETECTING AN AUDIO ADVERSARIAL ATTACK WITH RESPECT TO A VOICE COMMAND PROCESSED BYAN AUTOMATIC SPEECH RECOGNITION SYSTEM, CORRESPONDING DEVICE, COMPUTER PROGRAM PRODUCT AND COMPUTER-READABLE CARRIER MEDIUM
A method and device for detecting an audio adversarial attack with respect to a voice command processed by an automatic speech recognition system is described. The method is implemented by a detection device connected to the automatic speech recognition system and includes obtaining an audio signal associated with the voice command, performing a phonetic transcription of the audio signal, according to a phonetic transcription scheme, delivering a first character string; obtaining a transcript resulting from the processing, by the automatic speech recognition system, of the audio signal, performing a phonetic transcription of the transcript, according to the phonetic transcription scheme, delivering a second character string, computing a similarity score between the first character string and the second character string, and delivering a piece of data representative of a detection of an audio adversarial attack, as a function of a result of a comparison between the similarity score and a predetermined threshold.
SYSTEM AND METHOD USING CLOUD STRUCTURES IN REAL TIME SPEECH AND TRANSLATION INVOLVING MULTIPLE LANGUAGES, CONTEXT SETTING, AND TRANSCRIPTING FEATURES
A system for using cloud structures in real time speech and translation involving multiple languages is provided. The system comprises a processor, a memory, and an application stored in the memory that when executed on the processor receives audio content in a first spoken language from a first speaking device. The system also receives a first language preference from a first client device, the first language preference differing from the spoken language. The system also receives a second language preference from a second client device, the second language preference differing from the spoken language. The system also transmits the audio content and the language preferences to at least one translation engine and receives the audio content from the engine translated into the first and second languages. The system also sends the audio content to the client devices translated into their respective preferred languages.
SYSTEM AND METHOD USING CLOUD STRUCTURES IN REAL TIME SPEECH AND TRANSLATION INVOLVING MULTIPLE LANGUAGES, CONTEXT SETTING, AND TRANSCRIPTING FEATURES
A system for using cloud structures in real time speech and translation involving multiple languages is provided. The system comprises a processor, a memory, and an application stored in the memory that when executed on the processor receives audio content in a first spoken language from a first speaking device. The system also receives a first language preference from a first client device, the first language preference differing from the spoken language. The system also receives a second language preference from a second client device, the second language preference differing from the spoken language. The system also transmits the audio content and the language preferences to at least one translation engine and receives the audio content from the engine translated into the first and second languages. The system also sends the audio content to the client devices translated into their respective preferred languages.