Patent classifications
G10L15/34
Voice recognition system and voice recognition device
There are provided a recognition result candidate comparator 205 that compares a plurality of server-side voice recognition result candidates received by a receiver 204, to detect texts having a difference, and a recognition result integrator 206 that integrates a client-side voice recognition result candidate and a server-side voice recognition result candidate on the basis of the client-side voice recognition result candidate, the server-side voice recognition result candidate, and a detection result provided by the recognition result candidate comparator 205, to decide a voice recognition result.
Voice recognition system and voice recognition device
There are provided a recognition result candidate comparator 205 that compares a plurality of server-side voice recognition result candidates received by a receiver 204, to detect texts having a difference, and a recognition result integrator 206 that integrates a client-side voice recognition result candidate and a server-side voice recognition result candidate on the basis of the client-side voice recognition result candidate, the server-side voice recognition result candidate, and a detection result provided by the recognition result candidate comparator 205, to decide a voice recognition result.
Speech recognition method in edge computing device
Disclosed herein is a speech recognition method in a distributed network environment. A method of performing a speech recognition operation in an edge computing device includes receiving a natural language understanding (NLU) model from the cloud server, storing the received NLU model, receiving voice data spoken by a user from the client device, performing a natural language processing operation on the received voice data using the NLU model, performing speech recognition according to the natural language processing operation, and transmitting a result of the speech recognition to the client device. At least one of the edge computing device, a voice recognition device, and a server may be associated with an artificial intelligence module, a drone (an unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to a 5G service, and the like.
Speech recognition method in edge computing device
Disclosed herein is a speech recognition method in a distributed network environment. A method of performing a speech recognition operation in an edge computing device includes receiving a natural language understanding (NLU) model from the cloud server, storing the received NLU model, receiving voice data spoken by a user from the client device, performing a natural language processing operation on the received voice data using the NLU model, performing speech recognition according to the natural language processing operation, and transmitting a result of the speech recognition to the client device. At least one of the edge computing device, a voice recognition device, and a server may be associated with an artificial intelligence module, a drone (an unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to a 5G service, and the like.
SPEECH RECOGNITION SYSTEMS AND METHODS
A computer-implemented method for adapting a first speech recognition machine-learning model to utterances having one or more attributes, including: receiving an unlabelled utterance having the one or more attributes; generating a first transcription of the unlabelled utterance; generating a second transcription of the unlabelled utterance, wherein the second transcription is different from the first transcription; processing, by the first speech recognition machine-learning model, the one or more unlabelled utterances to derive posterior probabilities for the first transcription and the second transcription; and updating parameters of the first speech recognition machine-learning model in accordance with a loss function based on the derived posterior probabilities for the first transcription and the second transcription.
METHOD AND SYSTEM FOR SMART INTERACTION IN A MULTI VOICE CAPABLE DEVICE ENVIRONMENT
A system and method for providing a custom response to a voice command of a specific user. The method encompasses receiving, at a transceiver unit [102] from a user device, a custom voice response preference setting associated with the specific user. The method thereafter leads to receiving, at the transceiver unit [102] from a first target device, a voice command of the specific user. The method thereafter encompasses generating, by a processing unit [104], a custom response to the voice command of the specific user based at least on the custom voice response preference setting. Further, the method encompasses identifying, by an identification unit [106], a second target device from one or more devices present in vicinity of the specific user. Thereafter, the method comprises providing, by the processing unit [104], the generated custom response to the voice command of the specific user via the second target device.
METHOD AND SYSTEM FOR SMART INTERACTION IN A MULTI VOICE CAPABLE DEVICE ENVIRONMENT
A system and method for providing a custom response to a voice command of a specific user. The method encompasses receiving, at a transceiver unit [102] from a user device, a custom voice response preference setting associated with the specific user. The method thereafter leads to receiving, at the transceiver unit [102] from a first target device, a voice command of the specific user. The method thereafter encompasses generating, by a processing unit [104], a custom response to the voice command of the specific user based at least on the custom voice response preference setting. Further, the method encompasses identifying, by an identification unit [106], a second target device from one or more devices present in vicinity of the specific user. Thereafter, the method comprises providing, by the processing unit [104], the generated custom response to the voice command of the specific user via the second target device.
Detecting system-directed speech
A speech-processing system capable of receiving and processing audio data to determine if the audio data includes speech that was intended for the system. Non-system directed speech may be filtered out while system-directed speech may be selected for further processing. A system-directed speech detector may use a trained machine learning model (such as a deep neural network or the like) to process a feature vector representing a variety of characteristics of the incoming audio data, including the results of automatic speech recognition and/or other data. Using the feature vector the model may output an indicator as to whether the speech is system-directed. The system may also incorporate other filters such as voice activity detection prior to speech recognition, or the like.
Detecting system-directed speech
A speech-processing system capable of receiving and processing audio data to determine if the audio data includes speech that was intended for the system. Non-system directed speech may be filtered out while system-directed speech may be selected for further processing. A system-directed speech detector may use a trained machine learning model (such as a deep neural network or the like) to process a feature vector representing a variety of characteristics of the incoming audio data, including the results of automatic speech recognition and/or other data. Using the feature vector the model may output an indicator as to whether the speech is system-directed. The system may also incorporate other filters such as voice activity detection prior to speech recognition, or the like.
Systems and methods for cloud computing data processing
Systems and methods allow users to leverage multiple disparate cloud solutions, offered by disparate service providers, in a unified and cohesive manner. A system includes an engine configured to receive performance metrics from two or more disparate cloud services, select target resources among the two or more disparate cloud services to run tasks based on the performance metrics, a multiservice load balancing scheme, and task parameters. Resources can be scaled up or down in the two or more disparate cloud services based on task loads.