G10L15/14

Dynamic speech recognition methods and systems with user-configurable performance

Methods and systems are provided for assisting operation of a vehicle using speech recognition. One method involves identifying a user-configured speech recognition performance setting value selected from among a plurality of speech recognition performance setting values, selecting a speech recognition model configuration corresponding to the user-configured speech recognition performance setting value from among a plurality of speech recognition model configurations, where each speech recognition model configuration of the plurality of speech recognition model configurations corresponds to a respective one of the plurality of speech recognition performance setting values, and recognizing an audio input as an input state using the speech recognition model configuration corresponding to the user-configured speech recognition performance setting value.

PERIPHERAL EQUIPMENT FOR CONTROLLING CAMERA ARRANGED IN A TERMINAL, SYSTEM AND METHOD THEREOF
20170366746 · 2017-12-21 ·

The invention discloses a camera controlling apparatus, comprising: a telescopic support rod, the telescopic support rod having a hollow body and an electrical signal generator disposed inside; a terminal clamp, for holding a mobile device designed for a lightning earphone, mounted on the telescopic support rod; a trigger device, mounted on the telescopic support rod and connected to the electrical signal generator; and a lightning cable disposed inside and extending through the hollow body of the telescopic support rod, the cable having a first end for connecting to the mobile device held by the terminal clamp and a second end connected to the telescopic support rod, and for transmitting a digital control signal to the mobile device through the lightning cable.

PERIPHERAL EQUIPMENT FOR CONTROLLING CAMERA ARRANGED IN A TERMINAL, SYSTEM AND METHOD THEREOF
20170366746 · 2017-12-21 ·

The invention discloses a camera controlling apparatus, comprising: a telescopic support rod, the telescopic support rod having a hollow body and an electrical signal generator disposed inside; a terminal clamp, for holding a mobile device designed for a lightning earphone, mounted on the telescopic support rod; a trigger device, mounted on the telescopic support rod and connected to the electrical signal generator; and a lightning cable disposed inside and extending through the hollow body of the telescopic support rod, the cable having a first end for connecting to the mobile device held by the terminal clamp and a second end connected to the telescopic support rod, and for transmitting a digital control signal to the mobile device through the lightning cable.

ROBUST AUDIO IDENTIFICATION WITH INTERFERENCE CANCELLATION

Audio distortion compensation methods to improve accuracy and efficiency of audio content identification are described. The method is also applicable to speech recognition. Methods to detect the interference from speakers and sources, and distortion to audio from environment and devices, are discussed. Additional methods to detect distortion to the content after performing search and correlation are illustrated. The causes of actual distortion at each client are measured and registered and learnt to generate rules for determining likely distortion and interference sources. The learnt rules are applied at the client, and likely distortions that are detected are compensated or heavily distorted sections are ignored at audio level or signature and feature level based on compute resources available. Further methods to subtract the likely distortions in the query at both audio level and after processing at signature and feature level are described.

Artificial intelligence apparatus for recognizing speech including multiple languages, and method for the same
11682388 · 2023-06-20 · ·

An AI apparatus includes a microphone to acquire speech data including multiple languages, and a processor to acquire text data corresponding to the speech data, determine a main language from languages included in the text data, acquire a translated text data obtained by translating a text data portion, which has a language other than the main language, in the main language, acquire a morpheme analysis result for the translated text data, extract a keyword for intention analysis from the morpheme analysis result, acquire an intention pattern matched to the keyword, and perform an operation corresponding to the intention pattern.

Statistical voice dialog system and method

A method for processing a voice command using a statistical dialog model determines a belief state as a probability distribution over states organized in a hierarchy with a parent-child relationship of nodes representing the states. The belief state includes the hierarchy of state variables defining probabilities of each state to correspond to the voice command and a probability of a state of a child node in the hierarchy is conditioned on a probability of a state of a corresponding parent node. A system action is selected based on the belief state.

Internet of Things (IoT) Human Interface Apparatus, System, and Method
20170345420 · 2017-11-30 ·

Novel tools and techniques are provided for implementing Internet of Things (“IoT”) functionality. In some embodiments, microphones of an IoT human interface device might receive user voice input. The IoT human interface device and/or a computing system might identify explicit commands in the voice input, identify first IoT-capable devices to which the explicit commands are applicable, receive sensor data from IoT sensors, and analyze the voice input in view of previous user voice inputs and in view of the sensor data to determine whether the voice input contains any implicit commands. If so, second IoT-capable devices to which an implicit command is additionally applicable might be identified, instructions based on a combination of the explicit and implicit commands may be generated and sent to the second IoT-capable devices. Instructions based only on the explicit commands are generated and sent to first IoT-capable devices to which implicit commands are not applicable.

Internet of Things (IoT) Human Interface Apparatus, System, and Method
20170345420 · 2017-11-30 ·

Novel tools and techniques are provided for implementing Internet of Things (“IoT”) functionality. In some embodiments, microphones of an IoT human interface device might receive user voice input. The IoT human interface device and/or a computing system might identify explicit commands in the voice input, identify first IoT-capable devices to which the explicit commands are applicable, receive sensor data from IoT sensors, and analyze the voice input in view of previous user voice inputs and in view of the sensor data to determine whether the voice input contains any implicit commands. If so, second IoT-capable devices to which an implicit command is additionally applicable might be identified, instructions based on a combination of the explicit and implicit commands may be generated and sent to the second IoT-capable devices. Instructions based only on the explicit commands are generated and sent to first IoT-capable devices to which implicit commands are not applicable.

Organizational-based language model generation

Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.

Organizational-based language model generation

Systems and methods are provided for acquiring training data and building an organizational-based language model based on the training data. In organizational data is generated via one or more applications associated with an organization, the collected organizational data is aggregated and filtered into training data that is used for training an organizational-based language model for speech processing based on the training data.