G10L13/00

SYSTEM FOR TRANSCRIBING AND PERFORMING ANALYSIS ON PATIENT DATA

Methods, apparatuses, and systems for transcribing and performing analysis on patient data are disclosed. Data is collected from one or more medical professionals as well as sensors and imaging devices positioned on or oriented towards a patient. An analysis is performed on the patient data and the data is presented to a medical professional via a verbal interface in a conversational manner, allowing the medical professional to provide additional data such as observations or instructions which may be used for further analysis or to perform actions related to the patient's care.

USER-SYSTEM DIALOG EXPANSION

Techniques for recommending a skill experience to a user after a user-system dialog session has ended are described. Upon a dialog session ending, the system uses a first machine learning model to determine potential intents to recommend to a user. The system then uses a second machine learning model to determine a particular skill and intent to recommend. The system then prompts the user to accept the recommended skill and intent. If the user accepts, the system calls the recommended skill to execute. As part of calling the skill, the system sends to the skill at least one entity provided in a natural language user input of the ended dialog session. This enables the skill to skip welcome prompts, and initiate processing to output a response based on the intent and the at least one entity of the ended dialog session.

METHOD AND APPARATUS FOR CONSTRUCTING DOMAIN-SPECIFIC SPEECH RECOGNITION MODEL AND END-TO-END SPEECH RECOGNIZER USING THE SAME

Provided is an end-to-end speech recognition technology capable of improving speech recognition performance in a desired specific domain, which includes collecting domain text data be specialized and comparing the data with a basic transcript text DB to determine domain text that is not included in the basic transcript text DB and requires additional training and constructing a specialization target domain text DB. The end-to-end speech recognition technology generates a speech signal from the domain text of the specialization target domain text DB, and trains a speech recognition neural network with the generated speech signal to generate an end-to-end speech recognition model specialized for the domain to be specialized. The specialized speech recognition model may be applied to the end-to-end speech recognizer to perform the domain-specific end-to-end speech recognition.

Information processing apparatus, information processing system, and information processing method

Provided is an apparatus that includes a voice recognition section that executes a voice recognition process on a user speech and a learning processing section that executes a process of updating a degree of confidence on the basis of an interaction made between a user and the information processing apparatus after the user speech. The degree of confidence is an evaluation value indicating the reliability of a voice recognition result of the user speech. The voice recognition section generates data on degrees of confidence in recognition of the user speech in which data plural user speech candidates based on the voice recognition result of the user speech are associated with the degrees of confidence which are evaluation values each indicating reliability of the corresponding user speech candidate.

Auto-completion for gesture-input in assistant systems

In one embodiment, a method includes receiving an initial input in a first modality from a first user from a client system associated with the first user, determining one or more intents corresponding to the initial input by an intent-understanding module, generating one or more candidate continuation-inputs based on the one or more intents, where the one or more candidate continuation-inputs are in one or more candidate modalities, respectively, and wherein the candidate modalities are different from the first modality, and sending instructions for presenting one or more suggested inputs corresponding to one or more of the candidate continuation-inputs to the client system.

Personalized conversational recommendations by assistant systems

In one embodiment, a method includes receiving a user request from a client system associated with a user, generating a response to the user request which references one or more entities, generating a personalized recommendation based on the user request and the response, wherein the personalized recommendation references one or more of the entities of the response, and sending instructions for presenting the response and the personalized recommendation to the client system.

System and method for automated communication with Air Traffic Control

Systems and methods for automated communication with Air Traffic Control. The system comprises a processor and memory. The memory stores instructions to execute a method. The method includes receiving audio communication input from an air traffic controller (ATC). The audio communication input is then converted into text input. Next, an aircraft keyword is detected in the text input. The text input is then parsed and one or more data structures are generated from the parsed input. In some examples, the one or more data structures includes command data for controlling the aircraft. Next, the command data in the one or more data structures is verified. The one or more data structures are then transmitted to an onboard flight computer of the aircraft. Last, the one or more data structures are stored in a conversation memory.

System and method for automated communication with Air Traffic Control

Systems and methods for automated communication with Air Traffic Control. The system comprises a processor and memory. The memory stores instructions to execute a method. The method includes receiving audio communication input from an air traffic controller (ATC). The audio communication input is then converted into text input. Next, an aircraft keyword is detected in the text input. The text input is then parsed and one or more data structures are generated from the parsed input. In some examples, the one or more data structures includes command data for controlling the aircraft. Next, the command data in the one or more data structures is verified. The one or more data structures are then transmitted to an onboard flight computer of the aircraft. Last, the one or more data structures are stored in a conversation memory.

Generation of computing functionality using devices

Techniques for generating a skill using skill portion deviceskill portion devices are described. A user generates a skill by connecting skill portion deviceskill portion devices in a particular manner. As devices are connected, a speech controllable device or a distributed system may maintain a data structure representing a skill configuration corresponding to the presently connected devices.

Text conversion and representation system

Disclosed is a method of phonetically encoding a text document. The method comprises providing, for a current word in the text document, a phonetically equivalent encoded word comprising one or more syllables, each syllable comprising a sequence of phonemes from a predetermined phoneme set, the sequence being phonetically equivalent to the corresponding syllable in the current word, and adding the phonetically equivalent encoded word or the current word at a current position in the phonetically encoded document, Each phoneme in the phoneme set is associated with a base grapheme that is pronounced as the phoneme in one or more English words.