G10L2015/221

CONDITIONAL RESPONSE FULFILLMENT CACHE FOR LOCALLY RESPONDING TO AUTOMATED ASSISTANT INPUTS

Implementations set forth herein relate to conditionally caching responses to automated assistant queries according to certain contextual data that may be associated with each automated assistant query. Each query can be identified based on historical interactions between a user and an automated assistant, and—depending on the query, fulfillment data can be cached according to certain contextual data that influences the query response. Depending on how the contextual data changes, a cached response stored at a client device can be discarded and/or replaced with an updated cached response. For example, a query that users commonly ask prior to leaving for work can have a corresponding assistant response that depends on features of an environment of the users. This unique assistant response can be cached, before the users provide the query, to minimize latency that can occur when network or processing bandwidth is unpredictable.

ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF

Disclosed is a method of controlling an electronic apparatus. The method of controlling an electronic apparatus includes: displaying a screen including an input area configured to receive a text, receiving a speech and obtaining a text corresponding to the speech, performing a service operation corresponding to the input area by inputting the obtained text to the input area, and based on a result of performing the service operation, obtaining a plurality of similar texts including a similar pronunciation with the obtained text, and repeatedly performing the service operation by sequentially inputting the plurality of obtained similar texts to the input area.

METHODS AND SYSTEMS FOR CORRECTING, BASED ON SPEECH, INPUT GENERATED USING AUTOMATIC SPEECH RECOGNITION
20230138030 · 2023-05-04 ·

Methods and systems for correcting, based on subsequent second speech, an error in an input generated from first speech using automatic speech recognition, without an explicit indication in the second speech that a user intended to correct the input with the second speech, include determining that a time difference between when search results in response to the input were displayed and when the second speech was received is less than a threshold time, and based on the determination, correcting the input based on the second speech. The methods and systems also include determining that a difference in acceleration of a user input device, used to input the first speech and second speech, between when the search results in response to the input were displayed and when the second speech was received is less than a threshold acceleration, and based on the determination, correcting the input based on the second speech.

INFORMATION PROVIDING SYSTEM

When the number of characters displayable on a character display area of a display is restricted, the information providing system generates a first recognition object word from information of object to be provided. At the same time, the information providing system generates a second recognition object word by using whole of a character string which is obtained by shortening the first recognition object word to have the specified character number when its number of characters exceeds a specified character number, to thereby recognize a speech voice by a user, using the first recognition object word and the second recognition object word.

Method and system for speech emotion recognition

Systems and methods enrich speech to text communications between users in speech chat sessions using a speech emotion recognition model to convert observed emotions in speech samples to enrich text with visual emotion content. The method may include generating a data set of speech samples with labels of a plurality of emotion classes, selecting a set of acoustic features from each of the emotion classes, generating a machine learning (ML) model based on the acoustic features and data set, applying the set of rules based on the selected set of acoustic features and data set, computing a number of rules that have been satisfied, and presenting the enriched text in speech-to-text communications between users in the chat session for visual notice of an observed emotion in the speech sample.

UTTERANCE PRESENTATION DEVICE, UTTERANCE PRESENTATION METHOD, AND COMPUTER PROGRAM PRODUCT
20170365258 · 2017-12-21 ·

According to an embodiment, an utterance presentation device includes an utterance recording unit, a voice recognition unit, an association degree calculation unit, and a UI control unit. The utterance recording unit is configured to record vocal utterances. The voice recognition unit is configured to recognize the recorded utterances by voice recognition. An association degree calculation unit is configured to calculate degrees of association of the recognized utterances with a character string specified from among character strings displayed in a second display region of a user interface (UI) screen having a first display region and the second display region. A UI control unit is configured to display voice recognition results of utterances selected based on the degrees of association in the first display region of the UI screen.

PROCESSING METHOD, PROCESSING SYSTEM, AND STORAGE MEDIUM
20170364310 · 2017-12-21 ·

A processing method executed by a processor that receives an order from a user at a restaurant through interaction includes analyzing information indicating the order, extracting a phrase other than a standard element from the information, with reference to a first database when the extracted phrase is a first phrase in the first database, outputting a first confirmation item corresponding to the first phrase to the user, receiving a first user response corresponding to the first confirmation item, when the extracted phrase is not a first phrase in the first database, referring to a second database, when the extracted phrase is a second phrase in the second database, selecting a third phrase in the first database from second phrase-related phrases, outputting a second confirmation item corresponding to the third phrase to the user referring to the first database, and receiving a second user response corresponding to the second confirmation item.

Image Query Analysis

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for analyzing images for generating query responses. One of the methods includes determining, using a textual query, an image category for images responsive to the textual query, and an output type that identifies a type of requested content; selecting, using data that associates a plurality of images with a corresponding category, a subset of the images that each belong to the image category, each image in the plurality of images belonging to one of the two or more categories; analyzing, using the textual query, data for the images in the subset of the images to determine images responsive to the textual query; determining a response to the textual query using the images responsive to the textual query; and providing, using the output type, the response to the textual query for presentation.

Wearable communication enhancement device
09848260 · 2017-12-19 · ·

Embodiments disclosed herein may include a wearable apparatus including a frame having a memory and processor associated therewith. The apparatus may include a camera associated with the frame and in communication with the processor, the camera configured to track an eye of a wearer. The apparatus may also include at least one microphone associated with the frame. The at least one microphone may be configured to receive a directional instruction from the processor. The directional instruction may be based upon an adaptive beamforming analysis performed in response to a detected eye movement from the infrared camera. The apparatus may also include a speaker associated with the frame configured to provide an audio signal received at the at least one microphone to the wearer.

SYSTEMS AND METHODS FOR DYNAMICALLY UPDATING MACHINE LEARNING MODELS THAT PROVIDE CONVERSATIONAL RESPONSES

Methods and systems for dynamically updating machine learning models that provide conversational responses through the use of a configuration file that defines modifications and changes to the machine learning model are disclosed. For example, the configuration file may be used to define an expected behavior and required attributes for instituting modifications and changes (e.g., via a mutation algorithm) to the machine learning model.