IPIQ

G10L2015/225

INTELLIGENT VOICE INTERACTION METHOD AND APPARATUS, DEVICE AND COMPUTER STORAGE MEDIUM

20230058949 · 2023-02-23 ·

BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

The present disclosure discloses an intelligent voice interaction method and apparatus, a device and a computer storage medium, and relates to voice, big data and deep learning technologies in the field of artificial intelligence technologies. A specific implementation solution involves: acquiring first conversational voice entered by a user; and inputting the first conversational voice into a voice interaction model, to acquire second conversational voice generated by the voice interaction model for the first conversational voice for return to the user; wherein the voice interaction model includes: a voice encoding submodel configured to encode the first conversational voice and historical conversational voice of a current session, to obtain voice state Embedding; a state memory network configured to obtain Embedding of at least one preset attribute by using the voice state Embedding; and a voice generation submodel configured to generate the second conversational voice by using the voice state Embedding and the Embedding of the at least one preset attribute. The at least one preset attribute is preset according to information of a verified object. Intelligent data verification is realized according to the present disclosure.

HOLOGRAPHIC INTERFACE FOR VOICE COMMANDS

20230059372 · 2023-02-23 ·

A computer implemented method, computer system, and computer program product for executing a voice command. A number of processor units displays a view of a location with voice command devices in response to detecting the voice command from a user. The number of processor units displays a voice command direction for the voice command in the view of the location. The number of processor units changes the voice command direction in response to a user input. The number of processor units identifies a voice command device from the voice command devices in the location based on the voice command direction to form a selected voice command device. The number of processor units executes the voice command using the selected voice command device.

Active Listening for Assistant Systems

20220366904 · 2022-11-17 ·

In one embodiment, a method includes receiving a first user input comprising a wake word associated with an assistant xbot from a first client system, setting the assistant xbot into a listening mode, wherein a continuous non-visual feedback is provided via the first client system while the assistant xbot is in the listening mode, receiving a second user input comprising a user utterance from the first client system while the assistant xbot is in the listening mode, determining the second user input has ended based on a completion of the user utterance, and setting the assistant xbot into an inactive mode, wherein the non-visual feedback is discontinued via the first client system while the assistant xbot is in the inactive mode.

Voice interaction method, device, apparatus and server

11587560 · 2023-02-21 ·

Beijing Baidu Netcom Science And Technology Co., Ltd.

Chao TIAN

A voice interaction method is provided. The method is applied to a wearable set and includes: collecting voice information through at least two microphones; processing the voice information and determining that the voice information comprises an effective voice instruction; wherein the effective voice instruction is issued by a user for a mobile terminal; and transmitting the effective voice instruction to the mobile terminal. In an embodiment, the processing of the voice information is assigned to an external device, which reduces the power consumption of a mobile terminal; and voice information is collected by at least two microphones to improve an efficiency and quality of a voice collection.

Intelligent Interactive Voice Recognition System

20220366901 · 2022-11-17 ·

Systems for performing intelligent interactive voice recognition functions are provided. In some aspects, natural language data may be received from a plurality of users. The natural language data may be used to train a machine learning model. After training the machine learning model, additional or subsequent natural language input data may be received. The natural language data may include a user query, such as a request to obtain information from the system, to process a transaction, or the like. The natural language data may be processed to remove noise associated with the audio data. The data may then be further processed using the machine learning model to interpret the query of the user and generate an output. The output may be transmitted to the user and feedback data may be received from the user. The user-specific machine learning dataset may then be validated and/or updated based on the feedback data.

Communication system, server system, and communication apparatus, relating to permission to execute a newly added function

11588944 · 2023-02-21 ·

Canon Kabushiki Kaisha

Yu Tomioka

Provided is an acquisition unit configured to, in a case where a predetermined function is newly added as a function related to a communication apparatus and of which execution is instructed with a voice input to a voice control device, after an input from the user who permits the execution of the at least one function is performed. The acquisition unit acquires information that is based on a predetermined voice input that indicates whether to permit the execution of the predetermined function and that has been received by voice from the user by the voice control device.

Method and system for adaptive language learning

11587460 · 2023-02-21 ·

IXL Learning, Inc.

Methods and systems provide an adaptive method of language learning using automatic speech recognition that allows a user to learn a new language using only their voice—and without using their hands or eyes. The system may be implemented in an application for a smartphone. Each lesson comprises a series of questions that adapt to the user's knowledge. The questions ask for the translation of a word or phrase by playing an audio prompt in the origin language, recording the user speaking the translation in the target language, indicating whether the utterance was correct or incorrect, and providing feedback related to the user's utterance. Each user response is evaluated in real time, and the application provides individualized feedback to the user based on their response. Subsequent questions in the lesson and future lessons are dynamically ordered to adapt to the user's knowledge.

Conversational artificial intelligence driven methods and system for delivering personalized therapy and training sessions

11587562 · 2023-02-21 ·

John Lemme

A user directed verbal interactive method and system for requesting a evaluation and obtaining a customized verbal therapy routine based on the evaluation obtained. The method and system allow users to interact with an artificial intelligence agent by answering a series of system directed questions that guides the users through evaluation and treatment of physical pain using a customized verbal interaction and delivery regimen. Users verbally engage with the artificial intelligence agent to create respective profiles. The system develops therapies based on their current physiological state and profile. The users are then delivered verbal therapy prompts through the system to implement the developed therapy routines.

Ontology-based organization of conversational agent

11587557 · 2023-02-21 ·

International Business Machines Corporation

According to a first aspect of the present invention, a computer implemented method, a computer system and a computer program product for creating an ontological conversational agent, the method including creating an ontological specification of a domain of discourse of the ontological conversational agent, and creating a description of one or more goals of the ontological conversational agent. In an embodiment, the ontological description includes classes of entities, their associated attributes and relationships between the classes of entities. In an embodiment, the ontological description includes language-related descriptions. In an embodiment, the method, computer system and computer program product further includes creating a description of services of the ontological conversational agent. An embodiment including receiving a first utterance from a user during a conversation, identifying a first intent based on the first utterance, and recognizing a first goal of the one or more goals, based on the first intent.

Automatically executing operations sequences

11501774 · 2022-11-15 ·

WALKME LTD.

Method, system and product for automatic execution of operations sequences. An operations sequence, which includes a first operation immediately followed by a second operation, is obtained. The operations sequence or portion thereof is automatically executed, at least by performing: in response to a determination that a first element required for performing the first operation is available for user interaction in a first state of the computing device, mimicking a user interaction with the first element to perform the first operation, whereby causing a current state of the computing device to change from the first state to a second state; and in response to a determination that a second element required for performing the second operation is available for user interaction in the second state, mimicking user interaction with the second element to perform the second operation.

Patent classifications

G10L2015/225