Patent classifications
G10L2015/226
Efficient collaboration using a virtual assistant
In an approach to assisting users of a collaborative meeting platform, one or more computer processors detect a start of a collaborative meeting. One or more computer processors monitor one or more activities of the collaborative meeting. Based on the one or more activities of the collaborative meeting, one or more computer processors detect a trigger for assistance with a user interface of the collaborative meeting. One or more computer processors retrieve one or more correlated actions associated with the trigger. One or more computer processors perform at least one of the one or more retrieved correlated action within the user interface of the collaborative meeting.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM STORING INFORMATION PROCESSING PROGRAM FOR SELECTING SET VALUE USED TO EXECUTE FUNCTION
The processor of an information processing apparatus includes serves, by executing an information processing program, as: a function determiner; a morpheme analyzer configured to analyze a message input by a user in morphemes; a word detector configured to detect a predetermined time-representing word indicating temporal nearness or farness and a predetermined keyword which is modified by the time-representing word and which indicates settings associated with the function from the message analyzed in morphemes by the morpheme analyzer; a setting selector configured to select a newest set value when the word detector has detected the time-representing word indicating temporal nearness and to select a set value used when the user used the function in the past when the word detector has detected the time-representing word indicating temporal farness; and a function executor configured to execute the function determined by the function determiner using the set value selected by the setting selector.
System and method for personalizing dialogue based on user's appearances
The present teaching relates to method, system, medium, and implementations for enabling communication with a user. Information representing surrounding of a user engaged in an on-going dialogue is received via the communication platform, wherein the information includes a current response from the user in the on-going dialogue and is acquired from a current scene in which the user is present and captures characteristics of the user and the current scene. Relevant features are extracted from the information. A state of the user is estimated based on the relevant features and a dialogue context surrounding the current scene is determined based on the relevant features. A feedback directed to the current response of the user is generated based on the state of the user and the dialogue context.
Robot capable of conversation with another robot and method of controlling the same
A robot capable of conversation with another robot and a method of controlling the same are disclosed. The robot includes a main body having a first region corresponding to a human face and rotatable in left-right direction directions, a signal generator generating a first data signal to be transmitted to a listener robot and a first robot voice signal corresponding to the first data signal, a communication unit transmitting the first data signal to an external server, a speaker outputting the first robot voice signal, and a controller controlling a rotation direction of the main body such that the first region is directed toward the listener robot at a time point adjacent to a transmission time of the first data signal and controlling the speaker to output the first robot voice signal after the rotation direction of the robot is controlled, wherein the listener robot receives the first data signal transmitted from the external server and is controlled to operate based on the first data signal.
Co-reference understanding electronic apparatus and controlling method thereof
Disclosed is an electronic apparatus providing a reply to a query of a user. The electronic apparatus includes a microphone, a camera, a memory configured to store at least one instruction, and at least one processor, and the processor is configured to execute the at least one instruction to control the electronic apparatus to: identify a region of interest corresponding to a co-reference in an image acquired through the camera based on a co-reference being included in the query, identify an object referred to by the co-reference among at least one object included in the identified region of interest based on a dialogue content that includes the query, and provide information on the identified object as the reply.
Bias detection in speech recognition models
Systems and methods for detecting demographic bias in automatic speech recognition (ASR) systems. Corpuses of transcriptions from different demographic groups are analyzed, where one of the groups is known to be susceptible to bias and another group is known not to be susceptible to bias. ASR accuracy for each group is measured and compared to each other using both statistics-based and practicality-based methodologies to determine whether a given ASR system or model exhibits a meaningful level of bias.
Language Agnostic Multilingual End-To-End Streaming On-Device ASR System
A method includes receiving a sequence of acoustic frames characterizing one or more utterances as input to a multilingual automated speech recognition (ASR) model. The method also includes generating a higher order feature representation for a corresponding acoustic frame. The method also includes generating a hidden representation based on a sequence of non-blank symbols output by a final softmax layer. The method also includes generating a probability distribution over possible speech recognition hypotheses based on the hidden representation generated by the prediction network at each of the plurality of output steps and the higher order feature representation generated by the encoder at each of the plurality of output steps. The method also includes predicting an end of utterance (EOU) token at an end of each utterance. The method also includes classifying each acoustic frame as either speech, initial silence, intermediate silence, or final silence.
CONVERSATIONAL ARTIFICIAL INTELLIGENCE SYSTEM IN A VIRTUAL REALITY SPACE
A system for speech interpretation from a users' speech, while in a virtual environment, aided by user data and virtual world data. This system includes a virtual reality device comprising one or more user input devices, one or more user output devices, and a communication module. The output devices outputting a virtual environment to the user. A database stores information about elements in the virtual environment. An artificial intelligence module performs speech interpretation. The artificial intelligence module comprises a speech-to-text module that interprets user speech into a plurality of textual interpretations, and based on a ranking of the textual interpretations, select a top interpretation. An augmentation module adds context into the user speech to aid interpreting the speech. The context is derived from user data regarding the user’s interaction with the virtual environment, and virtual environment data defining an element in the virtual environment with which the user is interacting.
Digital Media Environment for Conversational Image Editing and Enhancement
Conversational image editing and enhancement techniques are described. For example, an indication of a digital image is received from a user. Aesthetic attribute scores for multiple aesthetic attributes of the image are generated. A computing device then conducts a natural language conversation with the user to edit the digital image. The computing device receives inputs from the user to refine the digital image as the natural language conversation progresses. The computing device generates natural language suggestions to edit the digital image based on the aesthetic attribute scores as part of the natural language conversation. The computing device provides feedback to the user that includes edits to the digital image based on the series of inputs. The computing device also includes as feedback natural language outputs indicating options for additional edits to the digital image based on the series of inputs and the previous edits to the digital image.
Electronic apparatus, controlling method of electronic apparatus and server
An electronic apparatus which registers a device to a server by using a voice, and a method therefor are provided. The electronic apparatus includes a communication circuit, a microphone, a memory for storing computer executable instructions, and at least one processor configured to execute the computer executable instructions to acquire, from a voice received through the microphone, information on an external device which a user wishes to register, based on an external device corresponding to the acquired information being searched through the communication circuit, control the communication circuit to transmit information on an access point to the external device to enable the external device to communicate with a server, and control the communication circuit to transmit a registration request with respect to the external device to the server.