G06F2216/13

Processing Multimodal User Input for Assistant Systems
20220012076 · 2022-01-13 ·

In one embodiment, a method includes receiving a user input based on a plurality of modalities at the client system, wherein at least one of the modalities of the user input is a visual modality, determining one or more subjects and one or more attributes associated with the one or more subjects, respectively, based on the visual modality of the user input, resolving one or more entities corresponding to the one or more subjects based on the determined one or more attributes, and presenting a communication content at the client system responsive to the user input, wherein the communication content comprises information associated with executing results of one or more tasks corresponding to the one or more resolved entities.

Auto-completion for Multi-modal User Input in Assistant Systems
20210343286 · 2021-11-04 ·

In one embodiment, a method includes receiving an initial input in a first modality from a first user at a client system, determining intents and slots corresponding to the initial input, wherein the slots are conditioned on the intents, generating one or more candidate continuation-inputs based on the intents and slots, where the one or more candidate continuation-inputs are in one or more candidate modalities, respectively, wherein the candidate modalities are different from the first modality, and wherein each of the candidate continuation-inputs references entities represented by the slots, and presenting one or more suggested inputs corresponding to one or more of the candidate continuation-inputs at the client system.

Auto-completion for Gesture-input in Assistant Systems
20230154175 · 2023-05-18 ·

In one embodiment, a method includes detecting a user input comprising an incomplete three-dimensional (3D) gesture performed by one or more hands of a first user by a virtual-reality (VR) headset, selecting candidate 3D gestures from pre-defined 3D gestures based on a personalized gesture-recognition model, wherein each of the candidate 3D gestures is associated with a confidence score representing a likelihood the first user intended to input the respective candidate 3D gesture, and presenting one or more suggested inputs corresponding to one or more of the candidate 3D gestures at the VR headset.

Predictive injection of conversation fillers for assistant systems

In one embodiment, a method includes, by a client system, receiving, at the client system, a first user input, processing by the client system, the first user input to provide an initial response by identifying one or more entities referenced by the first user input and providing, by the client system, the initial response, where the initial response includes a conversational filler referencing at least one of the one or more identified entities, processing the first user input to provide a complete response by identifying, by the client system, one or more intents and one or more slots associated with the first user input based on a semantic analysis by a natural-language understanding module, and providing, by the client system, the complete response subsequent to the initial response, where the complete response is based on the one or more intents and the one or more slots.

Content summarization for assistant systems

In one embodiment, a method includes, by a client system, receiving, by an assistant xbot of the client system, a request from a first user for a summary of user content from a first content source, retrieving, from the first content source, a plurality of content items corresponding to the request, generating a personalized summary of the retrieved content items, wherein the personalization of the summary is based on a user profile of the first user, and presenting, by the assistant xbot, the personalized summary responsive to the request within a separate communication interface between the assistant xbot and the first user, wherein the personalized summary is interactable by the first user to react to one or more of the plurality of content items.

Engaging users by personalized composing-content recommendation

In one embodiment, a method includes receiving an indication of a trigger action by a first user at a client system, wherein the trigger action is associated with a priming content object, identifying related content objects associated with the priming content object, selecting recommended content objects based on the priming content object, the related content objects, and profile information of the first user, wherein each of the selected recommended content objects comprises entity information of entities associated with the priming content object, and presenting content suggestions at the client system, wherein each content suggestion comprises one of the selected recommended content objects.

Personalized gesture recognition for user interaction with assistant systems

In one embodiment, a method includes receiving a user request from a first user from a client system associated with a first user, wherein the user request comprise a gesture-input from the first user and a speech-input from the first user, determining an intent corresponding to the user request based on the gesture-input by a personalized gesture-classification model associated with the first user, executing one or more tasks based on the determined intent and the speech-input, and sending instructions for presenting execution results of the one or more tasks to the client system responsive the user request.

Generating multi-perspective responses by assistant systems

In one embodiment, a method includes receiving a user query associated with dialog-intents at a client system, executing tasks corresponding to the dialog-intents, generating a multi-perspective response by a stitching model based on two or more of execution results of the tasks, wherein the multi-perspective response comprises a natural-language response combining the two or more execution results, and presenting the multi-perspective response at the client system.

Resolving entities from multiple data sources for assistant systems
11704899 · 2023-07-18 · ·

In one embodiment, a method includes receiving a request to access a first record in a plurality of records, where the first record describes a first set of attributes of a first entity, determining the first record is linked to a globally unique entity identifier, identifying one or more second records linked to the unique entity identifier, where the one or more second records describe one or more second sets of attributes of the first entity, generating a fused record comprising descriptions of attributes of the first entity from the first set and second sets of attributes, where the fused record is generated by deduping the plurality of records to associated the first record and the one or more second record with the unique entity identifier and compiling the first set and one or more second sets of attributes, and sending, in response responsive to the request to access the first record, instructions for presenting the fused record.

Intent identification for agent matching by assistant systems

In one embodiment, a method includes receiving a user request from a client system associated with a first user, wherein the user request is associated with a semantic-intent, identifying one or more dialog-intents associated with the user request based on the semantic-intent and context information associated with the user request, wherein each dialog-intent is a sub-intent of the semantic-intent, determining one or more agents for executing one or more tasks associated with the one or more dialog-intents, and sending instructions for presenting information returned from the one or more agents responsive to executing the one or more tasks to the client system.