G10L15/18

Conditional camera control via automated assistant commands

Implementations set forth herein relate to an automated assistant that can control a camera according to one or more conditions specified by a user. A condition can be satisfied when, for example, the automated assistant detects a particular environment feature is apparent. In this way, the user can rely on the automated assistant to identify and capture certain moments without necessarily requiring the user to constantly monitor a viewing window of the camera. In some implementations, a condition for the automated assistant to capture media data can be based on application data and/or other contextual data that is associated with the automated assistant. For instance, a relationship between content in a camera viewing window and other content of an application interface can be a condition upon which the automated assistant captures certain media data using a camera.

Virtual personal agent leveraging natural language processing and machine learning

Providing inter-virtual agent communication between communication devices owned by different users is provided. A first communication channel and a second communication channel are established with a remote data processing system. A virtual agent-to-virtual agent handshake is performed during establishment of the first communication channel. Virtual agent commands are exchanged with a remote virtual agent located on the remote data processing system via the first communication channel. An action corresponding to a virtual agent command received from the remote virtual agent located on the remote data processing system is performed while a human conversation is conducted via the second communication channel.

INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD
20230012053 · 2023-01-12 ·

An information processing device according to the present disclosure includes: an acquisition unit that acquires outline information indicating an outline of a user who makes a body motion; and a specification unit that specifies, among body parts, a main part corresponding to the body motion and a related part, which is to be a target of correction processing of motion information corresponding to the body motion, on the basis of the outline information acquired by the acquisition unit.

DETECTING AN IN-FIELD EVENT
20230010941 · 2023-01-12 ·

Examples are disclosed that relate to methods, computing devices, and systems for detecting an in-field event. One example provides a method comprising, during a training phase, receiving one or more training data streams. The training data stream(s) include an audio input comprising a semantic indicator. The audio input is processed to recognize the semantic indicator. A subset of data is selected and used to train a machine learning model to detect the in-field event, and the method further comprises outputting the trained machine learning model. During a run-time phase, the method comprises receiving one or more run-time input data streams. The trained machine learning model is used to detect a second instance of the in-field event in the one or more run-time input data streams. The method further comprises outputting an indication of the second instance of the in-field event.

Natural language processing routing

Devices and techniques are generally described for a speech processing routing architecture. In various examples, first data comprising a first feature definition is received. The first feature definition may include a first indication of first source data and first instructions for generating feature data using the first source data. In various examples, the feature data may be generated according to the first feature definition. In some examples, a speech processing system may receive a first request to process a first utterance. The feature data may be retrieved from a non-transitory computer-readable memory. The speech processing system may determine a first skill for processing the first utterance based at least in part on the feature data.

Natural language processing

Example embodiments provide techniques for configuring a natural-language processing system to perform a new function given at least one sample invocation of the function. The training data consisting of the sample invocation may be augmented by determining which subset of available training data most closely resembles the sample invocation and/or function. The effect of re-training a component this this augmented training data may be determined, and an annotator may review any annotations corresponding to the invocation if the effect is large.

Natural language processing

Example embodiments provide techniques for configuring a natural-language processing system to perform a new function given at least one sample invocation of the function. The training data consisting of the sample invocation may be augmented by determining which subset of available training data most closely resembles the sample invocation and/or function. The effect of re-training a component this this augmented training data may be determined, and an annotator may review any annotations corresponding to the invocation if the effect is large.

Method and apparatus for predicting customer satisfaction from a conversation
11553085 · 2023-01-10 · ·

A method and an apparatus for predicting satisfaction of a customer pursuant to a call between the customer and an agent, in which the method comprises receiving a transcribed text of the call, dividing the transcribed text into a plurality of phases of a conversation, extracting at least one call feature for each of the plurality of phases, receiving call metadata, extracting metadata features from the call metadata, combining the call features and the metadata features, and generating an output, using a trained machine learning (ML) model, based on the combined features, indicating whether the customer is satisfied or not. The ML model is trained to generate an output indicating whether the customer is satisfied or not, based on an input of the combined features.

Systems and methods for parsing and correlating solicitation video content
11574629 · 2023-02-07 · ·

Aspects relate to systems and methods for parsing and correlating solicitation video content. An exemplary system includes a computing device configured to receive a solicitation video related to a subject, where the solicitation video includes at least an image component and at least an audio component, where the audio component includes audible verbal content related to at least an attribute of the subject, transcribe at least a keyword as a function of the audio component, and associate the subject with at least a job description as a function of the at least a keyword.

Systems and methods for parsing and correlating solicitation video content
11574629 · 2023-02-07 · ·

Aspects relate to systems and methods for parsing and correlating solicitation video content. An exemplary system includes a computing device configured to receive a solicitation video related to a subject, where the solicitation video includes at least an image component and at least an audio component, where the audio component includes audible verbal content related to at least an attribute of the subject, transcribe at least a keyword as a function of the audio component, and associate the subject with at least a job description as a function of the at least a keyword.