G10L13/027

EARLY INVOCATION FOR CONTEXTUAL DATA PROCESSING

A speech processing system uses contextual data to determine the specific domains, subdomains, and applications appropriate for taking action in response to spoken commands and other utterances. The system can use signals and other contextual data associated with an utterance, such as location signals, content catalog data, data regarding historical usage patterns, data regarding content visually presented on a display screen of a computing device when an utterance was made, other data, or some combination thereof.

MULTI-TIER SPEECH PROCESSING AND CONTENT OPERATIONS

A multi-tier architecture is provided for processing user voice queries and making routing decisions for generating responses, including responses to book browsing requests and other content requests. When an utterance is associated with multiple applications in a given domain, the applications may be organized into a subdomain and a tier of routing decisions may be added to the inter-domain and intra-domain routing decision system. The system uses contextual signals to make subdomain routing decisions, including signals regarding content items that are already in a user's content catalog, consumption status of individual content items in the user's catalog, and the like

MULTI-DOMAIN INTENT HANDLING WITH CROSS-DOMAIN CONTEXTUAL SIGNALS

A multi-tier domain is provided for processing user voice queries and making routing decisions for generating responses, including for user voice queries that include multi-domain trigger words or phrases. When an utterance is recognized as different intents in different domains, a routing system for a domain may consider contextual signals, including those associated with other domains, to determine whether the domain is the proper one to handle the request. This determination can be performed with a statistical model specifically trained to make such determinations using the available contextual data.

Agent apparatus, agent system, and server device

An agent device includes an acquirer configured to acquire an utterance of a user of a first vehicle, and a first agent controller configured to perform processing for providing a service including causing an output device to output a response of voice in response to an utterance of the user of the first vehicle acquired by the acquirer. When there is a difference between a service which is utilized in the first vehicle and is available from one or more agent controllers including at least the first agent controller and a service which is utilized in a second vehicle and is available from one or more agent controllers, the first agent controller provides information on the difference.

PRONOUN-BASED NATURAL LANGUAGE PROCESSING
20220399015 · 2022-12-15 ·

Disclosed herein are various embodiments for pronoun-based natural language processing. An embodiment operates by receiving a plurality of text-based sentences each comprising a plurality of words, and each text-based sentence including a pronoun. A plurality of candidate nouns are identified amongst the plurality of words. A trigger word is identified from the plurality of words, wherein the trigger word is associated with both the pronoun and one of the plurality of candidate nouns. A score for each of the candidate nouns is received based on a relationship with the trigger word. The candidate noun with a highest score is selected as being associated with the pronoun

Systems and methods for providing search query responses having contextually relevant voice output
11520821 · 2022-12-06 · ·

Systems and methods are described for responding to a search query with a contextually relevant voice output. An illustrative method receives a search query, determines an answer to the search query, identifies a media content reference included in the search query, determines, based on the media content reference, a personality associated with the media content reference, identifies a voice profile of the personality, and generates audio output using the voice profile of the personality, the audio output including the answer to the search query.

Systems and methods for providing search query responses having contextually relevant voice output
11520821 · 2022-12-06 · ·

Systems and methods are described for responding to a search query with a contextually relevant voice output. An illustrative method receives a search query, determines an answer to the search query, identifies a media content reference included in the search query, determines, based on the media content reference, a personality associated with the media content reference, identifies a voice profile of the personality, and generates audio output using the voice profile of the personality, the audio output including the answer to the search query.

Agent device, agent system, method for controlling agent device, and storage medium

An agent device includes a display controller configured to cause a first display to display an agent image when an agent providing a service including causing an output device to output response of voice in response to an utterance of a user is activated, and a controller configured to execute particular control for causing a second display to display the agent image according to loudness of a voice received by an external terminal receiving a vocal input.

LANGUAGE PROCESSOR, LANGUAGE PROCESSING METHOD AND LANGUAGE PROCESSING PROGRAM

The present disclosure is directed to enabling acquisition of information of an argument corresponding to a case. The present disclosure is a language processing apparatus which refers to an argument emergence history database 14 which stores argument emergence patterns associated with cases and arguments of verbs for each meaning of a word or usage of a verb, acquires an argument emergence pattern which matches a verb and a case of the verb included in a request from a user from the argument emergence history database 14, and generates a response to the user using an argument included in the argument emergence pattern acquired from the argument emergence history database 14.

SIMULATING CROWD NOISE FOR LIVE EVENTS THROUGH EMOTIONAL ANALYSIS OF DISTRIBUTED INPUTS
20220383849 · 2022-12-01 ·

Methods and systems are provided for generating crowd noise related to a media event being presented using a cloud service is provided. The method includes receiving audio data captured from a viewer of the media event. The method includes processing the audio data to identify utterances of the viewer. In one embodiment, features of the utterances are classified to build a reaction model for identifying reaction states of the viewer. The method includes producing a soundscape for the crowd noise, the soundscape blends together audio of generic crowd noise related to the media event and audio corresponding to one or more of said reaction states of the viewer. In one embodiment, the soundscape is output to a speaker associated with presentation of the media event to the viewer.