G10L15/1807

System and method for conversational agent via adaptive caching of dialogue tree
11468885 · 2022-10-11 · ·

The present teaching relates to method, system, medium, and implementations for managing a user machine dialogue. A request is received by a server from a device for a response to be directed to a user engaged in a dialogue with the device. The request includes information related to a current state of the dialogue. The response is determined based on a dialogue tree and the information related to the current state of the dialogue. A sub-dialogue tree, which corresponds to a portion of the dialogue tree, is then created based on the response and the dialogue tree and is then used to generate a local dialogue manager for the device. The response, the sub-dialogue tree, and the local dialogue manager are then sent to the device, wherein the local dialogue manager, once deployed on the device, is capable of driving the dialogue with the user based on the sub-dialogue tree on the device.

TRAINING NEURAL NETWORKS TO PREDICT ACOUSTIC SEQUENCES USING OBSERVED PROSODY INFO
20220328041 · 2022-10-13 ·

An example system includes a processor to receive training targets. The training targets include an observed prosody info vector. The processor can train a neural network to predict acoustic sequences based on the training targets. The processor can train a prosody info generator to predict combined prosody info.

Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate
11386134 · 2022-07-12 · ·

Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate. In some aspects, the systems and methods calculate first and second pronunciation rates of first and second voice queries. The systems and methods determine that the second pronunciation rate is lower than the first pronunciation rate and determine a first candidate pronunciation time for a first candidate word from the first voice query. The systems and methods determine a second candidate pronunciation time, adjusted to the first pronunciation rate, for the second candidate word from the second voice query. The systems and methods determine that the first candidate pronunciation time matches the second candidate pronunciation time and generate a third voice query based on the first voice query by replacing the first candidate word with the second candidate word.

Authentication circle management

An approach for establishing and managing authentication circles is disclosed. The circles may be used to facilitate management of accounts, goals, or resources of one or more entities, or to provide an integrated view of the circumstances of, for example, family members or other interrelated persons. A person receiving assistance with the management of one or more accounts need not disclose authentication credentials to persons helping manage the accounts, enhancing security. Members may view members and access accounts administered by separate computing systems without needing credentials for each member, account, and/or computing system. The multiple accounts (which may be held at multiple institutions) need not be accessed individually by each member of the authentication circle, saving time and computing resources of users.

SYSTEM AND METHOD FOR CROSS-SPEAKER STYLE TRANSFER IN TEXT-TO-SPEECH AND TRAINING DATA GENERATION
20220293091 · 2022-09-15 ·

Systems are configured for generating spectrogram data characterized by a voice timbre of a target speaker and a prosody style of source speaker by converting a waveform of source speaker data to phonetic posterior gram (PPG) data, extracting additional prosody features from the source speaker data, and generating a spectrogram based on the PPG data and the extracted prosody features. The systems are configured to utilize/train a machine learning model for generating spectrogram data and for training a neural text-to-speech model with the generated spectrogram data.

Authentication circle shared expenses with extended family and friends

Systems and methods for providing authentication circles to pursue financial goals and/or share expenses with others are provided. One or more provider computing systems are communicatively coupled to one or more user devices. Users may join a circle and make contributions via electronic messages that may allow for acceptance in a one-click fashion. Members may, for example, plan for and share expenses for a trip and compare the expenses with budgets.

CONTINUOUS DIALOG WITH A DIGITAL ASSISTANT
20220293124 · 2022-09-15 ·

Systems and processes for operating an intelligent automated assistant are provided. For example, a first speech input directed to a digital assistant is received from a user. A first response is provided based on the first speech input. A session window is initiated, wherein the session window is associated with a variable speech threshold. A second speech input is received during the session window. In accordance with a determination that the second speech input includes speech directed to the digital assistant, a duration associated with the session window is increased. In accordance with a determination that the variable speech threshold does not exceed a predetermined speech threshold, the session window is ended.

Conversation print system and method

Conversation Print: A method, computer program product, and computing system for receiving voice-based content from a third-party. The voice-based content is processed to define a text-based transcript for the voice-based content. The voice-based content is processed to define speech-pattern indicia for the voice-based content. A conversation print for the voice-based content is generated based, at least in part, upon the text-based transcript and the speech-pattern indicia.

Conversation print system and method
11275854 · 2022-03-15 · ·

A method, computer program product, and computing system for defining a conversation print for each of a plurality of known entities, thus defining a plurality of conversation prints. Voice-based content is received from a third-party. The voice-based content is compared to at least one of the plurality of conversation prints to identify the third party.

Conversation print system and method
11275855 · 2022-03-15 · ·

A method, computer program product, and computing system for defining a conversation print for each of a plurality of known fraudsters, thus defining a plurality of fraudster conversation prints. The plurality of fraudster conversation prints is processed to identify one or more fraudster commonalities. A fraudster conversation template is generated based, at least in part, upon the one or more fraudster commonalities.