G10L25/87

Dialog device, dialog method, and dialog computer program

The dialog device according to the present invention includes a prediction unit 254 configured to predict an utterance length attribute of a user utterance in response to a the machine utterance, a selection unit 256 configured to use the utterance length attribute to select, as a feature model for usage in an end determination of the user utterance, at least one of an acoustic feature model or a lexical feature model, and an estimation unit 258 configured to estimate an end point in the user utterance using the selected model. By using this dialog device, it is possible to shorten the waiting time until a response is output to a user utterance by a machine, and to realize a more natural conversation between a user and a machine.

SOUND DETECTION METHOD
20230074906 · 2023-03-09 ·

The present disclosure discloses a sound detection method. The method includes: obtaining an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal, to obtain a target sound segment, and obtaining a timestamp corresponding to the target sound segment, the target sound segment including a speech of at least one object, and the timestamp being used for indicating a start time of the target sound segment and an end time of the target sound segment; segmenting the spatial distribution spectrum by using the timestamp, to obtain a spatial distribution spectrum segment corresponding to the target sound segment; and inputting the target sound segment and the spatial distribution spectrum segment into a sound detection model, to obtain a first sound detection result, the first sound detection result being used for describing whether sound of multiple objects exists in the initial sound signal

SOUND DETECTION METHOD
20230074906 · 2023-03-09 ·

The present disclosure discloses a sound detection method. The method includes: obtaining an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal, to obtain a target sound segment, and obtaining a timestamp corresponding to the target sound segment, the target sound segment including a speech of at least one object, and the timestamp being used for indicating a start time of the target sound segment and an end time of the target sound segment; segmenting the spatial distribution spectrum by using the timestamp, to obtain a spatial distribution spectrum segment corresponding to the target sound segment; and inputting the target sound segment and the spatial distribution spectrum segment into a sound detection model, to obtain a first sound detection result, the first sound detection result being used for describing whether sound of multiple objects exists in the initial sound signal

Feedback controller for data transmissions
11475886 · 2022-10-18 · ·

A feedback control system for data transmissions in voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request. The system can select a content item using the trigger keyword or request. The content item can be configured to establish a communication session between the device and a third party device. The system can monitor the communication session to measure a characteristic of the communication session. The system can generate a quality signal based on the measured characteristic.

Feedback controller for data transmissions
11475886 · 2022-10-18 · ·

A feedback control system for data transmissions in voice activated data packet based computer network environment is provided. A system can receive audio signals detected by a microphone of a device. The system can parse the audio signal to identify trigger keyword and request. The system can select a content item using the trigger keyword or request. The content item can be configured to establish a communication session between the device and a third party device. The system can monitor the communication session to measure a characteristic of the communication session. The system can generate a quality signal based on the measured characteristic.

SYSTEMS AND METHODS TO ANALYZE AUDIO DATA TO IDENTIFY DIFFERENT SPEAKERS
20230129467 · 2023-04-27 ·

A computing system may receive data representing dialog between persons, the data representing words spoken by at least first and second speakers, determine an intent of a speaker for a first portion of the data, the intent being indicative of an identity of the first or second speaker for the first portion of the data or another portion of the data different than the first portion, determine a name of the first or second speaker represented in the first portion of the data based at least in part on the determined intent, and output an indication of the determined name so that the indication identifies the first portion of the data or the another portion of the data with the first or second speaker.

SYSTEMS AND METHODS TO ANALYZE AUDIO DATA TO IDENTIFY DIFFERENT SPEAKERS
20230129467 · 2023-04-27 ·

A computing system may receive data representing dialog between persons, the data representing words spoken by at least first and second speakers, determine an intent of a speaker for a first portion of the data, the intent being indicative of an identity of the first or second speaker for the first portion of the data or another portion of the data different than the first portion, determine a name of the first or second speaker represented in the first portion of the data based at least in part on the determined intent, and output an indication of the determined name so that the indication identifies the first portion of the data or the another portion of the data with the first or second speaker.

Speech endpointing based on word comparisons

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech endpointing based on word comparisons are described. In one aspect, a method includes the actions of obtaining a transcription of an utterance. The actions further include determining, as a first value, a quantity of text samples in a collection of text samples that (i) include terms that match the transcription, and (ii) do not include any additional terms. The actions further include determining, as a second value, a quantity of text samples in the collection of text samples that (i) include terms that match the transcription, and (ii) include one or more additional terms. The actions further include classifying the utterance as a likely incomplete utterance or not a likely incomplete utterance based at least on comparing the first value and the second value.

Speech endpointing based on word comparisons

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech endpointing based on word comparisons are described. In one aspect, a method includes the actions of obtaining a transcription of an utterance. The actions further include determining, as a first value, a quantity of text samples in a collection of text samples that (i) include terms that match the transcription, and (ii) do not include any additional terms. The actions further include determining, as a second value, a quantity of text samples in the collection of text samples that (i) include terms that match the transcription, and (ii) include one or more additional terms. The actions further include classifying the utterance as a likely incomplete utterance or not a likely incomplete utterance based at least on comparing the first value and the second value.

INFORMATION PROCESSING APPARATUS AND INFORMATION PROCESSING METHOD

An information processing apparatus according to the present disclosure includes an acquisition unit that acquires inspiration information indicating inspiration of a user, and a prediction unit that predicts whether or not the user utters after the inspiration of the user on the basis of the inspiration information acquired by the acquisition unit.