G10L25/63

DIALOGUE APPARATUS, METHOD AND PROGRAM

A dialogue apparatus includes a speech recognition unit (1) configured to perform speech recognition on utterance input to generate a text corresponding to the utterance, a speech waveform corresponding to the utterance, and information regarding a length of sound of the utterance; a language understanding unit (2) configured to grasp contents of the utterance by using the text corresponding to the utterance; a dialogue management unit (3) configured to determine contents of a response corresponding to the utterance by using the content of the utterance; an utterance state extraction unit (4) configured to extract a state of the utterance by using the text corresponding to the utterance, the speech waveform corresponding to the utterance, and the information regarding the length of the sound of the utterance; a response state determination unit (5) configured to determine a state of the response according to the state of the utterance; a response sentence generation unit (6) configured to generate a response sentence by using the content of the response; and a speech synthesis unit (7) configured to synthesize speech corresponding to the response sentence with the state of the response taken into account.

INFORMATION OUTPUT APPARATUS, INFORMATION OUTPUT METHOD, AND NON-TRANSITORY RECORDING MEDIUM
20230005473 · 2023-01-05 ·

An information output apparatus is realized that is capable of notifying information based on an input voice content. An information output apparatus according to the present embodiment includes an input unit to which data indicating a sound are input from an outside, a voice extraction unit that analyzes data which are input from the input unit and indicate a sound and that extracts data which indicate a voice emitted by a person, a vibration generation unit that generates vibration data, which are associated with data indicating a sound in advance set, based on a result of a comparison between the data which indicate the voice and are extracted by the voice extraction unit and the data indicating the sound in advance set, and a vibrator that vibrates based on the vibration data generated by the vibration generation unit.

INFORMATION OUTPUT APPARATUS, INFORMATION OUTPUT METHOD, AND NON-TRANSITORY RECORDING MEDIUM
20230005473 · 2023-01-05 ·

An information output apparatus is realized that is capable of notifying information based on an input voice content. An information output apparatus according to the present embodiment includes an input unit to which data indicating a sound are input from an outside, a voice extraction unit that analyzes data which are input from the input unit and indicate a sound and that extracts data which indicate a voice emitted by a person, a vibration generation unit that generates vibration data, which are associated with data indicating a sound in advance set, based on a result of a comparison between the data which indicate the voice and are extracted by the voice extraction unit and the data indicating the sound in advance set, and a vibrator that vibrates based on the vibration data generated by the vibration generation unit.

Utilizing machine learning models to provide cognitive speaker fractionalization with empathy recognition

A device may receive audio data identifying a plurality of speakers and may process the audio data, with a plurality of clustering models, to identify a plurality of speaker segments. The device may determine a plurality of diarization error rates for the plurality of speaker segments and may identify a plurality of errors in the plurality of speaker segments. The device may select rectification models to rectify the plurality of errors and may segment and/or re-segment the audio data with the rectification models to generate re-segmented audio data. The device may determine a plurality of modified diarization error rates for the plurality of speaker segments based on the re-segmented audio data and may select one of the plurality of speaker segments based on the plurality of modified diarization error rates. The device may calculate an empathy score based on the selected speaker segment and may perform actions based on the empathy score.

Utilizing machine learning models to provide cognitive speaker fractionalization with empathy recognition

A device may receive audio data identifying a plurality of speakers and may process the audio data, with a plurality of clustering models, to identify a plurality of speaker segments. The device may determine a plurality of diarization error rates for the plurality of speaker segments and may identify a plurality of errors in the plurality of speaker segments. The device may select rectification models to rectify the plurality of errors and may segment and/or re-segment the audio data with the rectification models to generate re-segmented audio data. The device may determine a plurality of modified diarization error rates for the plurality of speaker segments based on the re-segmented audio data and may select one of the plurality of speaker segments based on the plurality of modified diarization error rates. The device may calculate an empathy score based on the selected speaker segment and may perform actions based on the empathy score.

System and methods for dynamically routing and rating customer service communications

A system may receive an indication that a user is accessing an ATM, receive, from the ATM, average session duration data over a predetermined period, generate, using a machine learning model, a busyness score for the ATM based on the average session duration data over the predetermined period, and determine whether the busyness score for the ATM exceeds a busyness score threshold. When the busyness score for the ATM does not exceed the busyness score threshold, the system may cause the ATM to present, via a first graphical user interface, a default ATM experience. When the busyness score for the ATM exceeds the busyness score threshold, the system may cause the ATM to present via, a second graphical user interface, a busy ATM experience.

System and methods for dynamically routing and rating customer service communications

A system may receive an indication that a user is accessing an ATM, receive, from the ATM, average session duration data over a predetermined period, generate, using a machine learning model, a busyness score for the ATM based on the average session duration data over the predetermined period, and determine whether the busyness score for the ATM exceeds a busyness score threshold. When the busyness score for the ATM does not exceed the busyness score threshold, the system may cause the ATM to present, via a first graphical user interface, a default ATM experience. When the busyness score for the ATM exceeds the busyness score threshold, the system may cause the ATM to present via, a second graphical user interface, a busy ATM experience.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY RECORDING MEDIUM
20230238018 · 2023-07-27 · ·

An information processing apparatus includes circuitry to acquire behavior information of a plurality of users having a conversation, generate sound data based on the behavior information, and cause an output device to output an ambient sound based on the sound data.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, AND NON-TRANSITORY RECORDING MEDIUM
20230238018 · 2023-07-27 · ·

An information processing apparatus includes circuitry to acquire behavior information of a plurality of users having a conversation, generate sound data based on the behavior information, and cause an output device to output an ambient sound based on the sound data.

APPLIED BEHAVIORAL THERAPY APPARATUS AND METHOD
20230238114 · 2023-07-27 ·

An apparatus for providing automated analysis and monitoring of an ABT session is presented herein. The apparatus may include a display configured to present material for the ABT session to a patient, at least one video capture device configured to capture video data for the ABT session related to at least one of first facial features of the patient, second facial features of a therapist, or a response to the material presented on the display, at least one audio capture device configured to capture audio data for the ABT session related to at least one of a first voice of the patient or a second voice of the therapist, and at least one processor configured to analyze, for the ABT session, data regarding the material presented on the display, the captured video data, and the captured audio data to produce an analysis of the ABT session.