IPIQ

G10L15/18

Voice controlled assistant with coaxial speaker and microphone arrangement

11521624 · 2022-12-06 ·

Amazon Technologies, Inc.

Timothy Theodore List

A voice controlled assistant has a housing to hold one or more microphones, one or more speakers, and various computing components. The housing has an elongated cylindrical body extending along a center axis between a base end and a top end. The microphone(s) are mounted in the top end and the speaker(s) are mounted proximal to the base end. The microphone(s) and speaker(s) are coaxially aligned along the center axis. The speaker(s) are oriented to output sound directionally toward the base end and opposite to the microphone(s) in the top end. The sound may then be redirected in a radial outward direction from the center axis at the base end so that the sound is output symmetric to, and equidistance from, the microphone(s).

Pronunciation error detection apparatus, pronunciation error detection method and program

11568761 · 2023-01-31 ·

NIPPON TELEGRAPH AND TELEPHONE CORPORATION

The present invention provides a pronunciation error detection apparatus capable of following a text without the need for a correct sentence even when erroneous recognition such as a reading error occurs. The pronunciation error detection apparatus comprises: a speech recognition part that recognizes the speech in speech data based on a speech recognition model for a non-native speaker, and outputs speech recognition results, reliability and time information; a reliability determination part that outputs the speech recognition results with higher reliability than a predetermined threshold and the corresponding time information as the determined speech recognition results and the determined time information; and a pronunciation error detection part that outputs a phoneme as a pronunciation error when reliability for each phoneme in the speech recognition results using the native speaker speech recognition model under a weakly constraining grammar is greater than the reliability of the corresponding phoneme in the speech recognition results using the native speaker acoustic model under a constraining grammar in which the determined speech recognition results are correct for the speech data in a segment specified by the determined time information.

Waypoint detection for a contact center analysis system

11568231 · 2023-01-31 ·

Raytheon Bbn Technologies Corp.

A contact center analysis system can receive various types of communications from customers, such as audio from telephone calls, voicemails, or video conferences; text from speech-to-text translations, emails, live chat transcripts, text messages, and the like; and other media or multimedia. The system can segment the communication data using temporal, lexical, semantic, syntactic, prosodic, user, and/or other features of the segments. The system can cluster the segments according to one or more similarity measures of the segments. The system can use the clusters to train a machine learning classifier to identify one or more of the clusters as waypoints (e.g., portions of the communications of particular relevance to a user training the classifier). The system can automatically classify new communications using the classifier and facilitate various analyses of the communications using the waypoints.

Artificial intelligence server and method for providing information to user

11568239 · 2023-01-31 ·

Lg Electronics Inc.

In an artificial intelligence server for providing information to a user, the artificial intelligence server includes a communication unit configured to communicate with a plurality of artificial intelligence apparatuses deployed in a service area and a processor configured to receive at least one of speech data of the user or terminal usage information of the user from at least one of the plurality of artificial intelligence apparatuses, generate intention information of the user based on at least one of the received speech data or the received terminal usage information, generate status information of the user using the plurality of artificial intelligence apparatuses, determine an information providing device among the plurality of artificial intelligence apparatuses based on the generated status information of the user, generate output information to be outputted from the determined information providing device, and transmit a control signal for outputting the generated output information to the determined information providing device.

Artificial intelligence server and method for providing information to user

11568239 · 2023-01-31 ·

Lg Electronics Inc.

Call processing method, electronic device and storage medium

11570306 · 2023-01-31 ·

Beijing Baidu Netcom Science And Technology Co., Ltd.

The present disclosure provides a call processing method, apparatus, electronic device and storage medium and relates to the field of cloud computing. The method may comprise: obtaining a calling subscriber's status information in real time while an intelligent dialogue robot is used to make a call with the calling subscriber; when it is determined that a call form of the intelligent dialogue robot needs to be adjusted, correspondingly adjusting the call form of the intelligent dialogue robot according to current status information of the calling subscriber. The solution of the present disclosure may be employed to improve the call performance of the intelligent dialogue robot.

Intent authoring using weak supervision and co-training for automated response systems

11568856 · 2023-01-31 ·

International Business Machines Corporation

A combination of propagation operations and learning algorithms is applied, using a selected set of labeled conversational logs retrieved from a subset of a plurality of conversational logs, to a remaining corpus of the plurality of conversational logs to train an automated response system according to an intent associated with each of the conversational logs. The combination of propagation operations and learning algorithms may include defining the labels by a user for the selected set of the subset of the plurality of conversational logs; training a probabilistic classifier using the defined labels of features of the selected set, wherein the probabilistic classifier produces labeling decisions for the subset of conversational logs; weighting the features of the selected set in a model optimization process; and/or training an additional classifier using the weighted features of the selected set and applying the additional classifier to the remaining corpus.

METHOD AND APPARATUS FOR INTELLIGENT VOICE QUERY

20230237056 · 2023-07-27 ·

Soundhound, Inc.

Chong Wang

A method and an apparatus for processing an intelligent voice query. A voice query input is received from a user. Automatic speech recognition and natural language understanding generate structured query data. It is modified based on an input adaptation rule to obtain modified structured query data appropriate for a content providing server, which provides a query result output corresponding to the modified structured query data. Input adaptation rules may comprise rule sets based on behavior patterns of the user and/or business recommendations. The query result output can be used for natural language generation, which may have similar adaptation rules for output.

METHOD AND APPARATUS FOR INTELLIGENT VOICE QUERY

20230237056 · 2023-07-27 ·

Soundhound, Inc.

Chong Wang

Voice recognition method using artificial intelligence and apparatus thereof

11568853 · 2023-01-31 ·

Lg Electronics Inc.

Jonghoon Chae

Disclosed is a voice recognition method and apparatus using artificial intelligence. A voice recognition method using artificial intelligence may include: generating a utterance by receiving a voice command of a user; obtaining a user's intention by analyzing the generated utterance; deriving an urgency level of the user on the basis of the generated utterance and prestored user information; generating a first response in association with the user's intention; obtaining main vocabularies included in the first response; generating a second response by using the main vocabularies and the urgency level of the user; determining a speech rate of the second response on the basis of the urgency level of the user; and outputting the second response according to the speech rate by synthesizing the second response to a voice signal.

Patent classifications

G10L15/18