G10L15/075

Electronic apparatus and control method for controlling a device in an Internet of Things

An electronic apparatus and method of controlling the electronic apparatus are provided. The electronic apparatus includes a communicator, a storage storing information on places wherein Internet of Things (IoT) devices are located, and a processor configured to, based on receiving a control signal for controlling an IoT device located in a specific place through the communicator, control the IoT device located in the specific place based on information on the place stored in the storage. The processor is further configured to receive motion information generated based on a motion of a wearable device from the wearable device, identify a place corresponding to the motion information, and store the identified place as information on a place of an IoT device located within a predetermined distance from the wearable device, in the storage.

ASSESSMENT OF THE QUALITY OF A COMMUNICATION SESSION OVER A TELECOMMUNICATION NETWORK

Apparatus (5) for assessing a quality of a communication session (3) between at least one first party (1) and at least one second party (2a, 2b . . . 2N), over a telecommunication network (4), comprising means for: monitoring (S1) said communication session by continuously receiving an audio stream associated with said communication session; converting (S2) language of said audio stream into text data; determining (S3), from said text data, at least first understandability quality features (UQF.sub.A, UQF.sub.G) and an information quality feature (IQF), said first understandability quality feature being representative of at least word articulation and grammar correctness within said language, and said information quality feature being representative of a comparison of the semantic content of said audio stream with a set of contents related to said audio stream; assessing (S4) said quality from said quality features.

Speaker awareness using speaker dependent speech model(s)

Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

Method to learn personalized intents

A method includes retrieving, at an electronic device, a first natural language (NL) input. An intent of the first NL input is undetermined by both a generic parser and a personal parser. A paraphrase of the first NL input is retrieved at the electronic device. An intent of the paraphrase of the first NL input is determined using at least one of: the generic parser, the personal parser, or a combination thereof. A new personal intent for the first NL input is generated based on the determined intent. The personal parser is trained using existing personal intents and the new personal intent.

SPEAKER AWARENESS USING SPEAKER DEPENDENT SPEECH MODEL(S)

Techniques disclosed herein enable training and/or utilizing speaker dependent (SD) speech models which are personalizable to any user of a client device. Various implementations include personalizing a SD speech model for a target user by processing, using the SD speech model, a speaker embedding corresponding to the target user along with an instance of audio data. The SD speech model can be personalized for an additional target user by processing, using the SD speech model, an additional speaker embedding, corresponding to the additional target user, along with another instance of audio data. Additional or alternative implementations include training the SD speech model based on a speaker independent speech model using teacher student learning.

METHOD AND APPARATUS WITH A PERSONALIZED SPEECH RECOGNITION MODEL

A method and apparatus for personalizing a speech recognition model is disclosed. The apparatus may obtain feedback data that is a result of recognizing a first speech input of a user using a trained speech recognition model, determine whether to update the speech recognition model based on the obtained feedback data, and selectively update, dependent on the determining, the speech recognition model based on the feedback data.

Fully Supervised Speaker Diarization
20210280197 · 2021-09-09 · ·

A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

ELECTRONIC DEVICE FOR GENERATING PERSONALIZED ASR MODEL AND METHOD FOR OPERATING SAME
20210264916 · 2021-08-26 ·

An electronic device according to various embodiments of the present invention comprises: a processor; and a memory electrically connected to the processor, wherein the memory can store instructions that allow, when executed, the processor to: store text data generated by recognizing user's voice data using a given automatic speech recognition (ASR) model, as one piece of utterance data together with the voice data, in an utterance data storage functionally connected to the processor; obtain a candidate for replacing an ASR error portion from a plurality of pieces of utterance data stored in the utterance data storage; generate a personalized ASR model by performing deep learning on the ASR model on the basis of the candidate and user's voice data corresponding to the candidate; receive a user's response to the candidate through an input device functionally connected to the processor; and update the ASR model to the personalized ASR model on the basis of the user's response. Various other embodiments are also possible.

Haptic and visual communication system for the hearing impaired
11100814 · 2021-08-24 ·

A communication method includes providing a speech training device configured to teach a hearing impaired user how to understand spoken language. The method further includes providing a haptic output device to a hearing impaired user, the haptic output device coupled to the hearing impaired user. The haptic output device receives a speech input from a non-hearing impaired person, the speech input directed to the hearing impaired user, and the haptic output device provides a haptic sensation to the hearing impaired user. The communication method also teaches a hearing-impaired person how to speak, and can teach a non-hearing impaired user to speak a foreign language. The communication method includes comparing speech of a user to a model speech, providing feedback to the user regarding the comparison, and repeating until the speech of the user matches the model speech.

Systems and methods for learning for domain adaptation

A method for training parameters of a first domain adaptation model includes evaluating a cycle consistency objective using a first task specific model associated with a first domain and a second task specific model associated with a second domain. The evaluating the cycle consistency objective is based on one or more first training representations adapted from the first domain to the second domain by a first domain adaptation model and from the second domain to the first domain by a second domain adaptation model, and one or more second training representations adapted from the second domain to the first domain by the second domain adaptation model and from the first domain to the second domain by the first domain adaptation model. The method further includes evaluating a learning objective based on the cycle consistency objective, and updating parameters of the first domain adaptation model based on learning objective.