G10L17/26

AI CONTROL DEVICE, SERVER DEVICE CONNECTED TO AI CONTROL DEVICE, AND AI CONTROL METHOD
20230095124 · 2023-03-30 ·

An AI control device, which identifies individual users from a plurality of users to receive input data, and is connectable to a server device that generates a trained model based on input data for each user, includes a control unit, and a communication unit connected to the server device. The control unit acquires input data, associates acquired input data and identifying information used to identify the user of the AI control device, and sends the data and information to the server device via the communication unit. The control unit uses the sent acquired input data to execute a trained model that is generated separately from trained models of other users by the server device, and that learns characteristics of acquired input data and detects input data having the same characteristics from unknown input data.

AI CONTROL DEVICE, SERVER DEVICE CONNECTED TO AI CONTROL DEVICE, AND AI CONTROL METHOD
20230095124 · 2023-03-30 ·

An AI control device, which identifies individual users from a plurality of users to receive input data, and is connectable to a server device that generates a trained model based on input data for each user, includes a control unit, and a communication unit connected to the server device. The control unit acquires input data, associates acquired input data and identifying information used to identify the user of the AI control device, and sends the data and information to the server device via the communication unit. The control unit uses the sent acquired input data to execute a trained model that is generated separately from trained models of other users by the server device, and that learns characteristics of acquired input data and detects input data having the same characteristics from unknown input data.

Determination of Content Services
20230030212 · 2023-02-02 ·

According to some aspects, disclosed methods and systems may include having a user input one or more speech commands into an input device of a user device. The user device may communicate with one or more components or devices at a local office or headend. The local office or the user device may transcribe the speech commands into language transcriptions. The local office or the user device may determine a mood for the user based on whether any of the speech commands may have been repeated. The local office or the user device may determine, based on the mood of the user, which content asset or content service to make available to the user device.

Determination of Content Services
20230030212 · 2023-02-02 ·

According to some aspects, disclosed methods and systems may include having a user input one or more speech commands into an input device of a user device. The user device may communicate with one or more components or devices at a local office or headend. The local office or the user device may transcribe the speech commands into language transcriptions. The local office or the user device may determine a mood for the user based on whether any of the speech commands may have been repeated. The local office or the user device may determine, based on the mood of the user, which content asset or content service to make available to the user device.

Learning method, learning program, learning device, and learning system
11488060 · 2022-11-01 · ·

Provided is a learning method, a learning program, a learning device, and a learning system, for training a classification model, to further raise the correct answer rate of classification by the classification model. The learning method includes execution of generating one piece of composite data by compositing a plurality of pieces of training data of which classification has each been set, or a plurality of pieces of converted data obtained by converting the plurality of pieces of training data, at a predetermined ratio, inputting one or a plurality of pieces of the composite data into the classification model, and updating a parameter of the classification model so that classification of the plurality of pieces of training data included in the composite data is replicated at the predetermined ratio by output of the classification model, by a computer provided with at least one hardware processor and at least one memory.

Method and system to estimate speaker characteristics on-the-fly for unknown speaker with high accuracy and low latency

A computer-implemented technique is presented for profiling an unknown speaker. A DNN-based frame selection allows the system to select the relevant frames necessary to provide a reliable speaker characteristic estimation. A frame selection module selects those frames that contain relevant information for estimating a given speaker characteristic and thereby contributes to the accuracy and the low latency of the system. Real-time speaker characteristics estimation allows the system to estimate the speaker characteristics from a speech segment of accumulated selected frames at any given time. The frame level processing contributes to the low latency as it is not necessary to wait for the whole speech utterance to predict a speaker characteristic but rather a speaker characteristic is estimated from only a few reliable frames. Different stopping criteria also contribute to the accuracy and the low latency of the system.

Method and system to estimate speaker characteristics on-the-fly for unknown speaker with high accuracy and low latency

A computer-implemented technique is presented for profiling an unknown speaker. A DNN-based frame selection allows the system to select the relevant frames necessary to provide a reliable speaker characteristic estimation. A frame selection module selects those frames that contain relevant information for estimating a given speaker characteristic and thereby contributes to the accuracy and the low latency of the system. Real-time speaker characteristics estimation allows the system to estimate the speaker characteristics from a speech segment of accumulated selected frames at any given time. The frame level processing contributes to the low latency as it is not necessary to wait for the whole speech utterance to predict a speaker characteristic but rather a speaker characteristic is estimated from only a few reliable frames. Different stopping criteria also contribute to the accuracy and the low latency of the system.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
20230093165 · 2023-03-23 ·

The present technology relates to an information processing apparatus, an information processing method, and a program capable of performing a voice operation by a natural expression. An information processing apparatus according to the present technology includes a command processing unit that, in a case where a voice command that is input by a user and instructs to control a device includes a predetermined word for which a degree of control is determined as ambiguous, executes processing matching the voice command by using a parameter matching a way of speaking of a user at a time when the voice command is input. The present technology is applicable to, for example, an imaging apparatus that can be operated by a voice.

Methods and systems for providing changes to a live voice stream

Methods and Systems for providing a change to a voice interacting with a user are described. Information indicating a change that can be made to the voice can be received. The voice can be changed based on the information.

METHOD FOR COUNTING COUGHS BY ANALYZING SOUND SIGNAL, SERVER PERFORMING SAME, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM

A method for counting coughs is provided. The method includes acquiring a plurality of onset signals from the sound signal, wherein the onset signal has a predetermined time length; acquiring a plurality of spectrograms corresponding to each of the plurality of onset signals; determining whether each of the acquired plurality of spectrograms represents a cough using a cough determination model; and calculating a number of coughs included in the sound signal based on a time point of a cough signal. The cough signal is an onset signal corresponding to one spectrogram determined to represent the cough. When a time interval between a first time point of a first cough signal and a second time point of a second cough is within a reference time interval, the first cough signal and the second cough signal are regarded as one cough signal at the first time point.