G10L25/00

DIGITAL AUDIO PROCESSING DEVICE, DIGITAL AUDIO PROCESSING METHOD, AND DIGITAL AUDIO PROCESSING PROGRAM
20200265861 · 2020-08-20 ·

A local extremum calculator detects a local maximum sample and a local minimum sample of a digital audio signal. A number-of-sample detector detects a sample interval between the local maximum sample and the local minimum sample. A difference value calculator calculates difference values between adjacent samples. A correction value calculator calculates a first correction value by multiplying the difference value between the local maximum sample and a first adjacent sample by a coefficient and calculates a second correction value by multiplying the difference value between the local minimum sample and a second adjacent sample by the coefficient. When a periodic signal detector detects that the digital audio signal is a single sine wave, an adder/subtractor does not add the first correction value to the first adjacent sample, and does not subtract the second correction value from the second adjacent sample.

Information processing apparatus, method and non-transitory computer-readable storage medium
10741198 · 2020-08-11 · ·

An information processing apparatus includes a memory, and a processor coupled to the memory and configured to specify a first signal level of a first voice signal, specify a second signal level of a second voice signal, and execute evaluation of at least one of the first voice signal and the second voice signal based on at least one of a sum of the first signal level and the second signal level and an average of the first signal level and the second signal level.

Name-sensitive listening device

One embodiment of the present invention sets forth a technique for providing audio enhancement to a user of a listening device. The technique includes reproducing a first audio stream, such as an audio stream associated with a media player. The technique further includes detecting a voice trigger. The voice trigger may be associated with a name of a user of the listening device. The technique further includes pausing or attenuating the first audio stream and reproducing a second audio stream associated with ambient sound in response to detecting the voice trigger.

Speech dialogue device and speech dialogue method

A correspondence relationship between keywords for instructing the start of a speech dialogue and modes of a response is defined in a response-mode correspondence table. A response-mode selecting unit selects a mode of a response corresponding to a keyword included in the recognition result of a speech recognition unit using the response-mode correspondence table. A dialogue controlling unit starts the speech dialogue when the keyword is included in the recognition result of the speech recognition unit, determines a response in accordance with the subsequent recognition result from the speech recognition unit, and controls a mode of the response in such a manner as to match the mode selected by the response-mode selecting unit. A speech output controlling unit generates speech data on the basis of the response and mode controlled by the dialogue controlling unit and outputs the speech data to a speaker.

EXPANDABLE DIALOGUE SYSTEM

A system that allows non-engineers administrators, without programming, machine language, or artificial intelligence system knowledge, to expand the capabilities of a dialogue system. The dialogue system may have a knowledge system, user interface, and learning model. A user interface allows non-engineers to utilize the knowledge system, defined by a small set of primitives and a simple language, to annotate a user utterance. The annotation may include selecting actions to take based on the utterance and subsequent actions and configuring associations. A dialogue state is continuously updated and provided to the user as the actions and associations take place. Rules are generated based on the actions, associations and dialogue state that allows for computing a wide range of results.

EXPANDABLE DIALOGUE SYSTEM

A system that allows non-engineers administrators, without programming, machine language, or artificial intelligence system knowledge, to expand the capabilities of a dialogue system. The dialogue system may have a knowledge system, user interface, and learning model. A user interface allows non-engineers to utilize the knowledge system, defined by a small set of primitives and a simple language, to annotate a user utterance. The annotation may include selecting actions to take based on the utterance and subsequent actions and configuring associations. A dialogue state is continuously updated and provided to the user as the actions and associations take place. Rules are generated based on the actions, associations and dialogue state that allows for computing a wide range of results.

System for controlling electronic devices by means of voice commands, more specifically a remote control to control a plurality of electronic devices by means of voice commands

A remote control for generating output signals apt at controlling one or more electronic device, characterized in that said remote control includes a sound transducer, a speech recognition unit for recognizing voice commands, a memory for storing information relative to available content of said one or more electronic device and a control signal generating and receiving unit for generating control signals corresponding to said voice commands, for controlling said one or more electronic device.

Switching between text data and audio data based on a mapping

Techniques are provided for creating a mapping that maps locations in audio data (e.g., an audio book) to corresponding locations in text data (e.g., an e-book). Techniques are provided for using a mapping between audio data and text data, whether the mapping is created automatically or manually. A mapping may be used for bookmark switching where a bookmark established in one version of a digital work (e.g., e-book) is used to identify a corresponding location with another version of the digital work (e.g., an audio book). Alternatively, the mapping may be used to play audio that corresponds to text selected by a user. Alternatively, the mapping may be used to automatically highlight text in response to audio that corresponds to the text being played. Alternatively, the mapping may be used to determine where an annotation created in one media context (e.g., audio) will be consumed in another media context.

Information processing device, information processing method, and program
10657959 · 2020-05-19 · ·

There is provided an information processing device, an information processing method, and a program that can allow a user to intuitively recognize other information corresponding to a speech output, the information processing device including: a control unit configured to control an output of other information different from a speech output related to a predetermined function on the basis of timing information on timing at which the speech output of an expression related to the function among a set of expressions is made, the set of expressions including the expression related to the function.

Determining a target device for voice command interaction

Systems, methods, and devices for determining a target device for a voice command are provided. A voice command is detected at a plurality of devices. A weight is determined for the detected voice command at each device of the plurality of devices. The determined weight is exchanged among the plurality of devices. A highest weight among the exchanged weights is determined. The device associated with the highest weight is determined as the target device for the voice command.