G10L21/06

Voice monitoring system and voice monitoring method

A recording device records a video and an imaging time, and a voice. Based on the voice, a sound parameter calculator calculates a sound parameter for specifying magnitude of the voice in a monitoring area at the imaging time for each of pixels and for each of certain times. A sound parameter storage unit stores the sound parameter. A sound parameter display controller superimposes a voice heat map on a captured image of the monitoring area and displays the superimposed image on a monitor. At this time, the sound parameter display controller displays the voice heat map based on a cumulative time value of magnitude of the voice, according to designation of a time range.

Speech recognition services

A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action, such as streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user. The speech recognition platform, in combination with the device, may therefore facilitate efficient interactions between the user and a voice-controlled device.

Speech recognition services

A speech recognition platform configured to receive an audio signal that includes speech from a user and perform automatic speech recognition (ASR) on the audio signal to identify ASR results. The platform may identify: (i) a domain of a voice command within the speech based on the ASR results and based on context information associated with the speech or the user, and (ii) an intent of the voice command. In response to identifying the intent, the platform may perform a corresponding action, such as streaming audio to the device, setting a reminder for the user, purchasing an item on behalf of the user, making a reservation for the user or launching an application for the user. The speech recognition platform, in combination with the device, may therefore facilitate efficient interactions between the user and a voice-controlled device.

Contingent device actions during loss of network connectivity

A speech-based system includes a local device in a user premises and a network-based control service that directs the local device to perform actions for a user. The control service may specify a first action that is to be performed upon detection by the local device of a stimulus. In some cases, performing the first action may rely on the availability of network communications with the control service or with another service. In these cases, the control service also specifies a second, fallback action that does not rely upon network communications. Upon detecting the stimulus, the local device performs the first action if network communications are available. If network communications are not available, the local device performs the second, fallback action.

Contingent device actions during loss of network connectivity

A speech-based system includes a local device in a user premises and a network-based control service that directs the local device to perform actions for a user. The control service may specify a first action that is to be performed upon detection by the local device of a stimulus. In some cases, performing the first action may rely on the availability of network communications with the control service or with another service. In these cases, the control service also specifies a second, fallback action that does not rely upon network communications. Upon detecting the stimulus, the local device performs the first action if network communications are available. If network communications are not available, the local device performs the second, fallback action.

Authoring an immersive haptic data file using an authoring tool

Methods and systems of authoring audio signal(s) into haptic data file(s) are disclosed. An audio analysis module analyses the audio signal(s) using filterbank(s) or by performing a spectrogram analysis. Transients are detected in the audio signal. If present, the transients are processed to determine a transient score and a transient binary. A database stores device specific information and actuator specific information. A haptic perceptual bandwidth of an electronic computing device having an embedded actuator is determined by using information from the database. A user interface allows modification of time-amplitude values and transient values based on the determined haptic perceptual bandwidth. Authored time amplitude values are aggregated in authored audio descriptor data, which is passed to a transformation module that fits the data into the haptic perceptual bandwidth and implements algorithms to produce transformed audio descriptor data. Finally, the transformed audio descriptor data is converted to the haptic data file.

Authoring an immersive haptic data file using an authoring tool

Methods and systems of authoring audio signal(s) into haptic data file(s) are disclosed. An audio analysis module analyses the audio signal(s) using filterbank(s) or by performing a spectrogram analysis. Transients are detected in the audio signal. If present, the transients are processed to determine a transient score and a transient binary. A database stores device specific information and actuator specific information. A haptic perceptual bandwidth of an electronic computing device having an embedded actuator is determined by using information from the database. A user interface allows modification of time-amplitude values and transient values based on the determined haptic perceptual bandwidth. Authored time amplitude values are aggregated in authored audio descriptor data, which is passed to a transformation module that fits the data into the haptic perceptual bandwidth and implements algorithms to produce transformed audio descriptor data. Finally, the transformed audio descriptor data is converted to the haptic data file.

Somatosensory vibration generating device and method for forming somatosensory vibration
11606636 · 2023-03-14 ·

The invention provides a somatosensory vibration generating device comprising: an audio signal receiving module for receiving sound waves of external environmental sounds and converting the sound waves into a first audio frequency signal; a digital-to-analog conversion module for performing digital-to-analog conversion on the first audio frequency signal to generate and output a second audio frequency signal after digital-to-analog conversion; a digital signal processing module for converting the second audio frequency signal output by the digital-to-analog conversion module into a first vibration signal; an operational amplifier for performing gain processing on the first vibration signal and outputting a second vibration signal after gain processing; and at least one tactile transducer at least comprising a vibration element and a tactile transducer; and a frequency of the second audio frequency signal is less than 200 Hz.

Somatosensory vibration generating device and method for forming somatosensory vibration
11606636 · 2023-03-14 ·

The invention provides a somatosensory vibration generating device comprising: an audio signal receiving module for receiving sound waves of external environmental sounds and converting the sound waves into a first audio frequency signal; a digital-to-analog conversion module for performing digital-to-analog conversion on the first audio frequency signal to generate and output a second audio frequency signal after digital-to-analog conversion; a digital signal processing module for converting the second audio frequency signal output by the digital-to-analog conversion module into a first vibration signal; an operational amplifier for performing gain processing on the first vibration signal and outputting a second vibration signal after gain processing; and at least one tactile transducer at least comprising a vibration element and a tactile transducer; and a frequency of the second audio frequency signal is less than 200 Hz.

Apparatus and Method for Processing an Audio Input Recording to obtain a Processed Audio Recording to address Privacy Issues

An apparatus for processing an audio input recording to obtain a processed audio recording according to an embodiment is provided. The apparatus comprises an input interface (110) for receiving a plurality of audio input portions of the audio input recording. Moreover, the apparatus comprises a processor (120) for processing a plurality of audio input portions of the audio input recording to obtain a processed audio recording. The processor (120) is configured to determine, whether or not an audio input portion of the plurality of audio input portions comprises speech. If the processor (120) has detected that the audio input portion comprises speech, the processor (120) is configured to generate the processed audio recording by modifying the audio input portion to obtain a modified audio portion, and by generating the processed audio recording such that the processed audio recording comprises the modified audio portion instead of the audio input portion. Or, if the processor (120) has detected that the audio input portion comprises speech, the processor (120) is configured to generate the processed audio recording, such that the processed audio recording does not comprise the audio input portion.