G10L2025/932

SYSTEM AND METHOD FOR THREAT DETECTION, CLASSIFICATION, WARNING AND ALERTING OF MOBILE USERS
20200027475 · 2020-01-23 ·

Embodiments of the invention are designed to detect a threat (such as an imminent automobile collision), and generates a audible warning to override the media distraction of a disengaged user. The user may take evasive action and acknowledge the warning, at which point the system resets and retreats to it vigilance mode. On the other hand, a system initiated alert is generated as a compensatory follow up step to a user who is non-responsive (does not acknowledge a system identified warning), and in such cases the system will transmit a user safety alert comprising the warning data and geolocation parameters, to first responders and interested parties such that they may engage the user to determine their current situation, or preemptively act to remedy a life threatening situation.

In the case of a user initiated alert, a user who might sense the threat of physical violence will trigger the generation of an alert via a customizable distress phrase causing the vigilance system to assemble and transmit an alert in real-time with the user's geolocation, live audio and video data information in the alert payload.

Voice processing method and device

A voice processing method and device, the method comprising: detecting a current voice application scenario in a network (S1); determining the voice quality requirement and the network requirement of the current voice application scenario (S2); based on the voice quality requirement and the network requirement, configuring voice processing parameters corresponding to the voice application scenario (S3); and according to the voice processing parameters, conducting voice processing on the voice signals collected in the voice application scenario (S4).

Voice Activity Detector for Audio Signals

According to one aspect, a method for determining voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having a sample rate, and spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband. The method further comprises filtering the lowest subband to reduce an energy of the lowest subband, estimating a noise level for at least some of the plurality of subbands, and computing a signal-to-noise ratio for at least some of the plurality of subbands. The method also includes determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands.

Frame loss correction with voice information
10431226 · 2019-10-01 · ·

A method for processing a digital audio signal, including a series of samples distributed in consecutive frames, is implemented when decoding the signal in order to replace at least one signal frame lost during decoding. The method includes the following steps: a) searching, in a valid signal segment available when decoding, for at least one period in the signal, determined in accordance with the valid signal; b) analyzing the signal in the period, in order to determine spectral components of the signal in the period; c) synthesizing at least one frame for replacing the lost frame, by construction of a synthesis signal from: an addition of components selected among the predetermined spectral components, and a noise added to the addition of components. In particular, the amount of noise added to the addition of components is weighted in accordance with voice information of the valid signal, obtained when decoding.

Voice activity detector for audio signals

According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.

Method And Apparatus For Recovering Lost Frames
20190251980 · 2019-08-15 ·

The present disclosure relates methods and apparatus for recovering a lost frame in a received audio signal. One example method includes obtaining an initial high-frequency band signal of a current lost frame in the received audio signal, calculating a ratio R, where the ratio R is a ratio of a high frequency excitation energy of a previous frame of the current lost frame to a high frequency excitation energy of the current lost frame, obtaining a global gain of the current lost frame according to the ratio R and a global gain of the previous frame of the current lost frame, and recovering a high-frequency band signal of the current lost frame according to the initial high-frequency band signal of the current lost frame and the global gain of the current lost frame.

CONCEPT FOR ENCODING AN AUDIO SIGNAL AND DECODING AN AUDIO SIGNAL USING DETERMINISTIC AND NOISE LIKE INFORMATION

An encoder for encoding an audio signal has: an analyzer configured for deriving prediction coefficients and a residual signal from an unvoiced frame of the audio signal; a gain parameter calculator configured for calculating a first gain parameter information for defining a first excitation signal related to a deterministic codebook and for calculating a second gain parameter information for defining a second excitation signal related to a noise-like signal for the unvoiced frame; and a bitstream former configured for forming an output signal based on an information related to a voiced signal frame, the first gain parameter information and the second gain parameter information.

Voice Activity Detection Feature Based on Modulation-Phase Differences
20190139567 · 2019-05-09 ·

Speech processing methods may rely on voice activity detection (VAD) that separates speech from noise. Example embodiments of a computationally low complex VAD feature that is robust against various types of noise is introduced. By considering an alternating excitation structure of low and high frequencies, speech is detected with a high confidence. The computationally low complex VAD feature can cope even with the limited spectral resolution that may be typical for a communication system, such as an in-car-communication (ICC) system. Simulation results confirm the robustness of the computationally low complex VAD feature and show an increase in performance relative to established VAD features.

Methods and systems for classifying audio segments of an audio signal

The disclosed embodiments illustrate a method for classifying one or more audio segments of an audio signal. The method includes determining one or more first features of a first audio segment of the one or more audio segments. The method further includes determining one or more second features based on the one or more first features. The method includes determining one or more third features of the first audio segment, wherein each of the one or more third features is determined based on a second feature of the one or more second features of the first audio segment and at least one second feature associated with a second audio segment. Additionally, the method includes classifying the first audio segment either in an interrogative category or a non-interrogative category based on one or more of the one or more second features and the one or more third features.

Speech signal processing circuit

A speech-signal-processing-circuit configured to receive a time-frequency-domain-reference-speech-signal and a time-frequency-domain-degraded-speech-signal. The time-frequency-domain-reference-speech-signal comprises: an upper-band-reference-component with frequencies that are greater than a frequency-threshold-value; and a lower-band-reference-component with frequencies that are less than the frequency-threshold-value. The time-frequency-domain-degraded-speech-signal comprises: an upper-band-degraded-component with frequencies that are greater than the frequency-threshold-value; and a lower-band-degraded-component with frequencies that are less than the frequency-threshold-value. The speech-signal-processing-circuit comprises: a disturbance calculator configured to determine one or more SBR-features based on the time-frequency-domain-reference-speech-signal and the time-frequency-domain-degraded-speech-signal by: for each of a plurality of frames: determining a reference-ratio based on the ratio of (i) the upper-band-reference-component to (ii) the lower-band-reference-component; determining a degraded-ratio based on the ratio of (i) the upper-band-degraded-component to (ii) the lower-band-degraded-component; and determining a spectral-balance-ratio based on the ratio of the reference-ratio to the degraded-ratio; and (ii) determining the one or more SBR-features based on the spectral-balance-ratio for the plurality of frames.