Patent classifications
G10L15/22
VOICE CALL CONTROL METHOD AND APPARATUS, COMPUTER-READABLE MEDIUM, AND ELECTRONIC DEVICE
Embodiments of this application provide a real-time voice call control method performed by an electronic device. The method includes: obtaining a mixed call voice in real time during a cloud conference call, where the mixed call voice includes at least one branch voice; determining energy information corresponding to each frequency point of the call voice in a frequency domain; determining an energy proportion of each branch voice at each frequency point in total energy of the frequency point based on the energy information at the frequency point; determining a quantity of branch voices comprised in the call voice based on the energy proportion of each branch voice at each frequency point; and controlling the voice call by setting a call voice control manner based on the quantity of branch voices.
DEVICES AND METHODS FOR AUDITORY REHABILITATION FOR INTERAURAL ASYMMETRY
A device, system and related methods to provide assessment and treatment of amblyaudia through standardized methods that do not require advanced training or a booth with loudspeakers for the operator to administer. The ARIA stimuli protocols for both assessment and treatment, encoded in or to be used by a software program or application, are transferred to a stand-alone set of specialized noise-cancelling headphones attached or connected to, wired or wirelessly, a software platform on an electronic computing device. or integrated with the headphones. The program administers assessment tests to individuals through the noise-cancelling earphones. The device enables someone with minimal instructions to administer automatically or semi-automatically both assessment and treatment protocols, generate results, make interpretations, store data, and produce reports. The device or system may be loaded with standard protocols for English-speaking individuals, as well as dichotic speech material in any language.
DEVICES AND METHODS FOR AUDITORY REHABILITATION FOR INTERAURAL ASYMMETRY
A device, system and related methods to provide assessment and treatment of amblyaudia through standardized methods that do not require advanced training or a booth with loudspeakers for the operator to administer. The ARIA stimuli protocols for both assessment and treatment, encoded in or to be used by a software program or application, are transferred to a stand-alone set of specialized noise-cancelling headphones attached or connected to, wired or wirelessly, a software platform on an electronic computing device. or integrated with the headphones. The program administers assessment tests to individuals through the noise-cancelling earphones. The device enables someone with minimal instructions to administer automatically or semi-automatically both assessment and treatment protocols, generate results, make interpretations, store data, and produce reports. The device or system may be loaded with standard protocols for English-speaking individuals, as well as dichotic speech material in any language.
Audio Output Method and Terminal Device
This application provides an audio output method and a terminal device. The method helps to reduce the operation complexity of a user, improve a degree of intelligence of the terminal device, and finally improve the user experience. The method includes that: a second terminal device may send an audio output request to a first terminal device, and a user may operate the first terminal device or an audio output device connected to the first terminal device, to trigger the second terminal device to obtain audio data corresponding to the audio output request. When the audio output device is a wireless audio output device, the second terminal device may establish a communication link with the wireless audio output device, to control the wireless audio output device to output a first audio signal corresponding to the audio data; and when the audio output device is a wired audio output device, the first terminal device may obtain audio data from the second terminal device, to control the audio output device connected to the first terminal device to output a first audio signal corresponding to the audio data.
Audio Output Method and Terminal Device
This application provides an audio output method and a terminal device. The method helps to reduce the operation complexity of a user, improve a degree of intelligence of the terminal device, and finally improve the user experience. The method includes that: a second terminal device may send an audio output request to a first terminal device, and a user may operate the first terminal device or an audio output device connected to the first terminal device, to trigger the second terminal device to obtain audio data corresponding to the audio output request. When the audio output device is a wireless audio output device, the second terminal device may establish a communication link with the wireless audio output device, to control the wireless audio output device to output a first audio signal corresponding to the audio data; and when the audio output device is a wired audio output device, the first terminal device may obtain audio data from the second terminal device, to control the audio output device connected to the first terminal device to output a first audio signal corresponding to the audio data.
In-Vehicle Speech Interaction Method and Device
An in-vehicle speech interaction method and a device are provided. The method includes: obtaining user speech information; determining a user instruction based on the user speech information; determining, based on the user instruction, whether response content to the user instruction is privacy-related; and determining, based on whether the response content is privacy-related, whether to output the response content in a privacy protection mode, to protect privacy from being leaked.
In-Vehicle Speech Interaction Method and Device
An in-vehicle speech interaction method and a device are provided. The method includes: obtaining user speech information; determining a user instruction based on the user speech information; determining, based on the user instruction, whether response content to the user instruction is privacy-related; and determining, based on whether the response content is privacy-related, whether to output the response content in a privacy protection mode, to protect privacy from being leaked.
SPEECH RECOGNITION APPARATUS, METHOD AND PROGRAM
A score integration unit 7 obtains a new score Score (l.sub.1:n.sup.b, c) that integrates a score Score (l.sub.1:n.sup.b, c) and a score Score (w.sub.1:o.sup.b, c). This new score Score (l.sub.1:n.sup.b, c) becomes a score Score (l.sub.1:n.sup.b) in a hypothesis selection unit 8. Thus, the score Score (l.sub.1:n.sup.b) can be said to take into account the score Score (w.sub.1:o.sup.b, c). In a speech recognition apparatus, first information is extracted on the basis of the score Score (l.sub.1:n.sup.b) taking into account the score Score (w.sub.1:o.sup.b, c). Thus, speech recognition with higher performance than that in the related art can be achieved.
SPEECH RECOGNITION APPARATUS, METHOD AND PROGRAM
A score integration unit 7 obtains a new score Score (l.sub.1:n.sup.b, c) that integrates a score Score (l.sub.1:n.sup.b, c) and a score Score (w.sub.1:o.sup.b, c). This new score Score (l.sub.1:n.sup.b, c) becomes a score Score (l.sub.1:n.sup.b) in a hypothesis selection unit 8. Thus, the score Score (l.sub.1:n.sup.b) can be said to take into account the score Score (w.sub.1:o.sup.b, c). In a speech recognition apparatus, first information is extracted on the basis of the score Score (l.sub.1:n.sup.b) taking into account the score Score (w.sub.1:o.sup.b, c). Thus, speech recognition with higher performance than that in the related art can be achieved.
MULTIMODAL SPEECH RECOGNITION METHOD AND SYSTEM, AND COMPUTER-READABLE STORAGE MEDIUM
The disclosure provides a multimodal speech recognition method and system, and a computer-readable storage medium. The method includes calculating a first logarithmic mel-frequency spectral coefficient and a second logarithmic mel-frequency spectral coefficient when a target millimeter-wave signal and a target audio signal both contain speech information corresponding to a target user; inputting the first and the second logarithmic mel-frequency spectral coefficient into a fusion network to determine a target fusion feature, where the fusion network includes at least a calibration module and a mapping module, the calibration module is configured to perform mutual feature calibration on the target audio/millimeter-wave signals, and the mapping module is configured to fuse a calibrated millimeter-wave feature and a calibrated audio feature; and inputting the target fusion feature into a semantic feature network to determine a speech recognition result corresponding to the target user. The disclosure can implement high-accuracy speech recognition.