Patent classifications
G10L15/22
MULTIMODAL SPEECH RECOGNITION METHOD AND SYSTEM, AND COMPUTER-READABLE STORAGE MEDIUM
The disclosure provides a multimodal speech recognition method and system, and a computer-readable storage medium. The method includes calculating a first logarithmic mel-frequency spectral coefficient and a second logarithmic mel-frequency spectral coefficient when a target millimeter-wave signal and a target audio signal both contain speech information corresponding to a target user; inputting the first and the second logarithmic mel-frequency spectral coefficient into a fusion network to determine a target fusion feature, where the fusion network includes at least a calibration module and a mapping module, the calibration module is configured to perform mutual feature calibration on the target audio/millimeter-wave signals, and the mapping module is configured to fuse a calibrated millimeter-wave feature and a calibrated audio feature; and inputting the target fusion feature into a semantic feature network to determine a speech recognition result corresponding to the target user. The disclosure can implement high-accuracy speech recognition.
SPEECH RECOGNITION IN A VEHICLE
An audio sample including speech and ambient sounds is transmitted to a vehicle computer. Recorded audio is received from the vehicle computer, the recorded audio including the audio sample broadcast by the vehicle computer and recorded by the vehicle computer and recognized speech from the recorded audio. The recognized speech and text of the speech are input to a machine learning program that outputs whether the recognized speech matches the text. When the output from the machine learning program indicates that the recognized speech does not match the text, the recognized speech and the text are included in a training dataset for the machine learning program.
SPEECH RECOGNITION IN A VEHICLE
An audio sample including speech and ambient sounds is transmitted to a vehicle computer. Recorded audio is received from the vehicle computer, the recorded audio including the audio sample broadcast by the vehicle computer and recorded by the vehicle computer and recognized speech from the recorded audio. The recognized speech and text of the speech are input to a machine learning program that outputs whether the recognized speech matches the text. When the output from the machine learning program indicates that the recognized speech does not match the text, the recognized speech and the text are included in a training dataset for the machine learning program.
SYSTEMS AND METHODS FOR FACILITATING STREAMING IN A LOCAL NETWORK WITH MULTIPLE SUBNETS
Systems, methods, and non-transitory, machine-readable media to facilitate streaming in a local network are disclosed. A primary media device may be configured to: operate as a server in a local network, receive audio/video (A/V) content, and provide the A/V content to a first display. A secondary media device may be communicatively connected to the primary media device and may be configured to: operate as a client with respect to the primary media device in the local network, receive the A/V content from the primary media device, and provide the A/V content to a second display. The primary media device and the secondary media device may use multiple subnets in the local network. The primary media device and/or the secondary media device may select a first subnet of the multiple subnets to use based at least in part on a type of content to communicate via the first subnet.
APPENDING ASSISTANT SERVER REQUESTS, FROM A CLIENT ASSISTANT,WITH PROACTIVELY-AGGREGATED PERIPHERAL DEVICE DATA
Implementations relate to proactively aggregating client device data to append to client assistant data that is communicated to a server device in response to a user request to a client automated assistant. When a user request that is associated with, for example, a peripheral client device, is received at a client device, the client device can communicate, to a server device, data that embodies the user request (e.g., audio data and/or local speech recognition data), along with peripheral device data that was received before the client device received the user request. In this way, the client automated assistant can bypass expressly soliciting peripheral device data each time a user request is received at another client device. Instead, a peripheral device can proactively communicate device data to a client device so that the device data can be appended to request data communicated to the server device from a particular client device.
APPENDING ASSISTANT SERVER REQUESTS, FROM A CLIENT ASSISTANT,WITH PROACTIVELY-AGGREGATED PERIPHERAL DEVICE DATA
Implementations relate to proactively aggregating client device data to append to client assistant data that is communicated to a server device in response to a user request to a client automated assistant. When a user request that is associated with, for example, a peripheral client device, is received at a client device, the client device can communicate, to a server device, data that embodies the user request (e.g., audio data and/or local speech recognition data), along with peripheral device data that was received before the client device received the user request. In this way, the client automated assistant can bypass expressly soliciting peripheral device data each time a user request is received at another client device. Instead, a peripheral device can proactively communicate device data to a client device so that the device data can be appended to request data communicated to the server device from a particular client device.
PROVIDING RELEVANT INFORMATION DURING ONLINE MEETINGS
One disclosed method involves determining, by at least one computing system and based at least in part on input provided to a meeting application, at least a first topic of interest for a first user accessing the meeting application via a first client device, in response to determining the first topic of interest, querying, by the at least one computing system, at least one data source, external to the meeting application, for information corresponding to the first topic of interest, and causing, by the at least one computing system, the first client device to display a representation of the information.
PROVIDING RELEVANT INFORMATION DURING ONLINE MEETINGS
One disclosed method involves determining, by at least one computing system and based at least in part on input provided to a meeting application, at least a first topic of interest for a first user accessing the meeting application via a first client device, in response to determining the first topic of interest, querying, by the at least one computing system, at least one data source, external to the meeting application, for information corresponding to the first topic of interest, and causing, by the at least one computing system, the first client device to display a representation of the information.
Voice Control System for Recreational Vehicles
A voice control system for recreational vehicles controls safe operation of deployable components of the RV, such as an antenna, awning or room extension. A voice recognition system is employed to interpret voice commands and control operation of the deployable components in response. A control system monitors the status of the RV and its components to detect unsafe conditions relating to operation of the deployable components. For example, this can be done via a wired or wireless network of sensors detecting the state of the RV and its deployable components. The control system can also monitor the status of the RV via the data bus built into the RV. If an unsafe condition is detected, the safety manager of the control system triggers a warning or activates a predetermined interlock to prevent unsafe operation of the RV and its deployable components.
Voice Control System for Recreational Vehicles
A voice control system for recreational vehicles controls safe operation of deployable components of the RV, such as an antenna, awning or room extension. A voice recognition system is employed to interpret voice commands and control operation of the deployable components in response. A control system monitors the status of the RV and its components to detect unsafe conditions relating to operation of the deployable components. For example, this can be done via a wired or wireless network of sensors detecting the state of the RV and its deployable components. The control system can also monitor the status of the RV via the data bus built into the RV. If an unsafe condition is detected, the safety manager of the control system triggers a warning or activates a predetermined interlock to prevent unsafe operation of the RV and its deployable components.