G10L15/28

HYBRID LIVE CAPTIONING SYSTEMS AND METHODS

A computer system configured to generate captions is provided. The computer system includes a memory and a processor coupled to the memory. The processor is configured to access a first buffer configured to store text generated by an automated speech recognition (ASR) process; access a second buffer configured to store text generated by a captioning client process; identify either the first buffer or the second buffer as a source buffer of caption text; generate caption text from the source buffer; and communicate the caption text to a target process.

Joint endpointing and automatic speech recognition

A method includes receiving audio data of an utterance and processing the audio data to obtain, as output from a speech recognition model configured to jointly perform speech decoding and endpointing of utterances: partial speech recognition results for the utterance; and an endpoint indication indicating when the utterance has ended. While processing the audio data, the method also includes detecting, based on the endpoint indication, the end of the utterance. In response to detecting the end of the utterance, the method also includes terminating the processing of any subsequent audio data received after the end of the utterance was detected.

Joint endpointing and automatic speech recognition

A method includes receiving audio data of an utterance and processing the audio data to obtain, as output from a speech recognition model configured to jointly perform speech decoding and endpointing of utterances: partial speech recognition results for the utterance; and an endpoint indication indicating when the utterance has ended. While processing the audio data, the method also includes detecting, based on the endpoint indication, the end of the utterance. In response to detecting the end of the utterance, the method also includes terminating the processing of any subsequent audio data received after the end of the utterance was detected.

Pulse density modulation systems and methods
11637546 · 2023-04-25 · ·

Systems and methods for programmable pulse density modulation (PDM) components enable backwards compatibility while maintaining reasonable tolerances. A system includes a programmable PDM device, a PDM master device and a bus communicably coupling the programmable PDM device to the PDM receiver. The PDM device may include an audio sensor, audio input circuitry, a delta-sigma converter and a PDM transmitter and receiver. The PDM transmitter and receiver may send out PDM data from the PDM device and receive programming data from the PDM Master device. The PDM device may further include register space controlled by the PDM master device, a buffer storing audio data for wakeup word systems that store audio data when the PDM receiver is powered down, a bus holder to hold the previous value on the bus if no device is driving it, and/or a clock multiplier to multiply the incoming clock by a factor.

Generating and updating voice-based software applications using application templates
11599336 · 2023-03-07 · ·

Systems and methods of generating voice-based software applications are provided. A system can receive, from an application developer computing device, a request to build a voice-based software application. The system can select an application template from a plurality of application templates. The selected application template can include a module that corresponds to a function of the voice-based software application. The system can provide the selected application template to the application developer computing device. The system can receive, from the application developer computing device, an input for a field of the at least one module of the selected application template. The system can generate the voice-based software application based on the selected application template and the input for the at least one field of the at least one module of the selected application template.

Generating and updating voice-based software applications using application templates
11599336 · 2023-03-07 · ·

Systems and methods of generating voice-based software applications are provided. A system can receive, from an application developer computing device, a request to build a voice-based software application. The system can select an application template from a plurality of application templates. The selected application template can include a module that corresponds to a function of the voice-based software application. The system can provide the selected application template to the application developer computing device. The system can receive, from the application developer computing device, an input for a field of the at least one module of the selected application template. The system can generate the voice-based software application based on the selected application template and the input for the at least one field of the at least one module of the selected application template.

Electronic device and control method thereof
11600275 · 2023-03-07 · ·

An electronic device performing voice recognition on user utterance based on first voice assistance. The electronic device may receive information on recognition characteristic of second voice assistance for user utterance from an external device and adjust recognition characteristic of the first voice assistance based on the information on the recognition characteristic of the second voice assistance.

VOCALLY ACTUATED SURGICAL CONTROL SYSTEM

A method and system for controlling at least one robotically controlled surgical tool via vocal activation include detecting a vocal command generated by a surgeon, converting said at least one surgeon in said surgical setting via said voice sensor. The vocal command is converted to operative instructions associated with a robotically controlled surgical tool to generate instructions according to a predetermined set of rules including at least one of a no fly zone rule and collision prevention rule.

VOCALLY ACTUATED SURGICAL CONTROL SYSTEM

A method and system for controlling at least one robotically controlled surgical tool via vocal activation include detecting a vocal command generated by a surgeon, converting said at least one surgeon in said surgical setting via said voice sensor. The vocal command is converted to operative instructions associated with a robotically controlled surgical tool to generate instructions according to a predetermined set of rules including at least one of a no fly zone rule and collision prevention rule.

SPEAKER ASSEMBLY IN A DISPLAY ASSISTANT DEVICE

In a display assistant device, a speaker is mounted in a waveguide structure which is at least partially disposed beneath a display screen. The waveguide structure is mounted in an exterior housing which includes speaker grills distributed on a plurality of surfaces of the exterior housing, permitting sound waves from the speaker to be projected outside the exterior housing. A cover structure is disposed on top of the waveguide structure to conceal the waveguide structure and speaker within the exterior housing. The cover structure has a tilted bottom surface configured to be suspended above the waveguide structure and to be separated by a first space. Sound waves projected from an upper portion of the speaker are reflected by the tilted bottom surface and are guided through the first space to exit the device from a speaker grill portion located on a rear side of the exterior housing.