G10L15/12

System and method of automated evaluation of transcription quality
10147418 · 2018-12-04 · ·

Systems and methods automatedly evaluate a transcription quality. Audio data is obtained. The audio data is segmented into a plurality of utterances with a voice activity detector operating on a computer processor. The plurality of utterances are transcribed into at least one word lattice with a large vocabulary continuous speech recognition system operating on the processor. A minimum Bayes risk decoder is applied to the at least one word lattice to create at least one confusion network. At least conformity ratio is calculated from the at least one confusion network.

System and method for call progress detection

A contact center includes an outbound server to make a call to a callee and a media device. The media device receives an audio signal based on the call, to determine a Mel-frequency cepstral coefficient for the received audio signal, and to match the Mel-frequency cepstral coefficient for the audio signal to a Mel-frequency cepstral coefficient for a pre-recorded carrier message. The media device can determine a content of the audio signal based on the match.

SPEECH RECOGNITION BY SELECTING AND REFINING HOT WORDS

Speech recognition is performed by receiving a speech signal that includes spoken phones. A dynamic time warping procedure is applied to the received speech signal to generate a time-warped signal. The time-warped signal is compared to a plurality of stored reference patterns to identify a set of stored reference patterns that are most similar to the time-warped signal. A candidate hot word is selected from a list using the identified set of stored reference patterns. The selection of the candidate hot word is then refined.

SPEECH RECOGNITION BY SELECTING AND REFINING HOT WORDS

Speech recognition is performed by receiving a speech signal that includes spoken phones. A dynamic time warping procedure is applied to the received speech signal to generate a time-warped signal. The time-warped signal is compared to a plurality of stored reference patterns to identify a set of stored reference patterns that are most similar to the time-warped signal. A candidate hot word is selected from a list using the identified set of stored reference patterns. The selection of the candidate hot word is then refined.

AUTOMATED ASSISTANT DATA FLOW

A system that transforms queries for each dialogue domain into constraint graphs, including both constraints explicitly provided by the user as well as implicit constraints that are inherent to the domain. Once all the domain-specific constraints have been collected into a graph, general-purpose domain-independent algorithms can be used to draw inferences for both intent disambiguation and constraint propagation. Given a candidate interpretation of a user utterance as the posting, modification, or retraction of a constraint, constraint inference techniques such as arc consistency and satisfiability checking can be used to answer questions. The underlying engine can also handle soft constraints, in cases where the constraint may be violated for some cost or in cases where there are different degrees of violations.

ANALOG-DIGITAL CONVERTER AND ANALOG-TO-DIGITAL CONVERSION METHOD
20180254781 · 2018-09-06 ·

Present invention discloses an ADC and an analog-to-digital conversion method. The ADC includes: a clock generator, including M transmission gates, where the M transmission gates are configured to receive a first clock signal that is periodically sent and separately perform gating control on the first clock signal, so as to generate M second clock signals, M is an integer that is greater than or equal to 2; M ADC channels that are configured in a time interleaving manner, configured to receive one analog signal and separately perform, under the control of the M second clock signals, sampling and analog-to-digital conversion on the analog signal, so as to obtain M digital signals, where each ADC channel is corresponding to one clock signal of the M second clock signals; and an adder, configured to add the M digital signals together in a digital field, so as to obtain a digital output signal.

ANALOG-DIGITAL CONVERTER AND ANALOG-TO-DIGITAL CONVERSION METHOD
20180254781 · 2018-09-06 ·

Present invention discloses an ADC and an analog-to-digital conversion method. The ADC includes: a clock generator, including M transmission gates, where the M transmission gates are configured to receive a first clock signal that is periodically sent and separately perform gating control on the first clock signal, so as to generate M second clock signals, M is an integer that is greater than or equal to 2; M ADC channels that are configured in a time interleaving manner, configured to receive one analog signal and separately perform, under the control of the M second clock signals, sampling and analog-to-digital conversion on the analog signal, so as to obtain M digital signals, where each ADC channel is corresponding to one clock signal of the M second clock signals; and an adder, configured to add the M digital signals together in a digital field, so as to obtain a digital output signal.

Keyword spotting system for achieving low-latency keyword recognition by using multiple dynamic programming tables reset at different frames of acoustic data input and related keyword spotting method
10032449 · 2018-07-24 · ·

A keyword spotting system includes a decoder having a storage device and a decoding circuit. The storage device is used to store a log-likelihood table and a plurality of dynamic programming (DP) tables generated for recognition of a designated keyword. The decoding circuit is used to refer to features in one frame of an acoustic data input to calculate the log-likelihood table and refer to at least the log-likelihood table to adjust each of the DP tables when recognition of the designated keyword is not accepted yet, where the DP tables are reset by the decoding circuit at different frames of the acoustic data input, respectively.

Keyword spotting system for achieving low-latency keyword recognition by using multiple dynamic programming tables reset at different frames of acoustic data input and related keyword spotting method
10032449 · 2018-07-24 · ·

A keyword spotting system includes a decoder having a storage device and a decoding circuit. The storage device is used to store a log-likelihood table and a plurality of dynamic programming (DP) tables generated for recognition of a designated keyword. The decoding circuit is used to refer to features in one frame of an acoustic data input to calculate the log-likelihood table and refer to at least the log-likelihood table to adjust each of the DP tables when recognition of the designated keyword is not accepted yet, where the DP tables are reset by the decoding circuit at different frames of the acoustic data input, respectively.

Processing acoustic sequences using long short-term memory (LSTM) neural networks that include recurrent projection layers
10026397 · 2018-07-17 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating phoneme representations of acoustic sequences using projection sequences. One of the methods includes receiving an acoustic sequence, the acoustic sequence representing an utterance, and the acoustic sequence comprising a respective acoustic feature representation at each of a plurality of time steps; for each of the plurality of time steps, processing the acoustic feature representation through each of one or more long short-term memory (LSTM) layers; and for each of the plurality of time steps, processing the recurrent projected output generated by the highest LSTM layer for the time step using an output layer to generate a set of scores for the time step.