IPIQ

G06F16/3343

System and method for phonetic search over speech recordings

10019514 · 2018-07-10 ·

Nice Ltd.

A system and method for searching for an element in speech related documents may include transcribing a set of speech recordings to a set of phoneme strings and including the phoneme strings in a set of phonetic transcriptions. A system and method may reverse-index the phonetic transcriptions according to one or more phonemes such that the one or more phonemes can be used as a search key for searching the phoneme in the phonetic transcriptions. A system and method may transcribe a textual search term into a set of search phoneme strings and use the set of search phoneme strings to search for an element in the set of phonetic transcriptions.

SYSTEM AND METHOD FOR DETECTING PHONETICALLY SIMILAR IMPOSTER PHRASES

20180182378 · 2018-06-28 ·

A system and method for detecting phonetically similar imposter phrases may include using automatic speech recognition (ASR) to search for a first phrase in a set of objects; producing a list of references by searching for the first phrase in the set of objects using phonetic search; using output produced by the ASR to determine whether or not a reference in the list points to a phrase that is the same as the first phrase; and if it is determined that the reference points to a second phrase that is different from the first phrase then marking the second phrase as a potential cause for a phrase search false positive.

Increasing user engagement through query suggestion

12170085 · 2024-12-17 ·

ADEIA GUIDES INC.

Systems and methods are presented herein for increasing user engagement with an interface by suggesting commands or queries for the user. A plurality of content items available for consumption are identified and metadata for each of the plurality of content items is retrieved. One or more candidate voice commands are generated based on a plurality of voice command templates based on a target verb and a subset of the metadata corresponding to the plurality of the content items available for consumption. A recall score is generated for each candidate voice command based at least in part on a detection of phonetic features that match between clauses of each candidate voice command. At least the candidate voice command with the highest recall score is selected and output using a suggestion system.

Phoneme-based text transcription searching

12165647 · 2024-12-10 ·

Microsoft Technology Licensing, Llc

Yuchen LI

A computer-implemented method is disclosed. A search query of a text transcription is received. The search query includes a word or words having a specified spelling. A sequence of search phonemes corresponding to the specified spelling is generated. A sequence of transcript phonemes corresponding to the text transcription is generated from the text transcription. A search alignment in which the sequence of search phonemes is aligned to a transcript phoneme fragment is generated. Based at least on the search alignment having a quality score exceeding a quality score threshold, the transcript phoneme fragment and an associated portion of the text transcription is determined to result from an utterance of the specified spelling in an audio session corresponding to the text transcription. A search result indicating that the transcript phoneme fragment and the associated portion of the text transcription is determined to have resulted from the utterance is output.

SYSTEM FOR MENDING THROUGH AUTOMATED PROCESSES

20170206272 · 2017-07-20 ·

Systems and methods are provided for transforming historical data collected in response to one or more triggering events, in order to classify textual values. Embodiments access a plurality of textual values from historical transaction data; identify one or more distinct patterns within the plurality of textual values; group the textual values based on the one or more distinct patterns, thereby forming one or more clusters; apply a similarity gauge to the textual values of each of the clusters to determine similarity or dissimilarity among the textual values of each cluster; and filter the textual values of each cluster to determine which textual values belong in each cluster, wherein the textual values that belong are cluster values. Some embodiments also remove undesired characters from the textual values, and in some cases identifying the distinct patterns includes comparing pronunciations and/or phonetics of the textual values.

INCREASING USER ENGAGEMENT THROUGH QUERY SUGGESTION

20250061895 · 2025-02-20 ·

METHOD AND SYSTEM FOR SEARCHING WORDS IN DOCUMENTS WRITTEN IN A SOURCE LANGUAGE AS TRANSCRIPT OF WORDS IN AN ORIGIN LANGUAGE

20170116175 · 2017-04-27 ·

The invention relates to a method used by computers for searching words in documents written in a source language, which are not in the vocabulary of said source language, but are transcript of meaningful words in an origin language. The method is comprised of a preparation process and a search process. During the preparation process a database of unrecognized words in the source language is maintained, which contains, among other data, normalized phonetic conversion of the unrecognized word, as well as a corpus of all words of the documents in the search domain and indexes for efficient search. During search, a phonetic conversion and normalization is done for the search word, and the distance to similar phonetics words in the corpus is calculated. The found words in the corpus are arranged in ascending order, and the relevant

Systems, methods, and apparatus for providing dynamic auto-responses at a mediating assistant application

12254333 · 2025-03-18 ·

Google Llc

Methods, apparatus, systems, and computer-readable media are provided for providing context specific schema files that allow an automated assistant to broker human-to-computer dialogs between a user and an application that is separate from the automated assistant. The context specific schema file can provide the automated assistant with sufficient data to be responsive to user queries without necessarily communicating with a remote device, such as a server. Multiple different context specific schema files can be made available to the automated assistant according to a context in which a user is interacting with the automated assistant. In this way, latency otherwise exhibited by the automated assistant can be mitigated by providing the automated assistant with the information needed to respond to a user without continually retrieving the information over a network.

Method for human-machine dialogue, computing device and computer-readable storage medium

12282747 · 2025-04-22 ·

Ubtech Robotics Corp Ltd

Li Ma

A method includes: acquiring an input sentence in a first language in a current round of conversation; translating the input sentence in the first language to obtain an input sentence in a second language, according to dialogue contents in the first language and dialogue contents in the second language that have a mutual translation relationship with the dialogue contents in the first language in historical rounds of conversation; invoking a multi-round conversation generation model to parse the input sentence in the second language in the current round of conversation to generate an output sentence in the second language in the current round of conversation; translating the output sentence in the second language in the current round of conversation to obtain at least one candidate result in the first language; and determining an output sentence in the first language from the at least one candidate result in the first language.

Information processing method, terminal device, and distributed network

12451117 · 2025-10-21 ·

Huawei Technologies Co., Ltd.

Bo Lu
Dexiang JIA

This application provides information processing methods, terminal devices, and distributed networks. In an implementation, after determining target information including at least one piece of text information and at least one piece of non-text information, a terminal device may determine, based on a predetermined playing speed and a predetermined time, at least one first location associated with at least one piece of non-text information. Text-to-speech is sequentially performed on the at least one piece of text information, to obtain and sequentially play speech information respectively corresponding to the at least one piece of text information. In response to speech information corresponding to first text information being played, target non-text information is sent to a second terminal device, so that the second terminal device displays the target non-text information.

Patent classifications

G06F16/3343