Patent classifications
G10L25/54
METHOD OF AND SYSTEM FOR REAL TIME FEEDBACK IN AN INCREMENTAL SPEECH INPUT INTERFACE
The present disclosure provides systems and methods for selecting and presenting content items based on user input. The method includes receiving first input intended to identify a desired content item among content items associated with metadata, determining that an input portion has an importance measure exceeding a threshold, and providing feedback identifying the input portion. The method further includes receiving second input, and inferring user intent to alter or supplement the first input with the second input. The method further includes, upon inferring intent to alter the first input, determining an alternative query by modifying the first input based on the second input, and, upon inferring intent to supplement the first input, determining an alternative query by combining the first input and the second input. The method further includes selecting and presenting a subset of content items based on comparing the alternative query and metadata associated with the subset.
AUDIO AUGMENTED REALITY SYSTEM
Techniques for online information search and retrieval for a query including a digital audio waveform. In an aspect, an audio waveform is received and digitized by at least one of a plurality of audio input devices. The digitized audio waveforms are transmitted to a central processing unit, which formulates and submits a query to an online engine. The formulated query may include the at least one digital audio waveform. The online engine retrieves one or more online results relevant to the formulated query. The online results may include one or more relevant visual results, and/or one or more relevant audio results. The retrieved results are served in real-time back to a user, via a device having audio output capability, and/or a device having visual data output capability.
Systems and methods to utilize text representations of conversations
A method for electronically utilizing content in a communication between a customer and a customer representative is provided. An audible conversation between a customer and a service representative is captured. At least a portion of the audible conversation is converted into computer searchable data. The computer searchable data is analyzed during the audible conversation to identify relevant meta tags previously stored in a data repository or generated during the audible conversation. Each meta tag is associated with the customer. Each meta tag provides a contextual item determined from at least a portion of one of a current or previous conversation with the customer. A meta tag determined to be relevant to the current conversation between the service representative and the customer is displayed in real time to the service representative currently conversing with the customer.
Systems and methods to utilize text representations of conversations
A method for electronically utilizing content in a communication between a customer and a customer representative is provided. An audible conversation between a customer and a service representative is captured. At least a portion of the audible conversation is converted into computer searchable data. The computer searchable data is analyzed during the audible conversation to identify relevant meta tags previously stored in a data repository or generated during the audible conversation. Each meta tag is associated with the customer. Each meta tag provides a contextual item determined from at least a portion of one of a current or previous conversation with the customer. A meta tag determined to be relevant to the current conversation between the service representative and the customer is displayed in real time to the service representative currently conversing with the customer.
System and Method for Building Contextual Highlights for Conferencing Systems
This disclosure relates to a method of highlighting at least a part of communication segments between a plurality of participants in a communication network. The method includes extracting, by a highlighting device, semantic information and a plurality of vocal cues from multimedia communication data exchanged between the plurality of participants; identifying, by the highlighting device, communication segments within the multimedia communication data by aggregating the semantic information and the plurality of vocal cues; associating, by the highlighting device, meta-data with each of the communication segments based on communication segment parameters; and highlighting, by the highlighting device, contextually, at least a part of the communication segments based on highlighting parameters received from a user.
ELECTRONIC DEVICE THAT PRESENTS CUSTOMIZED, AGE-APPROPRIATE AUGMENTED REALITY CONTENT
An electronic device, computer program product, and method enable presentation of age-appropriate augmented reality (AR) content by determining an age of wearer of an AR display device. The electronic device communicatively connects an electronic device to the AR display device. In response to determining that the electronic device is communicatively coupled to the AR display, a controller of the electronic device monitors or checks for information that indicates, or correlates to, an age of a person wearing the first AR display device being below a threshold age. In response to receiving information indicating that the person is below the threshold age, the controller selects customized AR content, customized for the person based on the age of the person and any identified personal preferences of the person. The controller presents, at the AR display device, the customized AR content, customized based on the age of the person.
ELECTRONIC DEVICE THAT PRESENTS CUSTOMIZED, AGE-APPROPRIATE AUGMENTED REALITY CONTENT
An electronic device, computer program product, and method enable presentation of age-appropriate augmented reality (AR) content by determining an age of wearer of an AR display device. The electronic device communicatively connects an electronic device to the AR display device. In response to determining that the electronic device is communicatively coupled to the AR display, a controller of the electronic device monitors or checks for information that indicates, or correlates to, an age of a person wearing the first AR display device being below a threshold age. In response to receiving information indicating that the person is below the threshold age, the controller selects customized AR content, customized for the person based on the age of the person and any identified personal preferences of the person. The controller presents, at the AR display device, the customized AR content, customized based on the age of the person.
Indexing based on time-variant transforms of an audio signal's spectrogram
An audio identification system generates audio fingerprints and indexes associated with the audio fingerprints based on discrete and overlapping frames within a sample of an audio signal. The system applies a time-to-frequency domain transform to a time-sequence of frames, which may be filtered. The audio identification system then applies a time-variant transformation (e.g., a Discrete Cosine Transform) to the transformed frames and generates an audio fingerprint and index by selecting sets of coefficients of the time-variant transformation. The system selects coefficients that are less sensitive to possible noise and/or distortions in the underlying signal, such as low-frequency coefficients. The time-variant transformation provides sufficient sampling among the indexes by incorporating the phase information of the frames into the indexes. The system stores the audio fingerprint and other identifying information by index for efficient retrieval and matching of the retrieved fingerprints.
Indexing based on time-variant transforms of an audio signal's spectrogram
An audio identification system generates audio fingerprints and indexes associated with the audio fingerprints based on discrete and overlapping frames within a sample of an audio signal. The system applies a time-to-frequency domain transform to a time-sequence of frames, which may be filtered. The audio identification system then applies a time-variant transformation (e.g., a Discrete Cosine Transform) to the transformed frames and generates an audio fingerprint and index by selecting sets of coefficients of the time-variant transformation. The system selects coefficients that are less sensitive to possible noise and/or distortions in the underlying signal, such as low-frequency coefficients. The time-variant transformation provides sufficient sampling among the indexes by incorporating the phase information of the frames into the indexes. The system stores the audio fingerprint and other identifying information by index for efficient retrieval and matching of the retrieved fingerprints.
VOICE IDENTIFICATION FOR OPTIMIZING VOICE SEARCH RESULTS
Systems and methods are provided for processing a voice input stream with interruptions and/or supplemental comments. Generally, a virtual voice assistant may receive an input stream with a first input comprising a voice query from a first voice and a second input comprising a secondary query from a second voice (e.g., an interruption or a supplement). The virtual assistant may determine that the second voice does not match the first voice, and then process the voice query to produce first results. Some embodiments may determine whether the secondary query is a supplement or an interruption and, e.g., choose to ignore an interruption or set aside a supplement if it may be used to help the search query. In some embodiments, results for the first query may be compared with results for the first query with a portion of the supplement.