Patent classifications
G10L13/00
SECURING PERSONALLY IDENTIFIABLE AND PRIVATE INFORMATION IN CONVERSATIONAL AI-BASED COMMUNICATION
Method and system of securing personally identifiable and sensitive information in conversational AI based communication. The method comprises enabling a first service provider device as a communication channel provider of an incoming communication mode and enabling a second service provider device s a communication channel provider of an outgoing communication mode, at least one of the incoming communication and outgoing communication modes comprising an audio communication, storing content of a conversation in the incoming communication mode in a first storage medium accessible to the first service provider device but not the second service provider device, and storing content of the conversation in the outgoing communication mode at a second storage medium accessible to the second service provider device but not the first service provider device, and anonymizing the audio communication wherein personally identifiable audio characteristics of the user are obfuscated from the service provider devices.
CANCELLATION MANAGEMENT OF PATIENT REQUESTS FOR ASSISTANCE IN A HEALTHCARE FACILITY
Systems and methods are provided for managing patient assistance requests in a healthcare facility, documenting items (e.g., minor, routine, and/or frequently-performed items) in association with a patient's records in an Electronic Healthcare Information System, and cancelling patient requests for assistance. Indications that requests for assistance have been received and/or are being addressed by an appropriate healthcare team member may be audibly output from a speaker associated with a personal assistant device. Healthcare team members may verbally provide items for documentation in association with a patient's medical records, the items for documentation being received by a listening component of a personal assistant device and transmitted to an EHIS for documentation. Healthcare team members may verbally cancel patient requests for assistance upon the healthcare team member addressing the request and, in some instances, verification of the healthcare team member as an approved source for documenting the item(s) in association with the patient.
CANCELLATION MANAGEMENT OF PATIENT REQUESTS FOR ASSISTANCE IN A HEALTHCARE FACILITY
Systems and methods are provided for managing patient assistance requests in a healthcare facility, documenting items (e.g., minor, routine, and/or frequently-performed items) in association with a patient's records in an Electronic Healthcare Information System, and cancelling patient requests for assistance. Indications that requests for assistance have been received and/or are being addressed by an appropriate healthcare team member may be audibly output from a speaker associated with a personal assistant device. Healthcare team members may verbally provide items for documentation in association with a patient's medical records, the items for documentation being received by a listening component of a personal assistant device and transmitted to an EHIS for documentation. Healthcare team members may verbally cancel patient requests for assistance upon the healthcare team member addressing the request and, in some instances, verification of the healthcare team member as an approved source for documenting the item(s) in association with the patient.
Systems And Methods For Presenting Social Network Communications In Audible Form Based On User Engagement With A User Device
Methods and systems are described herein for generating an audible presentation of a communication received from a remote server. A presentation of a media asset on a user equipment device is generated for a first user. A textual-based communication is received, at the user equipment device from the remote server. The textual-based communication is transmitted to the remote server by a second user and the remote server transmits the textual-based communication to the user equipment device responsive to determining that the second user is on a list of users associated with the first user. An engagement level of the first user with the user equipment device is determined. Responsive to determining that the engagement level does not exceed a threshold value, a presentation of the textual-based communication is generated in audible form.
Systems And Methods For Presenting Social Network Communications In Audible Form Based On User Engagement With A User Device
Methods and systems are described herein for generating an audible presentation of a communication received from a remote server. A presentation of a media asset on a user equipment device is generated for a first user. A textual-based communication is received, at the user equipment device from the remote server. The textual-based communication is transmitted to the remote server by a second user and the remote server transmits the textual-based communication to the user equipment device responsive to determining that the second user is on a list of users associated with the first user. An engagement level of the first user with the user equipment device is determined. Responsive to determining that the engagement level does not exceed a threshold value, a presentation of the textual-based communication is generated in audible form.
Robust Direct Speech-to-Speech Translation
A direct speech-to-speech translation (S2ST) model includes an encoder configured to receive an input speech representation that to an utterance spoken by a source speaker in a first language and encode the input speech representation into a hidden feature representation. The S2ST model also includes an attention module configured to generate a context vector that attends to the hidden representation encoded by the encoder. The S2ST model also includes a decoder configured to receive the context vector generated by the attention module and predict a phoneme representation that corresponds to a translation of the utterance in a second different language. The S2ST model also includes a synthesizer configured to receive the context vector and the phoneme representation and generate a translated synthesized speech representation that corresponds to a translation of the utterance spoken in the different second language.
SYSTEMS AND METHODS FOR ADDRESSING A CORRUPTED SEGMENT IN A MEDIA ASSET
Systems and methods for addressing a corrupted segment in a media asset. The media guidance application determines that a segment of a media asset is corrupted. The media guidance application determines whether a retrieval period to retrieve an uncorrupted copy of the segment exceeds a threshold period. If the retrieval period does not exceed the threshold period, the media guidance application retrieves and generates for display the uncorrupted copy of the segment. If the retrieval period exceeds the threshold period, the media guidance application determines whether an importance level of the corrupted segment exceeds a threshold level. If the importance level exceeds the threshold level, the media guidance application generates for display a summary for the corrupted segment. If the importance level does not exceed the threshold level, the media guidance application generates for display the subsequent segment and the summary for the corrupted segment in an overlay.
SYSTEMS AND METHODS FOR ADDRESSING A CORRUPTED SEGMENT IN A MEDIA ASSET
Systems and methods for addressing a corrupted segment in a media asset. The media guidance application determines that a segment of a media asset is corrupted. The media guidance application determines whether a retrieval period to retrieve an uncorrupted copy of the segment exceeds a threshold period. If the retrieval period does not exceed the threshold period, the media guidance application retrieves and generates for display the uncorrupted copy of the segment. If the retrieval period exceeds the threshold period, the media guidance application determines whether an importance level of the corrupted segment exceeds a threshold level. If the importance level exceeds the threshold level, the media guidance application generates for display a summary for the corrupted segment. If the importance level does not exceed the threshold level, the media guidance application generates for display the subsequent segment and the summary for the corrupted segment in an overlay.
PROCESSING SPEECH SIGNALS OF A USER TO GENERATE A VISUAL REPRESENTATION OF THE USER
A computing system for generating image data representing a speaker's face includes a detection device configured to route data representing a voice signal to one or more processors and a data processing device comprising the one or more processors configured to generate a representation of a speaker that generated the voice signal in response to receiving the voice signal. The data processing device executes a voice embedding function to generate a feature vector from the voice signal representing one or more signal features of the voice signal, maps a signal feature of the feature vector to a visual feature of the speaker by a modality transfer function specifying a relationship between the visual feature of the speaker and the signal feature of the feature vector; and generates a visual representation of at least a portion of the speaker based on the mapping, the visual representation comprising the visual feature.
PROCESSING SPEECH SIGNALS OF A USER TO GENERATE A VISUAL REPRESENTATION OF THE USER
A computing system for generating image data representing a speaker's face includes a detection device configured to route data representing a voice signal to one or more processors and a data processing device comprising the one or more processors configured to generate a representation of a speaker that generated the voice signal in response to receiving the voice signal. The data processing device executes a voice embedding function to generate a feature vector from the voice signal representing one or more signal features of the voice signal, maps a signal feature of the feature vector to a visual feature of the speaker by a modality transfer function specifying a relationship between the visual feature of the speaker and the signal feature of the feature vector; and generates a visual representation of at least a portion of the speaker based on the mapping, the visual representation comprising the visual feature.