Patent classifications
G10L25/48
Electronic device and method for speech recognition of the same
An electronic device for recognizing a user's speech and a speech recognition method therefor are provided. The electronic device includes a microphone configured to receive a user's speech, a memory for storing speech recognition models, and at least one processor configured to select a speech recognition model from among the speech recognition models stored in the memory based on an operation state of the electronic device, and recognize the user's speech received by the microphone based on the selected speech recognition model.
IDENTIFICATION AND CLASSIFICATION OF TALK-OVER SEGMENTS DURING VOICE COMMUNICATIONS USING MACHINE LEARNING MODELS
A system and methods are provided to analyze audio signals from an incoming voice call. The system includes a processor and a computer readable medium operably coupled thereto, to perform voice analysis operations which include receiving a first audio signal comprising a first audio waveform of a first speech between at least two users during the incoming voice call, accessing speech segment parameters for analyzing the audio signals, determining one or more talk-over segments in the first audio waveform using the speech segment parameters, extracting audio features from each of the one or more talk-over segments, determining, using a machine learning (ML) model trained for interruption analysis of the audio signals, whether each of the one or more talk-over segments are a negative interruption or a non-negative interruption based on the audio features, and determining whether to output a first notification for the negative interruption or the non-negative interruption.
System and method for filtering user requested information
A method for controlling secure access to user requested data includes retrieving information related to potential unauthorized access to user requested data. The information is collected by a plurality of sensors of user's mobile device. A trained statistical model representing an environment surrounding a user is generated based on the retrieved information. A first data security value is determined using the generated trained statistical model. The first data security value indicates a degree of information security based on user's environment. A second data security value is determined using the generated trained statistical model. The second data security value indicates a degree of confidentiality of the user requested data. The user requested data is filtered based on a ratio of the determined first data security value and the second data security value.
METHOD AND SYSTEM FOR AUTOMATIC SCORING READING FLUENCY
A method and system for automatic scoring of reading fluency. In some embodiments, the present disclosure provides a method and system for automatic scoring of reading fluency by recording a user's voice reading the passage in a reading fluency question and calculating a reading fluency score indicating how quickly and accurately the user reads texts.
CONTEXTUAL TRANSCRIPTION AUGMENTATION METHODS AND SYSTEMS
Methods and systems are provided for assisting operation of a vehicle using speech recognition and transcription. One method involves identifying an operational objective for an audio communication with respect to the vehicle, determining an expected clearance communication for the vehicle based at least in part on the operational objective, identifying a discrepancy between a transcription of the audio communication and the expected clearance communication, augmenting the transcription of the audio communication using a current operational context to reduce the discrepancy, and providing a graphical indication influenced by the augmented transcription.
CONTEXTUAL TRANSCRIPTION AUGMENTATION METHODS AND SYSTEMS
Methods and systems are provided for assisting operation of a vehicle using speech recognition and transcription. One method involves identifying an operational objective for an audio communication with respect to the vehicle, determining an expected clearance communication for the vehicle based at least in part on the operational objective, identifying a discrepancy between a transcription of the audio communication and the expected clearance communication, augmenting the transcription of the audio communication using a current operational context to reduce the discrepancy, and providing a graphical indication influenced by the augmented transcription.
RENDERING VIRTUAL ARTICLES OF CLOTHING BASED ON AUDIO CHARACTERISTICS
Systems and methods for generating a virtual article of clothing at a display are described. Some examples may include: obtaining video data and audio data, analyzing the video data to determine one or more body joints of a target object appearing in the video data. A mesh based on the determined one or more body joints may be generated. The audio data may be analyzed to determine audio characteristics associated with the audio data. Texture rendering information associated with a virtual article of clothing may be determined based on the audio characteristics. A rendered video may be generated by rendering the virtual article of clothing to the generated mesh using the texture rendering information.
RENDERING VIRTUAL ARTICLES OF CLOTHING BASED ON AUDIO CHARACTERISTICS
Systems and methods for generating a virtual article of clothing at a display are described. Some examples may include: obtaining video data and audio data, analyzing the video data to determine one or more body joints of a target object appearing in the video data. A mesh based on the determined one or more body joints may be generated. The audio data may be analyzed to determine audio characteristics associated with the audio data. Texture rendering information associated with a virtual article of clothing may be determined based on the audio characteristics. A rendered video may be generated by rendering the virtual article of clothing to the generated mesh using the texture rendering information.
ANNOTATION OF MEDIA FILES WITH CONVENIENT PAUSE POINTS
A computer-implemented method, a computer system and a computer program product annotate media files with convenient pause points. The method includes acquiring a text file version of an audio narration file. The text file version includes a pause point history of a plurality of prior users. The method also includes generating a list of pause points based on the pause point history. In addition, the method includes determining a tone of voice being used by a speaker at each pause point using natural language processing algorithms. The method further includes determining a set of convenient pause points based on the list of pause points and the determined tone of voice. Lastly, the method includes inserting the determined set of convenient pause points into the audio narration file.
Search query generation based on audio processing
Among other things, embodiments of the present disclosure relate to generating search queries based on audio processing. Other embodiments may be described and/or claimed.