G10L15/1807

Diagnosis and treatment of speech and language pathologies by speech to text and natural language processing

There is provided herein a method for assessing a speech/lingual quality of a subject, the method comprising: providing a content-containing stimulus to a user; recording the user's vocal response to the stimulus and/or to instructions related thereto; processing the user's recorded vocal response to measure/extract/compute at least one linguistics (prosodic) parameter and at least one acoustic parameter; transforming the user's vocal response into a transformed text section, which is based on a processing unit's interpretation of the user's verbal response; comparing the transformed text section to a predetermined text section, which represents the user's expected; and computing an output signal indicative of at least one speech/lingual quality of the user, based at least on data resulted from the texts comparison, the at least one measured/extracted/computed linguistic parameter and the at least one acoustic parameter.

Authentication circle shared expenses with extended family and friends

Systems and methods for providing authentication circles to pursue financial goals and/or share expenses with others are provided. One or more provider computing systems are communicatively coupled to one or more user devices. Users may join a circle and make contributions via electronic messages that may allow for acceptance in a one-click fashion. Members may, for example, plan for and share expenses for a trip and compare the expenses with budgets.

EMOTIONAL INTERACTION APPARATUS
20170365277 · 2017-12-21 ·

A system and method for emotional interaction. The system includes a robot that uses behavioral analysis automation to provide treatment and assessment of emotional communication and social skills for children with autism. The system generates a dataset including speech signals of one or more speakers, and assigns at least one of a set of labels to each of the speech signals for the one or more speakers. The set of labels includes at least three levels of emotional dimensions, the emotional dimensions include at least activation, valence, and dominance, and the at least three levels of emotional dimensions include a high state, a neutral state, and a low state.

AUTOMATED CALL REQUESTS WITH STATUS UPDATES
20170359464 · 2017-12-14 ·

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, relating to synthetic call status updates. In some implementations, a method includes determining, by a task manager module, that a triggering event has occurred to provide a current status of a user call request. The method may then determine, by the task manager module, the current status of the user call request. A representation of the current status of the user call request is generated. Then, the generated representation of the current status of the user call request is provided to the user.

NEURAL PITCH-SHIFTING AND TIME-STRETCHING

Methods for modifying audio data include operations for accessing audio data having a first prosody, receiving a target prosody differing from the first prosody, and computing acoustic features representing samples. Computing respective acoustic features for a sample includes computing a pitch feature as a quantized pitch value of the sample by assigning a pitch value, of the target prosody or the audio data, to at least one of a set of pitch bins having equal widths in cents. Computing the respective acoustic features further includes computing a periodicity feature from the audio data. The respective acoustic features for the sample include the pitch feature, the periodicity feature, and other acoustic features. A neural vocoder is applied to the acoustic features to pitch-shift and time-stretch the audio data from the first prosody toward the target prosody.

Systems and method for vocabulary management in a natural learning framework

An agent automation system implements a virtual agent that is capable of learning new words, or new meanings for known words, based on exchanges between the virtual agent and a user in order to customize the vocabulary of the virtual agent to the needs of the user or users. The agent automation framework has access to a corpus of previous exchanges between the virtual agent and the user, such as one or more chat logs. New words and/or new meanings for known words are identified within the corpus and new word vectors are generated for these new words and/or new meanings for known words and added to refine a word vector distribution model. The refined word vector distribution model is then utilized by the agent automation system to interact with the user.

SYSTEM AND METHOD FOR TONE RECOGNITION IN SPOKEN LANGUAGES
20230186905 · 2023-06-15 ·

There is provided a system and method for recognizing tone patterns in spoken languages using sequence-to-sequence neural networks in an electronic device. The recognized tone patterns can be used to improve the accuracy for a speech recognition system on tonal languages.

SELECTING BETWEEN MULTIPLE AUTOMATED ASSISTANTS BASED ON INVOCATION PROPERTIES
20230186909 · 2023-06-15 ·

Systems and methods for determining, based on invocation input that is common to multiple automated assistants, which automated assistant to invoke in lieu of invoking other automated assistants. The invocation input is processed to determine one or more invocation features that may be utilized to determine which, of a plurality of candidate automated assistants, to invoke. Further, additional features are processed that can indicate which, of the plurality of invocable automated assistants, to invoke. Once an automated assistant has been invoked, additional audio data and/or features of additional audio data are provided to the invoked automated assistant for further processing.

Method and apparatus for evaluating trigger phrase enrollment

An electronic device includes a microphone that receives an audio signal that includes a spoken trigger phrase, and a processor that is electrically coupled to the microphone. The processor measures characteristics of the audio signal, and determines, based on the measured characteristics, whether the spoken trigger phrase is acceptable for trigger phrase model training. If the spoken trigger phrase is determined not to be acceptable for trigger phrase model training, the processor rejects the trigger phrase for trigger phrase model training.

Systems and Methods for Automating Validation and Quantification of Interview Question Responses

In an illustrative embodiment, systems and methods for automating candidate video assessments include receiving a submission from a candidate for an available position including baseline response video segments and question response video segments. The system can determine, from detected nonverbal features within the baseline response video segments, nonverbal baseline scores. For each of the interview questions, candidate response attributes can be detected including a response direction, a response speed, and nonverbal features. A nonverbal reaction score is calculated from the detected nonverbal features and the baseline scores. A response score can be calculated from the response direction and response speed, and a trustworthiness score is determined based on a correspondence between the response score and the nonverbal reaction score. A next interview question can be determined in real-time from a benchmarked version of the response score. Overall scores reflecting candidate trustworthiness can be presented within a user interface screen.