Patent classifications
G10L17/26
Systems, methods, and apparatus for language acquisition using socio-neuorocognitive techniques
Provided are various mechanisms and processes for language acquisition using socio-neurocognitive techniques.
Systems, methods, and apparatus for language acquisition using socio-neuorocognitive techniques
Provided are various mechanisms and processes for language acquisition using socio-neurocognitive techniques.
Computer apparatus and method implementing sound detection with an image capture system
A computing device comprising a processor, the processor configured to: receive, from an image capture system, an image captured in an environment and image metadata associated with the image, the image metadata comprising an image capture time; receive a sound recognition message from a sound recognition module, the sound recognition message comprising (i) a sound recognition identifier indicating a target sound or scene that has been recognised based on captured audio data captured in the environment, and (ii) time information associated with the sound recognition identifier; detect that the target sound or scene occurred at a time that the image was captured based on the image metadata and the time information in the sound recognition message; and output a camera control command to said image capture system based on said detection.
Computer apparatus and method implementing sound detection with an image capture system
A computing device comprising a processor, the processor configured to: receive, from an image capture system, an image captured in an environment and image metadata associated with the image, the image metadata comprising an image capture time; receive a sound recognition message from a sound recognition module, the sound recognition message comprising (i) a sound recognition identifier indicating a target sound or scene that has been recognised based on captured audio data captured in the environment, and (ii) time information associated with the sound recognition identifier; detect that the target sound or scene occurred at a time that the image was captured based on the image metadata and the time information in the sound recognition message; and output a camera control command to said image capture system based on said detection.
METHOD OF PROCESSING SPEECH, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method of processing a speech, an electronic device, and a storage medium, which relate to a field of an artificial intelligence technology, in particular to fields of speech, cloud computing. The method includes: acquiring a wake-up voiceprint feature of a wake-up speech configured for waking up a speech interaction function, in response to the speech interaction function being waked up; extracting at least one interactive voiceprint feature from a received interactive speech including at least one single-sound source interactive speech one-to-one corresponding to the at least one interactive voiceprint feature; determining, from the at least one interactive voiceprint feature, a target interactive voiceprint feature matched with the wake-up voiceprint feature; extracting a target speech feature from a target single-sound source interactive speech corresponding to the target interactive voiceprint feature; and transmitting the target speech feature, so that a speech recognition is performed based on the target speech feature.
METHOD OF PROCESSING SPEECH, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A method of processing a speech, an electronic device, and a storage medium, which relate to a field of an artificial intelligence technology, in particular to fields of speech, cloud computing. The method includes: acquiring a wake-up voiceprint feature of a wake-up speech configured for waking up a speech interaction function, in response to the speech interaction function being waked up; extracting at least one interactive voiceprint feature from a received interactive speech including at least one single-sound source interactive speech one-to-one corresponding to the at least one interactive voiceprint feature; determining, from the at least one interactive voiceprint feature, a target interactive voiceprint feature matched with the wake-up voiceprint feature; extracting a target speech feature from a target single-sound source interactive speech corresponding to the target interactive voiceprint feature; and transmitting the target speech feature, so that a speech recognition is performed based on the target speech feature.
SPEECH AND SENTENCE STRUCTURE ANALYTICS FOR IDENTITY AND SITUATIONAL APPROPRIATENESS
A security platform architecture is described herein. A user identity platform architecture which uses a multitude of biometric analytics to create an identity token unique to an individual human. This token is derived on biometric factors like human behaviors, motion analytics, human physical characteristics like facial patterns, voice recognition prints, usage of device patterns, user location actions and other human behaviors which can derive a token or be used as a dynamic password identifying the unique individual with high calculated confidence. Because of the dynamic nature and the many different factors, this method is extremely difficult to spoof or hack by malicious actors or malware software.
AGE ESTIMATION FROM SPEECH
Disclosed are systems and methods including computing-processes executing machine-learning architectures implementing label distribution loss functions to improve age estimation performance and generalization. The machine-learning architecture includes a front-end neural network architecture defining a speaker embedding extraction engine of the machine-learning architecture, and a backend neural network architecture defining an age estimation engine of the machine-learning architecture. The embedding extractor is trained to extract low-level acoustic features of a speaker's speech, such as mel-frequency cepstral coefficients (MFCCs), from audio signals, and then extract a feature vector or speaker embedding vector that mathematically represents the low-level features of the speaker. The age estimator is trained to generate an estimated age for the speaker and a Gaussian probability distribution around the estimated age, by applying the various types of layers of the age estimator on the speaker embedding.
SMART HEARING DEVICE FOR DISTINGUISHING NATURAL LANGUAGE OR NON-NATURAL LANGUAGE, ARTIFICIAL INTELLIGENCE HEARING SYSTEM, AND METHOD THEREOF
The inventive concept relates to a smart hearing device for providing a control parameter and feedback for a natural language or a non-natural language determined by analyzing sound data, which includes a receiving unit that receives sound data of a voice signal and a noise signal from a first microphone and a second microphone being formed at one side, a determination unit that compares digital flow of the sound data with a previously stored graph pattern to determine a natural language or a non-natural language for the sound data, a processing unit that matches similar data for the determined natural language or non-natural language, based on a database including a natural language area and a non-natural language area, and a providing unit that provides a user with a one-sided sound converted by setting a control parameter in a natural language or a non-natural language specified according to the matched similar data.
VOICE-BASED CONTROL OF SEXUAL STIMULATION DEVICES
A system and method for voice-based control of sexual stimulation devices. In some configurations, the system and method involve receiving voice data, analyzing the voice data to detect spoken commands, and generating control signals based on the commands. In some configurations, the system and method involve receiving voice data, analyzing the voice data for non-speech vocalizations, detecting voice stress patterns, and generating control signals based on the detected patterns. In some configurations, the analyses of the voice data are performed by machine learning algorithms which may be trained on associations between speech and non-speech vocalizations of a user while the user engages in one or more voice-based training tasks, associating speech and non-speech vocalizations with controls of the sexual stimulation device. In some configurations, machine learning algorithms are used to make the associations. In some configurations, data from other biometric sensors is included in the associations.