Patent classifications
G10L2015/221
VOICE-BASED SOCIAL NETWORK
This invention presents a novel voice-based social network, where users can compose, explore, and share voice posts. Each voice post is composed of audio, text with dictation or transcription from speech, and other optional elements such as picture, video, contact, etc. During the composition step, the user speaks to the microphone, and the system generates text using the text-to-speech method. Users optionally attach a picture or video and category. Each voice post is visualized as a text on the top of the picture as an overlay. Text is highlighted with a synced part-of the speech. Users can explore posts using search interfaces using keywords and categories. Users can also comment using voice posts. This system also provides advanced interfaces such as recommendation interface where users can see related posts, connection interface where users can connect each other, message interface where users can communicate with each other via voice messages.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM
An information processing device of an embodiment includes a determiner configured to determine priority of metadata on the basis of an importance level indicating a degree of importance a user to each of a plurality of pieces of content and an amount of information of the metadata that is attached to each of the plurality of pieces of content and a notifier configured to notify the user of the metadata on the basis of the priority determined by the determiner.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
It is possible to perform a user speech operation on a voice agent satisfactorily. User speech data and user's shared information are accepted. An analysis result including a speech intention is obtained by analyzing the user speech data in consideration of the user's shared information. The analysis result is output. For example, the user's shared information is a combination of text information and tag information for identifying an information type indicated by the text information. For example, the user's shared information is information indicating a status of a predetermined number of status types. In a speech operation on a voice agent by a user, the user can talk with appropriate omission as in the case of people-to-people conversation, and thus can satisfactorily perform the speech operation.
DISPLAY APPARATUS AND THE CONTROL METHOD THEREOF
An electronic apparatus and a controlling method thereof are provided. The controlling method includes, based on an audio signal being received through a microphone, determining whether a user is on a public transport; detecting whether the audio signal includes a voice signal output through an acoustic device of the public transport; determining whether the voice signal from the acoustic device includes a voice signal for guiding at least one stop from among a plurality of stops; and outputting information on the at least one stop.
Communication apparatuses
In one example of the disclosure, a communication apparatus includes a first microphone. The communication apparatus is to be wirelessly and contemporaneously connected to a set of microphones including the first microphone. The communication apparatus is to receive microphone data from each microphone of the set of microphones, wherein the microphone data is indicative of a user spoken phrase captured by the set of microphones. The communication apparatus is to establish based on the received microphone data a selected microphone from among the set of microphones.
Artificial intelligence apparatus for recognizing speech of user and method for the same
An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.
Electronic device and method for providing artificial intelligence services based on pre-gathered conversations
An electronic device capable of providing a voice-based intelligent assistance service may include: a housing; a microphone; at least one speaker; a communication circuit; a processor disposed inside the housing and operatively connected with the microphone, the speaker, and the communication circuit; and a memory operatively connected to the processor, and configured to store a plurality of application programs. The electronic device may be controlled to: collect voice data of a user based on a specified condition prior to receiving a wake-up utterance invoking a voice-based intelligent assistant service; transmit the collected voice data to an external server and request the external server to construct a prediction database configured to predict an intention of the user; and output, after receiving the wake-up utterance, a recommendation service related to the intention of the user based on at least one piece of information included in the prediction database.
VIRTUAL ASSISTANT IDENTIFICATION OF NEARBY COMPUTING DEVICES
In one example, a method includes method comprising: receiving audio data generated by a microphone of a current computing device; identifying, based on the audio data, one or more computing devices that each emitted a respective audio signal in response to speech reception being activated at the current computing device; and selecting either the current computing device or a particular computing device from the identified one or more computing devices to satisfy a spoken utterance determined based on the audio data.
Systems and methods for replaying content dialogue in an alternate language
Systems and methods are described herein for replaying content dialogue in an alternate language in response to a user command. While the content is playing on a media device, a first language in which the content dialogue is spoken is identified. Upon receiving a voice command to repeat a portion of the dialogue, the language in which the command was spoken is identified. The portion of the content dialogue to repeat is identified and translated from the first language to the second language. The translated portion of the content dialogue is then output. In this way, the user can simply ask in their native language for the dialogue to be repeated and the repeated portion of the dialogue is presented in the user's native language.
SYSTEM AND METHOD FOR VOICE RECOGNITION USING A PERIPHERAL DEVICE
A system and method for dictation using a peripheral device includes a voice recognition mouse. The voice recognition mouse includes a microphone, a first button, a processor coupled to the microphone and the first button, and a memory coupled to the processor. The memory stores instructions that, when executed by the processor, cause the processor to detect actuation of the first button and in response to detecting actuation of the first button, invoke the microphone for capturing audio speech from a user. The captured audio speech is streamed to a first module. The first module is configured to invoke a second module for converting the captured audio speech into text and forward the text to the first module for providing to an application expecting the text, the application being configured to display the text on a display device.