G10L13/027

Interactive method and device of robot, and device

Embodiments of the present disclosure provide an interactive method of a robot, an interactive device of a robot and a device. The method includes: obtaining voice information input by an interactive object, and performing semantic recognition on the voice information to obtain a conversation intention; obtaining feedback information corresponding to the conversation intention based on a conversation scenario knowledge base pre-configured by a simulated user; and converting the feedback information into a voice of the simulated user, and playing the voice to the interactive object.

Interactive method and device of robot, and device

Embodiments of the present disclosure provide an interactive method of a robot, an interactive device of a robot and a device. The method includes: obtaining voice information input by an interactive object, and performing semantic recognition on the voice information to obtain a conversation intention; obtaining feedback information corresponding to the conversation intention based on a conversation scenario knowledge base pre-configured by a simulated user; and converting the feedback information into a voice of the simulated user, and playing the voice to the interactive object.

USING TOKEN LEVEL CONTEXT TO GENERATE SSML TAGS

This disclosure describes a system that analyzes a corpus of text (e.g., a financial article, an audio book, etc.) so that the context surrounding the text is fully understood. For instance, the context may be an environment described by the text, or an environment in which the text occurs. Based on the analysis, the system can determine sentiment, part of speech, entities, and/or human characters at the token level of the text, and automatically generate Speech Synthesis Markup Language (SSML) tags based on this information. The SSML tags can be used by applications, services, and/or features that implement text-to-speech (TTS) conversion to improve the audio experience for end-users. Consequently, via the techniques described herein, more realistic and human-like speech synthesis can be efficiently implemented at larger scale (e.g., for audio books, for all the articles published to a news site, etc.).

USING TOKEN LEVEL CONTEXT TO GENERATE SSML TAGS

This disclosure describes a system that analyzes a corpus of text (e.g., a financial article, an audio book, etc.) so that the context surrounding the text is fully understood. For instance, the context may be an environment described by the text, or an environment in which the text occurs. Based on the analysis, the system can determine sentiment, part of speech, entities, and/or human characters at the token level of the text, and automatically generate Speech Synthesis Markup Language (SSML) tags based on this information. The SSML tags can be used by applications, services, and/or features that implement text-to-speech (TTS) conversion to improve the audio experience for end-users. Consequently, via the techniques described herein, more realistic and human-like speech synthesis can be efficiently implemented at larger scale (e.g., for audio books, for all the articles published to a news site, etc.).

Providing information cards using semantic graph data

Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for providing information cards using semantic graph data. In some implementations, semantic graph data for a semantic graph is stored, where the semantic graph data indicates objects and relationships among the objects, and the objects include a card object that represents characteristics of an information card. A request is received from a client device, and the request is processed using the semantic graph data. Data for the information card is provided to the client device based on the card object indicated by the semantic graph data.

Providing information cards using semantic graph data

Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for providing information cards using semantic graph data. In some implementations, semantic graph data for a semantic graph is stored, where the semantic graph data indicates objects and relationships among the objects, and the objects include a card object that represents characteristics of an information card. A request is received from a client device, and the request is processed using the semantic graph data. Data for the information card is provided to the client device based on the card object indicated by the semantic graph data.

SPEECH TRANSCRIPTION FROM FACIAL SKIN MOVEMENTS
20230215437 · 2023-07-06 · ·

Systems and methods are disclosed for determining textual transcription from minute facial skin movements. In one implementation, a system may include at least one coherent light source, at least one sensor configured to receive light reflections from the at least one coherent light source; and a processor configured to control the at least one coherent light source to illuminate a region of a face of a user. The processor may receive from the at least one sensor, reflection signals indicative of coherent light reflected from the face in a time interval. The reflection signals may be analyzed to determine minute facial skin movements in the time interval. Then, based on the determined minute facial skin movements in the time interval, the processor may determine a sequence of words associated with the minute facial skin movements, and output a textual transcription corresponding with the determined sequence of words.

Agent device, agent device control method, and storage medium

An agent device includes an agent functional controller configured to provide a service including causing an output device to output a response of voice in response to an utterance of an occupant of a vehicle, and a controller configured to permit an operation of a power window of the vehicle when a speed of the vehicle is less than a first threshold value and limit the operation of the power window of the vehicle when the speed of the vehicle is equal to or greater than the first threshold value when the agent functional controller is activated.

Method for assessing facility risks with natural language processing
11544464 · 2023-01-03 · ·

The present technology pertains to a method and system for assessing risks associated with facilities, based on using natural language processing. For example, a method can include receiving a natural language input comprising at least one raw text document associated with a facility and generating a plurality of segmented sentences from the raw text documents. The plurality of segmented sentences can be provided as inputs to a machine learning model trained to classify an input segmented sentence over a pre-defined lexicon of pharmaceutical terminology. Each segmented sentence can be classified into one or more classes given by the pre-defined lexicon of pharmaceutical terminology. A secondary classification can be performed for each classified segmented sentence to generate a production issue label based on an analysis of the classified segmented sentence. From the secondary classifications for the classified segmented sentences, at least one production category score for the facility can be generated.

Method for assessing facility risks with natural language processing
11544464 · 2023-01-03 · ·

The present technology pertains to a method and system for assessing risks associated with facilities, based on using natural language processing. For example, a method can include receiving a natural language input comprising at least one raw text document associated with a facility and generating a plurality of segmented sentences from the raw text documents. The plurality of segmented sentences can be provided as inputs to a machine learning model trained to classify an input segmented sentence over a pre-defined lexicon of pharmaceutical terminology. Each segmented sentence can be classified into one or more classes given by the pre-defined lexicon of pharmaceutical terminology. A secondary classification can be performed for each classified segmented sentence to generate a production issue label based on an analysis of the classified segmented sentence. From the secondary classifications for the classified segmented sentences, at least one production category score for the facility can be generated.