Patent classifications
G10L15/30
Methods and systems for pushing audiovisual playlist based on text-attentional convolutional neural network
In some embodiments, methods and systems for pushing audiovisual playlists based on a text-attentional convolutional neural network include a local voice interactive terminal, a dialog system server and a playlist recommendation engine, where the dialog system server and the playlist recommendation engine are respectively connected to the local voice interactive terminal. In some embodiments, the local voice interactive terminal includes a microphone array, a host computer connected to the microphone array, and a voice synthesis chip board connected to the microphone array. In some embodiments, the playlist recommendation engine obtains rating data based on a rating predictor constructed by the neural network; the host computer parses the data into recommended playlist information; and the voice terminal synthesizes the results and pushes them to a user in the form of voice.
Preventing audio delay-induced miscommunication in audio/video conferences
Embodiments for delay-induced miscommunication reduction are provided. The embodiment may include capturing data streams transmitted between participants in an A/V exchange; translating, on a sender device prior to transmission to a recipient device, an audio stream within the data streams to text; timestamping, on a sender device prior to transmission to the recipient device, each word in the translated audio stream; transmitting the audio stream and the sender-side translated and timestamped audio stream to the recipient device; translating, on the recipient device, the transmitted audio stream to text; timestamping, on the recipient device, each word in the translated audio stream; determining a lag exists in the A/V exchange based on a comparison of each timestamp for corresponding words on the sender-side translated and timestamped audio stream and the recipient-side translated and timestamped audio stream; and generating a true transcript of an intended exchange between the participants based on the comparison.
Preventing audio delay-induced miscommunication in audio/video conferences
Embodiments for delay-induced miscommunication reduction are provided. The embodiment may include capturing data streams transmitted between participants in an A/V exchange; translating, on a sender device prior to transmission to a recipient device, an audio stream within the data streams to text; timestamping, on a sender device prior to transmission to the recipient device, each word in the translated audio stream; transmitting the audio stream and the sender-side translated and timestamped audio stream to the recipient device; translating, on the recipient device, the transmitted audio stream to text; timestamping, on the recipient device, each word in the translated audio stream; determining a lag exists in the A/V exchange based on a comparison of each timestamp for corresponding words on the sender-side translated and timestamped audio stream and the recipient-side translated and timestamped audio stream; and generating a true transcript of an intended exchange between the participants based on the comparison.
SERVER-SIDE PROCESSING METHOD AND SERVER FOR ACTIVELY INITIATING DIALOGUE, AND VOICE INTERACTION SYSTEM CAPABLE OF INITIATING DIALOGUE
A server-side processing method for implementing an active initiation of a dialogue is disclosed, comprising: establishing a communication connection with a voice client, in response to a received request for establishing a connection from the voice client; receiving an information stream sent by the voice client through the communication connection; performing a dialogue decision-making process according to the information stream, obtaining and outputting an adapted dialogue content to the voice client upon determining that it is an active dialogue scenario. A server and a system for implementing an active initiation of a dialogue are also provided. The disclosed solutions realize intelligent decision-making for voice interaction, and can actively initiate a dialogue based on server-side decision-making, improving interaction experience and realizing intelligent interaction.
SERVER-SIDE PROCESSING METHOD AND SERVER FOR ACTIVELY INITIATING DIALOGUE, AND VOICE INTERACTION SYSTEM CAPABLE OF INITIATING DIALOGUE
A server-side processing method for implementing an active initiation of a dialogue is disclosed, comprising: establishing a communication connection with a voice client, in response to a received request for establishing a connection from the voice client; receiving an information stream sent by the voice client through the communication connection; performing a dialogue decision-making process according to the information stream, obtaining and outputting an adapted dialogue content to the voice client upon determining that it is an active dialogue scenario. A server and a system for implementing an active initiation of a dialogue are also provided. The disclosed solutions realize intelligent decision-making for voice interaction, and can actively initiate a dialogue based on server-side decision-making, improving interaction experience and realizing intelligent interaction.
METHOD AND SYSTEM FOR IMPARTING VOICE COMMANDS TO A MOTOR VEHICLE
A method for imparting commands to a motor vehicle (1) includes a central control unit (23), at least one steered wheel (13), and a steering member (11) for acting on the steered wheel (13). The method includes a step of activating, by an interface device (21) on the steering member (11), a voice recognition function on a smartphone (5) interfaced with the central control unit (23). After the smartphone (5) has received a voice command imparted by the driver, and recognized through activation of the voice recognition function, the smartphone (5) selects by its processing unit, an instruction corresponding to the voice command received. The instruction is then executed.
METHOD AND SYSTEM FOR IMPARTING VOICE COMMANDS TO A MOTOR VEHICLE
A method for imparting commands to a motor vehicle (1) includes a central control unit (23), at least one steered wheel (13), and a steering member (11) for acting on the steered wheel (13). The method includes a step of activating, by an interface device (21) on the steering member (11), a voice recognition function on a smartphone (5) interfaced with the central control unit (23). After the smartphone (5) has received a voice command imparted by the driver, and recognized through activation of the voice recognition function, the smartphone (5) selects by its processing unit, an instruction corresponding to the voice command received. The instruction is then executed.
SKILL DISPATCHING METHOD AND APPARATUS FOR SPEECH DIALOGUE PLATFORM
A skill dispatching method for a speech dialogue platform including: receiving, by a central control dispatching service, a semantic result of recognizing a user's voice sent by a data distribution service; dispatching, by the central control dispatching service, a plurality of skill services related to the semantic result in parallel, and obtaining skill parsing results from the plurality of skill services; sorting the skill parsing results based on priorities of the skill services, and exporting a result with the highest priority to a skill realization discrimination service; when failure in realization, selecting a result with the highest priority among the rest of skill parsing results and exporting the same to the skill realization discrimination service, and when success in realization, sending the result with the highest priority to the data distribution service for feedback to the user. The method improves skill dispatching efficiency, reduces delay, and improves user experience.
SKILL DISPATCHING METHOD AND APPARATUS FOR SPEECH DIALOGUE PLATFORM
A skill dispatching method for a speech dialogue platform including: receiving, by a central control dispatching service, a semantic result of recognizing a user's voice sent by a data distribution service; dispatching, by the central control dispatching service, a plurality of skill services related to the semantic result in parallel, and obtaining skill parsing results from the plurality of skill services; sorting the skill parsing results based on priorities of the skill services, and exporting a result with the highest priority to a skill realization discrimination service; when failure in realization, selecting a result with the highest priority among the rest of skill parsing results and exporting the same to the skill realization discrimination service, and when success in realization, sending the result with the highest priority to the data distribution service for feedback to the user. The method improves skill dispatching efficiency, reduces delay, and improves user experience.
SYSTEM AND METHOD FOR CONTROLLING A PLURALITY OF DEVICES
Provided is a system and method for controlling a plurality of devices. The method includes generating a command script by processing a text string with at least one model, the text string including a natural language input by a user, modifying the command script based on contextual data, the command script including a configuration for at least one device, generating at least one command signal based on the command script, and controlling at least one device based on the at least one command signal.