Data Processing Device and Method for Performing Speech-Based Human Machine Interaction

20200211560 ยท 2020-07-02

    Inventors

    Cpc classification

    International classification

    Abstract

    A method for performing speech-based human machine interaction (HMI) includes obtaining a speech of a user and determining whether a response to the speech of the user can be generated. If no response can be generated, information is sent corresponding to the speech of the user to the call center. A data processing device for performing speech-based human machine interaction (HMI), includes an obtaining module to obtain a speech of a user. The data processing device further includes a determining module to determine whether a response to the speech of the user can be generated. The data processing device further includes a sending module to send information corresponding to the speech of the user to the call center.

    Claims

    1. A method for performing speech-based human machine interaction (HMI), comprising: obtaining a speech of a user; determining whether a response to the speech of the user can be generated; and if no response can be generated, sending information corresponding to the speech of the user to the call center.

    2. The method according to claim 1, further comprising: establishing a phone call between the user and a call center.

    3. The method according to claim 1, wherein the step of determining whether a response to the speech of the user can be generated further comprises: recognizing, by using natural language understanding, NLU, an intention of the user according to the speech.

    4. The method according to claim 3, wherein the step of determining whether a response to the speech of the user can be generated further comprises: deciding whether the response can be generated; and generating the response according to the recognized intention of the user.

    5. The method according to claim 1, wherein the step of sending information corresponding to the speech of the user to the call center further comprises: storing the speech of the user; and sending the speech to the call center.

    6. The method according to claim 1, wherein the step of sending information corresponding to the speech of the user to the call center further comprises: generating text information according to the speech; and sending the text information to the call center.

    7. A data processing device for performing speech-based human machine interaction (HMI), comprising: an obtaining module to obtain a speech of a user; a determining module to determine whether a response to the speech of the user can be generated; and a sending module to send information corresponding to the speech of the user to the call center.

    8. The data processing device according to claim 7, further comprising: an establishing module to establish a phone call between the user and a call center.

    9. The data processing device according to claim 7, wherein the determining module comprises: a recognizing module to recognize, by using natural language understanding, NLU, an intention of the user according to the speech.

    10. The data processing device according to claim 9, wherein the determining module further comprises: a response generating module to generate the response according to the recognized intention of the user; and a deciding module to decide whether the response can be generated by the response generating module.

    11. The data processing device according to claim 7, wherein the sending module comprises: a storing module to store the speech of the user; and a speech sending module to send the speech to the call center.

    12. The data processing device according to claim 7, wherein the sending module comprises: a generating module to generate text information according to the speech; and a text sending module to send the text information to the call center.

    13. The data processing device according to claim 7, wherein the data processing device is installed within a vehicle.

    14. A data processing device for performing speech-based human machine interaction (HMI), comprising: a processor; a memory in communication with the processor, the memory storing a plurality of instructions executable by the processor to cause the data processing device to: obtain a speech of a user; determine whether a response to the speech of the user can be generated; and if no response can be generated, send information corresponding to the speech of the user to the call center.

    15. The data processing device according to claim 14, wherein the memory further comprises instructions to cause the data processing device to: establish a phone call between the user and a call center.

    16. The data processing device according to claim 14, wherein the memory further comprises instructions to cause the data processing device to: recognize, by using natural language understanding, NLU, an intention of the user according to the speech.

    17. The data processing device according to claim 16, wherein the memory further comprises instructions to cause the data processing device to: decide whether the response can be generated; and generate the response according to the recognized intention of the user.

    18. The data processing device according to claim 14, wherein the memory further comprises instructions to cause the data processing device to: store the speech of the user; and send the speech to the call center.

    19. The data processing device according to claim 14, wherein the memory further comprises instructions to cause the data processing device to: generate text information according to the speech; and send the text information to the call center.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0031] To describe the technical solutions in the embodiments of the present subject matter more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments. Apparently, the accompanying drawings in the following description show merely some embodiments of the present subject matter, and a person of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts.

    [0032] FIG. 1 is a schematic diagram of a further embodiment of the method according to the present subject matter; and

    [0033] FIG. 2 shows a schematic diagram of an embodiment of the data processing device according to the present subject matter.

    DETAILED DESCRIPTION OF THE DRAWINGS

    [0034] The following clearly and completely describes the technical solutions in the embodiments of the present subject matter with reference to the accompanying drawings in the embodiments of the present subject matter. Apparently, the described embodiments are some but not all of the embodiments of the present subject matter. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present subject matter without creative efforts shall fall within the protection scope of the present subject matter.

    [0035] FIG. 1 shows a schematic flow chart diagram of an embodiment of the method 10 for performing speech-based human machine interaction for the in-car navigation or infotainment system, especially for answering questions from the driver or conducting operations ordered by the driver. The method can be implemented by a data processing device shown in FIG. 2, e.g. a processor with corresponding computer program.

    [0036] In the first step S11 according to FIG. 1, the interface in car, e.g. a microphone, can receive the speech of the driver. In order to find a response to the driver, the speech is then transferred to the AI assistant system, which could be an onboard system, off-board system or a hybrid system.

    [0037] In step S12, an intention of the user is recognized based on the speech of the driver, by using natural language understanding, NLU, technology. Then, the in-car assistant system tries to generate the response according to the recognized intention of the user, for example by using an artificial intelligence assistant module, which is configured to find the suitable response to the drive's requirement as well as to conduct the operation corresponding to the user's intention.

    [0038] As mentioned afore, in some case the artificial intelligence assistant system cannot understand the user or cannot find a suitable answer to the question of the user. The in-car assistant system according to the present subject matter decides whether the suitable response can be generated by the artificial intelligence assistant module.

    [0039] If it is determined in the step S12 that the in-car AI assistant module can understand and answer the driver correctly, according to step S15 the response corresponding to the question/voice of the driver will be sent to the driver though e.g. a speaker and display.

    [0040] According to the step S13, if AI assistant does not able to understand the driver's speech or cannot find a suitable answer, information corresponding to the speech of the user will be sent to the call center.

    [0041] Especially, the user's questions/speech will be sent to a speech recognition module or NLU module, which could help to translate voice into text and extract its semantics. The text message will be sent to the call center in order to initiate the human assistant service. Then before picking up the call, the human assistant can check the text message from the car and understand the meaning and intention of the driver. Additionally important parts/words in the text message can be highlighted according to the analysis of the speech recognition module.

    [0042] Alternatively, a voice message comprising the speech of the driver can be sent to the call center, instead of the text message.

    [0043] Especially, driver/user need not repeat his/her request again to the agent. The dialog design will let him/her know the AI service has failed because of some reasons, but a human agent will contact him/her immediately. In addition, the semantic analysis and the highlighted text will help a lot to the agent to catch the user's intention, because the call center agents normally do not have much time to read whole text or listen to the audio record, latency of call from driver is a critical criterion to evaluate his service quality.

    [0044] In the step S14, the in-car assistant system can also establish the concierge call between the driver and the call center.

    [0045] Accordingly, the AI-based assistant service is still the first choice for the driver, which could answer most of questions quickly without waiting a long time. The human assistant service (so-called concierge service) can be automatically triggered when AI-based assistant system fails to give a suitable answer.

    [0046] Before answering the call, the call center agents are able to know the general information and intention of the driver, therefore, the request and/or question must not be repeated to the call center assistant. When the call is connected, the agent would ask you to confirm your intention or directly provide the driver with suitable solutions. The user experience is thus improved.

    [0047] FIG. 2 shows a schematic diagram of the data processing device 100 according to the present subject matter. The data processing device 100 can be implemented in a vehicle.

    [0048] The data processing device 100 can implement the above-mentioned method for device for performing speech-based human machine interaction. The data processing device 100 comprises a receiving module 111 adapted to receive a speech of a user; a determining module 112 adapted to determine whether a response to the speech of the user can be generated; a sending module 113 adapted to send information corresponding to the speech of the user to the call center; an establishing module 114 adapted to establish a phone call between the user and a call center; and an artificial intelligence assistant module 115, which is configured to find the suitable response to the drive's requirement as well as to conduct the operation corresponding to the user's intention.

    [0049] The determining module 112 comprises a recognizing module adapted to recognize, by using natural language understanding, NLU, an intention of the user according to the speech, a response generating module adapted to generate the response according to the recognized intention of the user, and a deciding module adapted to decide whether the response can be generated by the response generating module.

    [0050] Furthermore, the sending module 113 comprises a storing module adapted to store the speech of the user; and a speech sending module adapted to send the speech to the call center. Alternatively and additionally, the sending module 113 comprises a generating module adapted to generate text information according to the speech; and a text sending module adapted to send the text information to the call center. Accordingly, both of the speech of the driver and the text information interpreted according to the speech can be send to the call center so that the human assistant in the call center can the user clearly understand.

    [0051] Additionally, the speech of the driver which the artificial intelligence assistant module cannot deal with correctly and the answer of the human assistant can be sent to the artificial intelligence assistant module and analyzed. Such data can complement the data base in the artificial intelligence assistant module and are very necessary for training the artificial intelligence assistant. Therefore, the questions that artificial intelligence assistant was not able to answer can be solved by the artificial intelligence assistant by using the updated data base. The performance of the artificial intelligence assistant can thus be improved.

    [0052] The foregoing disclosure has been set forth merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to persons skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.