Multimodal dialog in a motor vehicle

11551679 · 2023-01-10

Assignee

Inventors

Cpc classification

International classification

Abstract

A method for carrying out a multimodal dialog in a vehicle, in particular a motor vehicle, via which method the interaction between the vehicle and a vehicle user is improved with regard to the provision of a dialog that is as natural as possible. For this purpose, the following acts are performed: sensing an input of a vehicle user for activating a voice dialog and activating gesture recognition.

Claims

1. A method for carrying out a multimodal dialog in a vehicle, comprising the acts of: permanently activating a complex gesture recognition in the vehicle such that complex gestures are able to be permanently recognized; capturing a first input of a vehicle user for activating a voice dialog; activating the voice dialog in response to the capturing wherein a simple gesture recognition is not yet activated such that simple gestures are not yet able to be recognized; before activating the simple gesture recognition, checking whether the voice dialog has been concluded; and activating the simple gesture recognition such that simple gestures are able to be recognized when the checking determines that the voice dialog has not been concluded.

2. The method as claimed in claim 1, wherein the first input of the vehicle user for activating the voice dialog is a first voice complex gesture of the vehicle user.

3. The method as claimed in claim 1 further comprising the acts of: capturing a second input of the vehicle user; and processing the second input.

4. The method as claimed in claim 1 further comprising the acts of: checking whether the voice dialog has been concluded after activating the simple gesture recognition; and deactivating the simple gesture recognition when the checking determines that the voice dialog has been concluded.

5. The method as claimed in claim 3 further comprising the acts of: outputting an input request in response to the first input of the vehicle user; and/or outputting the input request in response to the second input of the vehicle user.

6. The method as claimed in claim 5, wherein the input request is a spoken input request.

7. The method as claimed in claim 5 further comprising the act of capturing and/or processing the second input of the vehicle user on a basis of the input request.

8. A multimodal dialog machine for a vehicle for carrying out the method as claimed in claim 1.

9. A motor vehicle comprising the multimodal dialog machine as claimed in claim 8.

Description

BRIEF DESCRIPTION OF THE DRAWING

(1) The FIGURE shows a flowchart of an embodiment of the method according to the invention.

DETAILED DESCRIPTION OF THE DRAWING

(2) It is pointed out that the FIGURE and the associated description are merely an exemplary embodiment of the invention. In particular, the illustration of combinations of features in the FIGURE and/or in the description of the FIGURE should not be interpreted to the effect that the invention necessarily requires the implementation of all features mentioned. Other embodiments of the invention may contain fewer, more and/or different features. The scope of protection and the disclosure of the invention emerge from the accompanying patent claims and the complete description.

(3) At the start of the method, a first voice input of a vehicle user is captured in step 10. At this time, the gesture recognition is not yet active. The voice input could be, for example, “Call Robert Meyer”.

(4) The first voice input is then processed in step 20, for which purpose a telephone book stored in the vehicle or in a mobile telephone connected to the vehicle is searched, for example.

(5) Step 30 checks whether the dialog has been concluded. If this is the case, the method ended. This could be the case, for example, if the first voice input were so clear that it could be immediately carried out. In the present example, however, it shall be assumed that a plurality of telephone numbers are stored for the telephone book entry of “Robert Meyer”.

(6) An input request is therefore output in step 35. In this respect, a list of all telephone numbers for Robert Meyer is output on a head-up display and the telephone number stored as the main number is graphically highlighted. At the same time, the input request comprises a voice output of the content “Would you like to call Robert Meyer's main number?”.

(7) The gesture recognition is now activated in step 40. From this time, the vehicle user can conduct the dialog in a multimodal manner. This also means that he/she can but need not necessarily use gestures. He/she could also continue the dialog with further inputs which are voice inputs or could use conventional operating elements.

(8) In step 50, a further input of the vehicle user which is a gesture is captured. The vehicle user could now abort the dialog with a swiping gesture or could scroll through the list of telephone numbers displayed on the head-up display using a “scrolling gesture” (pointing gesture in the upward or downward direction). However, it shall be assumed that the user would like to select the suggested option (Robert Meyer's main number). The further input of the vehicle user therefore comprises a pointing gesture in which the vehicle user holds his/her extended index finger in the direction of the head-up display (and therefore in the direction of the windshield of the vehicle) and moves it slightly forward and then back again in this direction (that is to say in the pointing direction).

(9) During the processing of the further input which is carried out in step 60, this gesture is interpreted and carried out. The telephone call is made.

(10) Step 70 checks whether the dialog has been concluded. This is the case in the present case, with the result that the gesture recognition is deactivated in step 80 and the method ends. If, in contrast, it were necessary to continue the dialog (for example because the further input was misleading or ambiguous), an input request could be output in step 75 (“I did not understand you. Please repeat your input.”). The method would then be continued with step 50 in which a further input is captured.

(11) The foregoing disclosure has been set forth merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to persons skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.