METHOD FOR ACQUIRING AT LEAST TWO PIECES OF INFORMATION TO BE ACQUIRED, COMPRISING INFORMATION CONTENT TO BE LINKED, USING A SPEECH DIALOGUE DEVICE, SPEECH DIALOGUE DEVICE, AND MOTOR VEHICLE
20170249941 · 2017-08-31
Assignee
Inventors
Cpc classification
G10L15/22
PHYSICS
G10L15/222
PHYSICS
International classification
G10L15/22
PHYSICS
Abstract
A voice output is produced by a speech dialogue device between the acquisitions of two pieces of information. Each piece of information is acquired by acquiring natural verbal voice input data and extracting the respective piece of information from the voice input data using a speech recognition algorithm. When a repetition condition has been satisfied, a natural speech summary output is generated by the speech dialogue device and output as a voice output which includes a natural voice reproduction of at least one previously acquired piece of information or a part of this piece of information or a piece of information derived from this piece of information.
Claims
1.-13. (canceled)
14. A method for acquiring pieces of information including information content to be linked, the method comprising: acquiring the pieces of information, using a speech dialogue device, by natural voice input data and by extracting respective pieces of information from the natural voice input data using a speech recognition algorithm; producing, by the speech dialogue device, a voice output between each acquisition of the pieces of information; generating, by the speech dialogue device when a repetition condition has been satisfied, a natural speech summary output and producing a voice output including a natural voice reproduction of the natural speech summary output which includes at least a part of at least one previously acquired piece of information or a derived piece of information derived from the at least one previously acquired piece of information, a necessary condition for the satisfaction of the repetition condition being that an initially satisfied interrupt condition, during the satisfaction of which the acquiring of the pieces of information is interrupted, is no longer satisfied; determining, by the speech dialogue device when a plurality of pieces of information have been acquired when the repetition condition was satisfied and based on a repetition parameter for the plurality of pieces of information, whether the natural voice reproduction of at least a part of the plurality of pieces of information or a derived piece of information derived from the plurality of pieces of information is incorporated into the natural speech summary output; determining, by the speech dialogue device, at least one of a duration of the interrupt condition and a complexity value corresponding to a measure of a load on a user by an event causing the interrupt when the interrupt condition is satisfied; and determining the repetition parameter based on at least one of the duration and the complexity value.
15. The method as claimed in claim 14, further comprising determining whether the repetition condition is satisfied based on at least one of a number of the previously acquired pieces of information, the previously acquired pieces of information, the duration of the interrupt condition, and the complexity value.
16. The method as claimed in claim 14, further comprising: determining the repetition parameter based on the complexity value; and determining whether the interrupt condition is satisfied based on a signal of a device connected to the speech dialogue device, wherein the device transmits a device complexity value used to determine the complexity value.
17. The method as claimed in claim 14, further comprising: determining the repetition parameter based on the complexity value; and determining whether the interrupt condition is satisfied based on signals of a plurality of devices connected to the speech dialogue device, wherein the determining the complexity value is based on at least one of the connected devices which output at least one signal among the signals which cause the interrupt condition to be satisfied.
18. The method as claimed in claim 14, wherein the determining the complexity value is based on at least one of the pieces of information to be acquired or at least one of the previously acquired pieces of information.
19. A speech dialogue device for a motor vehicle, comprising: a voice input device configured to acquire natural voice input data; a voice output device configured to produce a voice output between each acquisition of pieces of information extracted from the natural voice input data; and a controller configured to: acquire the pieces of information, including information content to be linked, by extracting the pieces of information from the natural voice input data using a speech recognition algorithm, generate, when a repetition condition has been satisfied, a natural speech summary output and produce a voice output including a natural voice reproduction of the natural speech summary output which includes at least a part of at least one previously acquired piece of information or a derived piece of information derived from the at least one previously acquired piece of information, a necessary condition for the satisfaction of the repetition condition being that an initially satisfied interrupt condition, during the satisfaction of which the acquiring of the pieces of information is interrupted, is no longer satisfied, determine, when a plurality of pieces of information have been acquired when the repetition condition was satisfied and based on a repetition parameter for the plurality of pieces of information, whether the natural voice reproduction of at least a part of the plurality of pieces of information or a derived piece of information derived from the plurality of pieces of information is incorporated into the natural speech summary output, determine at least one of a duration of the interrupt condition and a complexity value corresponding to a measure of a load on a user by an event causing the interrupt when the interrupt condition is satisfied, and determine the repetition parameter based on at least one of the duration and the complexity value.
20. A motor vehicle, comprising: a chassis; and a speech dialogue device including: a voice input device configured to acquire natural voice input data; a voice output device configured to produce a voice output between each acquisition of pieces of information extracted from the natural voice input data; and a controller configured to: acquire the pieces of information, including information content to be linked, by extracting the pieces of information from the natural voice input data using a speech recognition algorithm, generate, when a repetition condition has been satisfied, a natural speech summary output and produce a voice output including a natural voice reproduction of the natural speech summary output which includes at least a part of at least one previously acquired piece of information or a derived piece of information derived from the at least one previously acquired piece of information, a necessary condition for the satisfaction of the repetition condition being that an initially satisfied interrupt condition, during the satisfaction of which the acquiring of the pieces of information is interrupted, is no longer satisfied, determine, when a plurality of pieces of information have been acquired when the repetition condition was satisfied and based on a repetition parameter for the plurality of pieces of information, whether the natural voice reproduction of at least a part of the plurality of pieces of information or a derived piece of information derived from the plurality of pieces of information is incorporated into the natural speech summary output, determine at least one of a duration of the interrupt condition and a complexity value corresponding to a measure of a load on a user by an event causing the interrupt when the interrupt condition is satisfied, and determine the repetition parameter based on at least one of the duration and the complexity value.
21. The motor vehicle as claimed in claim 20, further comprising at least one vehicle system which, when a signal condition is satisfied, outputs an interrupt signal to the speech dialogue device to satisfy the interrupt condition.
22. The motor vehicle as claimed in claim 21, wherein the at least one vehicle system includes at least one of a communication device, a radio, a driver assistance system configured to evaluate a traffic situation, and a navigation system.
23. The motor vehicle as claimed in claim 22, wherein the at least one vehicle system includes the driver assistance system, and the driver assistance system is further configured to determine a device complexity value based on the traffic situation when the interrupt signal is output by the at least one vehicle system.
24. The motor vehicle as claimed in claim 20, wherein the pieces of information acquired via the voice input device are used to control at least one function of the motor vehicle, the interrupt condition is satisfied before all of the pieces of information are acquired, the speech dialogue device selects a structure for the natural speech summary output from among a plurality of possible structures of natural speech summary outputs, the natural speech summary output including at least a part of the pieces of information acquired before the interrupt condition was satisfied, and a remaining part of the pieces of information acquired via the voice input device are acquired after the interrupt condition is no longer satisfied and after the voice output device produces the voice output including the natural voice reproduction of the natural speech summary output.
25. The motor vehicle as claimed in claim 20, wherein the controller determines the repetition parameter based on the duration for which the interrupt condition is present, and the controller determines the repetition condition is satisfied if the duration for which the interrupt condition is present is greater than a predefined duration.
26. The motor vehicle as claimed in claim 20, wherein the controller determines the repetition parameter based on the complexity value, and a content of the natural speech summary output increases as the load on the user increases.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0031] These and other aspects and advantages will become more apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings of which:
[0032]
[0033]
[0034]
[0035]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
[0036] Reference will now be made in detail to the preferred embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
[0037] Referring to the drawings,
[0038] Firstly, the first process including S1 to S8 will be described. It is to be noted here that this process can be interrupted at any time by using one of the other processes, and after an interrupt has taken place it can, under certain circumstances, be continued. In S1, voice input data are firstly acquired. Therefore, the speech dialogue device includes a voice input device, for example one or more microphones. The acoustically acquired speech data are initially digitized and stored in a memory device.
[0039] In S2, a speech recognition algorithm is applied to the voice input data acquired in S1. In this context it is determined whether at least one of the pieces of information to be acquired is contained in the voice input data. If this is not the case, the acquisition of voice input data is continued until at least one piece of information is contained in the acquired voice input data. The one or more pieces of information contained in the acquired voice input data are subsequently stored. In this context, the storage of the pieces of information takes place in an ontology-based database. Each of the pieces of information is therefore assigned a meaning within the scope of the dialogue conducted by the speech dialogue device. The assignment of meaning is carried out by adding meta data to the acquired pieces of information.
[0040] In S3, the addressed function is determined. The method can be used to control a plurality of devices, and each of these devices can have different input possibilities. It is therefore possible for a speech dialogue which controls a communication device to trigger, for example, a call, to add an entry to a telephone directory or to edit such an entry or compose a text message using voice recognition. Likewise, an appointment planner can be controlled by the speech dialogue device, wherein new appointments can be created, appointment lists can be output or settings can be changed using speech dialogues. In order to control a navigation system, for example a target address for the navigation can be input, or a search for the closest refueling station can be conducted. In S3, the speech dialogue device therefore evaluates the pieces of information extracted in S2 and determines both the other pieces of information to be acquired and the voice outputs of the speech dialogue device which are output, in particular as enquiries, between acquiring the pieces of information.
[0041] In S4, the piece of information which is extracted in S2 or the pieces of information which are extracted in S2 are stored. In this context, the meta-information described in S2 is stored in relation to the pieces of information, whereby an ontology-based data structure, in which not only the extracted pieces of information but also a meaning content of the extracted pieces of information and a relationship between the pieces of information is stored, is formed.
[0042] In S5 it is checked whether all the pieces of information to be acquired have already been acquired. The number and type of the pieces of information to be acquired is defined by the function determined in S3. As explained there, it is possible that the number of the pieces of information to be acquired and the type of the pieces of information to be acquired are adapted in further acquisition operations.
[0043] If all the pieces of information to be acquired have not yet been acquired, in S6 a voice output is output using the speech dialogue device which interrogates from the user at least one further piece of information to be acquired. Therefore, by alternately interrogating pieces of information and acquiring the corresponding pieces of information a dialogue is set up. The renewed acquisition and evaluation of the further piece of information subsequently takes place in S7, as also explained with respect to S1 and S2. Subsequently, in S4 the newly acquired piece of information is also stored with associated meta-information, and in S5 it is checked again whether all the pieces of information to be acquired have now been acquired. This is repeated until in S5 it is determined that all the pieces of information to be acquired have now been acquired. In this case, the method ends in S8, after which components to be controlled can be controlled and pieces of information obtained using the speech dialogue, or the like, can be stored.
[0044] In addition to the process in S1 to S8 described above, process S9 to S11 are carried out in parallel. In S9, in this context input data are acquired using the speech dialogue device and evaluated within the scope of a repetition condition. This data includes, in particular, the time which has currently passed between a last voice output of the speech dialogue device and the acquisition of the pieces of information to be acquired. In addition, it is possible to acquire signals of operator control elements which, when activated by a user, cause a natural speech summary output to be output. In addition, the voice input data are continuously examined for the presence of a key word by which a user can cause a summary output to be generated and bring about voice output thereof.
[0045] In S10 it is checked whether the input data mentioned above satisfy the repetition condition. The repetition condition is satisfied here if the key word has been detected, an output of a summary output has been requested by the user by activating an operator control element or a predefined time interval is exceeded. If the repetition condition is not satisfied, the method is continued with S9. When the repetition condition is satisfied, the method is continued in S11, in which a summary output is generated and output. This is explained in more detail with respect to
[0046] In parallel with the sequences including S1 to S8 and S9 to S11, in S12 it is continuously checked whether an interrupt condition is present, and when an interrupt condition is present the dialogue process which is described in relation to S1 to S8 is interrupted. In particular, in this context the voice output and the voice acquisition are interrupted. When an interrupt signal is no longer present, a natural speech summary output is, under certain circumstances, generated and output before the dialogue process is continued. The procedure during S12 is explained more precisely with respect to
[0047] Referring again to the drawings,
[0048] In S13, there is initially acquisition of data which determine the structure of the summary output, in particular the number and selection of the pieces of information contained in the summary output as well as the method of the reproduction of the information. In this context, it is detected, in particular, what has triggered the generation of the summary output. In addition to triggering as a result of S11 in
[0049] In S14 it is determined which of the acquired pieces of information are reproduced within the scope of the summary output. For this purpose, from the values determined in S13 a common value is determined which determines the verbosity and as a function of which one of a plurality of predefined structures for a summary output is selected. In this context, a pre-selection of a group of structures of summary outputs is made as a function of the type and number of the pieces of information acquired and the addressed function. Within this group, one of the structures is subsequently selected as a function of the determined common value. A structure of the summary output includes pieces of information, which of the acquired pieces of information is output in which form at what point, and in addition typically additional phrases for connecting the pieces of information.
[0050] In S15, the summary output is generated by inserting the acquired pieces of information into the previously determined structure of the summary output in accordance with the structure, in order to obtain a natural speech summary output.
[0051] This will be explained using the example of the inputting of a navigation destination. In this type of dialogue, the type of dialogue, a town, a road and a house number are provided as the pieces of information to be acquired. In this context, all the pieces of information have already been acquired apart from the house number. If “Munich” was acquired as the town and “Innere Wiener Straβe” was acquired as the road, it is therefore possible to generate as a detailed summary output “we were just about to create a destination input to “Innere Wiener Straβe, Munich”. By using the “destination input”, the addressed function, specifically the determination of a navigation destination, is specified, and by using “Munich” and “Innere Wiener Straβe” the location and the road are specified. Therefore, all three previously acquired pieces of information are reproduced. However, depending on the specified parameters, in particular the duration and the complexity value, it may be advantageous to generate a relatively short summary output. The summary output can also be generated here together with the enquiry as to the next piece of information to be acquired. Therefore, for example the sentence “which house number in the Innere Wiener Straβe is to be driven to?” could be generated as a summary output.
[0052] The summary output is subsequently output in S16 using a voice output device, and in S17 the system returns to the sequence shown in process S1 to S8 in
[0053] Referring again to the drawings,
[0054] In S19 it is determined whether an interrupt condition is satisfied. An interrupt condition is satisfied in the case in which at least one of the motor vehicle systems whose signals have been acquired in S18 sends an interrupt signal. Additionally or alternatively, in S19 further interrupt conditions could also be checked, for example whether a further passenger of the motor vehicle is speaking, whether the user speaks without continuing the speech dialogue and whether a certain minimum time during which no piece of information has been acquired has passed since the last voice output of the speech dialogue device. If no interrupt condition is present, the method is continued from S18.
[0055] In S20 a timing process is started or a time counter is reset. The time counter serves in the method shown for acquiring the total length of the interrupt of the speech dialogue when the interrupt ends, that is to say when the interrupt condition is eliminated, and therefore serves to control the generation and outputting of the summary output as a function of the total length.
[0056] In S21, the execution of the dialogue, that is to say the execution of S1 to S8, is stopped and the instantaneous state of the dialogue is stored. For this purpose, at least the pieces of information already acquired and the associated pieces of meta-information are stored. In addition, the last voice output or further preceding voice outputs can be also stored by the speech dialogue device, the addressed function and/or the process within a dialogue at which the dialogue is at that particular time. Depending on the specific embodiment of the storage dialogue device, S21 can, under certain circumstances, be omitted, since a state of a dialogue is, under certain circumstances, present in the memory in any case. However, since under certain circumstances a further dialogue could be carried out during the interrupt, separate storage of the state of the dialogue is generally advantageous.
[0057] In S22 the signals of further devices which are connected to the speech dialogue device are acquired again, and in S23 it is checked again whether the interrupt condition is satisfied. S22 and S23 therefore correspond to S18 and S19. However, in this context a repetition of S22 and S23 takes place as long as the interrupt condition is satisfied. As soon as the interrupt condition is no longer satisfied, the method is continued with S24. In S24, the timer which has been started in S20 is initially stopped, and the value of the timer is read out. Therefore, in S24 the duration for which the interrupt was present is acquired.
[0058] Subsequently, in S25 it is checked whether the dialogue which was interrupted in S21 is to be continued. In this context, the duration of the interrupt is evaluated. If the duration is greater than a predefined limiting value, the dialogue is interrupted in S26 and can be started again by the user in S1. Other conditions can also cause the dialogue to be aborted. For example, a dialogue should typically not be continued in a motor vehicle if a user has left the vehicle between the start of the interrupt and the end of the interrupt. It is also possible that further operator control inputs of a user make a dialogue invalid. For example, it is possible for a station selection not to be made if the user has previously switched off the radio.
[0059] If it has been determined in S25 that the dialogue is to be continued, in S27 a complexity value is determined. The complexity value is a measure of the loading of a user by the event causing the interrupt. The complexity value is determined here as a function of the connected device which outputs the signal which causes the interrupt condition to be satisfied. Alternatively or additionally it would also be possible to determine the complexity value as a function of a device complexity value determined by the connected device or to take into account the pieces of information to be acquired or at least one of the pieces of information which have already been acquired during the determination of the complexity value.
[0060] In S28, the repetition condition is subsequently evaluated and it is determined as a function of the duration acquired in S24 and the complexity value determined in S27 whether a natural speech summary output is to be generated and output as a voice output. In this context, a value which is compared with a predefined limiting value is determined as a function of the duration and the complexity value. The determination of the value can be determined here by using a two-dimensional value table, calculating a weighted sum, forming products or other functions which depend on the duration, the complexity value and optionally further parameters.
[0061] If it is determined in S28 that the repetition condition is not satisfied, in S30 the state of the dialogue is restored. In this context, the values stored in S21 are reloaded. The state of the dialogue is changed here in such a way that the dialogue is continued with the last preceding process S6 from
[0062] If it is determined in S28 that a natural speech summary output is to be generated and output as a voice output, this takes place in S29. The generation and outputting of the summary output takes place as explained with respect to
[0063] Referring again to the drawings,
[0064] For some of the voice controllers, a plurality of pieces of information including information content to be linked are to be acquired. If, for example, a new entry needs to made in the address book which is integrated into the communication device 7, a first name, a family name, an address, which in turn may include a road, house number and location, as well as at least one telephone number, are to be acquired. In order to acquire complex inputs in this way, the speech dialogue device 2 can in each case interrogate further missing pieces of information from the user by using a voice output whenever pieces of information are acquired using a voice input of the user.
[0065] However, this multi-part dialogue can be interrupted as a function of signals of the other vehicle devices. For example, a speech dialogue is to be interrupted if the navigation system 8 outputs a navigation instruction, a traffic message is received and output by the radio 6, a call is received and/or accepted at the communication device 7 or the driver assistance system acquires a complex traffic situation which requires the driver's complete attention. Therefore, there is provision that the radio 6, the communication device 7, the navigation system 8 as well as the driver assistance system 9 can transmit interrupt signals to the speech dialogue device 2 via the vehicle bus. The speech dialogue device is designed here, as explained with respect to
[0066] In addition, the speech dialogue device 2 can then also generate a natural speech summary output and output it as a voice output if a corresponding voice command of a user has been acquired.
[0067] A description has been provided with particular reference to preferred embodiments thereof and examples, but it will be understood that variations and modifications can be effected within the spirit and scope of the claims which may include the phrase “at least one of A, B and C” as an alternative expression that means one or more of A, B and C may be used, contrary to the holding in Superguide v. DIRECTV, 358 F3d 870, 69 USPQ2d 1865 (Fed. Cir. 2004).