Online automatic audio transcription for hearing aid users
11373654 · 2022-06-28
Assignee
Inventors
Cpc classification
G10L15/22
PHYSICS
G10L15/30
PHYSICS
H04R25/554
ELECTRICITY
H04R2225/43
ELECTRICITY
International classification
G10L15/22
PHYSICS
Abstract
An automatic audio transcription method comprises: sending an audio stream and an identifier of the audio stream from a microphone device to an audio support server and to at least one hearing aid system comprising a hearing aid and a portable device connected to the hearing aid; playing the audio stream with the hearing aid; registering the at least one hearing aid system at the audio support server by sending the identifier from the hearing aid system to the audio support server; transcribing the audio stream into a text stream; sending the text stream from the audio support server to the portable device associated with the identifier of the audio stream; and displaying the text stream with the portable device.
Claims
1. An automatic audio transcription method, comprising: sending an audio stream and an identifier of the audio stream from a microphone device to an audio support server and to a hearing aid included in a hearing aid system, the hearing aid system further including a portable device connected to the hearing aid; playing the audio stream with the hearing aid; registering the hearing aid system at the audio support server by sending the identifier from the hearing aid to the audio support server; transcribing the audio stream into a text stream; sending the text stream from the audio support server to the portable device associated with the identifier of the audio stream; displaying the text stream with the portable device.
2. The method of claim 1, wherein the identifier is sent from the hearing aid to the portable device.
3. The method of claim 1, wherein the audio stream and the identifier are sent via a wireless digital radio communication connection directly from the microphone device to the hearing aid.
4. The method of claim 1, wherein the identifier is sent via Bluetooth from the hearing aid to the portable device.
5. The method of claim 1, where the audio stream and the identifier from the microphone device are sent to the audio support server via the portable device.
6. The method of claim 5, wherein the audio support server receives the audio stream from at least two portable devices and the audio stream is transcribed into the text stream solely one time.
7. The method of claim 1, wherein the audio stream and the identifier are sent via an Internet connection to the audio support server; and/or wherein the audio stream is sent via an Internet connection from the portable device to the audio support server; and/or wherein the identifier is sent via an Internet connection from the portable device to the audio support server; and/or wherein the text stream is sent via an Internet connection from the audio support server to the portable device.
8. The method of claim 1, wherein the audio stream is sent from the audio support server to a text transcription server, which transcribes the audio stream into the text stream and sends the text stream back to the audio support server.
9. The method of claim 1, further comprising: translating the text stream into another language.
10. The method of claim 1, wherein the audio stream and the identifier are sent to a plurality of hearing aid systems, which are registering at the audio support server with the identifier and which receive the text stream.
11. The method of claim 1, further comprising: registering a further portable device at the audio support server by inputting the identifier into the portable device and sending the identifier to the audio support server; sending the text stream from the audio support server to the further portable device associated with the identifier of the audio stream.
12. The method of claim 1, wherein the identifier is generated by the microphone device.
13. An audio transcription system, comprising: a microphone device, an audio support server adapted for at least one of transcribing an audio stream into a text stream and receiving a transcribed text stream from an audio transcription server; a hearing aid system comprising a hearing aid for playing the audio stream and a portable device adapted for displaying the text stream; wherein the microphone device is adapted for sending the audio stream and an identifier of the audio stream to the audio support server and the hearing aid; wherein the hearing aid is adapted for registering at the audio support server by sending the identifier to the audio support server; wherein the audio support server is adapted for sending the text stream to the portable device.
14. A non-transitory computer-readable medium storing a computer program that, when executed, direct a processor to: send an audio stream and an identifier of the audio stream from a microphone device to an audio support server and to a hearing aid included in a hearing aid system, the hearing aid system further including a portable device connected to the hearing aid; play the audio stream with the hearing aid; register the hearing aid system at the audio support server by sending the identifier from the hearing aid to the audio support server; transcribe the audio stream into a text stream; send the text stream from the audio support server to the portable device associated with the identifier of the audio stream; and display the text stream with the portable device.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Below, embodiments of the present invention are described in more detail with reference to the attached drawings.
(2)
(3)
(4)
(5)
(6) The reference symbols used in the drawings, and their meanings, are listed in summary form in the list of reference symbols. In principle, identical parts are provided with the same reference symbols in the figures.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
(7)
(8) A hearing aid system 14 comprises a hearing aid 20 that is usually worn in or near the ear of the listener 18, and a portable device 22, such as a smartphone, tablet computer, laptop, etc.
(9) It also may be that further portable devices 22′, not being a member of a hearing aid system, which are used by further listeners 18, are part of the audio transcription system 10.
(10) The parts of the audio transcription system 10 described up to now all may be located near each other, i.e. nearer as 100 m and/or in the same room. Usually, the persons 16, 18 all will be within a range that they would be able to talk to each other.
(11) Other parts of the audio transcription system 10 may be an audio support server 24 and an optional audio transcription and/or translation server 26, which may be remoter from the parts of the system 10 mentioned above. For example, the servers 24, 26 may be located in a different building or different buildings, for example in different cloud computing facilities. The audio support server 24 and an optional audio transcription and/or translation server 26 may be connected with an Internet connection 30.
(12) The microphone device 12 and the audio support server 24 may be connected with an Internet connection 30, which may be locally provided via a wireless communication connection such as Bluetooth© and/or Wi-Fi. Also the portable devices 22, 22′ may be connected with the audio support server 24 via an Internet connection, which may be locally provided via a wireless communication connection such as Bluetooth© and/or Wi-Fi.
(13) The microphone device 12 and the hearing aids 20 may be connected with a wireless communication connection 32 of a second type that is adapted for transferring audio data as fast that a hearing aid wearer does not feel a delay between watching the speaker and hearing him.
(14) Within each hearing system 14, the hearing aid and the portable device may communicate via a further wireless communication connection 34, which also may be provided with Bluetooth©.
(15)
(16) In general, the system 10 allows to provide an automatic transcription and optionally an automatic translation of the speech of the talker 16 for a plurality of listeners 18. In particular, an audio stream 38 and an identifier 36 for the audio stream is generated by the microphone device and transcribed and optionally translated into a text stream 40, which is then displayed by the portable devices 22, 22′.
(17) During the method, the microphone device 12 generates an audio stream 38 from the speech of the talker 16. The talker 16 may speak into a loudspeaker of the microphone device, which may digitize the recorded sound into the audio stream 38. Since the microphone device 12 may be worn by the talker 16 or at least may be situated nearer to the talker 16 as the hearing aid wearers 18, the audio stream may have a better sound-to-noise ratio as the audio data that may be gathered by a microphone of a hearing aid 20.
(18) Additionally, the microphone device 12 may generate an identifier 36 of the audio stream 38. This identifier 36 will be used in the system 10 for associating the hearing aid systems 14 with the correct text stream 40. The identifier 36 may be unique with respect to the audio stream 38. The identifier 36 may be generated randomly with a suitable seed value, for example in dependence of the serial number of the microphone device 12, the time of day, etc.
(19) The audio stream 38 and the identifier 36 of the audio stream 38 are then sent from the microphone device 12 to the hearing aid systems 14. It may be that the hearing aid systems 14 have themselves registered at the microphone device 12. It also may be that the microphone device 12 broadcasts the audio stream 36 and the identifier via its interface for the communication connection 32 and every hearing aid system 14 in a suitable range of the microphone device 12 may receive the data 36, 38.
(20) The audio stream 38 and the identifier 36 may be sent to the hearing aid 20 via the wireless digital radio communication connection 32 directly from the microphone device 12 to the hearing aid 20. The hearing aid 20 may play the audio stream 38, which usually has a better quality as the audio data generated within the hearing aid 20 with its internal microphone.
(21) And the hearing aid 20 may send the identifier via the communication connection 34 to the portable device 22. In the portable device 22, a hearing aid application may be running, which may be used for controlling the hearing aid 20. Such an application also may be used for further processing the identifier 36.
(22) The audio stream 38 and the identifier 36 are also sent by the microphone device 12 to the audio support server 24 via the Internet connection 30, for example via Bluetooth© and/or Wi-Fi to a router and from there via wire bound communication lines to the server 24.
(23) Each hearing aid system 14 is now able to register at the audio support server 24 with the identifier 36 of the audio stream 38. The identifier 36 may be sent from the portable device 22 to the audio support server 24 also via the Internet connection 30, for example firstly via Bluetooth© and/or Wi-Fi to a router and from there via wire bound connections to the server 24. The audio support server 24 may generate a list of listening hearing systems 14, which are associated with the audio stream 36.
(24) For example, the above mentioned control application running in the portable device 22 may connect to the audio support server 24 and may request a text stream 40 of the audio stream 38 with the respective identifier authentication. Optionally, the portable device 22, and in particular the application, may send a desired target language to the audio support server 24. Also this language may be saved for the respective hearing system 14 in the list of listening hearing systems 14.
(25) After receiving the audio stream 38 and when at least one hearing aid system 14 has registered at the audio support server 24, the audio support server 24 controls a transcription of the audio stream 38 into a text stream 40. To this end, the audio support server 24 may forward the audio stream 38 to the transcription/translation server 26, which may provide a transcription/translation service. The audio stream 38 may be sent from the audio support server 24 to the text transcription/translation server 26 via the Internet connection 30. The server 26 then transcribes the audio stream 38 into the text stream 40 and sends the text stream 40 back to the audio support server 24.
(26) It also may be that the text stream 40 is translated into one or more target languages. The server 24 may collect all translation requests and target languages from the portable devices 22, 22′ and may demand only one translation per language. For that purpose, the server 24 may send the audio stream 38 to the transcription/translation server 26 once and may request directly a translation into the one or more target languages.
(27) Alternatively, the server 24 also may request only the transcription in the original language from the server 26. With the transcription, it could then access the same or a further translation server 26 to get one or more translated text streams 40.
(28) Since only one transcribed text stream in the original language and/or at least one text stream 40 per demanded target language is generated by the server 26, the system 10 scales easily with more hearing aid systems 14.
(29) The audio support server 23 then distributes the transcript/translation to the hearing aid systems 14. The transcribed and optionally respective translated text stream 40 may be sent via the Internet connection 30 from the audio support server 24 to the portable device 22 associated with the identifier 36 of the audio stream 36.
(30) In the end, the text stream 40 may be displayed by the portable device 22. Without any user interaction to register to the correct source, a listener 18 may get the correct transcript/translation from the talk of the talker 16. It also may be that the transcribed and optionally translated text stream 40 is transformed back into an audible audio stream, for example by a text-to-speech (TTS) synthesizer. This may be performed locally in the portable device 22. Likewise, the further translation server may be located within the portable device 22. Thus, the audio support server 24 may send the transcribed text back to the portable device 22, which either shows the transcribed text to the user and/or translates it first and/or transforms it back to an audible audio signal.
(31) Potentially, the portable device 22 may adapt the transcription and/or translation locally, for example given meta-information about the probability of correct transcription and/or translation or alternative words as provided by the transcription server and a local database with terms and expressions as used by the wearer and people in his/her social network.
(32) It also may be that a listener 18 (possibly without a hearing aid 20) may manually register a further portable device 22′ at the audio support server 24 by inputting the identifier 36 or a representation thereof into the portable device 22′ and sending the identifier 36 to the audio support server 24. For example, a listener 18 using a hearing aid 20 without direct connection to the microphone device 12 may register manually in this way. The identifier 36 may be manually entered as an alphanumerical code. Also, the identifier 36 may be manually input by scanning a QR-code or by NFC (near field communication), i.e. by holding the portable device shortly into the vicinity of the microphone device.
(33) Also this portable device 22′ and optionally a target language may be listed in the server 24, which may then send the text stream 40 to the further portable device 22′ associated in such a way with the identifier 36.
(34) Additionally, encryption methods may be used between the microphone device and the audio support server 24, between the audio support server 24 and the transcription/translation server 26, and/or the audio support server 24 and the portable devices 22, and/or the microphone device 12 and the hearing aids 20, and/or the hearing aids 20 and the portable device 22. In particular, the identifier 36 as well as the audio stream 38 and/or the transcribed/translated text 40 may be sent between the microphone device 12 and/or between a portable device 22, 22′ and the server 24 in an encrypted way.
(35)
(36) As shown in
(37) In the case, the audio support server 24 receives more than one audio stream 38 from at least two portable devices 22, 22′, which are associated with the same identifier 26, the audio support server 24 may discard all but one audio streams 38. Only one audio stream 38 may be sent to the transcription/translation server 26 and may be transcribed and optionally translated solely one time.
(38) The other steps of the method illustrated in
(39) While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive; the invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art and practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or controller or other unit may fulfil the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage. Any reference signs in the claims should not be construed as limiting the scope.
LIST OF REFERENCE SYMBOLS
(40) 10 automatic audio transcription system 12 microphone device 14 hearing aid system 16 talker 18 listener 20 hearing aid 22, 22′ portable device 24 audio support server 26 audio transcription and/or translation server 30 Internet connection 32 first type wireless communication connection 34 second type wireless communication connection 36 identifier 38 audio stream 40 text stream