Abstract
A method for personalized bandwidth extension in an audio device. The method comprises obtaining an input microphone signal with a first bandwidth, obtaining a first user parameter indicative of one or more characteristics of a user of the audio device, determining, based on the first user parameter, a bandwidth extension model, and generating an output signal with a second bandwidth by applying the determined bandwidth extension model to the input microphone signal.
Claims
1. A computer-implemented method for training a bandwidth extension model for personalized bandwidth extension, wherein the method comprises: obtaining an audio dataset comprising one or more first audio signals with a first bandwidth, obtaining a hearing dataset comprising a hearing profile, applying the bandwidth extension model to the one or more first audio signals to generate one or more bandwidth extended audio signals with a second bandwidth, determining one or more perceptual losses associated with the one or more bandwidth extended audio signals based on the hearing data set; and training, based on the one or more perceptual losses, the bandwidth extension model.
2. A method for personalized bandwidth extension in an audio device, wherein the method comprises: a. obtaining an input microphone signal with a first bandwidth, b. obtaining a first user parameter comprising a result of a hearing test carried out on a user of the audio device and/or physiological information regarding the user of the audio device, such as gender and/or age, c. determining, based on the first user parameter, a bandwidth extension model, wherein the bandwidth extension model comprises a trained neural network, wherein the trained neural network is trained according to claim 1, and d. generating an output signal with a second bandwidth by applying the determined bandwidth extension model to the input microphone signal.
3. A method for personalized bandwidth extension in an audio device according to claim 2, wherein the step c. comprises: obtaining a codebook comprising a plurality of bandwidth extension models each associated with one or more user parameters, comparing the first user parameter to the codebook, and determining, based on the comparison between the codebook and the first user parameter, the bandwidth extension model.
4. A method for personalized bandwidth extension in an audio device according to claim 2, comprising: analysing the input microphone signal to determine the first bandwidth, and determining, based on the first user parameter and the determined first bandwidth, the bandwidth extension model.
5. A method for personalized bandwidth extension in an audio device according to claim 2, wherein the first user parameter is stored on a local storage of the audio device.
6. A method for personalized bandwidth extension in an audio device according to claim 2, wherein the step a. comprises: receiving the input microphone signal from a far-end station, wherein the received input microphone signal from the far-end station is an encoded signal, and wherein the steps b. to d. is carried out as part of decoding the input microphone signal from the far-end station.
7. An audio device for personalized bandwidth extension, the audio device comprising a processor, and a memory storing instructions which when executed by the processor causes the processor to: a. obtain an input microphone signal with a first bandwidth, b. obtain a first user parameter comprising a result of a hearing test carried out on a user of the audio device and/or physiological information regarding the user of the audio device, such as gender and/or age, c. determine based on the first user parameter a bandwidth extension model, wherein the bandwidth extension model comprises a trained neural network, wherein the trained neural network is trained according to claim 1, and d. generate an output signal with a second bandwidth using the determined bandwidth extension model.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0109] The above and other features and advantages of the present invention will become readily apparent to those skilled in the art by the following detailed description of example embodiments thereof with reference to the attached drawings, in which:
[0110] FIG. 1 schematically illustrates a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure.
[0111] FIG. 2 schematically illustrates a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure.
[0112] FIG. 3 schematically illustrates a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure.
[0113] FIG. 4 schematically illustrates a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure.
[0114] FIG. 5 schematically illustrates a communication system with an audio device according to an embodiment of the disclosure.
[0115] FIG. 6 schematically illustrates a block diagram of a training set-up for training a bandwidth extension model for personalized bandwidth extension according to an embodiment of the disclosure.
DETAILED DESCRIPTION
[0116] Various example embodiments and details are described hereinafter, with reference to the figures when relevant. It should be noted that the figures may or may not be drawn to scale and that elements of similar structures or functions are represented by like reference numerals throughout the figures. It should also be noted that the figures are only intended to facilitate the description of the embodiments. They are not intended as an exhaustive description of the invention or as a limitation on the scope of the invention. In addition, an illustrated embodiment needs not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced in any other embodiments even if not so illustrated, or if not so explicitly described.
[0117] Referring initially to FIG. 1 which depicts a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure. In a first step 100 an input microphone signal is obtained. The input microphone signal has a first bandwidth. The input microphone signal may be obtained as part of an ongoing communication session happening between a near-end station and a far-end station. In a second step 101 a first user parameter is obtained. The first user parameter is indicative of one or more characteristics of a user of the audio device. The first user parameter may comprise physiological information regarding the user of the audio device, such as gender and/or age. The first user parameter may comprise a result of a hearing test carried out on the user of the audio device. The first user parameter may be obtained by retrieving it from a local storage of the audio device, such a local memory, e.g., a flash drive. In a third step 102 a bandwidth extension model is determined based on the obtained first user parameter. The bandwidth extension model may be determined by being generated based on the first user parameter. The bandwidth extension model may be determined by matching the first user parameter to a pre-generated bandwidth extension model from a plurality of pre-generated bandwidth extension models. Each of the plurality of pre-generated bandwidth extension models may have been pre-generated based on different user parameters. Matching of the first user parameter to the pre-generated bandwidth extension model, may be carried out associating each of the plurality of pre-generated bandwidth extension models with the one or more user parameters used for generating the pre-generated bandwidth extension model, and matching the first user parameter to the pre-generated bandwidth extension model which have been generated based on one or more user parameters which matches the most with the first user parameter. The determined bandwidth extension model comprises a trained neural network. In a fourth step 103 an output signal is generated by applying the determined bandwidth extension model to the input microphone signal. The output signal is generated with a second bandwidth. The determined bandwidth extension model may be applied by providing the input microphone signal as an input to the determined bandwidth extension model. The output of the determined bandwidth extension model may then be the output signal with the second bandwidth.
[0118] Referring to FIG. 2 which depicts a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure. The method illustrated in FIG. 2 comprises steps corresponding to the steps of the method depicted in FIG. 1. In a first step 200 an input microphone signal is obtained. In a second step 201 a first user parameter is obtained. In a third step 202 a codebook is obtained. The codebook comprises a plurality of bandwidth extension models, each associated with one or more user parameters. The codebook may be obtained by retrieving it from a local storage on the audio device, alternatively, the codebook may be obtained by retrieving it from a cloud storage communicatively connected with the audio device. In a fourth step 203 the first user parameter is compared to the codebook. The comparison may be to determine which of the plurality of bandwidth extension model is the best match for the first user parameter, this may be done by comparing the first user parameter to the one or more user parameters associated with each of the bandwidth extension models. The result of the comparison may be a list of values, where each value indicates to what degree the first user parameter matches with a bandwidth extension model. In a fifth step 204 the bandwidth extension model is determined. The bandwidth extension model is determined based on the comparison between the codebook and the first user parameter. The determined bandwidth being a bandwidth extension model comprised in the obtained codebook. In a sixth step 205 an output signal is generated by applying the determined bandwidth extension model to the input microphone signal.
[0119] Referring to FIG. 3 which depicts a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure. The method illustrated in FIG. 3 comprises steps corresponding to the steps of the method depicted in FIG. 1. In a first step 300 an input microphone signal is obtained. In a second step 301 a first user parameter is obtained. In a third step 302 the input microphone signal is analysed. The input microphone signal is analysed to determine a first bandwidth of the input microphone signal. In a fourth step 303 a bandwidth extension model is determined. The bandwidth extension model is determined based on the first user parameter and the determined first bandwidth. In some embodiment, the use of detecting the first bandwidth may be used in conjunction with an obtained codebook comprising a plurality of bandwidth extension models. The plurality of bandwidth extension models may be separated into different groups, each group corresponding to different bandwidths. Hence, a detected first bandwidth may be compared to the codebook to select the group from which a bandwidth extension model should be selected from. In a fifth step 304 an output signal is generated by applying the determined bandwidth extension model to the input microphone signal.
[0120] Referring to FIG. 4 which depicts a flow chart of a method for personalized bandwidth extension in an audio device according to an embodiment of the disclosure. The method illustrated in FIG. 4 comprises steps corresponding to the steps of the method depicted in FIG. 1. In a first step 400 a communication connection with a far-end station is established. Establishing of the communication connection may be done as part of a handshake protocol between a far-end station and a near-end station. In a second step 401 a first user parameter is transmitted to the far-end station. The first user parameter may be transmitted to the far-end station as part of the handshake protocol. In a third step 402 the input microphone signal is received from the far-end station. The input microphone signal is received as an encoded signal. The input microphone signal may have been encoded according to an audio codec schematic. The encoded input microphone signal comprises the first user parameter. In a fourth step 403 the first user parameter is determined from the input microphone signal. In a fifth step 404 a bandwidth extension model is determined based on the determined first user parameter. In a sixth step 405 an output signal is generated by applying the determined bandwidth extension model to the input microphone signal. The fourth step 403, the fifth step 404, and the sixth step 406 is carried out as part of decoding process of the received encoded input microphone signal.
[0121] Referring to FIG. 5 which depicts a communication system with an audio device 500 according to an embodiment of the disclosure. The communication system comprises a far-end station 600 in communication with a near-end station 500. The near-end station 500 being the audio device 500, in other embodiments the audio device 500 may communicate with the far-end station via an intermediate device, for example, the intermediate device may be smartphone paired to the audio device 500. When setting up the communication connection between the far-end device 600 and the near-end device 500, the far-end device 600 may receive a first user parameter in the form of a signal 606, 607. The far-end device 600 may receive the signal 606, 607 regarding the first user parameter information from a cloud storage 604, or a local storage 506 on the audio device. The far-end device 600 transmits a TX signal 601. The TX signal 601 in the present embodiment being an encoded input microphone signal. The encoded input microphone signal may have been encoded with the first user parameter. The TX signal 601 is sent over a communication channel 602. The communication channel 602 may perform one or more actions to prevent the TX signal from degrading, such as packet loss concealment or buffering of the signal. At the near-end device 500 a RX signal 603 is received. The RX signal 603 may be the encoded input microphone signal transmitted as the TX signal 601 from the far-end station 600. The RX signal 603 may be received at a decoder module 501. The decoder module 501 being configured to decode the RX signal 603 to provide the input microphone signal 502. The decoder module 501 may also perform processing of the RX signal 603, such as noise suppression, echo cancellation, or bandwidth extension. A processor 503 of the audio device 500 obtains the input microphone signal 502 from the decoder module 501, in some embodiments the decoder module 501 is comprised in the processor 503. The processor 503 then obtains the first user parameter indicative of one or more characteristics of a user of the audio device 500. The first user parameter may be obtained from the decoder module 501, if the RX signal 603 was encoded with the first user parameter. Alternatively, the first user parameter 507 may be retrieved from a local memory 506 on the audio device, or be retrieved from a cloud storage 604 communicatively connected with the audio device 500. The processor 503 then determines a bandwidth extension model based on the first user parameter, and generates an output signal 504 with a second bandwidth using the determined bandwidth extension model. The output signal 504 may undergo further processing in a digital signal processing module 505. Further, processing may involve echo cancellation, noise suppression, dereverberation, etc. The output signal 504 may be outputted through one or more output transducers of the audio device. 500.
[0122] Referring to FIG. 6 which schematically illustrates a block diagram of a training set-up for training a bandwidth extension model for personalized bandwidth extension according to an embodiment of the disclosure. In the set-up an audio dataset 700 is obtained. The audio data set comprises one or more first audio signals with a first bandwidth. The audio data set 700 is given as input bandwidth extension model 701. The bandwidth extension model is applied to the one or more first audio signals to generate one or more bandwidth extended audio signals with a second bandwidth. The generated one or more bandwidth extended audio signals is given as input to a loss function 702. Furthermore, the audio data set 700 is also given as an input to the loss function 702. A hearing dataset 703 comprising a hearing profile is also obtained. The hearing dataset 703 is also given as an input to the loss function 702. Based on the hearing dataset 703, the one or more bandwidth extended audio signals, and the audio data set 700 one or more perceptual losses is determined by the loss function 702. The one or more perceptual losses determined is fed back to the bandwidth extension model to train the bandwidth extension model. In the case of the bandwidth extension model being a neural network, the perceptual losses may be back propagated through the bandwidth extension model to train the bandwidth extension model. To facilitate training of the bandwidth extension model 701 additional inputs may be given to the bandwidth extension model 701. In an embodiment, where the bandwidth extension model 701 comprises a neural network, pre-trained weights 704 may be given as an input to the bandwidth extension model 701 facilitate training of the bandwidth extension model 701.
[0123] It may be appreciated that FIGS. 5 and 6 comprise some modules or operations which are illustrated with a solid line and some modules or operations which are illustrated with a dashed line. The modules or operations which are comprised in a dashed line are example embodiments which may be comprised in, or a part of, or are further modules or operations which may be taken in addition to the modules or operations of the solid line example embodiments. It should be appreciated that these operations need not be performed in order presented. Furthermore, it should be appreciated that not all the operations need to be performed. The example operations may be performed in any order and in any combination.
[0124] It is to be noted that the word comprising does not necessarily exclude the presence of other elements or steps than those listed.
[0125] It is to be noted that the words a or an preceding an element do not exclude the presence of a plurality of such elements.
[0126] It should further be noted that any reference signs do not limit the scope of the claims, that the example embodiments may be implemented at least in part by means of both hardware and software, and that several means, units or devices may be represented by the same item of hardware.
[0127] The various example methods, devices, and systems described herein are described in the general context of method steps processes, which may be implemented in one aspect by a computer program product, embodied in a computer-readable medium, including computer-executable instructions, such as program code, executed by computers in networked environments. A computer-readable medium may include removable and non-removable storage devices including, but not limited to, Read Only Memory (ROM), Random Access Memory (RAM), compact discs (CDs), digital versatile discs (DVD), etc. Generally, program modules may include routines, programs, objects, components, data structures, etc. that perform specified tasks or implement specific abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps or processes.
[0128] Although features have been shown and described, it will be understood that they are not intended to limit the claimed invention, and it will be made obvious to those skilled in the art that various changes and modifications may be made without departing from the scope of the claimed invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than restrictive sense. The claimed invention is intended to cover all alternatives, modifications, and equivalents.
Items:
[0129] 1. A method for personalized bandwidth extension in an audio device, wherein the method comprises: [0130] a. obtaining an input microphone signal with a first bandwidth, [0131] b. obtaining a first user parameter indicative of one or more characteristics of a user of the audio device, [0132] c. determining, based on the first user parameter, a bandwidth extension model, and [0133] d. generating an output signal with a second bandwidth by applying the determined bandwidth extension model to the input microphone signal. [0134] 2. A method for personalized bandwidth extension in an audio device according to item 1, wherein the first user parameter comprises physiological information regarding the user of the audio device, such as gender and/or age. [0135] 3. A method for personalized bandwidth extension in an audio device according to item 1, wherein the first user parameter comprises a result of a hearing test carried out on the user of the audio device. [0136] 4. A method for personalized bandwidth extension in an audio device according to any of the preceding items, wherein the step c. comprises: [0137] obtaining a codebook comprising a plurality of bandwidth extension models each associated with one or more user parameters, [0138] comparing the first user parameter to the codebook, and [0139] determining, based on the comparison between the codebook and the first user parameter, the bandwidth extension model. [0140] 5. A method for personalized bandwidth extension in an audio device according to any of the preceding items, comprising: [0141] analysing the input microphone signal to determine the first bandwidth, and [0142] determining, based on the first user parameter and the determined first bandwidth, the bandwidth extension model. [0143] 6. A method for personalized bandwidth extension in an audio device according to any of the preceding items, wherein the bandwidth extension model comprises a trained neural network. [0144] 7. A method for personalized bandwidth extension in an audio device according to any of the preceding items, wherein the first user parameter is stored on a local storage of the audio device. [0145] 8. A method for personalized bandwidth extension in an audio device according to any of the preceding items, wherein the step a. comprises: [0146] receiving the input microphone signal from a far-end station, wherein the received [0147] input microphone signal from the far-end station is an encoded signal, and [0148] wherein the steps b. to d. is carried out as part of decoding the input microphone signal from the far-end station. [0149] 9. A method for personalized bandwidth extension in an audio device according to item 8, comprising: [0150] establishing a communication connection with a far-end station, [0151] transmitting the first user parameter to the far-end station, and [0152] receiving the input microphone signal from the far-end station, wherein the encoded input microphone signal comprises the first user parameter, and [0153] wherein step b) comprises: [0154] determining the first user parameter from the received input microphone signal. [0155] 10. A computer-implemented method for training a bandwidth extension model for personalized bandwidth extension, wherein the method comprises: [0156] obtaining an audio dataset comprising one or more first audio signals with a first bandwidth, [0157] obtaining a hearing dataset comprising a hearing profile, [0158] applying the bandwidth extension model to the one or more first audio signals to generate one or more bandwidth extended audio signals with a second bandwidth, [0159] determining one or more perceptual losses associated with the one or more bandwidth extended audio signals based on the hearing data set; and [0160] training, based on the one or more perceptual losses, the bandwidth extension model. [0161] 11. An audio device for personalized bandwidth extension, the audio device comprising a processor, and a memory storing instructions which when executed by the processor causes the processor to: [0162] a. obtain an input microphone signal with a first bandwidth, [0163] b. obtain a first user parameter indicative of one or more characteristics of a user of the audio device, [0164] c. determine based on the first user parameter a bandwidth extension model, and [0165] d. generate an output signal with a second bandwidth using the determined bandwidth extension model.