Processing of audio signals for a tinnitus therapy
09549269 · 2017-01-17
Assignee
Inventors
- Marc Adrian Nötzel (Hamburg, DE)
- Johannes Abraxas Wittig (Braunschweig, DE)
- Jörg Land (Hamburg, DE)
- Matthias Lanz (Hamburg, DE)
Cpc classification
A61M21/00
HUMAN NECESSITIES
H04R25/75
ELECTRICITY
International classification
Abstract
A method for processing audio signals in particular for a therapy of subjective tinnitus with an individual tinnitus frequency. The method includes: providing a first audio signal, determining a blocking range in the frequency spectrum of the first audio signal with a predefinable frequency width, creating a second audio signal from the first audio signal using a filter, and determining an auditory energy of the first audio signal or the second audio signal within at least one predefined therapeutically applicable frequency range and specifying an evaluation parameter for the second audio signal.
Claims
1. A method for processing audio signals (10, 12), for a therapy of subjective tinnitus with an individual tinnitus frequency, the method comprising: provision of a first audio signal (10), determination of a blocking range in a frequency spectrum of the first audio signal (10) with a predefinable frequency width (22) on the basis of a predefinable therapy frequency (20); creation of a second audio signal (12) from the first audio signal (10) using a filter (120, 121) for a portion of the signal in the first audio signal (10) in the blocking range; determination of an auditory energy of the first audio signal (10) or of the second audio signal (12) within at least one predefined or predefinable therapeutically applicable frequency range; and specification of an evaluation parameter (30) for the second audio signal (12) as a function of the auditory energy and of a frequency separation between the therapeutically applicable frequency range and the blocking range, wherein the therapeutically applicable frequency range is analyzed subdivided into frequency intervals, wherein in particular respectively an auditory energy of the first audio signal (10) or of the second audio signal (12) is determined within each frequency interval and the evaluation parameter (30) is determined depending on the respective auditory energy and of a respective frequency distance between the blocking range and the respective frequency interval taking all frequency intervals of the therapeutically applicable frequency range into consideration.
2. The method according to claim 1, wherein at least one of the first audio signal (10) and the second audio signal (12) are respectively a digital audio signal, in particular a digital audio file or a digital audio data flow.
3. The method according to claim 1, wherein the first audio signal (10) is normalized before the creation of the second audio signal (12).
4. The method according to claim 1, wherein at least one of the first audio signal (10) and the second audio signal (12) is corrected to compensate for at least one of frequency-dependent elevations and dampings by a playback device with a non-linear frequency path.
5. The method according to claim 4, wherein the correction of the first audio signal (10) and the second audio signal (12) is carried out by means of a filter (121, 120).
6. The method according to claim 1, wherein a used filter (120, 121) is a filter with finite impulse response.
7. The method according to claim 1, wherein the first audio signal (10) or respectively the second audio signal (12) has at least two channels, wherein each channel is analyzed individually for determining the evaluation parameter (30).
8. The method according to claim 1, wherein the method is performed using a data processing device (40).
9. The method according to claim 8, wherein the data processing device (40) is connected with a playback device (44) by means of a first data connection, wherein the second audio signal (12) is transmitted by the data processing device (40) to the playback device (44) via the first data connection.
10. The method according to claim 8, wherein the data processing device (40) is connected with a data storage device (44) by means of a second data connection, wherein the first audio signal (10) is transmitted by the data storage device (44) to the data processing device (40) via the second data connection.
11. The method according to claim 9, wherein at least one of the first data connection and the second data connection is or will be established via a data network (42).
12. A computer program product with program code means, the program code means being designed to execute a method according to claim 1 when the program code means are executed on a data processing device (40).
13. A computer system with a data processing device (40), which is set up to execute a method according to claim 1.
14. A method for processing audio signals (10, 12), for a therapy of subjective tinnitus with an individual tinnitus frequency, the method comprising: provision of a first audio signal (10), determination of a blocking range in a frequency spectrum of the first audio signal (10) with a predefinable frequency width (22) on the basis of a predefinable therapy frequency (20); creation of a second audio signal (12) from the first audio signal (10) using a filter (120, 121) for a portion of the signal in the first audio signal (10) in the blocking range; determination of an auditory energy of the first audio signal (10) or of the second audio signal (12) within at least one predefined or predefinable therapeutically applicable frequency range; and specification of an evaluation parameter (30) for the second audio signal (12) as a function of the auditory energy and of a frequency separation between the therapeutically applicable frequency range and the blocking range, wherein the first audio signal (10) or the second audio signal (12) is analyzed subdivided into temporally consecutive sections for determining the evaluation parameter (30), wherein in particular each section comprises a predefinable duration or a predefinable number of digital audio samples.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The invention is described below, without restricting the general idea of the invention, using exemplary embodiments with reference to the drawings, whereby we expressly refer to the drawings with regard to all details according to the invention that are not explained in greater detail in the text. The figures show in:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10) In the drawings, the same or similar types of elements and/or parts are provided with the same reference numbers so that a re-introduction is omitted.
DETAILED DESCRIPTION
(11) An exemplary implementation of the method according to the invention is shown in
(12) Via the client computer 44, an original audio signal 10, for example an audio file saved on the client computer 44, is transferred to the server 40 via the Internet 42, where a therapy signal 12 is created by means of a digital filter 120. For the configuration of the filter, an individual tinnitus frequency 20 or therapy frequency 20 and optionally a blocking range 22, in particular blocking range width, are provided via the client computer 44, which have been determined for example by the treating doctor for the individual tinnitus patient. If no blocking range 22 is specified, a standard value, for example an octave, is used for the blocking range 22.
(13) In a signal analysis 130, the therapy 12 is analyzed and at least one evaluation parameter 30 is determined. On the basis of significance of being a measure for the inhibition of the neuronal activity, the evaluation parameter 30 is also called inhibition parameter 30 in the following.
(14) In a parameter evaluation 140, the inhibition parameter 30 is compared with reference parameters in order to determine the suitability of the therapy signal 12 for the therapy or treatment of the individual tinnitus with the tinnitus frequency 20.
(15) The reference parameters are based for example on reference signals, which have proven to be suitable or unsuitable for tinnitus therapy in empirical studies, wherein the reference parameters are specified by the inhibition 30 for the reference signals determined by means of the signal analysis 130.
(16) The result of the parameter evaluation 140 is transmitted to the client computer 44 via a user dialog 150. The user dialog 150 simultaneously provides an audio data flow 160 with the therapy signal 12 for playback by means of the client computer 44 or an audio file 162 with the therapy signal 12 for storage on the client computer 44.
(17) The amplitude frequency path of the filter 120 is represented schematically in
(18) The characteristic curve 60 of the filter 120 has a band-stop filter 70 around a center frequency F0, which corresponds in particular with the individual tinnitus frequency 20. The band-stop filter 70 has a therapeutic goal range or blocking range with a blocking range 22, which is for example an octave or is specified as variable blocking width 22. The blocking range defines a lower threshold frequency F2 of the therapeutic target range or respectively blocking range and an upper threshold frequency F3 of the therapeutic target range or respectively blocking range, wherein the blocking range is arranged for example on a logarithmic frequency scale symmetrically around the center frequency F0. The damping of the band-stop filter 70, in particular the damping of the filter in the therapeutic target range, is determined in particular depending on the quantification word width M of a digital audio signal to be filtered 10 and is for example M*6 dB+2 dB.
(19) Above and below the blocking range, the band-stop filter 70 has transition areas, which are characterized by the lower threshold frequency F1 of the band-stop filter 70 and the upper threshold frequency F4 of the band-stop filter 70. The width of the transition areas is thereby dependent on the implementation of the respective filter 120, wherein a decreasing width of the transition areas in the case of a digitally implemented filter 120 is generally connected with increased computing effort and thus with increased time effort during the creation of the therapy signal 12. The filter 120 is preferably designed or configured such that the width of the transition areas is small compared to the blocking range 22. For example, each of the widths of the transition areas is a quarter tone when the blocking range 22 is one octave or respectively six whole tone steps.
(20) Outside of the transition areas, each characteristic curve 60 of the filter 120 has passbands, in which the audio signal to be filtered mainly remains unchanged. In these areas, the damping is correspondingly zero.
(21)
(22) In a signal preparation 210, the original audio signal 10 is decoded and, if applicable, converted to a linear PCM format (Pulse Code Modulation) if the original audio signal 10 is not yet available in such a format.
(23) The audio signal prepared in this manner undergoes a normalization 212 in order to keep the signal-to-noise ratio of the filtered audio signal low. If the step response of the filter 120 produces overshoots, the audio signal is also reduced approximately by the height of the overshoot in a linear damping 214 in order to avoid distortions in the filtered audio signal.
(24) Furthermore, the sampling rate of the audio signal 10 as well as the quantification word width M of the prepared audio signal are determined in a parameter determination 220.
(25) The actual filtering of the audio signal takes place by means of an FIR filter 250 through numeric folding with suitable filter coefficients, which were determined previously taking into consideration the sampling rate, the quantification word width M, the individual tinnitus frequency 20 as well as the blocking range 22 (block 240).
(26) In a subsequent noise suppression 260, the so-called dithering, digitalization roundings are randomized in the filtered audio signal. In a signal post-processing 270, the filtered audio signal is then converted to a freely selectable data format and made available as a therapy signal 12. For example, the data format of the original audio signal 10 is used.
(27)
(28) The audio signal 10, 12 to be analyzed is analyzed in sections, wherein one section comprises for example 576 audio samples at a sampling rate of 44.1 kHz and is called a granule below. Moreover, if present, the left stereo channel and the right stereo channel of the audio signal can be analyzed individually.
(29) Each granule of the audio signal 10, 12 to be analyzed is analyzed in the frequency range on the basis of the functionality of human hearing. The modeling of human hearing is generally based on auditory filters with a different and usually relative bandwidth. These are for example the frequency groups according to Zwicker, i.e. the so-called bark scale, or the equivalent rectangular bandwidth, i.e. the so-called ERB scale (Equivalent Rectangular Bandwidth) according to Moore. Both the bark scale and the ERB scale are linked with the frequency non-linearly and selected such that the division of the scale into integer scale sections corresponds with the signal processing of human hearing. For a differentiated analysis, each scale section can respectively be divided into several, for example three, parts. Such a part is called a partition band below and has for example a width of bark or ERB.
(30) In each granule of the audio signal 10, 12 to be analyzed, an auditory energy contained in the partition band is determined for each partition band (block 310). This takes place for example using a Fast Fourier Transformation, FFT, and assuming a sound pressure level, which leads to a volume that is considered moderate when listening to the audio signal 10, 12. For example, the audio signal 10, 12 to be analyzed is thereby scaled such that the maximum sound pressure level is approximately 70 dB.
(31) Furthermore, a tonality is determined for each partition band in each granule (block 320). The tonality is a measure for whether a sound event is noise-like, i.e. wide-band, or tonal, i.e. narrow-band. It can be determined for example via the predictability or periodicity of the audio signal over time, wherein an observation of several successive temporal sections of the audio signal 10, 12 to be analyzed is required. Alternative determination processes, for example based on the distribution of the sound energy in the frequency spectrum of the actually analyzed granule, in particular within the individual partition bands of the granule, are thus preferred.
(32) If the actual partition band lies in full or in part outside of the therapeutic target range, then an excitation strength of the actual partition band determines an excitation strength first based on the auditory energy and the tonality (block 330), wherein it can be taken into consideration that noise-like sound events are perceived stronger or louder than tonal sound events at the same sound pressure. It can also be taken into consideration that higher frequencies are perceived weaker or less loud than deeper tones at the same sound energy in that for example the excitation strength is reduced when the actual partition band lies above the tinnitus frequency 20 or respectively above the therapeutic target range. The excitation strength is a measure for the stimulation of the neurons of the primary auditory cortex tonotopic to the actual partition band.
(33) A damping strength is determined from the excitation strength for the actual partition band for all other partition bands (block 332), which is a measure for the lateral inhibition of the neurons respectively tonotopic to the other partition bands. In particular neuroacoustic or psychoacoustic spreading functions are used for this in particular. It is thereby taken into consideration in particular that the range of the lateral inhibition depends greatly on the excitation strength or the strength of the stimulation of the neurons tonotopic to the actual partition band. The greater the excitation strength, the greater the frequency range or respectively the number of neighboring partition bands, in which the lateral inhibition shows relevant effects. If empirically psychoacoustic spreading functions are used, a correction based on the frequency curves of the same volume (isophones) according to ISO 226:2003 thus preferably takes place in order to compensate in particular for frequency-evaluating properties of the outer, middle and inner ear.
(34) If the actual partition band lies within the therapeutic target range, which is defined in particular by the tinnitus frequency 20 and the blocking range 22, an excitation strength is also determined (block 334). It is thereby then differentiated whether the audio signal 10, 12 to be analyzed is an unfiltered or original audio signal 10 or a therapy signal 12.
(35) In the case of an unfiltered audio signal 10, the excitation strength is set to zero. The unfiltered audio signal 10 is treated correspondingly as if it had been processed with an ideally damping band-stop filter with infinitely narrow transition areas.
(36) In the case of a filtered audio or a therapy signal 12, the excitation strength as in the case of a partition band is determined outside the therapeutic target range or respectively blocking range.
(37) The analysis steps 310, 320, 330, 332, 334 described above are repeated for all partition bands of a granule. An excitation strength and a plurality of damping strengths are then available for each partition band of the granule.
(38) The damping strengths for each partition band are combined respectively into a total damping strength for this partition band (block 340). This takes place for example by means of intensity addition, by means of non-linear addition or by means of maximum value calculation.
(39) Optionally, the excitation strengths and the total damping strengths of all partition bands of a granule are corrected with respect to such one or more other granules (block 350).
(40) Through a correction with respect to one or more preceding granules, it can be taken into consideration for example that a strong excitation or damping of a neuronal areal continues to have an effect for a short time even after fading of the stimulus.
(41) Accordingly, it can be taken into consideration through correction with respect to a simultaneous granule for another channel of the audio signal 10, 12 that, if applicable, an excitation of the neurons responsible for an ear results in a damping of the neurons responsible for the other ear.
(42) The total damping strengths of those partition bands lying within the therapeutic target range are subsequently combined into an inhibition parameter 30 (block 360), which takes place for example by means of intensity addition, by means of non-linear addition or by means of maximum value calculation. If the audio signal to be analyzed is a therapy signal 12, the excitation strengths of the partial bands lying within the blocking range with the opposite sign are also included.
(43) An inhibition parameter 30 is thus then available for each granule, which is a measure for an inhibition of neuronal activity in the primary auditory cortex of the actual granule of the analyzed therapy signal 12 or respectively of a therapy signal created from the analyzed unfiltered audio signal 10.
(44) Additional parameters, which also correlate with the inhibition of neuronal activity in the primary auditory cortex, can be determined from the inhibition parameters 30.
(45) In particular, the simultaneous granules of the two stereo channels can be combined into a sum parameter and into a difference parameter. The sum parameter, for which the inhibition parameters 30 of the granules of both stereo channels are considered in particular with the same signs, is for example a measure for the therapy potential of the audio signal 10, 12. This also applies if the therapy takes place by means of a loudspeaker and thus both ears are both equally exposed to the two stereo channels. The difference parameter, for which the inhibition parameters 30 of the granules of both stereo channels are considered in particular with different signs, specifies in contrast how the therapy potential of the audio signal 10, 12 is distributed to the stereo channels. This is interesting in particular for when the therapy takes place with headphones and thus each ear is exposed to one stereo signal.
(46) The inhibition parameters 30, the sum parameters or the difference parameters of all granules of an audio signal 10, 12 can also be combined into one total parameter, which accordingly specifies in particular the therapeutic potential of the total audio signal 10, 12. This takes place for example by means of intensity addition, non-linear addition, maximum value formation or even average value formation.
(47)
(48) The implementations according to
(49)
(50) The method according to the invention is also suitable for preparing or processing audio signals for a tinnitus therapy for use with playback device that have a non-linear frequency path.
(51) For example, commercially available headphones often have a non-linear frequency path due to their design or manipulated in a targeted manner, wherein the non-linearity is generally homogeneous for all models of a series and is correspondingly known or at least determinable.
(52) Through use of a non-linear playback device or a playback device with non-linear frequency path, the therapeutic qualities of the audio signal provided for tinnitus therapy are reduced and the assessment of the therapeutic suitability of the audio signal is falsified according to the above description.
(53) In order to prevent this, an optional correction of the audio signal provided for the therapy is provided within the framework of the invention. An exemplary design of this correction is described in
(54)
(55) In front of the FIR filter 250, a further correction filter 251, designed for example as an FIR filter, is used, by means of which a correction is performed with respect to the non-linearity of the playback device. For example, filter coefficients 241 or correction coefficients 241 from a database are used for this, which are adjusted to the playback device to be corrected. Such frequencies, which are played back in a damped manner due to the non-linearity of the playback device, are increased by the correction filter 251 in the filtered audio signal. Such frequencies, which are played back excessively or strengthened due to the non-linearity of the playback device, are correspondingly damped in the filtered audio signal.
(56) The occurring correction in the audio signal 12 provided for the therapy is also preferably taken into consideration in the determination of the inhibition parameter 30, as shown in
(57) A non-linearity simulation 311 is performed here before the determination of the auditory energy (block 310), in order to correctly consider the non-linearity of the playback device. The non-linearity simulation 311 is thereby based on the correction coefficient 241 already used for the correction filter 251.
(58) All named characteristics, including those taken from the drawings alone and also individual characteristics, which are disclosed in combination with other characteristics, are considered alone and in combination as essential for the invention. Embodiments according to the invention can be realized by individual characteristics, or a combination of several characteristics.