AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD

Abstract

An upper limit of a frequency range of audio indicated by input audio data is detected. A representative point extraction unit downsamples the input audio data to a sampling rate set to be less than or equal to twice the detected upper limit to obtain representative-point audio data. An interpolation processing unit upsamples the representative-point audio data by using a fractal interpolation function (FIF) that uses a mapping function calculated by a mapping function calculation unit, while using the input audio data, if necessary, to generate high-frequency interpolated audio data.

Claims

1. An audio processing apparatus for performing high-frequency interpolation of audio data, comprising: a frequency range upper-limit detector configured to detect an upper limit of a frequency range of audio indicated by input audio data that is audio data to be subjected to the high-frequency interpolation; a downsampler configured to downsample the input audio data by decimating samples from the input audio data so as to achieve a sampling rate that is less than or equal to twice the upper limit detected by the frequency range upper-limit detector to generate intermediate audio data; and an upsampler configured to upsample the intermediate audio data generated by the downsampler using a fractal interpolation function to generate high-frequency interpolated audio data.

2. The audio processing apparatus according to claim 1, wherein the downsampler is configured to downsample the input audio data to a maximum sampling rate equal to the sampling rate that is less than or equal to twice the upper limit detected by the frequency range upper-limit detector among sampling rates that are reciprocals of powers of 2 of the sampling rate of the input audio data, by decimating samples from the input audio data, to generate intermediate audio data, and wherein the upsampler is configured to upsample the intermediate audio data to a sampling rate that is a multiple of a power of 2 of a sampling rate of the intermediate audio data using a fractal interpolation function to generate the high-frequency interpolated audio data.

3. The audio processing apparatus according to claim 2, wherein the input audio data comprises audio data obtained by decoding compressed and encoded audio data, and wherein the frequency range upper-limit detector is configured to detect the upper limit of the frequency range of the audio indicated by the input audio data on the basis of a bit rate indicating a number of bits of the compressed and encoded audio data to be processed per unit time when the compressed and encoded audio data is reproduced.

4. The audio processing apparatus according to claim 1, wherein the input audio data comprises audio data obtained by decoding compressed and encoded audio data, and wherein the frequency range upper-limit detector is configured to detect the upper limit of the frequency range of the audio indicated by the input audio data on the basis of a bit rate indicating the number of bits of the compressed and encoded audio data to be processed per unit time when the compressed and encoded audio data is reproduced.

5. An audio processing method for an audio processing apparatus that performs audio processing, for performing high-frequency interpolation of audio data, the audio processing method comprising: a frequency range upper-limit detection step of detecting, by the audio processing apparatus, an upper limit of a frequency range of audio indicated by input audio data that is audio data to be subjected to the high-frequency interpolation; a downsampling step of downsampling, by the audio processing apparatus, the input audio data by decimating samples from the input audio data so as to achieve a sampling rate that is less than or equal to twice the upper limit detected in the frequency range upper-limit detection step to generate intermediate audio data; and an upsampling step of upsampling, by the audio processing apparatus, the intermediate audio data generated in the downsampling step by using a fractal interpolation function to generate high-frequency interpolated audio data.

6. The audio processing method according to claim 5, wherein the downsampling step includes downsampling the input audio data to a maximum sampling rate equal to the sampling rate that is less than or equal to twice the upper limit detected in the frequency range upper-limit detection step among sampling rates that are reciprocals of powers of 2 of the sampling rate of the input audio data, by decimating samples from the input audio data, to generate intermediate audio data, and wherein the upsampling step includes upsampling the intermediate audio data to a sampling rate that is a multiple of a power of 2 of a sampling rate of the intermediate audio data by using a fractal interpolation function to generate the high-frequency interpolated audio data.

7. The audio processing method according to claim 6, wherein the input audio data comprises audio data obtained by decoding compressed and encoded audio data, and wherein the frequency range upper-limit detection step includes detecting the upper limit of the frequency range of the audio indicated by the input audio data on the basis of a bit rate indicating the number of bits of the compressed and encoded audio data to be processed per unit time when the compressed and encoded audio data is reproduced.

8. The audio processing method according to claim 5, wherein the input audio data comprises audio data obtained by decoding compressed and encoded audio data, and wherein the frequency range upper-limit detection step includes detecting the upper limit of the frequency range of the audio indicated by the input audio data on the basis of a bit rate indicating the number of bits of the compressed and encoded audio data to be processed per unit time when the compressed and encoded audio data is reproduced.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] FIG. 1 is a block diagram illustrating a configuration of an audio processing apparatus;

[0016] FIG. 2 illustrates functional blocks in the audio processing apparatus for performing high-frequency interpolation;

[0017] FIG. 3 is a flowchart illustrating a high-frequency interpolation operation control process;

[0018] FIGS. 4A to 4C illustrate examples of a high-frequency interpolation operation;

[0019] FIGS. 5A1 and 5A2 illustrate an example of the high-frequency interpolation operation;

[0020] FIGS. 5B1 and 5B2 illustrate an example of the high-frequency interpolation operation;

[0021] FIG. 6 illustrates the principle of high-frequency interpolation using an FIF; and

[0022] FIGS. 7A and 7B illustrate the results of high-frequency interpolation of the related art and the results of high-frequency interpolation according to forms of the present disclosure, respectively, for illustrative comparison.

DETAILED DESCRIPTION OF THE DRAWINGS

[0023] FIG. 1 illustrates a configuration of an audio processing apparatus.

[0024] As illustrated in FIG. 1, the audio processing apparatus includes an audio source 1, an input processing unit 2, a digital sound processor 3, an amplifier 4, a speaker 5, and a control unit 6 that controls the audio source 1, the input processing unit 2, the digital sound processor 3, and the amplifier 4. In the configuration described above, the audio source 1 is a device that outputs audio data, such as a recording medium having audio files recorded thereon or a broadcast receiver that receives audio data.

[0025] The input processing unit 2 captures audio data from the audio source 1 in accordance with control of the control unit 6, performs pre-processing such as decoding the captured audio data, if necessary, and outputs the audio data subjected to the pre-processing to the digital sound processor 3 as input audio data.

[0026] Further, prior to the output of the input audio data to the digital sound processor 3, the input processing unit 2 detects the sampling rate of the input audio data and the upper limit of the frequency range of audio indicated by the input audio data, and notifies the control unit 6 of the detected input sampling rate and upper limit. The upper limit of the frequency range of the audio indicated by the audio data may be detected by analyzing the frequency spectra of the input audio data. Alternatively, if the audio data captured from the audio source 1 is compressed and encoded audio data, the upper limit of the frequency range of the audio indicated by the input audio data may be detected in accordance with the bit rate of the compressed and encoded audio data (the number of bits of the compressed and encoded audio data to be processed per second when reproduced). In a case where the upper limit of the frequency range of the audio indicated by the input audio data is detected in accordance with the bit rate of the compressed and encoded audio data, the relationship between the bit rate of audio data and the upper limit of the frequency range of audio indicated by the audio data is registered in advance and the upper limit of the frequency range of audio indicated by input audio data is detected in accordance with the registered relationship.

[0027] The digital sound processor 3 is a processor that performs audio processing in accordance with a preset program. The digital sound processor 3 performs audio processing, such as high-frequency interpolation, on the input audio data input from the input processing unit 2 in accordance with control of the control unit 6, and outputs the resulting audio data to the amplifier 4 as output audio data.

[0028] The amplifier 4 amplifies the output audio data input from the digital sound processor 3 with a gain set by the control unit 6, and outputs the resulting audio data to the speaker 5.

[0029] FIG. 2 illustrates a configuration of functions of the digital sound processor 3 for performing high-frequency interpolation.

[0030] As illustrated in FIG. 2, the digital sound processor 3 includes a representative point extraction unit 31, a mapping function calculation unit 32, and an interpolation processing unit 33. The operation of the representative point extraction unit 31, the mapping function calculation unit 32, and the interpolation processing unit 33 will be described below.

[0031] The representative point extraction unit 31, the mapping function calculation unit 32, and the interpolation processing unit 33 of the digital sound processor 3 are implemented by, as described above, the digital sound processor 3 executing a preset program.

[0032] The control unit 6 is a processor that performs processes in accordance with preset programs, and performs a high-frequency interpolation operation control process as one of the processes in accordance with the preset programs.

[0033] FIG. 3 illustrates the procedure of the high-frequency interpolation operation control process.

[0034] In the high-frequency interpolation operation control process, as illustrated in FIG. 3, the control unit 6 monitors the input processing unit 2 to determine whether the input processing unit 2 has provided a notification of the sampling rate of the input audio data and the upper limit of the frequency range of audio indicated by the input audio data (step 302).

[0035] When the notification is provided, the control unit 6 determines the sampling rate of representative-point audio data so that the sampling rate of the representative-point audio data is less than or equal to twice the upper limit of the frequency range of the audio indicated by the input audio data, and sets the determined sampling rate of the representative-point audio data in the representative point extraction unit 31 (step 304).

[0036] The sampling rate of the representative-point audio data is assumed to be, for example, a maximum sampling rate that is less than or equal to twice the upper limit of the frequency range of the audio indicated by the input audio data from the input processing unit 2 among sampling rates that are reciprocals of powers of 2 of the sampling rate of the input audio data.

[0037] The representative point extraction unit 31 in which the sampling rate of the representative-point audio data has been set in the way described above downsamples, for each unit processing interval T that is a time interval having a predetermined time length, the input audio data to audio data having a sampling rate set as the sampling rate of the representative-point audio data. The representative point extraction unit 31 outputs the downsampled audio data to the mapping function calculation unit 32 and the interpolation processing unit 33 as representative-point audio data for the unit processing interval T.

[0038] The downsampling of the input audio data to the representative-point audio data is performed by selecting a sample as a representative point from among samples within the unit processing interval T of the input audio data so that the sampling rate of the representative point is equal to the set sampling rate of the representative-point audio data and by using, as representative-point audio data within the unit processing interval T, audio data obtained by decimating samples other than the sample selected as the representative point from the input audio data.

[0039] That is, for example, a sampling rate that is one half of the sampling rate of the input audio data is set as the sampling rate of the representative-point audio data. In this case, every second sample is extracted as a representative point from samples indicated by white circles in FIG. 4A within the unit processing interval T of the input audio data, and, as indicated by black circles in FIG. 4B, samples of the input audio data extracted as representative points are used as samples of representative-point audio data within the unit processing interval T.

[0040] For another example, a sampling rate that is one-quarter of that of the input audio data is set as the sampling rate of the representative-point audio data. In this case, every fourth sample is extracted as a representative point from samples indicated by white circles in FIG. 4A within the unit processing interval T of the input audio data, and, as indicated by black circles in FIG. 4C, samples of the input audio data extracted as representative points are used as samples of representative-point audio data within the unit processing interval T.

[0041] In the following description, an interval t.sub.i between adjacent samples within the unit processing interval T of the representative-point audio data generated in the way described above is referred to as an “interpolation interval”.

[0042] Upon receipt of the representative-point audio data in the manner described above, the mapping function calculation unit 32 calculates, for each interpolation interval t.sub.i of the unit processing interval T, a mapping function ω.sub.i that is a contraction mapping of a signal for the unit processing interval T of the input audio data into the interpolation interval t.sub.i, as a mapping function ω.sub.i for the interpolation interval t.sub.i and sets the mapping function ω.sub.i in the interpolation processing unit 33.

[0043] The calculation of a mapping function ω.sub.i for an interpolation interval t.sub.i is performed in the following way.

[0044] When x.sub.i represents a time position of the i-th sample within a unit time interval T of representative-point audio data and y.sub.i represents a sample value (magnitude) of the i-th sample within the unit time interval T, a.sub.i, e.sub.i, c.sub.i, and f.sub.i are defined by Eqs. (1) to (4) below, respectively. In the equations, x.sub.0 represents the time position of a sample that is the start point of the unit time interval T of the representative-point audio data, y.sub.0 represents the sample value (magnitude) of the sample that is the start point of the unit time interval T, x.sub.M represents the time position of a sample that is the end point of the unit time interval T of the representative-point audio data, and y.sub.M represents the sample value (magnitude) of the sample that is the end point of the unit time interval T.

[00001] $\begin{matrix} a_{i} = \frac{x_{i} - x_{i - 1}}{x_{M} - x_{0}} & Eq . .Math. (1) \\ e_{i} = \frac{x_{M} .Math. x_{i - 1} - x_{0} .Math. x_{i}}{x_{M} - x_{0}} & Eq . .Math. (2) \\ c_{i} = \frac{y_{i} - y_{i - 1}}{x_{M} - x_{0}} - d_{i} .Math. \frac{y_{M} - y_{0}}{x_{M} - x_{0}} & Eq . .Math. (3) \\ f_{i} = \frac{x_{M} .Math. y_{i - 1} - x_{0} .Math. y_{i}}{x_{M} - x_{0}} - d_{i} .Math. \frac{x_{M} .Math. y_{0} - x_{0} .Math. y_{M}}{x_{M} - x_{0}} & Eq . .Math. (4) \end{matrix}$

[0045] Note that d.sub.i is selected as the value that minimizes Eq. (5) below, where μ.sub.n represents the time position of the n-th sample of the input audio data within the unit time interval T, and ν.sub.n represents the sample value (magnitude) of the n-th sample of the input audio data within the unit time interval T.

E.sub.i=Σ.sub.n=0.sup.N(c.sub.i.Math.u.sub.n+d.sub.i.Math.ν.sub.n+f.sub.i−ν.sub.m).sup.2 Eq. (5)

[0046] In Eq. (5), m is determined by Eq. (6) below, where D represents the time interval between adjacent samples of the input audio data.

m=[a.sub.i.Math.u.sub.n+e.sub.n+0.5 D] Eq. (6)

[0047] In Eq. (6), [ ] is the Gaussian sign and [X] represents a maximum integer that does not exceed X.

[0048] When α.sub.n and β.sub.n are defined by Eqs. (7) and (8), respectively, Eq. (5) can be modified to Eq. (9).

[00002] $\begin{matrix} α_{n} = v_{n} - \frac{y_{M} - y_{0}}{x_{M} - x_{0}} .Math. u_{n} - \frac{x_{M} .Math. y_{0} - x_{0} .Math. y_{M}}{x_{M} - x_{0}} & Eq . .Math. (7) \\ β_{n} = v_{m} - \frac{y_{i} - y_{i - 1}}{x_{M} - x_{0}} .Math. u_{n} - \frac{x_{M} .Math. y_{i - 1} - x_{0} .Math. y_{i}}{x_{M} - x_{0}} & Eq . .Math. (8) \\ E_{i} = {.Math.}_{n = 0}^{N} .Math. {(α_{n} .Math. d_{i} - β_{n})}^{2} & Eq . .Math. (9) \end{matrix}$

[0049] Then, d.sub.i that maximizes Eq. (5) and Eq. (9) can be determined by Eq. (10).

[00003] $\begin{matrix} d_{i} = \frac{{.Math.}_{n = 0}^{N} .Math. α_{n} .Math. β_{n}}{{.Math.}_{n = 0}^{N} .Math. α_{n}^{2}} & Eq . .Math. (10) \end{matrix}$

[0050] Then, the mapping function ω.sub.i for the interpolation interval t.sub.i is set by Eq. (11) below by using a.sub.i, e.sub.i, c.sub.i, and f.sub.i, which are determined in the way described above.

[00004] $\begin{matrix} (\begin{matrix} p_{n} \\ q_{n} \end{matrix}) = ω .Math. .Math. i (\begin{matrix} u_{n} \\ v_{n} \end{matrix}) = (\begin{matrix} a_{i} .Math. u_{n} + e_{i} \\ c_{i} .Math. u_{n} + d_{i} .Math. v_{n} + f_{i} \end{matrix}) & Eq . .Math. (11) \end{matrix}$

[0051] In Eq. (11), p.sub.n represents the time position of the n-th sample of the input audio data within the unit time interval T which has been subjected to mapping using the mapping function ω.sub.i, and q.sub.n represents the sample value (magnitude) of the n-th sample of the input audio data within the unit time interval T which has been subjected to mapping using the mapping function ω.sub.i.

[0052] The calculation of the mapping function ω.sub.i described above may be performed after the duration of each time segment is normalized so that the time length of the unit time interval T is equal to 1, for simplicity of calculation.

[0053] Referring back to FIG. 3, when the sampling rate of the representative-point audio data is set in the representative point extraction unit 31 (step 304), the control unit 6 determines, by calculation, a mapping-source sample location in accordance with the ratio of the sampling rate of the high-frequency interpolated audio data and the sampling rate of the representative-point audio data, and sets the mapping-source sample location in the interpolation processing unit 33 (step 306). Then, the process returns to step 302. The sampling rate of the high-frequency interpolated audio data is a sampling rate set in advance as a sampling rate of high-frequency interpolated audio data that is audio data subjected to high-frequency interpolation by the digital sound processor 3.

[0054] In some forms of the present disclosure, it is assumed that the sampling rate of the high-frequency interpolated audio data and the sampling rate of the input audio data have a relationship in which the sampling rate of the high-frequency interpolated audio data is equal to the sampling rate of the input audio data or is a multiple of a power of 2 of the sampling rate of the input audio data and that the sampling rate of the high-frequency interpolated audio data is a multiple of a power of 2 of the sampling rate of the representative-point audio data.

[0055] The determination of the mapping-source sample location by calculation in step 306 is performed in the following way.

[0056] If the sampling rate of the high-frequency interpolated audio data is a multiple of the n-th power of 2 of the sampling rate of the representative-point audio data, a time position that is a position of division when the unit processing interval T is divided into 2.sup.n time intervals having an equal time length is determined as a mapping-source sample location by calculation. Note that the start point and end point of the unit processing interval T are not determined as mapping-source sample locations by calculation. However, the end point of the unit processing interval T may be determined as a mapping-source sample location by calculation.

[0057] As a result, for example, if the sampling rate of the high-frequency interpolated audio data is twice the sampling rate of the representative-point audio data, as indicated by a double circle in FIG. 5A1 indicating a sample at a mapping-source sample location of the input audio data, the median time position of the unit processing interval T is determined as a mapping-source sample location by calculation. If the sampling rate of the high-frequency interpolated audio data is four times the sampling rate of the representative-point audio data, as indicated by double circles in FIG. 5B1 indicating samples at mapping-source sample locations of the input audio data, a time position that is spaced one-quarter of the time length of the unit processing interval T from the start point of the unit processing interval T, the median time position of the unit processing interval T, and a time position that is spaced one-quarter of the time length of the unit processing interval T from the end point of the unit processing interval T are determined as mapping-source sample locations by calculation.

[0058] The interpolation processing unit 33 in which mapping-source sample locations have been set in the way described above maps, for each interpolation interval t.sub.i of the unit processing interval T, a sample at each mapping-source sample location of the input audio data for the unit processing interval T onto between representative points of the representative-point audio data by using a mapping function ω.sub.i calculated for the interpolation interval t.sub.i by the mapping function calculation unit 32 to upsample the representative-point audio data, and outputs the upsampled audio data as high-frequency interpolated audio data.

[0059] Specifically, for example, as in FIG. 5A1, when the median time position of the unit processing interval T is set as a mapping-source sample location, as illustrated in FIG. 5A2, for each interpolation interval t.sub.i of the representative-point audio data, the sample indicated by the double circle at the mapping-source sample location of the input audio data is mapped onto the median time position of the interpolation interval t.sub.i by using a mapping function ω.sub.i for the interpolation interval t.sub.i to upsample the representative-point audio data, and the upsampled representative-point audio data is output as high-frequency interpolated audio data.

[0060] As in FIG. 5B1, when the time position spaced one-quarter of the unit processing interval T from the start point of the unit processing interval T, the median time position of the unit processing interval T, and the time position spaced one-quarter of the unit processing interval T from the end point of the unit processing interval T are set as mapping-source sample locations, as illustrated in FIG. 5B2, for each interpolation interval t.sub.i of the representative-point audio data, the samples indicated by the three double circles at the mapping-source sample locations of the input audio data are mapped onto a time position that is spaced one-quarter of the time length of the interpolation interval from the start point of the interpolation interval t.sub.i, the median time position of the interpolation interval t.sub.i, and a time position that is spaced one-quarter of the time length of the interpolation interval from the end point of the interpolation interval t.sub.i by using a mapping function ω.sub.i for the interpolation interval t.sub.i to upsample the representative-point audio data, and the upsampled representative-point audio data is output as high-frequency interpolated audio data.

[0061] In the process of the interpolation processing unit 33 described above, if a sample of the input audio data at each mapping-source sample location is included in representative-point audio data as a sample of the representative-point audio data, the sample of the representative-point audio data may be used in place of the sample of the input audio data at the mapping-source sample location.

[0062] The high-frequency interpolated audio data output from the interpolation processing unit 33 in the way described above is output to the amplifier 4 as is or after being subjected to any other audio signal processing such as frequency characteristic adjustment processing by the digital sound processor 3, as output audio data.

[0063] FIG. 7B illustrates the frequency characteristics of the high-frequency interpolated audio data generated in the way described above.

[0064] FIG. 7B illustrates a case where audio data has a sampling rate of 96 kHz, the upper limit of the frequency range of audio indicated by the audio data is 20 kHz, downsampled representative-point audio data has a sampling rate of 48 kHz, and high-frequency interpolated audio data has a sampling rate of 96 kHz. In FIG. 7B, SI represents a frequency characteristic of audio data and SO represents a frequency characteristic of audio data whose high-frequency components have been interpolated by using an FIF.

[0065] As demonstrated from comparison with FIG. 7A described above, in high-frequency interpolated audio data subjected to high-frequency interpolation according to forms of the present disclosure, as illustrated in FIG. 7B, high-frequency components around the upper limit Fmax of the frequency range of the audio indicated by the input audio data are also interpolated without being lost.

[0066] Therefore, embodiments and forms of the present disclosure may enable desired interpolation of high-frequency components of audio data regardless of the upper limit of the frequency range of audio indicated by the audio data.

[0067] It is therefore intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that it is the following claims, including all equivalents, that are intended to define the spirit and scope of this invention.

AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD

Assignee

Inventors

Cpc classification

Classification Explorer

G10L21/0388

PHYSICS

Classification Explorer

H03H17/0621

ELECTRICITY

Classification Explorer

G10L21/038

PHYSICS

Classification Explorer

H03H17/028

ELECTRICITY

International classification

Classification Explorer

G10L21/038

PHYSICS

Abstract

Claims

Description