Fast Fourier transform circuit of audio processing device
11630880 · 2023-04-18
Assignee
Inventors
Cpc classification
G06F17/142
PHYSICS
International classification
G06F17/14
PHYSICS
G06F7/48
PHYSICS
Abstract
A fast Fourier transform (FFT) circuit of an audio processing device configured to perform an N-points FFT and including a memory circuit and a butterfly operation unit circuit is provided. The butterfly operation unit circuit reads two points input data from the memory circuit, performs a butterfly operation for the two points input data according to a twiddle factor to generate two points output data, and writes the two points output data into the memory circuit. The butterfly operation unit circuit includes a multiplier and a plurality of adders/subtractors. The multiplier sequentially multiplies real or imaginary coefficients of one of the two points input data by real or imaginary coefficients of the twiddle factor in multiple clock cycles. The multiplier performs a multiplication once in each clock cycle. The adders/subtractors perform addition/subtraction, such that the butterfly operation unit circuit generates the two points output data.
Claims
1. A fast Fourier transform circuit of an audio processing device, configured to perform an N-points fast Fourier transform where N is a power of 2, and comprising: a memory circuit; and a butterfly operation unit circuit, coupled to the memory circuit, reading two points input data from the memory circuit, performing a butterfly operation for the two points input data according to a twiddle factor to generate two points output data, and writing the two points output data into the memory circuit, wherein the butterfly operation unit circuit comprises: a multiplier, sequentially multiplying a real coefficient or an imaginary coefficient of one of the two points input data by a real coefficient or an imaginary coefficient of the twiddle factor in a plurality of clock cycles, wherein the multiplier performs a multiplication once in each of the clock cycles; and a plurality of adder-subtractors, coupled to the multiplier, and performing an addition or a subtraction according to an output of the multiplier or a real coefficient or an imaginary coefficient of the other one of the two points input data, such that the butterfly operation unit circuit generates the two points output data, wherein the clock cycles comprise a first clock cycle and a second clock cycle, the multiplier multiplies the real coefficient of one of the two points input data by one of the real coefficient and the imaginary coefficient of the twiddle factor in the first clock cycle to generate a first multiplication output, and the multiplier multiplies the imaginary coefficient of one of the two points input data by the other one of the real coefficient and the imaginary coefficient of the twiddle factor in the second clock cycle to generate a second multiplication output, wherein the adder-subtractors comprise a first adder-subtractor and a second adder-subtractor, the first adder-subtractor performs an addition or a subtraction according to the first multiplication output and the second multiplication output in the second clock cycle to obtain a first output, and the second adder-subtractor performs an addition or a subtraction according to the first output and the real coefficient of the other one of the two points input data in the second clock cycle to obtain the real coefficient of one of the two points output data, wherein the clock cycles comprise a third clock cycle, the second adder-subtractor performs an addition or a subtraction according to the first output and the real coefficient of the other one of the two points input data in the third clock cycle to obtain the real coefficient of the other one of the two points output data.
2. The fast Fourier transform circuit of the audio processing device according to claim 1, further comprising: a twiddle factor memory circuit, coupled to the butterfly operation unit circuit, and storing the twiddle factor.
3. The fast Fourier transform circuit of the audio processing device according to claim 1, wherein the N-points fast Fourier transform is based on R points, the N-points fast Fourier transform comprises log.sub.RN level stage operations, and the butterfly operation unit circuit performs a butterfly operation N/4 times in an i.sup.th level stage operation and sequentially writes N/2 points output data generated in the i.sup.th stage operation into the memory circuit, wherein R is an integer greater than 1, and i is an integer greater than or equal to 1 and less than or equal to log.sub.RN, wherein the N/2 points output data generated in the i.sup.th level stage operation by the butterfly operation unit circuit are not conjugate symmetric to each other, the memory circuit has (N/2+1)*2 memory addresses, and each of the memory addresses is configured to record a real coefficient or an imaginary coefficient of one of the N/2 points output data.
4. The fast Fourier transform circuit of the audio processing device according to claim 3, wherein in the i.sup.th level stage operation, the butterfly operation unit circuit reads the two points input data from two memory addresses of the memory circuit, and overwrites the two points output data back to the two memory addresses.
5. A fast Fourier transform circuit of an audio processing device, configured to perform an N-points fast Fourier transform where N is a power of 2, and comprising: a memory circuit; and a butterfly operation unit circuit, coupled to the memory circuit, reading two points input data from the memory circuit, performing a butterfly operation for the two points input data according to a twiddle factor to generate two points output data, and writing the two points output data into the memory circuit, wherein the butterfly operation unit circuit comprises: a multiplier, sequentially multiplying a real coefficient or an imaginary coefficient of one of the two points input data by a real coefficient or an imaginary coefficient of the twiddle factor in a plurality of clock cycles, wherein the multiplier performs a multiplication once in each of the clock cycles; and a plurality of adder-subtractors, coupled to the multiplier, and performing an addition or a subtraction according to an output of the multiplier or a real coefficient or an imaginary coefficient of the other one of the two points input data, such that the butterfly operation unit circuit generates the two points output data, wherein the clock cycles comprise a first clock cycle and a second clock cycle, the multiplier multiplies the real coefficient of one of the two points input data by one of the real coefficient and the imaginary coefficient of the twiddle factor in the first clock cycle to generate a first multiplication output, and the multiplier multiplies the imaginary coefficient of one of the two points input data by the other one of the real coefficient and the imaginary coefficient of the twiddle factor in the second clock cycle to generate a second multiplication output, wherein the adder-subtractors comprise a first adder-subtractor and a second adder-subtractor, the first adder-subtractor performs an addition or a subtraction according to the first multiplication output and the second multiplication output in the second clock cycle to obtain a first output, and the second adder-subtractor performs an addition or a subtraction according to the first output and the imaginary coefficient of the other one of the two points input data in the second clock cycle to obtain the imaginary coefficient of one of the two points output data, wherein the clock cycles comprise a third clock cycle, the second adder-subtractor performs an addition or a subtraction according to the first output and the imaginary coefficient of the other one of the two points input data in the third clock cycle to obtain the imaginary coefficient of the other one of the two points output data.
6. A fast Fourier transform circuit of an audio processing device, configured to perform an N-points fast Fourier transform where N is a power of 2, and comprising: a plurality of radix-2 butterfly operation circuits, wherein each of the radix-2 butterfly operation circuits performs steps of: receiving input data, and performing a butterfly operation for the input data according to one twiddle factor of M twiddle factors to generate output data, wherein the butterfly operation comprises a plurality of additions and subtractions and a plurality of multiplications decomposed based on a complex operation, and M is a positive integer less than N/2, wherein each of the radix-2 butterfly operation circuits sequentially perform the additions or the subtractions and the multiplications in a plurality of clock cycles, and the multiplication is performed at most once in each of the clock cycles, wherein one single multiplier is used in each of the radix-2 butterfly operation circuits to perform the multiplications, wherein each of the radix-2 butterfly operation circuits further comprises: a memory circuit, configured to store the input data and the output data, and having (N/2+1)*2 memory addresses, wherein each of the memory addresses is configured to record a real coefficient or an imaginary coefficient of one of N/2 points output data.
7. The fast Fourier transform circuit of the audio processing device according to claim 6, wherein each of the radix-2 butterfly operation circuits further comprises: a plurality of adder-subtractors, coupled to the multiplier, and performing an addition or a subtraction according to an output of the multiplier or a real coefficient or an imaginary coefficient of the input data, such that the butterfly operation unit circuit generates the output data.
8. The fast Fourier transform circuit of the audio processing device according to claim 6, further comprising: a read-only memory, coupled to each of the radix-2 butterfly operation circuits, and storing the M twiddle factors, wherein M is equal to N/4.
9. The fast Fourier transform circuit of the audio processing device according to claim 6, wherein at least part of the radix-2 butterfly operation circuits constitute one or more butterfly operation circuits having a radix greater than 2.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the present disclosure and, together with the description, serve to explain the principles of the present disclosure.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
DETAILED DESCRIPTION
(13) Reference will now be made in detail to the present preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the description to refer to the same or like parts.
(14)
(15)
(16) In one embodiment, the FFT circuit 62 includes a memory circuit 621, a butterfly operation unit circuit 622 and a twiddle factor memory circuit 623. The memory circuit 621 may be a static random-access memory (SRAM) configured to buffer data during the FFT operation, but not limited thereto. The memory circuit 621 may be coupled to the butterfly operation unit circuit 622 via an internal bus. In addition, the twiddle factor memory circuit 623 is coupled to the FFT circuit 62 and used to store the twiddle factor. The twiddle factor memory circuit 623 may be a read-only memory (ROM) or other non-transitory memory.
(17) In one embodiment, the N-points fast Fourier transform performed by the FFT circuit 62 is based on R points, i.e., the N-points fast Fourier transform includes log.sub.RN level stage operations. R is an integer greater than 1. However, in order to clearly illustrate the invention, the following description will continue by taking R=2 as an example. That is, the butterfly operation unit circuit 622 performs a radix-2 butterfly operation. The butterfly operation unit circuit 622 reads two points input data from the memory circuit 621, and performs the radix-2 butterfly operation for the two points input data according to the twiddle factor to generate two points output data. Here, the FFT circuit 62 may read the twiddle factor from the twiddle factor memory circuit 623. Then, the butterfly operation unit circuit 622 writes the two points output data into the memory circuit 621. In one embodiment, the butterfly operation unit circuit 622 reads the two points input data from a plurality of memory addresses of the memory circuit 621, and overwrites the two points output data generated by the butterfly operation back to the same memory addresses, so as to reuse a storage space of the memory circuit 621.
(18) It should be noted that, the butterfly operation described above includes a plurality of additions/subtractions and a plurality of multiplications decomposed based on a complex number operation.
(19) In one embodiment, the butterfly operation unit circuit 622 may sequentially perform the additions/subtractions and the multiplications in a plurality of clock cycles, and the multiplication is performed at most once in each of the clock cycles. In other words, the butterfly operation unit circuit 622 may use one single multiplier to perform the multiplications. More specifically, the butterfly operation unit circuit 622 may include a multiplier 6221 and a plurality of adders/subtractors 6222 coupled to the multiplier 6221. The multiplier 6221 may sequentially multiply a real coefficient or an imaginary coefficient of one of the two points input data (e.g., x.sub.2[k] in the previous example) by a real coefficient or an imaginary coefficient of the twiddle factor in the plurality of clock cycles. It should be noted that, the multiplier 6221 performs the multiplication at most once in each of the plurality of clock cycles. The adders/subtractors 6222 perform an addition/subtraction according to an output of the multiplier 6221 or a real coefficient or an imaginary coefficient of the other one of the two points input data (e.g., x.sub.1[k] in the previous example), such that the butterfly operation unit circuit 622 may generate the two points output data accordingly.
(20) In one embodiment, the butterfly operation unit circuit 622 may sequentially perform the additions/subtractions and the multiplication operation in the plurality of clock cycles. These clock cycles include a first clock cycle and a second clock cycle and a third clock cycle. The multiplier 6221 multiplies the real coefficient of one of the two points input data by one of the real coefficient and the imaginary coefficient of the twiddle factor in the first clock cycle to generate a first multiplication output, and the multiplier 6221 multiplies the imaginary coefficient of one of the two points input data by the other one of the real coefficient and the imaginary coefficient of the twiddle factor in the second clock cycle to generate a second multiplication output.
(21) Specifically, in one embodiment, the multiplier 6221 may first multiply the real coefficient of one of the two points input data by the real coefficient of the twiddle factor in the first clock cycle to generate the first multiplication output. Then, the multiplier 6221 may multiply the imaginary coefficient of one of the two points input data by the imaginary coefficient of the twiddle factor in the second clock cycle to generate the second multiplication output. Alternatively, in one embodiment, the multiplier 6221 may first multiply the imaginary coefficient of one of the two points input data by the imaginary coefficient of the twiddle factor in the first clock cycle to generate the first multiplication output. Then, the multiplier 6221 may multiply the real coefficient of one of the two points input data by the real coefficient of the twiddle factor in the second clock cycle to generate the second multiplication output. It should be noted that, the first multiplication output and the second multiplication output of the above two configurations can be used to generate a real coefficient of the two points output data.
(22) In one embodiment, the adders/subtractors 6222 include a first adder/subtractor and a second adder/subtractor. In the case where the first multiplication output and the second multiplication output for generating the real coefficients of the two points output data have been generated, the first adder/subtractor performs an addition/subtraction according to the first multiplication output and the second multiplication output in the second clock cycle to obtain a first addition/subtraction output. Further, the second adder/subtractor performs an addition/subtraction according to the first addition/subtraction output and the real coefficient of the other one of the two points input data in the second clock cycle to obtain the real coefficient of one of the two points output data. Then, the second adder/subtractor may perform an addition/subtraction according to the first addition/subtraction output and the real coefficient of the other one of the two points input data in the third clock cycle to obtain the real coefficient of the other one of the two points output data.
(23) On the other hand, in one embodiment, the multiplier 6221 may multiply the real coefficient of one of the two points input data by the imaginary coefficient of the twiddle factor in the first clock cycle to generate the first multiplication output. Then, the multiplier 6221 may multiply the imaginary coefficient of one of the two points input data by the real coefficient of the twiddle factor in the second clock cycle to generate the second multiplication output. Alternatively, in one embodiment, the multiplier 6221 may first multiply the imaginary coefficient of one of the two points input data by the real coefficient of the twiddle factor in the first clock cycle to generate the first multiplication output. Then, the multiplier 6221 may multiply the real coefficient of one of the two points input data by the imaginary coefficient of the twiddle factor in the second clock cycle to generate the second multiplication output. It should be noted that, the first multiplication output and the second multiplication output of the above two configurations can be used to generate an imaginary coefficient of the two points output data.
(24) In one embodiment, the adders/subtractors 6222 include a first adder/subtractor and a second adder/subtractor. In the case where the first multiplication output and the second multiplication output for generating the imaginary coefficients of the two points output data have been generated, the first adder/subtractor performs an addition/subtraction according to the first multiplication output and the second multiplication output in the second clock cycle to obtain a first addition/subtraction output. Further, the second adder/subtractor performs an addition/subtraction according to the first addition/subtraction output and the imaginary coefficient of the other one of the two points input data in the second clock cycle to obtain the imaginary coefficient of one of the two points output data. Then, the second adder/subtractor may perform an addition/subtraction according to the first addition/subtraction output and the imaginary coefficient of the other one of the two points input data in the third clock cycle to obtain the imaginary coefficient of the other one of the two points output data.
(25) For clear description,
(26) In a clock cycle CC92, the butterfly operation unit circuit 622 reads an imaginary coefficient b of the input data x.sub.2[k] and an imaginary coefficient d of the twiddle factor W.sub.N.sup.k respectively from the memory circuit 621 and the twiddle factor memory circuit 623, and records the imaginary coefficients in the register circuit. The multiplier 6221 may multiply the real coefficient a of the input data x.sub.2[k] by the real coefficient c of the twiddle factor W.sub.N.sup.k in the clock cycle CC92 to generate a multiplication output a*c, and record it in the register circuit.
(27) In a clock cycle CC93, the butterfly operation unit circuit 622 reads a real coefficient e of the input data x.sub.1[k] from the memory circuit 621 and records the real coefficient in the register circuit. The multiplier 6221 may multiply the imaginary coefficient b of the input data x.sub.2[k] by the imaginary coefficient d of the twiddle factor W.sub.N.sup.k in the clock cycle CC93 to generate a multiplication output b*d, and record it in the register circuit. The first adder/subtractor 6222_1 performs a subtraction according to the multiplication output a*c and the multiplication output b*d in the clock cycle CC93 to obtain a subtraction output a*c-b*d, and records it in the register circuit. The second adder/subtractor 6222_2 performs an addition according to the subtraction output a*c−b*d and the real coefficient e of the input data x.sub.1[k] in the clock cycle CC93 to obtain a real coefficient e+(a*c−b*d) of output data.
(28) In a clock cycle CC94, the butterfly operation unit circuit 622 reads an imaginary coefficient f of the input data x.sub.1[k] from the memory circuit 621 and records the imaginary coefficient in the register circuit. The multiplier 6221 may multiply the real coefficient a of the input data x.sub.2[k] by the imaginary coefficient d of the twiddle factor W.sub.N.sup.k to generate a multiplication output a*d, and record it in the register circuit. The second adder/subtractor 6222_2 performs a subtraction according to the subtraction output a*c−b*d and the real coefficient e of the input data x.sub.1[k] in the clock cycle CC94 to obtain a real coefficient e−(a*c−b*d) of another output data.
(29) In a clock cycle CC95, the multiplier 6221 may multiply the imaginary coefficient b of the input data x.sub.2[k] by the real coefficient c of the twiddle factor W.sub.N.sup.k in the clock cycle CC95 to generate a multiplication output b*c, and record it in the register circuit. The first adder/subtractor 6222_1 performs an addition according to the multiplication output a*d and the multiplication output b*c in the clock cycle CC95 to obtain an addition output a*d+b*c, and records it in the register circuit. The second adder/subtractor 6222_2 performs an addition according to the addition output a*d+b*c and the imaginary coefficient f of the input data x.sub.1[k] in the clock cycle CC95 to obtain an imaginary coefficient f+(a*d+b*c) of the output data. Lastly, in a clock cycle CC96, the second adder/subtractor 6222_2 performs a subtraction according to the addition output a*c+b*d and the imaginary coefficient f of the input data x.sub.1[k] to obtain an imaginary coefficient f−(a*d+b*c) of the another output data.
(30) Based on the description of
(31) In one embodiment, the FFT circuit 62 may include one butterfly operation unit circuit 622 to complete the FFT operation by reusing the butterfly operation unit circuit 622. In one embodiment, the FFT circuit 62 may include a plurality of radix-2 butterfly operation circuits similar in structure to the butterfly operation unit circuit 622. In one implementation, these radix-2 butterfly operation circuits may constitute one or more butterfly operation circuits having a radix greater than 2, such as a radix-4 butterfly operation circuit or a radix-8 butterfly operation circuit. Alternatively, in one implementation, these radix-2 butterfly operation circuits in the FFT circuit 62 may perform all butterfly operations of one specific stage in parallel. For instance, the FFT circuit 62 may include 4 radix-2 butterfly operation circuits similar in structure to the butterfly operation unit circuit 622 to perform 4 butterfly operations of one specific stage in the FFT operation in parallel.
(32) The following description is provided by taking an example in which the FFT circuit 62 completes the FFT operations by reusing one single butterfly operation unit circuit 622.
(33) Referring to
(34) In detail, the FFT circuit 62 may first perform an odd-even separation on the 512 points input data x[0] to x[511], and obtains 256 input data (e.g., {x[0], x[1]}, {x[256], x[257]}, {x[128], x[129]} {x[510], x[511]}) for the subsequent butterfly operations. That is to say, the above operations can first divide the 512 input data into 256 arrays such that each array is a parity pair from the perspective of address. In a 1.sup.st level stage operation (i.e., i=1), the butterfly operation unit circuit 622 sequentially obtains the 256 input data and uses one single multiplier 6221 to perform Pt to 128.sup.th butterfly operations (128 times in total) in time-sharing manner to generate 256 operation results. Two operation results of the butterfly operation performed each time may include the real coefficient and the imaginary coefficient and will be stored in the memory circuit 621. In a 2.sup.nd level stage operation (i.e., i=2), the butterfly operation unit circuit 622 sequentially obtains the 256 operation results of the Pt level stage operation from the memory circuit 621 and uses one single multiplier 6221 to perform 129.sup.th to 256.sup.th butterfly operations (128 times in total) again in time-sharing manner to generate 256 operation results.
(35) By analogy, in an 8.sup.th level stage operation (i.e., i=8), the butterfly operation unit circuit 622 sequentially obtains 256 operation results of the 7.sup.th level stage operation from the memory circuit 621 and uses one single multiplier 6221 to perform 897.sup.th to 1024.sup.th butterfly operations (128 times in total) again in time-sharing manner to generate 256 operation results. It should be noted that, in a 9.sup.th level stage operation (i.e., i=9), the butterfly operation unit circuit 622 obtains the 256 operation results of the 8.sup.th level stage operation from the memory circuit 621 and performs a conjugate symmetric transform operation 128 times to obtain complete 512 points output data X[0] to X[511]. Based on the above, for 512 real number points FFT of this embodiment, in the butterfly operations of the 1.sup.st to 8.sup.th level stages, 256 complex number results are generated in each stage, and the 256 complex number results are not conjugate symmetric to each other. In the butterfly operation of the 9.sup.th level stage, another 256 complex result are derived according to the conjugate symmetry, and thus 512 complex results are obtained in total.
(36) As can be known, each time after the butterfly operation unit circuit 622 performs one specific stage operation (i.e., the 1.sup.st stage operation to the 9.sup.th stage operation), the butterfly operation unit circuit 622 may record the operation result in the memory circuit 621. Here, based on the aforementioned symmetry property, this embodiment only needs to record a total of 257 complex number results, including a peak value (the 256.sup.th complex number result) and symmetry values (X [0] to X [255]). Further, as each complex result includes the real coefficient and the imaginary coefficient, the memory circuit 621 needs 257*2 memory addresses in order to complete one 512 points FFT operation. Each of the memory addresses is configured to record the real coefficient or the imaginary coefficient of the complex result.
(37) In addition, in one embodiment, when the FFT circuit 62 includes the radix-2 butterfly operation circuits similar in structure to the butterfly operation unit circuit 622, each of the radix-2 butterfly operation circuits performs the following operations of: receiving input data, and performing a butterfly operation for the input data according one twiddle factor of M twiddle factors to generate output data. The butterfly operation includes a plurality of additions/subtractions and a plurality of multiplications decomposed based on a complex operation, and M is a positive integer less than N/2. Each of the radix-2 butterfly operation circuits sequentially performs the additions/subtractions and the multiplications in a plurality of clock cycles, and the multiplication is performed at most once in each of the clock cycles. In one embodiment, the read-only memory used as the twiddle factor memory circuit 623 may be coupled to the radix-2 butterfly operation circuits and configured to store M twiddle factors, wherein M is equal to N/4.
(38) In detail, based on the example in
(39) In summary, in the embodiments of the invention, the butterfly operation of the FFT can be implemented by reusing the multiplier, thereby greatly reducing the number of multipliers. In addition, by optimizing a look-up table recorded with the twiddle factors, the use of the read-only memory can be effectively reduced. As a result, the FFT circuit in the embodiments of the invention can greatly reduce the hardware cost and reduce the circuit area.
(40) Lastly, it should be noted that, each of the above embodiments merely serves as an example in the invention instead of limitation thereto. Despite that the invention has been described with reference to above embodiments, it will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the technical content disclosed in above embodiments of the invention without departing from the scope or spirit of the invention. In view of the foregoing, it is intended that the invention cover modifications and variations of this invention provided they fall within the scope of the following claims and their equivalents.