NTT PROCESSOR INCLUDING A PLURALITY OF MEMORY BANKS
20210318869 · 2021-10-14
Assignee
Inventors
- Joel CATHEBRAS (Palaiseau, FR)
- Alexandre CARBON (MASSY, FR)
- Renaud Sirdey (Cernay-la-Ville, FR)
- Nicolas VENTROUX (Gif-sur-Yvette, FR)
Cpc classification
G06F17/144
PHYSICS
International classification
Abstract
The present invention relates to a stream-based NTT processor comprising: a plurality (K) of processing stages (210.sub.k, k=0, . . . , K−1) organised in a pipeline (210); a plurality (G+1) of memory banks (220.sub.g, g=0, . . . , G); a read management module (260) for reading, within one memory) of a memory bank (220.sub.g) of the processor, sets of twiddle factors intended for parameterising a processing stage (210.sub.k); a write management module (270) for receiving, in the form of successive blocks, a set of twiddle factors and writing said sets of twiddle factors into the memories of a memory bank, the writing being carried out cyclically in the memory banks, each new set of twiddle factors being written into a new memory bank; and a control module for controlling the writing and reading of twiddle factors as well as the progression of data blocks through the processing stages.
Claims
1. A NTT stream processor, comprising: a plurality K of processing stages 210.sub.k k=0, . . . , K−1 organized in pipeline, each stage comprising a permutation module followed by a radix module, each data sequence to be transformed being input to the processor in the form of .sub.p.sub.
2. The NTT stream processor according to claim 1, G+1 wherein G=┌Lat_NTT/T┐, Lat_NTT being the processing latency of the NTT processor and T=N/W is the data sequence flow rate at the input to the NTT processor, each memory bank being associated with the transform of a data sequence and containing the sets of twiddle factors of successive stages for the NTT transform of this sequence.
3. The NTT stream processor according to claim 2, wherein, before a data sequence passes from a first processing stage to a second processing stage, the control module controls reading of the set of twiddle factors in the corresponding memory associated with this second stage within the memory bank associated with this sequence, and programming of the second stage using the set of twiddle factors thus read.
4. The NTT stream processor according to claim 2, wherein the control module comprises a memory banks counter, incremented each time that a new set of twiddle factors is supplied to the write management module and reset to zero after reaching the value G, the output from said counter being compared with a plurality of comparators to be compared with the values g=0, . . . , G, the corresponding outputs from these comparators providing G+1 selection signals enabling one write command in each corresponding memory bank.
5. The NTT stream processor according to claim 4, wherein each memory bank associated with the NTT transformation of a sequence also comprises a register in which is stored the characteristic (p.sub.l) of the field (.sub.p.sub.
6. The NTT stream processor according to claim 5, wherein characteristics of the fields in which are made L NTT transforms of L successive data sequences are distinct, wherein L≤G.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0030] Other characteristics and advantages of the invention will become clear after reading a preferred embodiment of the invention, described with reference to the appended figures among which:
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
DETAILED PRESENTATION OF PARTICULAR EMBODIMENTS
[0037] An NTT stream processor comprises essentially a plurality of calculation stages, each stage comprising a permutation module adapted to perform a permutation operation and a combination module adapted to perform a combination operation (butterfly radix-R) on R integer numbers making use of twiddle factors belonging to a field □.sub.p=□/p□ with characteristic p.
[0038]
[0039] The input data are relative integers (elements of □) and are supplied in blocks with size W to the NTT processor, 100. In other words, W is the width of the data path. Thus, if N (or the number of points) is the size of the NTT transform, and if a block is supplied to the processor during each clock cycle, the entirety of a sequence of N elements, hereinafter referred to as an N-sequence, will be received after time T=N/W (counted as a number of cycles), in other words the input flow rate (of data sequences) is equal to N/W.
[0040] The NTT processor comprises a plurality K of processing stages, 110.sub.0, . . . , 110.sub.K−1, arranged in pipeline, 110, each stage 110.sub.k comprising a permutation module 113.sub.k, followed by a combination module, 115.sub.k, forming a combination operation on R integers supplied by the permutation module 113.sub.k making use of N/R.sup.K−k−1 twiddle factors (in the case of an implementation by time decimation) or R.sup.K−k−1 (in the case of an implementation by frequency decimation). Thus, each stage is configured by an associated set of twiddle factors.
[0041] In the remainder of the presentation, in the interests of simplification and without any loss of generality, we will assume that R=W. Furthermore, the size (or the number of points) N of the NTT transform is related to the number of stages in the processor by K=log.sub.R(N).
[0042] Each permutation module comprises switches and delays (FIFO buffers) so as to present the R integers to be combined to the next combination module simultaneously. The architecture of these permutation modules has been described for example in patent US-B-8321823 incorporated herein by reference.
[0043] The combination modules perform the radix-R operations (as for an FFT) making use of twiddle factors supplied to them, the calculations being made in the field .sub.p.
[0044] The flow rate through each stage of the NTT processor, in other words the time after which an N-sequence of data is processed by this stage, is equal to T so that a stream operation can be performed.
[0045] Since the data N-sequences are supplied to the NTT processor with flow rate T=N/W, the maximum number of N-sequences present in the processor at any moment is G=┌Lat_NTT/T┐ wherein Lat_NTT is the processing latency of the NTT processor expressed as a number of clock cycles.
[0046] The architecture of the NTT stream processor according to this invention enables the NTT transform to be made on L≤G distinct fields in pipeline, which is particularly advantageous when operations have to be carried out in RNS representation.
[0047] The basic concept of this invention is to enable each stage k=0, . . . , K of the NTT processor to access the set of twiddle factors associated with this stage and the N-sequence of data currently being processed by this stage, said twiddle factors belonging to
={(
).sup.n, n=0, . . . , N−1} wherein
=1, . . . , L, the NTT transforms on the different fields
taking place simultaneously on data blocks within the successive stages of the processor, access of the different stages to the sets
being made synchronously with the progress of the N-sequences of data from one stage to the next.
[0048]
[0049] The NTT processor comprises a control module 250, a plurality K of processing stages, 210.sub.0, . . . , 210.sub.K−1, arranged in pipeline, 210, these processing stages having the structure previously described in relation to
[0050] The different stages are programmed using G+1 memory banks 220.sub.0, . . . , 220.sub.G. Each memory bank, 220.sub.g, comprises K memories, MEM.sub.k.sup.g, with index k=0, . . . , K−1, each memory MEM.sub.k.sup.g containing the set of twiddle factors necessary for setting parameters (in other words for programming) the corresponding stage 210.sub.k when this stage is to be performed in the field
. Furthermore, the memory bank 220.sub.g stores the characteristic
of the field in which the combination operations will be done in a register (not shown). This characteristic is also supplied to the stage 210.sub.k so that the combination module of this stage can perform the operations (multiplication by a twiddle factor, addition, subtraction) modulo
.
[0051] A stage with given index k accesses a single memory bank at each instant. It should be noted that the contents of different memory banks can be different or identical, depending on whether successive NTT transforms are made on different or identical fields. It is important to understand that each memory bank is associated with a sequence of . When these
, to be configured by this set. A transform can then be made on the
or in a different field
. In the first case, the memory bank 220.sub.g+1 (if it is assumed that g<G) will contain the same sets of twiddle factor
, k=0, . . . , K−1 as the memory bank 220.sub.g. In the second case, it will contain the sets of twiddle factors
, k=0, . . . , K−1.
[0052] Each stage 210.sub.k cyclically accesses memories with index k, MEM.sub.k.sup.g of memory banks 220.sub.g, g=0, . . . , G. It should be noted that when a stage 210.sub.k has been configured by means of the memory MEM.sub.k.sup.G of the memory bank 220.sub.G, it is then configured using the memory MEM.sub.k.sup.0 of the memory bank 220.sub.0 for the next sequence of
[0053] The function of the control block 250 is particularly to control writing of twiddle factors in the memory banks 220.sub.g, g=0, . . . , G and to read these twiddle factors within the memory banks to configure stages 210.sub.k, k=0, . . . , K−1.
[0054] More precisely, the control module 250 controls the read management module 260, the write management module, 270, the memory bank write interface, 280, and the memory bank read interface, 290. The control module 250 selects memory banks individually so that each can be accessed in write or in read.
[0055] The read management module, 260 is responsible for the generation of read addresses. It comprises K output ports, each output port with index k providing the address addr.sub.k to be read in the selected memory bank to configure the corresponding stage 210.sub.k.
[0056] The function of the write management module, 270, is primarily to generate a write command prg_we.sub.k, k=0, . . . , K−1 and a write address prg_addr.sub.k, k=0, . . . , K−1, for each memory in the memory bank. Only the memory for which the write command is activated is accessed in write at address prg_addr.sub.k. It is understood that when a selected memory bank is not accessed in writing, it can be accessed in reading.
[0057] Another function is to provide twiddle factors, denoted as prg_dat, k=0, . . . , K−1, that will be written in the corresponding memories of the selected memory bank, and the characteristic
of the field to be written in the register of the selected memory bank.
[0058] These twiddle factors may have been previously stored in an external memory and provided to the memory banks, through a FIFO buffer provided in the write management module, 270. Alternatively, the twiddle factors can be supplied directly by a twiddle factor generation circuit like that described in application FR1856340 deposited on the same day and incorporated herein by reference. It will be remembered that such a circuit can provide twiddle factors by blocks with size W with flow rate N/W.
[0059]
[0060] This figure shows the memory bank 220.sub.g, herein assumed to be selected by the control module for the write operation. This memory bank contains the memories MEM.sub.k.sup.g, k=0, . . . , K−1. Each memory MEM.sub.k.sup.g receives successive twiddle factors on its data bus. When the twiddle factor prg_dat is present on the data bus, the write control signal prg_we.sub.k writes it at the address addr.sub.k in the memory MEM.sub.k.sup.g. More precisely, le signal prg_we.sub.k uses the multiplexer 310.sub.k to select the write address between prg_addr.sub.k (address provided by the write management module) and the address (addr.sub.k provided by the read management module). Generation of addresses prg_addr.sub.k depends on the distribution of rotation factors
within the different successive blocks provided to the write management module. When these
, k=0, . . . , K−1, are progressively present in the memories MEM.sub.k.sup.g, k=0, . . . , K−1.
[0061]
[0062] This figure once again shows the processing stages pipeline 210, the memory banks 220.sub.g, g=0, . . . , G, the control module 250, the read management module 260, the write management module 270, the memory bank write interface, 280, and the memory bank read interface, 290.
[0063] The control module 250 comprises a memory bank write counter, 430, that is incremented each time that the signal CE_TW becomes active, in other words each time a new set of twiddle factors is supplied to the write management module 270. When the counter 430 reaches the value G, the next increment resets its output to zero. In other words, the memory banks are cyclically addressed in writing.
[0064] The output from the counter 430 is compared with the values 0, . . . , G using G+1 comparators 440.sub.g, g=0, . . . , G. The outputs from each of these comparators provide signals sel.sub.g, g=0, . . . , G. If the signal sel.sub.g uses a first logical value (in this case “1”), the memory bank 220.sub.g is selected in write, and if it takes on a second logical value (“0”), opposite to the first, this memory bank is not selected. The signal sel.sub.g authorises the transmission of write control signals prg_we.sub.k, k=0, . . . , K−1 to the selected memory bank if it is equal to a first logical value, and inhibits this transmission (for example by means of an AND gate or a multiplexer) if it is equal to a second logical value, opposite to the first. In other words, if the memory bank 220.sub.g is selected by the control module, the write control signals prg_we.sub.k, k=0, . . . , K−1 are transmitted to the different memories in the memory bank concerned, in the form of signals mem.sub.g_prg_we.sub.k, k=0, . . . , K−1.
[0065]
[0066] It is assumed herein that the control module would configure stage 210.sub.k of the NTT processor.
[0067] The arrival of a new sequence of
[0068] Thus, when the first data block sequence reaches stage 210.sub.k, the rotation factors read from the memory MEM.sub.k.sup.0 are supplied to this stage to configure it, when the second data block sequence reaches the same stage, the twiddle factors read from the memory MEM.sub.k.sup.1 are presented to it to configure it, and so on. When the memory MEM.sub.k.sup.G of the last memory bank, 220.sub.G is reached, the memory bank counter, 550.sub.k, is reset to zero and the twiddle factors are once again read from memory MEM.sub.k.sup.0. The dynamic configuration process continues cyclically, for each stage of the NTT processor.
[0069]
[0070] Clk denotes the timer clock (size W) of the twiddle factor blocks. This same clock is also used to clock the arrival of data blocks (also with size W).
[0071] The signal CE_TW informs the processor that a new set of twiddle factors ={(
).sup.n, n=0, . . . , N−1} is available. This set of twiddle factors is supplied in the form of successive blocks with size W. This set of twiddle factors is broken down into sets of twiddle factors
, k=0, . . . , K−1 that will be stored in the K corresponding memories of a memory bank and will be used to configure the corresponding processing stages 210.sub.k, k=0, . . . , K−1, when the data flow to be transformed passes through these stages. It is important to note that the twiddle factors of a set of twiddle factors can be distributed on several successive blocks of
.
[0072] The twiddle_factors line represents the twiddle factors. For each clock tick Clk, a block of W twiddle factors is supplied to the write management module 270.
[0073] The signal prg_start indicates the beginning of the write of a set of twiddle factors in a memory bank.
[0074] The signal #prg represents the output from the memory bank write counter and the signals sel.sub.g, g=0, . . . , G−1 are output signals from the comparators 440.sub.g, g=0, . . . , G−1. When a signal sel.sub.g is active (in this case logical level “1”), the memory bank 220.sub.g is selected in writing.
[0075] The K signals prg_data.sub.{0, . . . , K−1} represent data to be written in the corresponding memories MEM.sub.k.sup.g, k=0, . . . , K−1 of the memory bank 220.sub.g selected in write. More precisely, these data appear on the data bus that supplies all memory banks but will only be written in the selected memory bank.
[0076] Similarly, the K signals prg_addr.sub.{0, . . . , K−1} represent the corresponding addresses in the memories MEM.sub.k.sup.g, k=0, . . . , K−1 in which the data prg_data.sub.{0, . . . , K−1} will be written. More precisely, these addresses appear on the address bus that supplies all memory banks but will only be used for the selected memory bank. The signals mem.sub.g_prg_we.sub.{0, . . . , K−1} are commands to write in the selected memory MEM.sub.k.sup.g, k=0, . . . , K−1 In this case, the write command mem.sub.g_prg_we.sub.k is simply the logical product sel.sub.g.prg_we.sub.k: it triggers writing data prg_data.sub.k at address prg_addr.sub.k in the memory MEM.sub.k.sup.g of the selected memory bank, 220.sub.g. As mentioned above, when a block of twiddle factors arrives, several memories MEM.sub.k.sup.g, k=0, . . . , K−1 of the selected memory bank can be selected successively and addressed in writing.
[0077] It will be understood from this chronogram that memory banks are cyclically addressed in write so that successive sets of twiddle factors can be stored in them, each set being composed of sets of twiddle factors
, k=0, . . . , K−1 that are stored in the corresponding memories of a memory bank and will be used to program the different corresponding processing steps of the NTT stream processor.