System and method for pipelined time-domain computing using time-domain flip-flops and its application in time-series analysis
11467831 · 2022-10-11
Assignee
Inventors
Cpc classification
G06F7/605
PHYSICS
G06F17/18
PHYSICS
International classification
G06F9/30
PHYSICS
G06F7/60
PHYSICS
Abstract
Systems and/or methods can include a ring based inverter chain that constructs multi-bit flip-flops that store time. The time flip-flops serve as storage units and enable pipeline operations. Single cells used in time series analysis, such as dynamic time warping are rendered by the time-domain circuits. The circuits include time flip-flops, Min, and ABS circuits. A and the matrix can be constructed through the single cells.
Claims
1. A time-domain hardware implemented method for processing time sequences signals, comprising: performing, using a scalable pipelined structure that comprises multi-bit ring-based time flip-flops (TFF), dynamic time warping (DTW) between two time sequences A and B to detect data similarities as a node D.sub.i,j in the pipeline structure, wherein the time sequences A and B each comprises time sample signals A.sub.i and B.sub.j of variable speed, respectively, wherein: each of the A.sub.i and B.sub.j time sample signals has a numerical value which is proportional to a duration of time pulse at a time instance i and j, respectively, and the node D.sub.i,j is stored as a resultant time pulse T(D.sub.i,j) in the TFF for a next clock cycle operation, and the resultant time pulse T(D.sub.i,j) is obtained by directly performing at least a time-domain pulses comparison operation on a pair of time sample signals A.sub.i and B.sub.j.
2. The time-domain hardware implemented method of claim 1, wherein the time-domain pulses comparison operation comprising performing an absolute difference operation (ABS) on the pair of time sample signals A.sub.i and B.sub.j to obtain the resultant time pulse T(D.sub.i,j), wherein T(D.sub.i,j)=ABS (A.sub.i−B.sub.j) and the resultant time pulse T(D.sub.i,j) is stored in the TFF for carrying out more time-domain pulses comparison operation with neighboring nodes in a next clock cycle to generate new resultant time pulses T(D.sub.i,j), until all nodes D.sub.i,j within an i×j dimension matrix between the time sequences A and B are determined.
3. The time-domain hardware implemented method of claim 2, wherein a path of nodes D.sub.i,j within the i×j dimension matrix having the lowest pulse widths forms a warping path which aligns respective pairs of time sample signals A.sub.i and B.sub.j between the time sequences A and B, and wherein a pulse width value of a terminating node D*.sub.i,j at an opposite diagonal corner from a starting node of an i×j dimension matrix represents a final distance which indicates how much similarities between the time sequences A and B.
4. The time-domain hardware implemented method of claim 1, wherein the time-domain pulses comparison operation comprising taking a minimum value operation (MIN) of three ancestor nodes (D.sub.i-1,j, D.sub.i-1,j-1, D.sub.i,j-1), wherein T(D.sub.i,j)=MIN (D.sub.i-1,j, D.sub.i-1,j-1, D.sub.i,j-1) and the pulse T(D.sub.i,j) forms a node D.sub.i,j to populate an i×j dimension matrix between the time sequences A and B.
5. The time-domain hardware implemented method of claim 4, wherein the time-domain pulses comparison operation comprising operating in a pipeline mode by reading the resultant time pulse T(D.sub.i,j) of an absolute difference operation (ABS) operation stored in the TFF, for a next cycle MIN operation to form respective nodes D.sub.i,j to populate the i×j dimension matrix.
6. The time-domain hardware implemented method of claim 5, comprising using a multiplexer (MUX) to by-pass node pulse data reading from an input TFF to speed up the DTW operations.
7. The time-domain hardware implemented method of claim 1, wherein the time signal samples A.sub.i and B.sub.j are read in digital from the sequences A and B, and the time signal samples A.sub.i and B.sub.j are converted into quantized pulse widths prior to performing an absolute difference operation ABS or a minimum value operation (MIN) operation.
8. The time-domain hardware implemented method of claim 5, further comprising fine tuning diagonally in the i×j matrix, the stored pulse widths values of the nodes D.sub.i,j.
9. The time-domain hardware implemented method of claim 2, comprising using a NAND gate having a plurality of NAND gate inputs to perform the ABS operation, wherein only one of the plurality of NAND gate inputs is buffered by an inverter.
10. The time-domain hardware implemented method of claim 4, comprising using a multi-input NOR gate to directly perform the MIN operation on time-domain pulses.
11. The time-domain hardware implemented method of claim 1, wherein the TFF in the pipelines are used to store the resultant time pulse T(D.sub.i,j) for a future clock cycle operations and to perform accumulation operations in DTW calculations when enabled.
12. The time-domain hardware implemented method of claim 1, wherein the pipelined structure is scalable by simply cascading the pipelined structure horizontally and vertically to increase sequence length for reading more time sample signals A.sub.i and B.sub.j simultaneously.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1) The disclosure is better understood with reference to the following drawings and description.
(2) The elements in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the disclosure. Moreover, in the figures, like-referenced numerals may designate to corresponding parts throughout the different views.
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
DESCRIPTION
(17) The system and/or methods h can improve energy efficiency of time domain signal processing. One example of energy efficient time domain signal processing is described in commonly assigned Pub. Patent Application No. 2017/0194982, entitled, “System and Method for Energy Efficient Time Domain Signal Processing,” the entire contents of which are incorporated by reference herein. In some existing designs, a lack of storage unit prevents time-domain pipeline structure computing.
(18) The disclosure describes systems and/or methods using new time-domain flip-flops (TFF) to build pipelined structure. The enabled pipeline structure may enhance throughputs of the design. A simple logic operation in the pipeline structure is also described, such as Min, ABS operations in time-domain that carry out a dynamic time warping algorithm using time-domain circuits hardware constructions. Examples of a hardware implemented method using circuit techniques carry out dynamic time warping algorithm for time series analysis finds applications in voice recognition, motion detection, DNA sequencing, etc.
(19) Dynamic time warping (DTW) is a variant of the dynamic programming algorithm, is used for time signal classifications. The strong capability of distance measurement for variable speed temporal sequences makes DTW a prevalent method for time series classifications in broad applications such as, ECG diagnosis, DNA sequencing, etc. Several efforts have been proposed in accelerating the operation including a recent demonstration of race-logic. However, the demonstration is confined to a single-bit operation, not scalable with variable sequence length and has low throughput due to its non-pipelined operation. To overcome those challenges, the systems and/or methods describe a DTW engine for time series classifications using time-domain computing. A pipelined operation uses time flip-flop (TFF) which can lead to an order of magnitude improvements in throughput and a scalable processing capability of time sequence. Compared with recent time-domain designs which suffer from low bit precision and lack of memory element, a pipeline structure is implemented all in time-domain with up to a 10 bit resolution.
(20)
(21) A “warping path” 120 is produced in order to align the two signals Ai and Bj in time t as marked in graph 100 of
(22)
(23)
(24)
(25)
(26) During readout phase, the stored pulse is sent out from the output pin 615 of the ring 610 with pulse width equivalent to the stored values. When the ring is filled, a carry signal rises and the ring will rotate back with reminder values stored inside. The “rotation” operation provides a scalable operation into multi-bit groups. Different from conventional D-flip-flops 510, each TFF 650 can process multi-bit signals. In on example, each TFF 650 can store a 6-bit time domain signal and two TFFs 810 (see
(27)
(28)
(29)
(30)
(31) At step 1606, the process performs the time-domain pulse comparison operation by taking an absolute difference operation on the pair of time sample signals A.sub.i and B.sub.j to obtain the resultant time pulse T(D.sub.i,j), wherein T(D.sub.i,j)=ABS (A.sub.i−B.sub.j) and the resultant time pulse T(D.sub.i,j) is stored in the TFF for carrying out more time-domain pulses comparison operation with neighboring nodes in a next clock cycle to generate new resultant time pulses T(D.sub.i,j), until all nodes D.sub.i,j within an i×j dimension matrix between the time sequences A and B are determined.
(32) At step 1608, the time-domain pulses comparison operation may additionally take a minimum value operation (MIN) of three ancestor nodes (D.sub.i-1,j, D.sub.i-1,j-1, D.sub.i,j-1), wherein T(D.sub.i,j)=MIN (D.sub.i-1,j, D.sub.i-1,j-1, D.sub.i,j-1) and the pulse T(D.sub.i,j) forms a node D.sub.i,j to populate an i×j dimension matrix between the time sequences A and B. the time-domain pulses comparison operation includes operating in a pipeline mode by reading the resultant time pulse T(D.sub.i,j) of the (ABS) operation stored in the TFF, for a next cycle MIN operation to form respective nodes D.sub.i,j to populate the i×j dimension matrix.
(33) At step 1610, the time-domain hardware implemented method may include a multiplexer (MUX) to facilitate by-pass node pulse data reading from an input TFF to speed up the DTW operations. The stored pulse widths values of the nodes D.sub.i,j readings may be improved by performing fine tuning diagonally in the i×j matrix from a bottom right to top left direction. The time-domain hardware may use a NAND gate to perform the ABS operation, wherein only one of the NAND gate's input is buffered by an inverter. The time-domain hardware implemented method also include using a multi-input NOR gate to directly perform the MIN operation on time-domain pulses. The TFF in the pipelines are used to store the resultant time pulse T(D.sub.i,j) for a future clock cycle operations and to perform accumulation operations in DTW calculations when enabled. The accumulation function in the TFF may be disabled when not used. In another example, the pipelined structure is scalable by simply cascading the pipelined structure horizontally and vertically to increase sequence length for reading more time sample signals A.sub.i and B.sub.j simultaneously.
(34) Alternate systems may include any combination of structure and functions described or shown in one or more of
(35) Other systems, methods, features and advantages will be, or will become, apparent to one with skill in the art upon examination of the figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the disclosure, and be protected by the following claims.