Method and apparatus for efficient multiplication to improve performance in computational machines
11474791 · 2022-10-18
Assignee
Inventors
Cpc classification
G06F7/70
PHYSICS
G06F7/57
PHYSICS
International classification
Abstract
A method and apparatus is disclosed for determining a stochastic binary string (SBS) representing a value based on the value represented in binary two's complement. Several different generators are disclosed for generating SBS strings, each being generated to have particular features that are advantageous under various conditions in which the string is to be multiplied with another SBS string. Several such generators can be presented and selected depending upon the particular values to be converted to SBS representation and the functions to be performed on those values.
Claims
1. A stochastic binary string (SBS) generator, comprising: (a) a plurality of SBS generator sub-blocks, each sub-block generating an SBS sequence using a different technique; (b) a selector for selectively coupling an input value to a selected one from among the plurality of SBS generator sub-blocks; and (c) a processor coupled to the selector to control the selection performed by the selector.
2. The SBS generator of claim 1, wherein the plurality of SBS generator sub-blocks include at least one SBS ρ-sequence generator sub-block that generates ρ-sequence SBS sequences.
3. The SBS generator of claim 1, wherein the plurality of SBS generator sub-blocks include at least one SBS δ-sequence generator sub-block that generates δ-sequence SBS sequences.
4. The SBS generator of claim 1, wherein the plurality of SBS generator sub-blocks include at least one SBS ω-sequence generator sub-block that generates ω-sequence SBS sequences.
5. The SBS generator of claim 1, wherein: (a) the plurality of SBS generator sub-blocks includes a first SBS ρ-sequence generator sub-block generates SBS ρ-sequence strings; (b) the plurality of SBS generator sub-blocks includes a second SBS δ-sequence generator sub-block that generates SBS δ-sequence strings; (c) the processor selects from among the plurality of SBS generator sub-blocks, the first SBS generator sub-block to generate a first SBS sequence; and (d) the processor selects from among the plurality of SBS generator sub-blocks, the second SBS generator sub-block to generate a second SBS sequence to be multiplied with the first SBS sequence.
6. The SBS generator of claim 5, further including at least one exclusive nor (XNOR) gate having a first input coupled to the first SBS generator sub-block to receive the first SBS sequence and having a second input coupled to the second SBS generator sub-block to receive the second SBS sequence, the output of the XNOR gate being the product of the first SBS sequence and the second SBS sequence.
7. The SBS generator of claim 1, wherein the plurality of SBS generator sub-blocks include at least one SBS generator sub-block that generates SBS sequences using a σ-sequence method.
8. The SBS generator of claim 7, wherein the plurality of SBS generator sub-blocks include at least one SBS generator sub-block that generates SBS sequences using a δ-sequence method.
9. The SBS generator of claim 7, wherein the plurality of SBS generator sub-blocks include at least one SBS generator sub-block that generates SBS sequences using a ω-sequence method.
10. The SBS generator of claim 7, wherein: (a) the plurality of SBS generator sub-blocks includes a first SBS generator sub-block that generates SBS sequences using a σ-sequence method; (b) the plurality of SBS generator sub-blocks includes a second SBS generator sub-block that generates SBS sequences using a δ-sequence method; (c) the processor selects from among the plurality of SBS generator sub-block, the first SBS generator sub-block to generate a first SBS sequence; and (d) the processor selects from among the plurality of SBS generator sub-blocks, the second SBS generator sub-block to generate a second SBS sequence to be multiplied with the first SBS sequence.
11. The SBS generator of claim 10, further including at least one exclusive nor (XNOR) gate having a first input coupled to the first SBS generator sub-block to receive the first SBS sequence and having a second input coupled to the second SBS generator sub-block to receive the second SBS sequence, the output of the XNOR gate being the product of the first SBS sequence and the second SBS sequence.
12. The SBS generator of claim 1, wherein the plurality of SBS generator sub-blocks include at a first SBS generator sub-block, wherein the first SBS generator sub-block generates SBS sequences using ζ-sequence method.
13. The SBS generator of claim 12, wherein the processor selects from among the plurality of SBS generator sub-blocks the first SBS generator sub-block when an SBS sequence generated by one of the other SBS generator sub-blocks is to be multiplied by zero and wherein the first SBS generator outputs an SBS sequence of zero.
Description
DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32) Like reference numbers and designations in the various drawings indicate like elements.
DETAILED DESCRIPTION
(33) Some embodiments of the disclosed method and apparatus use several types of stochastic binary string (SBS) generators to generate SBS sequences based on the particular values to be multiplied.
(34) One such SBS generator generates an ω-sequence. An ω-sequence is a string that presents a numerical value in SBS format. An ω-sequence may be either a binary SBS (BSBS) or a unipolar SBS (USBS). The ω-sequence may be used either once or repeated multiple times within an SBS sequence. Within each ω-sequence, the number of transitions from “1” to “0” is minimized by grouping the 1-bits together and grouping the 0-bits together. Accordingly, the output of an ω-sequence generator is a string of is followed by a string of 0s. The number of ones in the first part of the string will be equal to pT, the product of the probability associated with each bit of the SBS sequence N and the length T of the SBS sequence. In general, the probability p that a bit of a BSBS sequence is assigned a value of “1” is determined as a function of a value n being represented by the SBS sequence. For a BSBS, the probability p=(c+n)/2c, and the BSBS sequence N can represent values n in a range from −c to c. For a USBS, the probability is p=n/c. and the sequence N can represent values n in a range from f to c, where f≤n≤c.
(35) Conversion of a value b represented in binary form to a value n represented by an either a USBS or a BSBS sequence N using an ω-sequence can be performed as follows. For a USBS sequence, the probability is p=c/n, where c is the ceiling (i.e., largest number that can be represented by the USBS). For a BSBS, the probability p can be expressed as (2.sup.L−1+b)/2.sup.L, where L is bit precision in the binary representation and b is the value represented in binary form. The bit precision of the binary representation is the length of the binary number (i.e., number of bits) used to represent the value. The probability p can be used to determine the number of bits that are assigned a value of “1” within the ω-sequence. That is, the number of ones in the ω-sequence is equal to pT. In one embodiment, the value pT is compared to the output of a counter that runs from 0 to T. If the value pT is larger than the output of the counter, a “1” is generated within the ω-sequence; otherwise a zero is generated. Once the counter runs the full range from 0 to T, a complete ω-sequence is generated in which the value b is represented in SBS format as a string of pT bits followed by a string of T(1−p) bits assigned a value of zero. An example is provided below to illustrate this.
(36)
p=(2.sup.L−1+b)/2.sup.L EQ. 1
T=2.sup.L EQ. 2
2.sup.L−1=T/2 EQ. 3
pT=(2.sup.L−1+b) EQ. 4 where: p is the probability that a bit in the BSBS sequence is assigned a value of “1”; L is the bit precision (i.e., the number of bits in the binary format used to express the magnitude) of the value to be converted from binary format to BSBS format; T is the number of bits in the BSBS sequence; and b expressed in binary format that is to be converted for representation in BSBS format.
(37) Therefore, pT can be generated by summing T/2 with the binary value of b. For example, if the value b is a binary value of “0010” (representing a value of 2), represented in a 4-bit 2s-complement binary representation (a signed binary representation that can hold a value from −8 to 7), then the value of L is 3. For this example, T=2.sup.L=2.sup.3=8; represented in binary form as “01000”. The value of 2.sup.L−1 is 4; “00100” in binary format, which is a simple right shift of T. Adding this binary value to the binary value of b yields b+2.sup.L−1=pT=00110. It should be noted that the bit precision of the register needed to hold the value L must be at least one greater than the bit precision of the register needed to hold the value of b.
(38) By applying b+2.sup.L−1=00110 to the positive input of the comparator 803 and applying the value of i output from the counter 801 to the negative input of the comparator 801, the comparator will output a string in which the first 10 bits are ones and the following 6 bits are zeros (i.e., the desired ω-sequence representing the value 0010 binary as a BSBS sequence N having a length T=16). It can be seen that the ω-sequence generator 800 can generate the w-sequence using only simple addition and comparisons. The bits output from the comparator 803 are coupled to the input of a register 809 and stored under the control of an index, i output from the counter 801. The register outputs the string N having bits n.sub.0−n.sub.T-1.
(39)
(40) A BSBS sequence generated using a ω-sequence has several properties of interest. The first of these properties is that the maximum relative error for the ω-sequence is less than 1/T. That is, converting a binary value to a SBS sequence, and then converting the SBS sequence back to a binary number will result in an relative error that is less than 1/T, where T is the length of the ω-sequence. Another property of a SBS sequence generated with an ω-sequence is that it can be generated without the use of a random number generator. Yet another property of a SBS that has been generated with an ω-sequence is that the SBS sequence will have the fewest possible transitions (state changes between one and zero). However, an accurate product cannot be obtained when multiplying a SBS sequence that has been generated using a ω-sequence generator with another SBS sequence that has been generated using a ω-sequence. Neither can an accurate product be attained when multiplying a SBS sequence that has been generated using a w-sequence with a time-shifted version of itself.
(41) Another SBS generator that can be used in accordance with some embodiments of the disclosed method and apparatus generates an δ-sequence. Like the ω-sequence, a δ-sequence is also a string that presents a numerical value in either unipolar SBS format or bipolar SBS format. The δ-sequence may also be used either once or repeated multiple times within an SBS sequence. However, the δ-sequence evenly spreads the bit positions of the “1”s in the BSBS sequence out so that the “1”s are as spaced across the string.
(42)
(43) By initializing the value of an accumulator, A.sub.0 to 2.sup.L+b, the value of the accumulator, A is offset by an amount equal to the value of b from the value at which b overflows. When generating long sequences (i.e., in which T is much greater than c), the value to which the accumulator A is initialized has less consequence. Accordingly, the accumulator can be initialized to any arbitrary value. However, for shorter strings in which T≤c, initializing the accumulator to 2.sup.L+b provides a more accurate result. The content A.sub.i of the accumulator A is then compared to 2.sup.L to determine whether A.sub.i is greater than 2.sup.L (STEP 1003). If b is positive, the comparison will be true. Alternatively, if b is negative, then the result of the comparison will be false. If true, then a bit n.sub.i of the δ-sequence, N is set to “1” and the value A.sub.i of the accumulator, A is decremented by 2.sup.L. (STEP 1005). If false, the bit n.sub.i of the δ-sequence, N is set to “0” and the value A.sub.i of the accumulator, A remains the same (STEP 1007). In either case, the index i is then incremented (STEP 1009). After incrementing the index, the value A.sub.i of the accumulator is updated to A.sub.i=A.sub.i−1+b+2.sup.L−1 (STEP 1011).
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)
(52) The multiplexer has two additional inputs, 2.sup.L−1 and b. The first of these additional inputs, 2.sup.L−1 is coupled a source of the value 2.sup.L−1 In accordance with some embodiments of the disclosed method and apparatus, the output of a register 1814 in which a value of 2.sup.L is stored is coupled to the input of a right shift register 1816. In such embodiments, the value 2.sup.L−1 is derived by performing a right shift of the output of the register 1814. Alternatively, a register (not shown) stores the value 2.sup.L−1 The second of the additional inputs, b is coupled to a memory register 1407 in which the value b is stored. The value b is the value to be converted from a two's complement binary representation to a BSBS sequence.
(53) The output of the summing circuit 1806 is the sum of the three outputs A.sub.i−1+b+2.sup.L−1 The output of the summing circuit 1806 is an intermediate value of A.sub.i. This intermediate value of A.sub.i coupled to three different circuits. The first of these three circuits is a comparator 1808. The second of the three circuits is a difference circuit 1810. The third of the circuits is a multiplexer 1812. The positive input to the comparator 1808 receives the output of the summing circuit 1806. The negative input to the comparator 1808 is coupled to the register 1814 in which the value 2.sup.L is stored. If the positive input is greater than the negative input, then the output of the comparator 1808 is a one. That is, if the sum A.sub.1=A.sub.i−1+b+2.sup.L−1 produced by the summing circuit 1806 is greater than 2.sup.L then the comparator outputs a “1”. Otherwise, the comparator outputs a “0”.
(54) The output of the comparator 1808 is coupled to the select input of the multiplexer 1812. When the comparator outputs a “1”, the multiplexer 1808 couples the “x” input to the output of the multiplexer 1808. The “x” input to the multiplexer 1808 is coupled to the output of the difference circuit 1810. The difference circuit 1810 outputs the difference between 2.sup.L and the intermediate value of A.sub.i. When the comparator 1808 outputs a “0”, the multiplexer 1810 couples the intermediate value of A.sub.i to the output of the multiplexer 1810. The output of the multiplexer 1812 is coupled to the input of the accumulator A 1804. A clock, which may either be the clock that increments the index counter 1801 or a clock derived from that clock, determines when the value A.sub.i should be updated.
(55) Accordingly, when the value of the sum A.sub.i−1+b+2.sup.L−1 is greater than 2.sup.L, the multiplexer 1812 sets the value of A.sub.1 equal to A.sub.1 minus 2.sup.L. Alternatively, when the value of the sum A.sub.i−1+b+2.sup.L−1 is less than, or equal to 2.sup.L, the multiplexer 1812 sets the value of A.sub.i equal to the intermediate value of A.sub.i output from the summing circuit 1806.
(56) The output of the comparator 1808 is also coupled to the input to an N register 1818 that holds the values of each of the bits n.sub.i of the δ-sequence N. The index i is used to save the bits n.sub.i of the δ-sequence in distinct bit locations associated with the value of the index i. It should be noted that the δ-sequence generator 1800 comprises very simple circuit element, such as a counter 1801, a summing circuit 1806, a difference circuit 1810, 2 comparators 1802, 1808, 2 multiplexers 1803, 1812, a shift register 1816 and four registers 1804, 1807, 1814, 1818.
(57)
(58) An SBS sequence generated using a δ-sequence has several properties of interest. The first of these properties is that, similar to the ω-sequence, the maximum error for the δ-sequence is less than 1/T. That is, converting a binary value to an SBS sequence, and then converting the SBS sequence back to a binary number will result in an error that is less than 1/T, where T is the length of the δ-sequence. Another property of an SBS sequence generated with an δ-sequence is that it can be generated without the use of a random number generator. Yet another property of an SBS that has been generated with an δ-sequence is that the SBS sequence will have the maximum number of transitions possible (state changes between one and zero). However, similar to an ω-sequence, an accurate product cannot be obtained when multiplying an SBS sequence that has been generated using a δ-sequence generator with another SBS sequence that has been generated using a δ-sequence. Neither can an accurate product be attained when multiplying a SBS sequence that has been generated using a δ-sequence with a time-shifted version of itself. Nonetheless, a SBS sequence that has been generated using an ω-sequence can be multiplied with a SBS sequence that has been generated using a δ-sequence.
(59) Another SBS generator that can be used in accordance with some embodiments of the disclosed method and apparatus generates an σ-sequence. Like the ω-sequence and δ-sequence, the σ-sequence is also a string that presents a numerical value in bipolar BSBS format. The σ-sequence may also be used either once or repeated multiple times within an SBS sequence. That is, several σ-sequence strings can be concatenated together to form one SBS sequence. The σ-sequence is similar to the ρ-sequence, but is more useful in cases in which it is desirable for the SBS sequences generated to obey the additive inverse identity property. That is, a ρ-sequence representing a binary value of b subtracted from another ρ-sequence representing the same value will result in zero if the two ρ-sequences are constructed using the same random number sequence.
(60)
(61)
(62) The comparator 2110 compares the absolute value of b to each random number, R.sub.i generated by the PRNG 2106 as the index counter 2102 increments through the index values, i. For each value of i, the comparator determines whether the absolute value of b is greater than the difference R.sub.i−2.sup.L−1 With the random values, R.sub.i output from the RNG 2106 being in the range from 0 through 2.sup.L−1, and assumed to be truly random, the probability that the output of the comparator 2110 will be true is equal to p=(b+2.sup.−1)/2.sup.L for values of b that are in the range of −2.sup.L−1 to 2.sup.L−1. It can be seen that the term (b+2.sup.L−1) shifts the value of the numerator upward from the value of b by 2.sup.L. This serves to effectively place the range of numerator in a range from 0 to 2.sup.L−1.
(63) The output of the comparator 2110 is coupled to a select input of a multiplexer 2115. Accordingly, the output of the comparator 2110 determines whether the “x input” or the “y input” of the multiplexer 2115 is coupled to the multiplexer 2115 output. The “x input” is coupled directly to the output of a SIGN register 2116. The SIGN register 2116 stores a value that is provided by the output of a comparator 2118. The comparator 2118 is a “1” for values of b that are less than zero (negative) and “0” for values of b that are greater than or equal to zero. The “y input” to the multiplexer 2115 is coupled to an inverter 2120. The input of the inverter 2120 is coupled to the output of the SIGN register 2116. The output of the multiplexer 2115 is coupled to a register 2122 that stores each of the values of the bits, n.sub.i of the string N under the control of the index, i.
(64) Accordingly, the probability that n.sub.i is set to the value stored in the SIGN register 2116 is equal to p=(b+2.sup.L−1)/2.sup.L the probability that the output of the comparator 2110 is a “1”. The benefit gained by generating an SBS sequence using the α-sequence generator is that such a SBS sequence will obey the additive inverse identity, assuming the same string of values R.sub.i is used to generate both the SBS sequences that represent the positive value and the negative value that are summed. In the case in which the PRNG 2104 is a pseudo random number generator, the values of R.sub.i will be the same if the same seed is used to generate the values of R.sub.i. All of the other characteristics of SBS sequences generated using the σ-sequence generator are the same as those of SBS sequences generated using a ρ-sequence generator.
(65) Another SBS generator that can be used in accordance with some embodiments of the disclosed method and apparatus generates a ζ-sequence. Like the ω-sequence, δ-sequence, and σ-sequence, the ζ-sequence is also a string that presents a numerical value in bipolar BSBS format. The ζ-sequence may be used either once or repeated multiple times within an SBS sequence. That is, several ζ-sequence strings can be concatenated together to form one BSBS sequence. ζ-sequence strings always represent a value of zero, and are of particular use when multiplying by zero.
(66) A ζ-sequence has two parameters that define the nature of strings generated using a ζ-sequence generator. These parameters are referred to herein as “r” and “s”, where r is an integer of value 1 or greater and s is an integer in the range of 0 to r−1. A ζ-sequence string is defined as having s bits that are assigned a value of “0”, followed by r bits assigned a value of “1”, followed by r−s bits assigned a value of “0”. Accordingly, the total number of bits, T in string generated with a ζ-sequence generator is T=r+s+r−s=2r.
(67) The following are examples of some strings that contain multiple instances of a ζ-sequence, ζ(r,s): ζ(1,0): [10].sup.T=2r=2=10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 1 . . . ζ(2,0): [1100].sup.T=2r=4=1100, 1100, 1100, 1100, 1100, 1100, 1100, 1 . . . ζ(2,1): [0110].sup.T=2r=4=0110, 0110, 0110, 0110, 0110, 0110, 0110, 0 . . . ζ(3,2): [001110].sup.T=2r=6=001110, 001110, 001110, 001110, 001110, 0 . . .
(68) Commas are inserted between each individual ζ-sequence within the string generated by concatenating several ζ-sequences together. The value within the brackets indicates the pattern of ones and zeros within the ζ-sequence. The superscript indicates the length of the ζ-sequence. It can be seen that there will be r bits assigned to a value of “1” and the same number, s+r−s=r bits assigned to a value of “0”. Thus, the resulting ζ-sequence always represents a value of zero. However, the pattern of “1”s and “0”s can be defined by setting the values of the parameters, r and s.
(69)
(70)
(71) A string generated with a ζ-sequence generator will always represent a value of zero, assuming the length of the BSBS sequence is either 2r or an integer multiple of 2r. It should be noted that a BSBS sequence generated using a ζ-sequence (1, 0) generator will be identical to a BSBS sequence of the same length generated with a δ-sequence generator and representing a value of zero. This is not true for other sets of parameters. In addition, the product of a first string ζ(r.sub.1,s.sub.1) and a second string ζ(r.sub.2,s.sub.2) will always be equal to 0, if r.sub.1 is not equal to r.sub.2 or s.sub.1 is not equal to s.sub.2 and the length of the product of the two strings is equal to 2×r.sub.1×r.sub.2, or an integer multiple thereof. Furthermore, a BSBS sequence generated using a ζ-sequence generator multiplied by a either a BSBS sequence generated using a δ-sequence generator or a ω-sequence generator will result in a product that has a value, P that is less than or equal to r of the BSBS sequence generated with a δ-sequence generator, assuming the length, T of the string is equal to, or an integer multiple of 2r. Also, the product, P of a BSBS sequence generated using a ζ-sequence generator and a BSBS sequence generated using a σ-sequence generator will have a value with a mean of 0 and a scaled standard deviation proportional to the square root of (p(1−p))/T. Lastly, a BSBS sequence generated using a ζ-sequence generator can be multiplied by a time shifted version of itself, as long as the amount of the shift is not equal to r or an integer multiple of r.
(72)
(73)
(74)
(75)
(76)
(77) It can be seen that the implementation of ζ(r,s) generator can be generalized by noting that r register elements are provided, with the output being taken by the register element indicated by s. In the case of the generator 2800, there are r=3 register elements 2402, 2502, 2702, with the output being taken from the first register element 2502, as indicated by s=1. In a ζ(3,0) generator, the output would be taken from the zeroth register element 2402.
(78) Yet another SBS generator that can be used in accordance with some embodiments of the disclosed method and apparatus generates a “prime”-sequence, i.e., σ′.sub.r,s-sequence and ρ′.sub.r,s-sequence. These prime-sequence generators are modifications of their underlying sequences. That is, the σ′.sub.r,s sequence is a modification of the σ-sequence. The modification is that for b=0, the BSBS sequence is generated using a ζ-sequence generator, but for all other values of b, a σ-sequence generator is used to generate the BSBS sequence. Similarly, the ρ′.sub.r,s-sequence is generated using a ρ-sequence generator for all values of b except b=0, for which a ζ-sequence generator is used. Generating BSBS sequences using a ρ′.sub.r,s-sequence generator or a σ′.sub.r,s-sequence generator improves the behavior of the standard ρ-sequence and σ-sequence generators for b=zero by ensuring that the error for multiplication by zero is minimized.
(79) γ-sequences
(80) γ-sequences are a slight modification to the ρ-sequence and σ-sequence, in which a comparison values (i.e., those values that are compared to b to determine the bit value) are created by a generator function G.sub.L(i) that maps values of the index, i into the space that includes integer values in the range from −2.sup.L−to 2.sup.L−1−1. The generator function may be a hash function that is “random”, such as an L-bit cyclic redundancy code (CRC) in which a fixed bit sequence serves as the seed and the input is the index value, i.
(81) The γ-sequence generator operates the same way as the ρ-sequence generator described above, but for the use of the function G.sub.L(i) in place of the RNG output. A prime sequence generator for the γ′.sub.r,s-sequence operates the same way as the γ-sequence generator described above, but for the use of the function GL(i) in place of the RNG output. The advantage provided by the γ-sequence generator is the fact that it does not require a linear-feedback shift register (LFSR), as is required for the PRNG that is used to generate the RNG output in the ρ-sequence generator and the σ-generator described above.
(82) Each of the above described sequence generators can be provided as sub-blocks in a single BSBS sequence generator and selected based upon the particular characteristics of the operation to be performed on the resulting BSBS sequence and the conditions under which the string is being generated. By providing flexibility in determining which particular generator is best suited to a particular situation, the advantages of each generator can be maximized for each particular situation.
(83)
CONCLUSION
(84) A number of embodiments of the disclosed method and apparatus have been described. It is to be understood that various modifications may be made without departing from the spirit and scope of the disclosed method and apparatus. For example, some of the steps described above may be order independent, and thus can be performed in an order different from that described. Further, some of the steps described above may be optional. Various activities described with respect to the methods identified above can be executed in repetitive, serial, or parallel fashion. It is to be understood that the foregoing description is intended to illustrate and not to limit the scope of the any claims that are presented in later filed applications that might claim priority to this disclosure.