FAST AND RESOURCE-EFFICIENT APPROXIMATION FOR THE EXPONENTIAL FUNCTION

20250371105 ยท 2025-12-04

    Inventors

    Cpc classification

    International classification

    Abstract

    A method for computing an approximate value A of the exponential function e.sup.x of an argument x. The method includes: approximating e.sup.x with a Taylor expansion T around x=0 that includes a predetermined number n of terms with i-th powers x.sup.i of the argument x divided by the respective factorial of i, with i=1, . . . , n, and in the computation of each term, approximating the factorial of i to the nearest power of 2, p(i!).

    Claims

    1. A method for computing an approximate value A of the exponential function e.sup.x of an argument x, comprising the following steps: approximating e.sup.x with a Taylor expansion around x=0 that includes a predetermined number n of terms with i-th powers x.sup.i of the argument x divided by a respective factorial of i, with i=1, . . . , n; and computing each of the terms by approximating the factorial of i to a nearest power of 2.

    2. The method of claim 1, further comprising: decomposing the argument x into a product of an integer X.sub.q and a non-integer scaling factor .sub.x; and expressing the scaling factor .sub.x as a power of 2 with an exponent of .sub.x*, so that .sub.x=2.sup..sup.x.sup.*.

    3. The method of claim 2, further comprising: decomposing the approximation of e.sup.x=e.sup..sup.x.sup.X.sup.q into a product of 2.4% and a remaining part f.

    4. The method of claim 1, further comprising: using the computed approximate value of e.sup.x in the computation of a softmax function S ( y k ) = exp ( y k ) / .Math. l = 1 m exp ( y l ) of an element y.sub.k of an input vector y with m elements.

    5. The method of claim 3, wherein computations of two instances of 2.sup.n.Math..sup.x.sup.* that appear in a numerator and in a denominator of S(y.sub.k) are omitted.

    6. The method of claim 1, wherein at least one multiplication of one number with a power of 2 to an exponent, and/or division of the number by the power of 2, is computed by bit-shifting the number for a number of bits corresponding to the exponent.

    7. The method of claim 1, further comprising: using the computed approximate value of e.sup.x, and/or the computed value S(y.sub.k) of the softmax function, in a computation of output O of a neural network.

    8. The method of claim 7, wherein the computed approximate value of e.sup.x, and/or the computed value S(y.sub.k) of the softmax function, is used to compute: (i) the output O of a classifier network for images or other records of measurement data, and/or (ii) the output O of a multi-head attention module of a transformer network.

    9. The method of claim 7, further comprising: determining a confidence C of the output O of the neural network; and in response to the confidence C meeting a predetermined condition, modifying the number n of terms used in subsequent computations of the approximate value of e.sup.x.

    10. The method of claim 9, wherein the number n of terms is controlled to be kept at a lowest value that is sufficient to achieve a predetermined minimum confidence of the output O.

    11. The method of claim 7, wherein the neural network is implemented on a hardware platform with less memory, and/or less processing resources, than those which would be necessary to compute the output O without approximating the value of e.sup.x.

    12. The method of claim 7, wherein the argument x is derived from measurement data acquired using at least one sensor, and wherein the method further comprises: determining, from the output O of the neural network, an actuation signal; and actuating, using the actuation signal, a vehicle and/or a driving assistance system and/or a robot and/or a quality inspection system and/or a surveillance system and/or a medical imaging system.

    13. A non-transitory machine-readable storage medium on which is stored a computer program for computing an approximate value A of the exponential function e.sup.x of an argument x, the computer program, when executed by one or more computers and/or compute instances, causing the one or more computers and/or compute instances to perform the following steps: approximating e.sup.x with a Taylor expansion around x=0 that includes a predetermined number n of terms with i-th powers x.sup.i of the argument x divided by a respective factorial of i, with i=1, . . . , n; and computing each of the terms by approximating the factorial of i to a nearest power of 2.

    14. One or more computers and/or compute instances including a non-transitory machine-readable storage medium on which is stored a computer program for computing an approximate value A of the exponential function e.sup.x of an argument x, the computer program, when executed by the one or more computers and/or compute instances, causing the one or more computers and/or compute instances to perform the following steps: approximating e.sup.x with a Taylor expansion around x=0 that includes a predetermined number n of terms with i-th powers x.sup.i of the argument x divided by a respective factorial of i, with i=1, . . . , n; and computing each of the terms by approximating the factorial of i to a nearest power of 2.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0031] FIG. 1 shows an exemplary embodiment of the method 100 for computing an approximate value A of the exponential function e.sup.x of an argument x, according to the present invention.

    [0032] FIG. 2 shows an illustration of the simplifications introduced by the approximation as per the method 100, according to an example embodiment of the present invention.

    DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

    [0033] FIG. 1 is a schematic flow chart of an embodiment of the method 100 for computing an approximate value A of the exponential function e.sup.x of an argument x.

    [0034] According to block 105, the argument x may be derived from measurement data acquired using at least one sensor.

    [0035] In step 110, e.sup.x is approximated with a Taylor expansion T around x=0 that comprises a predetermined number n of terms with i-th powers x.sup.i of the argument x divided by the respective factorial of i, with i=1, . . . , n.

    [0036] In step 120, the argument x is decomposed into a product of an integer X.sub.q and a non-integer scaling factor .sub.x.

    [0037] In step 130, this scaling factor .sub.x is expressed as a power of 2 with an exponent of .sub.x*, so that .sub.x=24%.

    [0038] According to block 131, the sought approximation of e.sup.x=e.sup..sup.x.sup.X.sup.q may be decomposed into a product of 2.sup.n.Math..sup.x.sup.* and a remaining part f.

    [0039] In step 140, in the computation of each term of the Taylor expansion T, the factorial of i is approximated to the nearest power of 2, p(i!). The computation of all terms of the Taylor expansion T yields the sought approximate value of e.sup.x.

    [0040] In step 150, the computed approximate value A of e.sup.x is used in the computation of the softmax function

    [00009] S ( y k ) = exp ( y k ) / .Math. l = 1 m exp ( y l )

    of an element y.sub.k of an input vector y with m elements.

    [0041] According to block 151, if A has been decomposed into 2.sup.n.Math..sup.x.sup.*.Math.f according to block 131, computations of two instances of 2.sup.n.Math..sup.x.sup.* that appear in the numerator and in the denominator of S(y.sub.k) may be omitted.

    [0042] In step 160, the computed approximate value A of e.sup.x, and/or the computed value S(y.sub.k) of the softmax function, is used in the computation of the output O of a neural network.

    [0043] According to block 161, the computed approximate value A of e.sup.x, and/or the computed value S(y.sub.k) of the softmax function, may be used to compute the output O of a classifier network for images or other records of measurement data, and/or the output O of a multi-head attention module of a transformer network.

    [0044] According to block 162, a confidence C of the output O of the neural network may be determined. It may then be determined in block 163 whether this confidence C meets a predetermined condition, such as being above or below a predetermined threshold value. If this is the case (truth value 1), then, according to block 164, the number n of terms used in subsequent computations of the approximate value A of e.sup.x may be modified. In particular, according to block 164a, the number n of terms may be controlled to be kept at the lowest value that is sufficient to achieve a predetermined minimum confidence C of the output O.

    [0045] According to block 165, the neural network may be implemented on a hardware platform with less memory, and/or less processing resources, than those which would be necessary to compute the output O without approximating the value of e.sup.x.

    [0046] In step 170, an actuation signal 170a is determined from the output O of the neural network.

    [0047] In step 180, a vehicle 50, a driving assistance system 51, a robot 60, a quality inspection system 70, a surveillance system 80, and/or a medical imaging system 90, is actuated with the actuation signal 170a.

    [0048] FIG. 2 illustrates the simplifications introduced by the approximation as per the method 100.

    [0049] The task of computing e.sup.x is sketched as lifting the argument x, drawn as a container filled with water, up a given height h. This is difficult because the water-filled container is heavy. The container can be lightened a lot by decomposing the argument x into an integer X.sub.q, symbolized by a near-empty container, and a non-integer scaling factor .sub.x. Computing the i-th power X.sub.q.sup.i of the integer X.sub.q corresponds to lifting the near-empty container up the given height h.

    [0050] There are two more computations required in the approximation. A power of the scaling factor .sub.x has to be computed, and a division by the factorial i! needs to be made. The scaling factor .sub.x is expressed as a power of 2 with an exponent of .sub.x*, so that .sub.x=2.sup..sup.x.sup.*. This reduces the multiplication with the scaling factor 4, or a power thereof to a simple bit-shift operation, symbolized in FIG. 2 as an anti-clockwise rotation of the lifted near-empty container. The factorial i! is approximated to a nearest power of 2, p(i!). This reduces division by the factorial i! to a bit-shift operation in the other direction, symbolized by a clockwise rotation of the lifted near-empty container.

    [0051] The end result is an approximation A that is close but not identical to the true value of e.sup.x. This is symbolized by the container A not being at the full height h of e.sup.x and having a lesser fill level than the container e.sup.x.