RADIATION DETECTION WITH NON-PARAMETRIC DECOMPOUNDING OF PULSE PILE-UP
20220137111 · 2022-05-05
Inventors
- Christopher MCLEAN (Kangaroo Flat, Victoria, AU)
- Michael Pauley (Carlton North, Victoria, AU)
- Jonathan Manton (Carlton North, Victoria, AU)
Cpc classification
G01T1/36
PHYSICS
G01R23/15
PHYSICS
International classification
G01R23/15
PHYSICS
Abstract
A method of determining a spectrum of energies of individual quanta of radiation received in a radiation detector is disclosed. Spectrum sensitive statistics are computed from a time series of digital observations from the radiation detector, defining a mapping from a density of amplitudes of the pulses to the spectrum sensitive statistics. The spectrum is determined by estimating the density of amplitudes of the pulses by applying an inversion of the mapping to the spectrum sensitive statistics. The statistics may be based on a first set of nonoverlapping time intervals of constant length L at least as long as a duration of the pulses without regard to entirety of clusters of the pulses; and a second set of nonoverlapping time intervals of constant length L1 less than L also without regard to entirety of clusters of the pulses. A method of estimating count rate is also disclosed.
Claims
1. A method of determining a spectrum of energies of individual quanta of radiation received in a radiation detector, the method comprising the steps of: (1) obtaining a time series of digital observations from the radiation detector comprising pulses corresponding to the detection of the individual quanta; (2) computing spectrum sensitive statistics from the detector signal, the spectrum sensitive statistics defining a mapping from a density of amplitudes of the pulses to the spectrum sensitive statistics; and (3) determining the spectrum by estimating the density of amplitudes of the pulses by applying an inversion of the mapping to the spectrum sensitive statistics.
2. The method of claim 1 further comprising basing the spectrum sensitive statistics on a sum of the digital observations over a plurality of time intervals.
3. The method of claim 2 further comprising defining the mapping using an approximate compound Poisson process.
4. The method of claim 3, further comprising augmenting the approximate compound Poisson process by a modelled noise.
5. The method of claim 4, further comprising expressing the mapping as a relation between characteristic functions of the amplitudes, the spectrum sensitive statistics and the modelled noise.
6. The method of claim 5, further comprising computing the characteristic functions of the spectrum sensitive statistics by applying an inverse Fourier transform to a histogram of the sum of the digital observations.
7. The method of claim 5, further comprising computing the characteristic functions of the amplitudes with a low pass filter.
8. The method of claim 2, further comprising selecting each of the plurality of time intervals to encompass zero or more approximately entire clusters of the pulses, and defining the plurality of time intervals to be nonoverlapping and have a constant length L.
9. The method of claim 8, further comprising requiring a maximum value of the detector signal at a beginning and end of each time interval.
10. The method of claim 8, further comprising defining the approximate compound Poisson process as a sum of the amplitudes of the pulses in each time interval.
11. (canceled)
12. (canceled)
13. The method of claim 2, further comprising selecting the plurality of intervals to include: a first set of nonoverlapping time intervals of constant length L without regard to entirety of clusters of the pulses; and a second set of nonoverlapping time intervals of constant length L1 less than L also without regard to entirety of clusters of the pulses; wherein L is at least as long as a duration of the pulses.
14. The method of claim 13, further comprising selecting L1 to be less than the duration of the pulses.
15. (canceled)
16. (canceled)
17. The method of claim 1, further comprising using a data driven strategy selected to result in a near optimal choice for a kernel parameter which minimizes an integrated square of errors of an estimated probability density function of the energies of the individual quanta of radiation.
18. A method of estimating count rate of individual quanta of radiation received in a radiation detector, the method comprising the steps of: (1) obtaining a time series of digital observations from the radiation detector comprising pulses corresponding to the detection of the individual quanta; (2) computing spectrum sensitive statistics from the detector signal, based on a sum of the digital observations over a plurality of time intervals, the spectrum sensitive statistics defining a mapping from a density of amplitudes of the pulses to the spectrum sensitive statistics using an approximate compound Poisson process, the plurality of time intervals including: a first set of nonoverlapping time intervals of constant length L selected without regard to entirety of clusters of the pulses; and a second set of nonoverlapping time intervals of constant length L1 less than L also selected without regard to entirety of clusters of the pulses; wherein L is at least as long as a duration of the pulses; (3) determining an estimate of a characteristic function of the approximate compound Poisson process using:
19. The method of claim 18, further comprising estimating the count rate by using an optimization routine or other means to fit a curve, estimating a DC offset of a logarithm of the estimate of the characteristic function, or fitting a curve to the logarithm of the estimate of the characteristic function.
Description
3 DERIVATION OF ESTIMATOR OF THE FIRST EMBODIMENT
[0015] The general approach we take to addressing pile-up is based on the following strategy; i) obtain statistics from s(t) that are sensitive to the distribution of incident photon energies, and estimate those statistics using the observed, finite-length sampled version of s(t), ii) obtain a mapping from the density of incident photon energies to the statistical properties of the observed statistics, iii) estimate the density of the incident photon energies by inverting the mapping. Section 3.1 describes our choice of statistics. Section 3.2 argues that these statistics (approximately) have the same distribution as a compound Poisson process. Section 3.3 introduces a decompounding technique for recovering the spectrum from these statistics. It is based on the decompounding algorithm in [18] but further developed to obtain near optimal performance in terms of the integrated square of error. Our approach to the pile-up problem follows the general theme of finding some statistics of s(t) that are sensitive to the underlying spectrum, estimating these statistics from a finite-time sampled version of s(t), then inverting the map that describes the statistical properties of these statistics given the underlying spectrum, thereby producing an estimate of the spectrum.
3.1 Choice of Statistic
[0016] We wish to obtain estimates of the photon energies from the observed signal given in (2). In typical modern spectroscopic systems, the detector output s(t) is uniformly sampled by an ADC. Without loss of generality, we assume the raw observations available to the algorithm are {s(k): k∈.sub.≥0}. Since identification of individual pulses can be difficult, we look instead for intervals of fixed length L∈
.sub.>0 containing zero or more clusters of pulses. Precisely, we define these intervals to be [T.sub.j, T.sub.j+L) where
[0017] Here, ϵ is chosen as a trade-off between errors in the energy estimate and the probability of creating an interval. The value of ϵ should be sufficiently small to ensure the error in the estimate of total photon energy arriving within each interval is acceptably low, yet sufficiently large with respect to the noise variance to ensure a large number of intervals are obtained. Although the probability of partitioning the observed data into intervals approaches zero as the count-rate goes to infinity, this approach succumbs to paralysis at higher count-rates compared to pile-up rejection strategies based on individual pulses, since multiple photons are permitted to pile-up within in each interval. Section 4.2 describes the selection of L and ϵ for real data. Each interval contains an unknown, random number of pulses and may contain zero pulses.
[0018] We estimate the total photon energy x.sub.j in the interval [T.sub.j, T.sub.j+L) using the sampled raw observations. Since the area under each pulse is proportional to the photon energy A.sub.j defined in (1), we let
[0019] The number of photon arrivals, the energy of each arriving photon and the detector output noise in each interval [T.sub.j, T.sub.j+L) are assumed to be random and independent of other intervals. For pulse shapes with exponential decay, a small amount of the photon energy arriving in an interval may be recorded in the next interval. The amount of leakage is proportional to ϵ, and is negligible for sufficiently small ϵ. Consequently, the estimates x.sub.1, x.sub.2, . . . may be treated as the realization of a weakly-dependent, stationary process where each estimate is identically distributed according to the random variable X. This relationship is illustrated in
3.2 Approximation with Compound Poisson Process
[0020] In this subsection we describe the distribution of X in terms of ƒ.sub.A(x). We will then invert this in section 3.3, to obtain an estimator for the density ƒ.sub.A(x). Using (9), (2), (1) and the fact that (t) is causa we have
[0021] As justified below, this simplifies to
[0022] Both y.sub.j and z.sub.j are i.i.d. sequences of random variables. We denote their distributions by Y and Z. The distribution of Z is fully determined from the distribution of w(t), which is assumed zero-mean Gaussian with known variance σ.sup.2. Moreover, Y is a compound Poisson process since the number of terms in the summation (number of photon arrivals in an interval of length L) has Poisson statistics. Equations (11)-(13) are justified as follows. The first term of (10) represents leakage from earlier intervals and is approximately zero. This is easily shown for Gaussian noise by performing a Taylor expansion about ϵ=0
[0023] Thus there is a finite but small probability that some energy belonging to a previous interval will be included in the current estimate. In practice, this contribution is comparable to the noise for sufficiently small E. The third term in (10) is zero since (t) is causal. The second term in (10) can be written as
where we assume the pulse shapes (t) are sufficiently smooth such that
It approximates the total energy of all the photons arriving in the interval [T.sub.j, T.sub.j+L). Let ν.sub.j designate the number of photon arrivals in the interval [T.sub.j, T.sub.j+L). We assume ν.sub.j is a realization of a homogeneous Poisson process with rate parameter λ, where λ is expressed in terms of the expected number of photons per interval of length L. Henceforth we shall assume that (11) holds exactly, and write
[0024] Finally, we write x.sub.j as
where we assume Z has known variance σ.sup.2. In this subsection we model the statistic of section 3.1 using a compound Poisson process. This allows us to derive an estimator for the density ƒ.sub.A(x) in terms of observable quantities. The number of photons arriving in the interval [T.sub.j, T.sub.j+L) is a Poisson random variable which we designate ν.sub.j. The total energy in the interval Y can be modelled as a compound Poisson process i.e.,
where .sub.j,1=min{
: T.sub.j≤τ
<T.sub.j+L} is the index of the first-photon arrival time in the interval, the arrival times are assumed ordered, and
representing photon energy are independent realizations of the random variable A with density function ƒ.sub.A(x). The {ν.sub.j} form a homogeneous Poisson process with rate parameter λ. The Poisson rate λ is expressed in terms of the expected number of photons per interval of length L.
[0025] The relationship between realizations of Y and the sampled detector response is illustrated in by substituting (2) into (9),
where z.sub.j is the realization of the unobservable random variable Y that represents the photon energy in an interval of the discrete-time detector response,
[0026] where z.sub.j is a realization of Z, an independent random variable representing errors in the sampling process and estimation of T.sub.j. We assume Z has known variance σ.sup.2. With these definitions of X and Y, the number of intervals which can be found in a finite length of detector output is a random variable N. At high count-rates this approach succumbs to paralysis, as the probability of being able to partition the observed data into intervals approaches zero. The onset of paralysis occurs at higher count-rates compared to pile-up rejection based strategies, since multiple photons are permitted to pile-up within in each interval. Assume the time-series defined in (3)-(6) has been sampled uniformly. Without loss of generality, assume unit sample intervals beginning at t.sub.0=0 i.e., t.sub.k=k, 0≤k<K. Let R be a discrete-time random process representing the sampled detector response of (1). Let Y={Y.sub.j: 0≤j<N} be a discrete-time random process whose components Y.sub.j represent the total photon energy arriving during a fixed time interval. A compound Poisson process can be used to model Y, i.e.,
where ν.sub.j is an independent Poisson random variable, and A.sub.k are independent identically distributed random variables with density function ƒ.sub.A(x). The {ν.sub.j} form a homogeneous Poisson process with rate parameter λ. The process Y is not directly observable. Assume the pulse shape Φ(t) has finite support. Let .sub.A(t) be the indicator function for the set A. Let the pulse length
be given by
=sup({t: Φ>0})−inf({t: Φ(t)>0}). Let S=R+W={s(k):0≤k<K} be a discrete-time random process representing the observed detector output given by (2). It consists of the detector response R corrupted by a noise process W. Without loss of generality, we assume unit sample intervals. From the observations S we form the process X, where
and where Z.sub.j is a random variable from an independent noise process of known variance σ.sup.2. A simple model for testing theory is obtained when we let the pulse shape Φ(t)=.sub.(0,1)(t) in (1), in which case we let X.sub.j=S.sub.j, and N is simply the sample length K. Obtaining X.sub.j from S is more complicated for real data. In that case we partition the process S into non-overlapping blocks of length L, where L>
. The Poisson rate λ is expressed in photons per block. The start of each block T.sub.j∈
is chosen such that the total energy of any pulse is fully contained within the block in which it arrives
We let
[0027]
where {circumflex over (T)}.sub.j is an estimate of T.sub.j. Section 4.2 describes the selection of L and ϵ for real data. With this definition of X.sub.j, the number of components in Y becomes a random variable for a given sample length K. At high count-rates this approach succumbs to paralysis, as the probability of being able to create a block approaches zero. The onset of paralysis occurs at higher count-rates compared to pile-up rejection based strategies, since multiple photons are permitted to pile-up within in each block. Let Y={Y.sub.j: 0≤j<N} be a discrete-time random process whose components Y.sub.j are given by
[0028] where L∈ is a constant chosen such that h and d is a small threshold value close to zero. The random variable Y.sub.j thus represents the total photon energy arriving during a fixed time interval of length L. The value of d ensures the signal associated with photon arrivals is very small at the start and end of each interval. This is illustrated in
where ν.sub.λ is a homogeneous Poisson process with rate parameter λ, and A.sub.k are independent identically distributed random variables with density function ƒ.sub.A(x). Let S=R+W be a discrete-time random process representing the sampled detector output given by (2). It consists of the detector response R corrupted by a noise process W. The process Y is not directly observable. Using (2), (25) and (32), we model observations by the process X={X.sub.j: 0≤j<N}, i.e.,
3.3 Basic Form of Estimator
[0030] We seek to invert the mapping from the distribution of photon energy A to the distribution of X. Our strategy is to first obtain the characteristic function of X in terms of ƒ.sub.A, then invert the mapping assuming the count-rate and noise characteristics are known. Let ϕ.sub.X, ϕ.sub.Y, ϕ.sub.Z, ϕ.sub.A be the characteristic functions of X, Y, Z, A. It is well known [15] that for the compound Poisson process Y with rate λ,
and since X=Y+Z then
[0031] Given the observations x.sub.j we can form an empirical estimate {circumflex over (ϕ)}.sub.X of the characteristic function of X. Treating this as the true characteristic function, we can invert (40), (41) to obtain the characteristic function of A and then take the Fourier transform to find the amplitude spectrum ƒ.sub.A. Specifically, using (40), (41) and exploiting the assumption that Z is Gaussian to ensure ϕ.sub.Z(u) will be non-zero ∀u∈, we let γ:
.fwdarw.
be the curve described by
[0032] Temporarily assuming ∀u, γ(u)≠0, after taking the distinguished logarithm of (43) and rearranging we have
[0033] Ideally, ƒ.sub.A is recovered by taking a Fourier transform
[0034] The basic form of our proposed estimator is given in (88) and is derived from (45) via a sequence of steps. First, ϕ.sub.X is estimated from the data (Step 1). Simply substituting this estimate for ϕ.sub.X in (42) does not produce an ISE optimal estimate of γ. The approximate ISE is obtained from an approximate estimate of the error distribution of ϕ.sub.X (Step 2). We then determine a sensible windowing function G(u) (in Step 3) and estimate γ by
[0035] The windowing function G(u) is designed to minimise the approximate ISE between ƒ.sub.A and our estimate of ƒ.sub.A based on (44), (45) and (46), but with γ in (44) replaced by (46). A similar idea is used for estimating ϕ.sub.A from (44): a weighting function H(u) is found (in Step 4) such that replacing ϕ.sub.A in (45) by
produces a better estimate of ƒ.sub.A than using the unweighted estimate 1/λd log ({circumflex over (γ)}). Finally, the weighting function H(u) is modified (in Step 5) to account for the integral in (45) having to be replaced by a finite sum in practice. The following subsections expand on these five steps.
3.4 Estimating ϕ.SUB.X
[0036] An estimate of ϕ.sub.X(u) is required to estimate γ(u). In this subsection we define a histogram model and describe our estimation of ϕ.sub.X(u) based on a histogram of the x.sub.j values. Assume N intervals (and corresponding x.sub.j values) have been obtained from a finite length data sample. Although the empirical characteristic function
provides a consistent, asymptotically normal estimator of the characteristic function [21], it has the disadvantage of rapid growth in computational burden as the number of data points N and the required number of evaluation points u∈ increases. Instead, we use a histogram based estimator that has a lower computational burden. Assume that a histogram of the observed X values is represented by the 2M×1 vector n, where the count in the mth bin is given by
[0037] All bins of the histogram have equal width. The bin-width is chosen in relation to the magnitude of the x.sub.j values. Since the effect of choosing a different bin width is simply equivalent to scaling the x.sub.j values, we assume the bin-width to be unity without loss of generality. The bins are apportioned equally between non-negative and negative data values. The number of histogram bins 2M influences the estimator in various ways, as discussed in later subsections. For now, it is sufficient to assume that 2M is large enough to ensure the histogram includes all x.sub.j values. We estimate ϕ.sub.X(u) by forming a histogram of scaled x.sub.j values and take the inverse Discrete Fourier transform i.e.,
[0038] This is a close approximation of the empirical characteristic function but where x.sub.j terms have been rounded to the nearest histogram bin centre (and u contracted by a factor of 2π). The term n.sub.m simply counts the number of rounded terms with the same value. Clearly, this function can be efficiently evaluated at certain discrete points u∈−M, . . . , M−1 using the fast Fourier Transform (FFT).
3.5 Error Distribution of {circumflex over (ϕ)}.SUB.X
[0039] The design of the filters G(u) and H(u) in (46) and (47) rely on the statistics of the errors between {circumflex over (ϕ)}.sub.X and the true characteristic function. In this subsection we define and describe the characteristics of these errors. We assume the density function ƒ.sub.X is sufficiently smooth (i.e., |d.sup.nƒ.sub.X(u)/du.sup.n|≤C.sub.n∈∀n∈
) and that the width of the histogram bins are sufficiently small (relative to the standard deviation of the additive noise Z) such that the errors introduced by rounding x.sub.j values to the centre of each histogram bin are approximately uniformly distributed across each bin, have zero mean and are small relative to the peak spreading caused by Z. In other words, the source of error arising from the binning of x.sub.j values is considered negligible. Due to both the statistical nature of Poisson counting and the expected count in each bin being non-integer (
[n.sub.m]∈
.sub.≥0), discrepancies exist between the observed number of counts in any given histogram bin and the expected number of counts for that bin. We combine these two sources of error in our model and refer to it as ‘histogram noise’. We emphasize that this noise is distinct from the additive noise Z modelled in (11), which causes peak spreading in the histogram. Let the probability that a realization of X falls in the m-th bin be
[0040] Let the normalized histogram error ϵ.sub.m in the m-th bin be the difference between the observed count n.sub.m and the expected count [n.sub.m]=Npx.sub.m in the mth bin, relative to the total counts in the histogram N i.e.,
[0041] Using (50), (51) and (52) we have
[0042] If the histogram is modelled as a Poisson vector, show that
[0043] Since the characteristics of the histogram noise can be expressed in terms of the total number of observed intervals N, the impact of using observation data of finite length may be accounted for by incorporating this information into the design of G(u) and H(u).
3.6 Estimating γ
[0044] Having obtained {circumflex over (ϕ)}.sub.X, the next task is to estimate γ. Rather than substitute {circumflex over (ϕ)}.sub.X(u) for ϕ.sub.X(u) in (42), we instead use (46) as the estimator, which requires us to choose a windowing function G(u). In this subsection we attempt to find a function G(u) that is close to optimal. When the distribution of errors in {circumflex over (ϕ)}.sub.X(u) are considered, the windowing function G(u)=G.sub.opt(u) that results in the lowest ISE estimator of the form given in (46) is
where {z} denotes the real component of z∈
. We cannot calculate G.sub.opt(u) since ϕ.sub.A(u) is unknown, so instead we attempt to find an approximation. We let
[0045] This is justified by considering the magnitude of the relative error between the functions g.sub.opt(u) and g.sub.1(u) where
[0046] The magnitude of the relative error is given by
[0047] Since {ϕ.sub.A}∈[−1,1], we see the right hand side of (63) is maximum when
{ϕ.sub.A(u)}=−1. The relative error is thus bound by
which justifies the approximation when λ is small, or when N|ϕ.sub.Z|.sup.2(u)>>e.sup.4λ. Furthermore, we note that the above bound is quite conservative. The distribution of photon energies in spectroscopic systems can typically be modelled as a sum of K Gaussian peaks, where the kth peak has location μ.sub.k and scale σ.sub.k i.e.,
[0048] Consequently, the characteristic function will have the form
i.e., oscillations within an envelope that decays as e.sup.−cu.sup.{ϕ.sub.A}|>>1 for most values of u. The approximation error will be significantly smaller at most evaluation points across the spectrum. Having chosen G(u), we can form an estimate of γ using (46). The windowing function reduces the impact of histogram noise arising from the finite number of data samples. For large values of Ne.sup.−2λ|ϕ.sub.Z(u)|.sup.2, the impact of windowing is negligible and the estimator is essentially the same as using (42) directly. However, in the regions where
the windowing becomes significant, and acts to bound our estimate of γ i.e., Using the fact that the noise Z is Gaussian (so ϕ.sub.Z(u)∈ and hence |ϕ.sub.Z|.sup.2=ϕ.sub.Z.sup.2), and since e.sup.−2λ>0 we see that
[0049] This ensures the argument to the distinguished logarithm in (47) remains finite even though lim.sub.u.fwdarw.∞ϕ.sub.Z(u)=0.
3.7 Estimating ϕ.SUB.A
[0050] Once {circumflex over (γ)} has been obtained, we proceed to estimate ϕ.sub.A using (47). This requires another windowing function H(u). In this subsection we find a function H(u) for estimating ϕ.sub.A that is close to ISE optimal. We begin by defining a function ψ(u) for notational convenience
[0051] The ISE is minimized when H(u)=H.sub.opt(u), where the optimal filter H.sub.opt(u) is given by
[0052] Again, we cannot calculate the optimal filter by using (73)-(74) since ϕ.sub.X(u), ϕ.sub.A(u) and ϕ.sub.ϵ(u) are unknown. We instead make the following observations to obtain an approximation of the ISE-optimal filter.
3.7.1 Initial Observations
[0053] The optimal filter remains close to unity as long as the estimated {circumflex over (ϕ)}.sub.A(u) remains close to the true value of ϕ.sub.A(u). This will invariably be the case for small values of u since
[0054] Furthermore, equation (73) shows that if |ϕ.sub.ϵ(u)|≤≤|ϕ.sub.X(u)|, then {circumflex over (ϕ)}.sub.X(u)=ϕ.sub.X(u)+ϕ.sub.ϵ(u)≈ϕ.sub.X(u) so H.sub.opt(u)≈1. For larger values of u, when the magnitude of |ϕ.sub.X(u)| becomes comparable to or less than |ϕ.sub.ϵ(u)|, the estimator
is dominated by noise and no longer provides useful estimates of ϕ.sub.A(u). In the extreme case |ϕ.sub.X(u)|<<ϕ.sub.ϵ(u)| so |{circumflex over (ϕ)}.sub.X(u)|≈|ϕ.sub.ϵ(u)| and hence
[0055] The window H(u) should exclude these regions from the estimate, as the bias introduced in doing so will be less than the variance of the unfiltered noise. Unfortunately the estimate of ϕ.sub.A(u) can be severely degraded well before this boundary condition is reached, so (77) is not particularly helpful. A more useful method for detecting when noise begins to dominate is as follows.
3.7.2 Filter Design Function
[0056] Further manipulation of (67) shows that for typical spectroscopic systems, the magnitude of ϕ.sub.A will have the form
i.e.; a mean component that decays according to the peak widths σ.sub.k, and a more rapidly decaying oscillatory component that varies according to the location of the spectral peaks μ.sub.k. In designing the window H(u), we are interested in attenuating the regions in |{circumflex over (ϕ)}.sub.A| where |ϕ.sub.A|.sup.2≲|ϕ.sub.ϵ/ψ|.sup.2, i.e., where the signal power is less than the histogram noise that has been enhanced by the removal of ϕ.sub.Z during the estimation of γ. To obtain an estimate of |ϕ.sub.A|, a low-pass, Gaussian shaped filter H.sub.lpf(u) is convolved with |{circumflex over (ϕ)}.sub.A| to attenuate all but the slowly varying, large scale features of |{circumflex over (ϕ)}.sub.A|. We denote this |{circumflex over (ϕ)}.sub.Asmooth|(u)
[0057] We see that |ϕ.sub.ϵ(u)| has a Rayleigh distribution with scale parameter
Consequently
[0058]
[0059] It is well known that the cumulative distribution function of a Rayleigh distributed random variable X.sub.Ray is given by
[0060] Hence, to assist with computing the window H(u) we will make use of the function
to control the shape of H(u). The function α.sub.min(u) provides an indication of how confident we can be that the estimate {circumflex over (ϕ)}.sub.A(u) contains more signal energy than noise energy. The approximation in (84) arises from the fact that |{circumflex over (ϕ)}.sub.Asmooth| is also a random variable slightly affected by the noise ϵ. On occasion—particularly for larger values of |u|—the histogram noise may result in sufficiently large values of α.sub.min(u) to give a false sense of confidence, and potentially allow noisy results to corrupt the estimate of ϕ.sub.A. To overcome this problem, the function was modified to be uni-modal in u
[0061] This modification was justified on the assumption that Gaussian noise causes ϕ.sub.Z(u) to be decreasing in |u|. Consequently we expect [|ϕ.sub.ϵ(u)|/ψ(u)] to be increasing in |u|. If we ignore the local oscillations in ϕ.sub.A(u) that are due to peak locations in ƒ.sub.A(x), the envelope approximated by the smoothed |ϕ.sub.Asmooth|(u) will be non-increasing in |u|. Equation (74) indicates the optimal window has the form λϕ.sub.A(u)/(λϕ.sub.A(u)+d log({circumflex over (ϕ)}.sub.X/ϕ.sub.X)(u)+d log(G)(u), so the overall window shape will be decreasing in |u|. Hence, if the estimated characteristic function in the region of some u.sub.0 (where the signal to noise ratio is high) has determined that the window value should be H(u.sub.0)<1, then it is reasonable to reject the suggestion that in the region u.sub.1>u.sub.0 (where the signal to noise ratio will be worse) that H(u.sub.1)>H(u.sub.0). Using the knowledge that |H.sub.opt(u)| should be close to unity for small |u|, close to zero for large |u|, and should ‘roll off’ as the signal-to-noise-ratio decreases—we consider two potential windowing functions as approximations of H.sub.opt(u).
3.7.3 Rectangle Window
[0062] The indicator function provides a very simple windowing function
[0063] The threshold value α.sub.0 determines the point at which cut-off occurs, and can be selected manually as desired (e.g., α.sub.0=0.95). Once the threshold is chosen, the estimator exhibits similar ISE performance regardless of peak locations in the incident spectra. Rather than requiring the user to select a window width depending on the incident spectrum.sup.1, the width of the window is automatically selected by the data via α.sub.mod(u). While simplicity is the primary advantage of the rectangular window, the abrupt transition region provides a poor model for the roll-off region of the optimal filter. The second filter shape attempts to improve on that. .sup.1Gugushvili [18] proposed a non-parametric estimator for the general decompounding problem in which a rectangular windowing scheme was used. It requires manual selection of window width, which varies with ϕ.sub.A.
3.7.4 Logistic Window
[0064] A window based on the logistic function attempts to model smoother roll-off. It is given by
where α.sub.0 again acts as a threshold of acceptance of the hypothesis that the signal energy is greater than the noise energy in the estimate {circumflex over (ϕ)}.sub.A(u). The rate of filter roll off in the vicinity of the threshold region is controlled by β.sub.0>0. This provides a smoother transition region than the rectangle window, reducing Gibbs oscillations in the final estimate of ϕ.sub.A. Once again, although the parameters α.sub.0, β are chosen manually, they are much less dependent on ϕ.sub.A and can be used to provide close to optimal filtering for a wide variety of incident spectra. Typical values used were α.sub.0=0.95, β.sub.0=40.0. The performance of the rectangle and logistic window functions are compared in section 4.
3.8 Estimating ƒ.SUB.A
[0065] Having designed a window function H(u) and thus an estimator {circumflex over (ϕ)}.sub.A(u), the final task is to estimate ƒ.sub.A(x) by inverting the Fourier transform. This sub-section describes several issues that arise with numerical implementation. Firstly, it is infeasible to evaluate {circumflex over (ϕ)}.sub.X, {circumflex over (γ)}(u) and {circumflex over (ϕ)}.sub.A numerically on the whole real line. Instead we estimate it at discrete points over a finite interval. The finite interval is chosen sufficiently large such that a tolerably small error is incurred as a result of excluding signal values outside the interval. This is justified for ƒ.sub.A(x) being a Gaussian mixture, since the magnitudes of ϕ.sub.X and ϕ.sub.A will decay as e.sup.−cu.sup.}. Secondly, the distinguished logarithm in (47) is undefined if {circumflex over (γ)}(u)=0. In estimating γ(u) from the data, there is a small but non-zero probability that the estimate will be zero. In this case, the distinguished logarithm in (47) is undefined and the technique fails. As |u| increases, |ϕ.sub.X|(u) decreases and may approach |ϕ.sub.ϵ|(u). When |ϕ.sub.X|(u) and |ϕ.sub.ϵ|(u) have similar magnitudes, the probability of |ϕ.sub.X+ϕ.sub.ϵ| (and hence {circumflex over (γ)}) being close to zero can become significant. The filter H(u) should roll off faster than |ϕ.sub.X|(u) approaches |ϕ.sub.ϵ|(u) to reduce the impact this may have on the estimate. Ideally H(u) should be zero in regions where noise may result in |{circumflex over (γ)}|(u) being close to zero. Gugushvili has shown [18] that for a rectangular window, the probability of inversion failure approaches zero as the length of the data set increases N.fwdarw.∞.
3.9 Discrete Notation
[0066] We digress momentarily to introduce additional notation. Throughout the rest of the paper, bold font will be used to indicate a 2M×1 vector corresponding to a discretely sampled version of the named function, e.g., {circumflex over (ϕ)}.sub.A represents a 2M×1 vector whose values are given by the characteristic function {circumflex over (ϕ)}.sub.A(u) evaluated at the points u∈{0, 1, . . . , M−1, −M, . . . , −2, −1}. Square bracket notation [k] is used to index a particular element in the vector, e.g., {circumflex over (ϕ)}.sub.A[M−1] has the value of {circumflex over (ϕ)}.sub.A(M−1). We also use negative indexes for accessing elements of a vector in a manner similar to the python programming language. Negative indexes should be interpreted relative to the length of the vector, i.e., {circumflex over (ϕ)}.sub.A[−1] refers to the last element in the vector (which is equivalent to {circumflex over (ϕ)}.sub.A[2M−1]).
3.10 Summary of Estimator
[0067] The estimation procedure we use may be summarized in the following steps. [0068] 1. Partition sampled time series into intervals using (8). [0069] 2. Calculate x.sub.j value for each interval according to (9). [0070] 3. Generate histogram n from the x.sub.j values. [0071] 4. Calculate {circumflex over (ϕ)}.sub.X using the inverse FFT to efficiently evaluate (50) at various sample points. [0072] 5. Calculate ϕ.sub.Z and G at the appropriate points. [0073] 6. Calculate {circumflex over (γ)} via (46) using {circumflex over (ϕ)}.sub.X, G and ϕ.sub.Z. [0074] 7. Calculate |ϕ.sub.Asmooth(u)|, a low-pass filtered version of
3.11 Performance Measures
[0079] The performance of the estimator is measured using the integrated square of the error (ISE). The ISE measures the global fit of the estimated density.
[0080] The discrete ISE measure is given by
where p.sub.A is a 2M×1 vector whose elements contain the probability mass in the region of each histogram bin i.e.,
[0081] The vector {circumflex over (p)}.sub.A represents the corresponding estimated probability mass vector.
4 NUMERICAL RESULTS OF THE FIRST EMBODIMENT
[0082] Experiments were performed using simulated and real data.
4.1 Simulations
[0083] The ideal density used by Trigano et al. [11] was used for these simulations. It consists of a mixture of six Gaussian and one gamma distribution to simulate Compton background. The mixture density is given by
where (μ, σ.sup.2) is the density of a normal distribution with mean μ and variance σ.sup.2. The density of the gamma distribution is given by g(x)=(0.5+x/200)e.sup.−(0.5+x/200). The density was sampled at 8192 equally spaced integer points to produce the discrete vector p.sub.A of probability mass. The FFT was taken to obtain ϕ.sub.A, a sampled vector of ϕ.sub.A values. [0084] A particular count rate λ was chosen for an experiment, corresponding to the expected number of events per observation interval. The expected pile-up density was obtained via (40). i.e., the discrete vector ϕ.sub.A was scaled by λ, exponentiated, then scaled by e.sup.−λ and finally an FFT was applied
[0085] Equation (93) was convolved with a Gaussian to simulate the effect of noise Z smearing out the observed spectrum
[0086] This represents the expected density of the observed spectrum, including pile-up and additive noise. Observation histograms were created using random variables that were distributed according to (94). Experiments were parameterized by the pair (N, λ) where N∈{10.sup.4, 10.sup.5, 10.sup.6, 10.sup.7, 10.sup.8, 10.sup.9} and λ∈{1.0, 3.0, 5.0}. For each parameter pair (N, λ), one thousand observed histograms were made. Estimates of the probability mass vector p.sub.A were made using (88) with both (86) and (87) used for H(k). A threshold value of α.sub.0=0.95 was used for both window shapes, and β.sub.0=40.0 for the logistic shape. The discrete ISE measure of the error between each estimate {circumflex over (p)}.sub.A and the true vector p.sub.A were recorded. For comparison with asymptotic bandwidth results, estimates were made using a rectangular window whose bandwidth was selected according to the condition 1.3 specified by Gugushvili in [18] i.e., h.sub.N=(ln N).sup.−β where β<½. We emphasize that the β of Gugushvili's filter is not to be confused with the β.sub.0 of (87). The asymptotic bandwidth criterion was implemented by using
[0087] Three values for Gugushvilli's β were trialed, namely β=½, ⅓, ¼. [0088] Estimates were also made using a rectangular filter (95) with fixed bandwidths of various values α/M∈[0.2, 0.4, 0.6, 0.8]. Finally time-series data was created according to (1) with an idealised rectangular pulse shape and 10.sup.7 pulses whose energies were distributed according to (92). The pulse length and count rate were chosen to give a Poisson rate Δ=1.0. The algorithm described by Trigano et al. [11] was used to estimate the underlying amplitude density from a bi-dimensional histogram containing 32×1024 (duration×energy) bins—this choice of bins reportedly giving the best accuracy and reasonable execution times. The performance and processing time of the core algorithm were recorded for comparison with our proposed algorithm.
TABLE-US-00001 TABLE 1 Comparison With Algorithm Described in [11] Algorithm Avg. ISE Avg. Time (sec) Fast Trigano Algorithm 1.3 × 10.sup.−5 3.19 32 × 1024 (duration × energy) bins Proposed Algorithm .sup. 1 × 10.sup.−5 0.019
4.2 Real Data
[0089] The estimator was applied to real data to assess its usefulness in practical applications. The threshold value ϵ found in (8) was chosen to be approximately one half the standard deviation of the additive noise w(t). This ensured a reasonably high probability of creating intervals, yet ensured errors in the estimation of interval energy were low. A value for the interval length L was chosen approximately four times the ‘length’ of a typical pulse—that is, four times the length of the interval {t: Φ(t)>ϵ}. An energy histogram was obtained from a manganese sample, with a photon flux rate of nominally 10.sup.5 events per second. A slight negative skew was present in the shape of the main peaks of the observed histogram, suggesting a complicated noise source had influenced the system. This is barely visible in (μ.sub.1, σ.sub.1.sup.2)+α.sub.2
(μ.sub.2, σ.sub.2.sup.2) to the noise peak located around bin index zero. A suitable value for λ was chosen manually. The logistic filter with data-driven bandwidth was used to estimate the true density.
5 SUMMARY OF THE FIRST EMBODIMENT
[0090] We have taken the estimator proposed by Gugushvili [18] for decompounding under Gaussian noise, and adapted it for correcting pulse pile-up in X-ray spectroscopy. We have proposed a data-driven bandwidth selection mechanism that is easily implemented, and provides significant reduction in ISE/MISE across a broad range of sample counts of interest to spectroscopic applications (10.sup.4˜10.sup.9 counts). The data-driven rectangular bandwidth selection is close to optimal (for rectangular filters), and over the range of interest outperforms bandwidth selection based on asymptotic results or fixed bandwidth. [0091] Although initial results appear promising, further work is required to improve the performance for practical implementations. The estimation still contains ‘ringing’ artefacts associated with the Gibbs phenomenon. Additional filter shape attempts to reduce this, there are other shapes that are closer to MSE optimal.
6 SECOND EMBODIMENT
[0092] This section gives a summary of the spectrum estimator of the 2nd embodiment. The 2nd embodiment solves the problem of the 1st embodiment which requires entire clusters to be approximately encompassed in each interval. In the 2nd embodiment, the entire data series if desired can be used, and the overlap is compensated by introduction of 2 different interval lengths L and L1. [0093] We need to include a few additional terms not mentioned in the first embodiment. In particular {circumflex over (ϕ)}.sub.X1. The spectrum estimator is based on
[0094] The introduction of the filter H(u; α) allows us to address several implementation issues that arise. The estimation procedure we use may be summarized in the following steps. [0095] 1. Partition sampled time-series into fixed length intervals [T.sub.j, T.sub.j+L), j∈ [0096] 2. Calculate x.sub.j value for each interval according to x.sub.j=Σ.sub.k∈[T.sub.
6.1 Algorithm Details
[0107] Partition the detector output stream into a set of non-overlapping intervals of length L i.e., [T.sub.j, T.sub.j+L), T.sub.0∈.sub.≥0, T.sub.j+1≥T.sub.j+L, j∈
.sub.≥0. Let x.sub.j be the sum of the detector output samples in the jth interval i.e.,
[0108] Assuming L is greater than a pulse length, the jth interval may contain ‘complete’ pulses as well as pulses which have been truncated by the ends of the interval. It can be shown that x.sub.j consists of a superposition of the energy of ‘complete’ pulses which we denote , the energies of truncated pulses which we denote with
.sub.1j and noise z.sub.j [0109] Let the detector output stream be partitioned into a second set of non-overlapping intervals [T.sub.1j, T.sub.1j+L.sub.1), T.sub.1,0∈
.sub.≥0, T.sub.1,j+1≥T.sub.1,j+L, j∈
.sub.≥0 where L.sub.1<L. Let x.sub.1j be given by
[0110] If L.sub.1 is chosen to be slightly less than the pulse length, the x.sub.1j term will contain no ‘complete’ pulses, but consist of a superposition of only the energies of truncated pulses y.sub.lj and noise z.sub.j. The number of truncated pulses in any interval has a Poisson distribution. We have
where Z.sub.0 represents noise in the regions where pulses are fully contained in the interval (a length of L−L.sub.1), and Z.sub.1 represents noise in the regions where pulses are truncated (a length of L.sub.1). Hence,
[0113] Rearranging gives
[0114] We can estimate ϕ.sub.X.sub.
6.2 Visualization of Internal Quantities
[0116] To aid the reader's understanding, [|ϕ.sub.ƒ(k)|]. It can be see that |{circumflex over (ϕ)}.sub.ƒ| provides a reasonably good estimate of |ϕ.sub.ƒ| in the region where |{circumflex over (ϕ)}.sub.fsmooth|>>
[|ϕ.sub.ϵ|/(λϕ.sub.Ze.sup.−λ)]. As these two quantities approach each other, the quality of the estimate deteriorates until it is eventually dominated by noise. The filter H(k) should include good estimates of |ϕ.sub.ƒ| while excluding poor estimates. To find the regions where good estimates of |ϕ.sub.ƒ| are obtained we address the question: Given
[|ϕ.sub.ϵ/(λϕ.sub.Ze.sup.−λ)], what is the probability that the calculated values of {circumflex over (ϕ)}.sub.ƒ in a local region arise largely from noise?
7 COUNT RATE ESTIMATION
[0117] The previous estimator assumed λ was known. An estimate of λ can be obtained without prior knowledge as follows. [0118] 1. Using {circumflex over (ϕ)}.sub.X, {circumflex over (ϕ)}.sub.X1, {circumflex over (ϕ)}.sub.Z, G from the previous section, calculate
is chosen to allow the curve fit to sufficient accuracy. The parameter λ provides an estimate of the count rate. The optimization engine is not required to give equal weighting to each point in Ψ.
8 DESCRIPTION OF FIGURES
[0124] The following figures are to aid understanding of the process.
REFERENCES
[0125] [1] G. F. Knoll, Radiation Detection and Measurement, 3rd Edition. New York: Wiley, 2000. [0126] [2] P. A. B. Scoullar and R. J. Evans, “Maximum likelihood estimation techniques for high rate, high throughput digital pulse processing,” in 2008 IEEE Nuclear Science Symposium Conference Record, pp. 1668-1672, October 2008. [0127] [3] M. Haselman, J. Pasko, S. Hauck, T. Lewellen, and R. Miyaoka, “FPGA-based pulse pile-up correction with energy and timing recovery,” IEEE Transactions on Nuclear Science, vol. 59, pp. 1823-1830, October 2012. [0128] [4] T. Petrovic, M. Vencelj, M. Lipoglavsek, R. Novak, and D. Savran, “Efficient reduction of piled-up events in gamma-ray spectrometry at high count rates,” IEEE Transactions on Nuclear Science, vol. 61, pp. 584-589, February 2014. [0129] [5] B. A. VanDevender, M. P. Dion, J. E. Fast, D. C. Rodriguez, M. S. Taubman, C. D. Wilen, L. S. Wood, and M. E. Wright, “High-purity germanium spectroscopy at rates in excess of 10.sup.6 events/s,” IEEE Transactions on Nuclear Science, vol. 61, pp. 2619-2627, October 2014. [0130] [6] Y. Sepulcre, T. Trigano, and Y. Ritov, “Sparse regression algorithm for activity estimation in spectrometry,” IEEE Transactions on Signal Processing, vol. 61, pp. 4347-4359, September 2013. [0131] [7] T. Trigano, I. Gildin, and Y. Sepulcre, “Pileup correction algorithm using an iterated sparse reconstruction method,” IEEE Signal Processing Letters, vol. 22, pp. 1392-1395, September 2015. [0132] [8] L. Wielopolski and R. P. Gardner, “Prediction of the pulse-height spectral distortion caused by the peak pile-up effect,” Nuclear Instruments and Methods, vol. 133, pp. 303-309, March 1976. [0133] [9] N. P. Barradas and M. A. Reis, “Accurate calculation of pileup effects in PIXE spectra from first principles,” X-Ray Spectrometry, vol. 35, pp. 232-237, July 2006. [0134] [10] T. Trigano, A. Souloumiac, T. Montagu, F. Roueff, and E. Moulines, “Statistical pileup correction method for HPGe detectors,” IEEE Transactions on Signal Processing, vol. 55, pp. 4871-4881, October 2007. [0135] [11] T. Trigano, E. Barat, T. Dautremer, and T. Montagu, “Fast digital filtering of spectrometric data for pile-up correction,” IEEE Signal Processing Letters, vol. 22, pp. 973-977, July 2015. [0136] [12] P. Ilhe, E. Moulines, F. Roueff, and A. Souloumiac, “Nonparametric estimation of mark's distribution of an exponential shot-noise process,” Electronic Journal of Statistics, vol. 9, no. 2, pp. 3098-3123, 2015. [0137] [13] P. Ilhe, F. Roueff, E. Moulines, and A. Souloumiac, “Nonparametric estimation of a shot-noise process,” in 2016 IEEE Statistical Signal Processing Workshop (SSP), pp. 1-4, June 2016. [0138] [14] C. McLean, M. Pauley, and J. H. Manton, “Limitations of decision based pile-up correction algorithms,” in 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 693-697, June 2018. [0139] [15] D. Snyder and M. Miller, Random Point Processes In Time And Space. New York: Springer-Verlag, 2, revised ed., September 2011. [0140] [16] B. Buchmann and R. Grübel, “Decompounding: An estimation problem for Poisson random sums,” Annals of Statistics, pp. 1054-1074, 2003. [0141] [17] B. van Es, S. Gugushvili, and P. Spreij, “Deconvolution for an atomic distribution,” Electronic Journal of Statistics, vol. 2, pp. 265-297, 2008. [0142] [18] S. Gugushvili, Non-Parametric Inference for Partially Observed Levy Processes. PhD, University of Amsterdam, Thomas Stieltjes Institute, 2008. [0143] [19] S. Said, C. Lageman, N. Le Bihan, and J. H. Manton, “Decompounding on compact Lie groups,” IEEE Transactions on Information Theory, vol. 56, pp. 2766-2777, June 2010. [0144] [20] B. van Es, S. Gugushvili, and P. Spreij, “A kernel type nonparametric density estimator for decompounding,” Bernoulli, vol. 13, pp. 672-694, August 2007. [0145] [21] J. Yu, “Empirical characteristic function estimation and its applications,” Econometric Reviews, vol. 23, pp. 93-123, December 2004.