LEARNING-BASED SUBSAMPLING
20170109650 · 2017-04-20
Inventors
- Volkan Cevher (Chavannes-Renens, CH)
- Yen-Huan LI (Ecublens, CH)
- Ilija Bogunovic (Lausanne, CH)
- Luca Baldassarre (Lausanne, CH)
- Jonathan Scarlett (Lausanne, CH)
- Baran Gözcü (Grandvaux, CH)
Cpc classification
G01R33/54
PHYSICS
H03M7/00
ELECTRICITY
H03M7/3059
ELECTRICITY
International classification
Abstract
The present invention concerns a method of sampling a test signal. The method comprises: acquiring (21) training signals sampled at a plurality of sampling locations; running (23) an optimization procedure for determining an index set of n indices, representing a subset of the sampling locations, that maximize a function, over the training signals, of a quality parameter representing how well a given training signal is represented by the n indices; and sampling (25) the test signal at the sampling locations represented by the n indices.
Claims
1. A method of sampling a test signal, the method comprising: acquiring training signals sampled at a plurality of sampling locations; running an optimization procedure for determining an index set of n indices, representing a subset of the sampling locations, that maximize a function, over at least some of the training signals, of a quality parameter representing how well a given training signal is represented by the n indices; and sampling the test signal at the sampling locations represented by the n indices.
2. The method according to claim 1, further comprising obtaining an estimate of the test signal based only on samples obtained at the sampling locations represented by the n indices.
3. The method according to claim 1, wherein the optimization procedure forms training signal estimates for at least some of the training signals, based only on samples obtained at sampling locations represented by a candidate index set to obtain the index set.
4. The method according to claim 1, wherein the optimization procedure maximizes an average of the quality parameters of the training signals.
5. The method according to claim 1, wherein the optimization procedure maximizes a minimum of the quality parameters of the training signals.
6. The method according to claim 1, wherein the quality parameter corresponds to a given training signal energy.
7. The method according to claim 1, wherein the quality parameter is derived from a mean-square error of the training signal estimates.
8. The method according to claim 1, wherein the optimization procedure defines an optimization problem for determining the index set, and wherein the optimization problem is solved by determining modularity and/or submodularity structures in the optimization problem.
9. The method according to claim 1, wherein the optimization procedure defines an optimization problem of the type: {:||=n}, .sub.i is the i-th row of , which is an orthonormal basis matrix such that C.sup.pp, where p is a dimension of the test signal.
10. The method according to claim 1, wherein the optimization procedure defines an optimization problem of the type:
11. The method according to claim 1, further comprising applying one or more weighing factors in the optimization procedure to prioritize some training signals and/or some parts of the training signals.
12. The method according to claim 1, wherein sampling of the test signal is done non-uniformly.
13. The method according to claim 1, wherein substantially no sampling of the test signal is done outside the locations given by the n indices.
14. The method according to claim 1, wherein the data sample set comprises data samples from fully sampled training signals.
15. The method according to claim 1, wherein samples of the one or more training signals have the most energy substantially at the same locations as samples of the test signal.
16. The method according to claim 1, wherein the index set is used for sampling further test signals.
17. The method according to claim 1, wherein in the determination of the index set n, one or more constraints relating to at least one of the following are taken into account: sampling apparatus, settings of the sampling apparatus, test signal type and user.
18. The method according to claim 1, wherein in the determination of the index set n, one or more constraints are imposed in the optimization procedure.
19. A signal sampling apparatus for sampling a test signal, the apparatus comprising means for: acquiring training signals sampled at a plurality of sampling locations; running an optimization procedure for determining an index set of n indices, representing a subset of the sampling locations, that maximize a function, over at least some of the training signals, of a quality parameter representing how well a given training signal is represented by the n indices; and sampling the test signal at the sampling locations represented by the n indices.
20. The signal sampling apparatus according to claim 19, wherein the signal sampling apparatus is a linear signal sampling apparatus.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] Other features and advantages of the invention will become apparent from the following description of a non-limiting exemplary embodiment, with reference to the appended drawings, in which:
[0017]
[0018]
[0019]
[0020]
[0021]
DETAILED DESCRIPTION OF AN EMBODIMENT OF THE INVENTION
[0022] An embodiment of the present invention will now be described in detail with reference to the attached figures. Identical or corresponding functional and structural elements which appear in different drawings are assigned the same reference numerals.
[0023] The goal of the present invention is to design so-called principled learning-based algorithms for performing subsampling in compressive measurement systems. In the present invention, the term subsampling is understood to refer to a sampling process in which less than the whole test signal is sampled. Instead, only a portion of the signal is sampled at locations given by the indices of the index set . Currently, despite their importance, learning-based sampling techniques are largely unexplored. The present invention provides a data-driven approach in which a fixed index set is chosen based on a set of training signals that are either fully known or already accurately known. In the present invention the word signal is to be understood broadly, and can be understood to be an analog signal or a discrete signal, such as a set of pixels or a data set.
[0024] The goal is to select indices of the index set , which perform well on unknown real test signals, also referred to as measurement signals, that are similar in some sense to the training signals. For example, the training signals can be considered to be similar to the test signal when the most energy (for example in the Fourier domain) captured by samples of the training signals is substantially at the same sampling locations as the energy captured by samples of the test signal. Another notion of similarity is that in which the locations of the large coefficients have common clustering patterns. More generally, the similarity can be measured according to a quality measure, as defined below. The training signals may represent for instance the same body parts of human beings, such as knees, taken from different people. The real test signal would represent a patient's body part, e.g. the same body part as the training signals.
[0025] This invention relies on three aspects of the system being chosen in advance in a manner that may depend on the application at hand: [0026] 1. The measurements that can be obtained are assumed to be restricted to some finite or infinite set from which subsamples are to be chosen. For example, in MRI systems, this set may be the set of all points in the Fourier domain. Other orthonormal bases include the Hadamard and Wavelet bases. Non-orthogonal designs are also possible; for example, if the measurements represent photon counts then the measurements may be based on inner products with positive-valued vectors, which are non-orthogonal when there is any overlap in the corresponding supports (i.e. locations of the non-zero entries). [0027] 2. An estimator (also referred to as a decoder) is chosen that, given measurements of a subsampled signal, produces an estimate of that signal. Two specific examples are given below, and several more estimators are available in the existing literature. For example, estimators for compressive sensing include the least absolute shrinkage and selection operator, orthogonal matching pursuit, iterative hard thresholding, and iterative reweighted least squares. [0028] 3. A quality measure, also referred to as a quality parameter, is chosen that, given the original signal and its estimate, produces a number reflecting how favorable the estimate is. For example, this number may equal the negative mean-square error between the two signals, so that a smaller mean-square error is considered more favorable. More generally, the quality can be the negative of any distance between the two signals, according to the formal mathematical definition of distance. The quality measure can also be a divergence measure between the two signals. Depending on the application, further choices may be preferable; for example, in comparing to audio signals, one may be primarily interested in quality measures based on amplitude but not phase, and such a measure will typically not correspond to a mathematical distance.
[0029] To determine the index set 12, combinatorial optimization problems are formed seeking to maximize the quality measure of the estimator when applied to the training signals, either in an average-case or worst-case sense, as explained later. In some cases, such optimization problems can be solved without explicitly applying the estimator to any of the training signals.
[0030] While combinatorial optimization problems are difficult in general, the present invention permits choices of objective functions that can be optimized by identifying beneficial discrete structures or properties, such as modularity or submodularity as explained later, that permit approximate, near-optimal, or exact solutions for the optimization problems to be obtained efficiently.
[0031] The block diagram of
[0032] In the block diagram of
[0033] Alternatively, it could be a digital signal or any set of data, such as a data matrix. In this example, the signal sampler 17 takes two inputs: the test signal and the optimized index set comprising optimized indices from the index set selector 13. Discrete samples obtained by the signal sampler 17 are configured to be fed to a signal estimator 19, also referred to as a decoder, configured to reconstruct from the discrete samples a signal, which one desires to be as close as possible to the original test signal. The estimated signal is thus ideally almost identical or substantially identical to the test signal.
[0034] A method of subsampling a test signal according to an embodiment of the present invention can be summarized by the flow chart of
[0035] We first present two optimization problems that can be considered in vast generality, even though the difficulty of solving these problems can vary depending on the choice of estimator and quality measure. Let a set of m training signals x.sub.1, . . . , x.sub.n, be given, or if the training signals have not been fully sampled, let these be the best estimates obtained via state-of-the-art existing techniques. Suppose that the estimator 19, when applied to these training signals one-by-one for a certain index set , produces the m training signal estimates {circumflex over (x)}.sub.1(), . . . , {circumflex over (x)}.sub.m(), and suppose that the corresponding quality measures are q(x.sub.1,{circumflex over (x)}.sub.1()), . . . , a(x.sub.m,{circumflex over (x)}.sub.m()).
[0036] The index selection scheme that maximizes the average quality is defined as
where {:||=n} is a cardinality-constrained subset of {1, . . . , p} in which the selected index set is assumed to lie, n being typically much smaller than p. This set can be simply be the set {:||=n} itself, or it can be further constrained to suit the application, as outlined below. The average can also be replaced by a weighted average, as outlined below.
[0037] The index selection scheme that maximizes the minimal quality is defined as
In words, this rule optimizes the quality for the worst-case rather than the average-case.
[0038] While it may appear that we require the estimator 19 to be applied to the training signals several times in order to solve these optimization problems, there are cases in which this is not necessary. We proceed by providing an example of a finite-dimensional and noiseless scenario in which it is indeed not necessary, even though the invention is also applicable to infinite-dimensional and noisy scenarios. We consider the measurement model
b=P.sub.x (5)
for some orthonormal basis matrix C.sup.pp, and some subsampling matrix P.sub. whose rows are canonical basis vectors. We assume without loss of generality that |x.sub.j|=1 for all j, since the signals can always be normalized to satisfy this condition.
[0039] While the focus in the present invention is on the selection of the index set , the recovery algorithm used by the signal estimator 19 also contributes towards achieving the goal. In this example, we consider a procedure that expands b to a p-dimensional vector by placing zeros in the entries corresponding to dr, and then applies the adjoint *=.sup.1:
{circumflex over (x)}=*P.sub..sup.Tb. (6)
[0040] It is to be noted that this is a linear decoder, and can be implemented highly efficiently even in large-scale systems for suitably structured matrices (e.g. Fourier or Hadamard). Another noteworthy feature of this estimator is that under the quality measure
q(x, {circumflex over (x)})=x{circumflex over (x)}.sub.2.sup.2 (7)
that favors estimates that are close to the corresponding signal in the 2-norm, the quality function is an increasing function of the energy captured in the index set, P.sub.x.sub.1.sup.2.sub.2. Thus, maximizing the quality measure amounts to maximizing the captured energy.
[0041] Alternatively, if x is known to be approximately sparse in some known basis, i.e. x=*z for some approximately sparse vector z and basis matrix , then stable and robust recovery is possible using standard CS algorithms. A particularly popular choice is basis pursuit (BP), which estimates
and then sets {circumflex over (x)}=*{circumflex over (z)}. It is also possible to replace the basis pursuit recovery with other convex programs that leverage additional structured sparsity of the coefficients.
[0042] For the linear decoder, our subsampling strategy is as follows: given the training signals x.sub.1, . . . , x.sub.m, we seek a subsampling scheme that optimizes a function subject to some constraints:
where {:||=n} is a cardinality constrained subset of {1, . . . , p}, and the function F is either chosen directly based on the quality measure or a heuristic. In the present example, we consider the choice
F(, x.sub.1, . . . , x.sub.m), :=f(P.sub.x.sub.1.sub.2.sup.2, . . . , P.sub.x.sub.m.sub.2.sup.2) (10)
for some function f. Specifically, we find that both (3) and (4) can be written in this form, due to the above-mentioned correspondence between maximum quality measure and maximum energy captured.
[0043] Optimization problems of the form (9) are combinatorial, and in general finding the exact solution is non-deterministic polynomial-time (NP) hard, meaning that finding the optimal solution is not computationally feasible. The key idea in all of the examples below is to identify advantageous combinatorial structures in the optimization problem in order to efficiently obtain near-optimal solutions. For instance, it will be shown that submodularity properties in the optimization problems, and more specifically in the function F(.Math.) are helpful.
[0044] Definition 1. A set function h() mapping subsets {1, . . . , p} to real numbers is said to be submodular if, for .sub.1,.sub.2 {1, . . . , p} with .sub.1 .sub.2, and all i {1, . . . , p}\.sub.2, we have
h(.sub.1 {i})h(.sub.1)h(.sub.2 {i})h(.sub.2). (11)
The function is said to be modular if the same holds true with equality in place of the inequality.
[0045] This definition formalizes the notion of diminishing returns: adding an element to a smaller set increases the objective function more compared to when it is added to a larger set. The focus in this example will be on submodular functions that are also monotonic, i.e., h(.sub.2)h(.sub.1) whenever .sub.1 .sub.2. Submodular or modular set functions often allow us to efficiently obtain near-optimal solutions.
[0046] In general, the ability to find approximate or exact solutions to (9) also depends on the choice of the constraint set {:||=n}. In the context of structured signal recovery, a notable choice is multi-level subsampling, where the indices {1, . . . , p} are split into K disjoint groups, also referred to as levels, with sizes {p.sub.k}.sub.k=1.sup.K, and the number of measurements within the k-th group is constrained to be n.sub.k, with .sub.k=1.sup.K n.sub.k=n. Thus, the total number of possible sparsity patterns is
This structure can be efficiently handled within our framework. When the combinatorial problem is not tractable, then cutting plane and branch and bound techniques may be applied to obtain satisfactory solutions. In other instances, we can attempt brute force solutions.
[0047] In the context of image compression with image-independent subsampling, may correspond to the wavelet basis, and a suitable choice for
forces the coefficients to form a rooted connected subtree of the wavelet tree of cardinality n. In this case, the total number of subsampling patterns is the Catalan number
This structure can be handled using dynamic programming, as shown by Bhan et al. Tractability of Interpretability via Selection of Group-Sparse Models, 2013 IEEE International Symposium on Information Theory.
[0048] Average-case criterion (f=f.sub.avg)
[0049] We first consider the function
yielding the optimization problem
where .sub.i is the i-th row . This corresponds to maximizing the average energy in the training signals. In other words, according to this optimization problem, a sum of the energies in the training signals, captured by samples at the sampling locations given by the n indices, is as large as possible. As mentioned above, this in fact corresponds to minimizing the squared-error recovery performance for the linear decoder in (6). That is, equation (12) is simply a re-write of equation (3). It is to be noted that in equation (12), the exponent, instead of being 2, could be q, where q>0.
[0050] Since the sum of (sub)modular functions is again (sub)modular, it can be seen that (12) is a modular maximization problem, thus permitting an exact solution to be found efficiently in several cases of interest.
[0051] Case 1 (no additional constraints): In the case where the problem is
where {:||=n}, the exact solution is found by sorting: Select the n indices whose values of
are the largest. The running time is dominated by the precomputation of the values
and behaves as O(mp.sup.2) for general matrices , Or O(mp log p) for suitably structured matrices such as Fourier and Hadamard.
[0052] Case 2 (multi-level sampling): In the case that the constraints defining correspond to multi-level sampling as defined above, the exact solution is found by simply performing sorting within each level.
[0053] Case 3 (rooted connected tree constraint): Choosing the set that forces the coefficients to form a rooted connected wavelet tree, there exists a dynamic program for finding the optimal solution in O(np) time.
[0054] Generalized average-case criterion (f=f.sub.gen)
[0055] We generalize the choice f=f.sub.avg by considering
for some function g: [0,1].fwdarw.R. We consider the case that g is an increasing concave function with g(0)=0. It is to be noted that in equation (14), the exponent, instead of being 2, could be q, where q>0.
[0056] It is to be noted that this example also permits weights .sub.ij in front of the terms |.sub.i, x.sub.j
|.sup.2 in all of the optimization problems. This permits certain training signals to be treated as more important than others (via higher values of {.sub.ij}.sub.i=1.sup.p), and also permits the energy of certain parts of the signal to be considered more important than the energy in others parts (via higher values of {.sub.ij}.sub.i=1.sup.m. In the present description, we focus only on uniform weights for clarity of explanation.
[0057] We established above that the argument to g in (14) is a modular function of . Recalling that g is concave and increasing by assumption, it follows that (14) is a submodular optimization problem. While finding the exact solution is hard in general, it is possible to efficiently find an approximate solution with rigorous guarantees in several cases.
[0058] Case 1 (no additional constraints): In the case that {:||=n}, a solution whose objective value is within a multiplicative factor of
of the optimum can be found via a simple greedy algorithm: start with the empty set, and repeatedly add the item that increases the objective value by the largest amount, terminating once n indices have been selected. The values |.sub.i, x.sub.j
|.sup.2 can be computed in O(mp.sup.2) (general case), or O(mp log p) (structured case), and the greedy algorithm itself can be implemented in O(nmp) time, with the factor of m arising due to the summation in (14).
[0059] Case 2 (matroid constraints): Although we do not explore them in detail in the present description, there exist algorithms for obtaining solutions that are within a factor of
of the optimum for more general constraint sets , known as matroid constraints, see for example Filmus and Ward, A tight combinatorial algorithm for submodular maximization subject to a matroid constraint, Foundations of Computer Science conference 2012. For example, the above-mentioned multi-level sampling constraint falls into this category.
[0060] Worst-case criterion (f=f.sub.min)
[0061] Finally, we consider the choices {:||=n} and f.sub.min(.sub.1, . . . , .sub.m):=min.sub.j=1, . . . , m.sub.j, yielding
[0062] This can be thought of as seeking robustness with respect to the worst image in the training set, which may be desirable in some cases. In other words, this optimization problem is solved so that a minimum of the energies in the training signals, captured by samples at the sampling locations given by the n indices, is as large as possible. Note that in this example, equation (15) is equivalent to equation (4). It is to be noted that in equation (15), the exponent, instead of being 2, could be q, where q>0.
[0063] The objective function in (15) is the minimum of m modular functions, and there exist algorithms for solving general problems of this form, including an algorithm called Saturate proposed by Krause et al. in Robust Submodular Observation Selection, Journal of Machine Learning Research, vol. 9, 2761-2801, 1-2008. The running time is O(p.sup.2m log m) in the worst case. In practice, it has been observed to run the algorithm much faster. Moreover, it was found that the total number of samples returned by the algorithm is very close to its maximum value an (e.g. within 1%).
[0064] It is also possible to consider a generalization of all the previous optimization problem approaches, where instead of defining the energy in terms of the l.sub.2 norm, we generalize it to any l.sub.q norm, for q1. That is, we can replace |.sub.i, x.sub.j
|.sup.2 with |
.sub.i, x.sub.j
|.sup.q.
[0065] While the above approaches are finite-dimensional, the teachings of the present invention can directly be applied to infinite-dimensional settings. In such cases, we have {1,2, . . . }, so it is practically not possible to search over all possible indices. Nevertheless, truncation methods are directly applicable to the training signals (see Adcock and Hansen, Generalized sampling and infinite-dimensional compressed sensing, University of Cambridge technical report, 2011), and a suitable truncation length can be learned from the training signals.
[0066] The above signal sampling method is summarized by the flow chart of
[0067] The proposed invention has a wide range of sensing applications as explained next in more detail.
[0068] Wearable and Implantable Bio-Sensors
[0069] Nowadays, there is a growing interest in designing and building sensors for real-time monitoring of health-related biological signals. These sensors are either worn around the body, e.g., heart-rate monitors, or directly implanted in the body. One such field of application is the recording of neuro-electrical signals from the brain, which may allow for the early detection of epileptic seizures or an understanding of the neurological basis of depression. For these cases, the sensor must be small enough to avoid damaging the cerebral tissue and also consume as little power as possible to avoid overheating the brain. To overcome these issues, area and power-efficient compressive sensing-based circuits have been proposed for reducing the data size before transmission, which is main power drain. Furthermore, it has often been reported that biological signals are structured and not completely random. Therefore, the proposed invention can directly exploit the structure in the signals in order to provide a fixed sampling pattern that allows for greater compression rates than randomized approaches, while maintaining high signal reconstruction performance.
[0070] Magnetic Resonance Imaging
[0071] In MRI, Fourier samples of an object are measured in order to create a 3D image of its interior. Many modern systems already exploit compressive sensing and the structure in the object for reducing the number of samples necessary for reconstruction. However, none of the methods proposed in the literature so far are capable of adapting to a specific class of objects. For instance, in many research applications of MRI, the scientists are interested in acquiring images of the same objects over and over again, subject to small changes. The method according to the present invention can adapt to the specific structure of these objects and provide a more compact sampling pattern that allows a reduction of the scan time, while providing the same or better reconstruction quality.
[0072] Computed Tomography
[0073] In X-ray computed tomography (CT), parallel beams of X-rays shine through an object of interest and are partially absorbed along the way. Multiple projected images are recorded by varying the angle of incidence of the beam. From these projections, it is possible to reconstruct the internal 3D structure of the object. However, current methods require many projections to be recorded at small angular increments in order to obtain high quality reconstructions. As is the case for MRI, CT scans are often made of similar objects, and the present invention can therefore leverage the similarities and the structures of these objects in order to select only those angles that yield the most informative projections, without sacrificing the quality of the reconstructions.
[0074] CMOS-Based Imaging
[0075] Recently, a complementary metaloxidesemiconductor (CMOS) image sensor that is capable of directly capturing transformations of images has been proposed, allowing one to compressively sense images, instead of first acquiring the signals for all the pixels and then compressing. The main limitation of this sensor consists in the need to define a suitable set of coefficients to sample to achieve an optimal trade-off between the compression rate and the image reconstruction quality. The present invention can be directly applied to this problem by optimizing the sampling strategy from previous images.
[0076] Ultrasound
[0077] Real-time 3D ultrasound imaging (UI) has the potential to overcome many of the limitations of standard 2D ultrasound imaging, such as limited probe manipulation for certain tissues, while providing real-time volumetric images of organs in motion. However, 3D UI requires the measurement and processing of a tremendous amount of data in order to achieve good image quality. Recently, leveraging ideas from 2D frequency beamforming, a technique to reduce the sampled data has been proposed. A key element of this technique is to define a small range of discrete Fourier coefficients to be retained per 2D slice. The present invention can be directly applied to this domain in order to optimize the subset of discrete Fourier coefficients to sample for each class of images.
[0078] As explained above, the present invention provides a principled learning-based approach to subsampling the rows of an orthonormal basis matrix for structured signal recovery, thus providing an attractive alternative to traditional approaches based on parameterized random sampling distributions. Combinatorial optimization problems were proposed based on the average-case and worst-case energy captured in the training signals, obtained solutions via the identification of modularity and submodularity structures.
[0079] While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive, the invention being not limited to the disclosed embodiment. Other embodiments and variants are understood, and can be achieved by those skilled in the art when carrying out the claimed invention, based on a study of the drawings, the disclosure and the appended claims.
[0080] In the claims, the word comprising does not exclude other elements or steps, and the indefinite article a or an does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.