Guided bayesian experimental design

09720130 · 2017-08-01

Assignee

Inventors

US classification

  • 1/1

Cpc classification

International classification

Abstract

A Bayesian methodology is described for designing experiments or surveys that are improved by utilizing available prior information to guide the design toward maximally reducing posterior uncertainties in the interpretation of the future experiment. Synthetic geophysical tomography examples are used to illustrate benefits of this approach.

Claims

1. A method for characterizing a subterranean formation, the method comprising: selecting model parameters of a linear model that predicts physical observations of the formation; determining a prior model probability density relating to the model parameters of the linear model, the prior model probability density including (i) a mean prior model, and (ii) a first covariance matrix describing uncertainty around the mean prior model; determining a second covariance matrix relating to the model parameters of the linear model and describing uncertainty around anticipated observation noise; and iteratively adding a select candidate physical observation to a set of physical observations of the formation as predicted by the linear model based on D-optimality augmented criteria using a processor, which involves determining a sensitivity matrix for the model parameters and the set of physical observations based on a determinant derived from the first covariance matrix and components of the second covariance matrix that are associated with the select candidate physical observation; and configuring survey equipment that characterizes the formation using the set of physical observations.

2. The method of claim 1, further comprising assigning the physical model parameters to a probability density function.

3. The method of claim 1, wherein each physical observation is associated with multiple measurements.

4. The method of claim 3, further comprising processing k measurements at a time for a physical observation with k associated measurements.

5. The method of claim 1, wherein the determinant is further derived from components of the second covariance matrix that are associated with a number of base physical observations.

6. The method of claim 1, wherein at least one select candidate physical observation is iteratively added to the set of physical observations of the formation until a stop condition is satisfied.

7. The method of claim 1, wherein the survey equipment comprises an array of seismic receivers.

8. Apparatus for characterizing a subterranean formation, the apparatus comprising: a memory storing comprising physical model parameters of a linear model that predicts physical observations of the formation; and a processor which is configured to: determine a prior model probability density relating to the model parameters of the linear model, the prior model probability density including (i) a mean prior model, and (ii) a first covariance matrix describing uncertainty around the mean prior model, determine a second covariance matrix relating to the model parameters of the linear model and describing uncertainty around anticipated observation noise, and iteratively add a select candidate physical observation to a set of physical observations of the formation as predicted by the linear model based on D-optimality augmented criteria, which involves determining a sensitivity matrix for the model parameters and the set of physical observations based on a determinant derived from the first covariance matrix and components of the second covariance matrix that are associated with the select candidate physical observation; wherein the set of physical observations are used to configure survey equipment that characterizes the formation.

9. The apparatus of claim 8, wherein the determinant is further derived from components of the second covariance matrix that are associated with a number of base physical observations.

10. The apparatus of claim 8, wherein at least one select candidate physical observation is iteratively added to the set of physical observations of the formation until a stop condition is satisfied.

11. The apparatus of claim 10, wherein the stop condition involves at least one of: i) a condition related to the determinant, ii) a condition related to the number of candidate physical observations added to the set of physical observations and to the total number of model parameters of the linear model, and iii) a condition related to the number of candidate physical observations added to the set of physical observations and a predefined maximum number.

12. The apparatus of claim 8, wherein the survey equipment comprises an array of seismic receivers.

13. The method of claim 6, wherein the stop condition involves at least one of: i) a condition related to the determinant, ii) a condition related to the number of candidate physical observations added to the set of physical observations and to the total number of model parameters of the linear model, and iii) a condition related to the number of candidate physical observations added to the set of physical observations and a predefined maximum number.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) FIG. 1 illustrates a general arrangement of a vertical seismic profile (VSP) survey.

(2) FIG. 2 illustrates an acoustic logging tool for estimating formation stresses using radial profiles of three shear moduli.

(3) FIG. 3 is a flow diagram of an embodiment of guided Bayesian experimental design.

(4) FIG. 4a illustrates the set of all rays from the depicted potential source-receiver pairs.

(5) FIG. 4b illustrates a survey designed by a non-Bayesian algorithm where noise is neglected.

(6) FIG. 5a illustrates a survey designed by a Bayesian algorithm where both noise and prior model uncertainty are neglected.

(7) FIG. 5b illustrates a survey designed by the Bayesian algorithm using the same noise model as above, but with the uncorrelated prior model covariance having a standard deviation within the disk being 10 times larger than in the other portion of the model.

(8) FIG. 5c is a plot of the convergence of |C.sub.n+1−1|/|C.sub.n−1|−1 versus the iteration index n for the solution shown in FIG. 5b.

(9) FIG. 6a illustrates the set of all (successful) rays from the depicted potential source-receiver pairs.

(10) FIG. 6b illustrates a survey designed by a non-Bayesian algorithm.

(11) FIG. 6c illustrates a survey designed by the Bayesian algorithm using the same noise model as above, but with the uncorrelated prior model covariance having a standard deviation within the disk being 10 times larger than in the other portion of the model.

DETAILED DESCRIPTION

(12) A general arrangement of a vertical seismic profile (VSP) survey is shown in FIG. 1. A tool 10, typically including and array of seismic receivers 12, e.g., geophones, is positioned in a borehole 14 by means of a logging cable 16 connected to surface equipment 18. One or more seismic sources 20, e.g., airguns, are positioned at the surface some distance from the borehole 14. When the sources are fired, seismic waves S travel through the formation 22 surrounding the borehole 14 and are reflected in part from changes in acoustic impedance in the formation due to the presence of bed boundaries 24, and are detected by the receivers 12 in the borehole 14. The signals recorded from the receivers 12 can be interpreted by use of a suitable geophysical model to characterize the formation 22, e.g., in terms of the shape and location of the boundary 24. Often, the VSP survey is designed with the intention of investigating a particular area or target within a roughly known boundary. Variations of such surveys can include reverse VSP (sources in borehole, receivers at the surface), walkaway VSP (measurements made from a series of source firing as it is moved progressively further from the borehole), 3D VSP (use of a 2D array of sources at the surface), and drill bit seismic (drill bit as source of signals, receivers at the surface). The surface equipment may include a memory with prior information regarding the target and a computer program for performing guided Bayesian experimental design. The location, number and type of receivers and sources used in a given test are some factors that may be selected and controlled based on guided Bayesian experimental design.

(13) FIG. 2 illustrates a general arrangement of a logging tool (106) used to acquire and analyze sonic data that describes a subterranean formation. The illustrated tool has a plurality of acoustic receivers and transmitters, including multi-pole transmitters such as crossed dipole transmitters (120, 122) (only one end of dipole (120) is visible in the figure) and monopole transmitters (109) (close) and (124) (far) capable of exciting compressional, shear, Stoneley, and flexural waves. The logging tool (106) also includes receivers (126), which are spaced apart some distance from the transmitters. The logging tool (106) is suspended from an armored cable (108) and may have optional centralizers (not shown). The cable (108) extends from the borehole (104) over a sheave wheel (110) on a derrick (112) to a winch forming part of surface equipment, which may include an analyzer unit (114). Well known depth gauging equipment (not shown) may be provided to measure cable displacement over the sheave wheel (110). The tool (106) may include any of many well known devices to produce a signal indicating tool orientation. Processing and interface circuitry within the tool (106) amplifies, samples and digitizes the tool's information signals for transmission and communicates them to the analyzer unit (114) via the cable (108). Electrical power and control signals for coordinating operation of the tool (106) may be generated by the analyzer unit (114) or some other device, and communicated via the cable (108) to circuitry provided within the tool (106). The surface equipment includes a processor subsystem (116) (which may include a microprocessor, computer readable memory, clock and timing, and input/output functions—not separately shown), standard peripheral equipment (not separately shown), and a recorder (118). The memory may include prior information regarding the formation and a computer program for performing guided Bayesian experimental design. The location, number and type of receivers and transmitters used in a given test are some factors that may be selected and controlled based on guided Bayesian experimental design.

(14) FIG. 3 is a flow diagram of a guided Bayesian experimental design algorithm for designing experiments including but not limited to seismic and acoustic surveys. In order to start 300, the relevant physical model parameters are first selected and assigned to m. Further, the relevant candidate physical observations of interest are selected. Further, the available prior information on model parameters and observation noise is characterized in the form of: a prior mean model; a prior covariance matrix C describing the uncertainty around the prior mean model; and a covariance matrix C.sub.D describing the uncertainty around the anticipated zero-mean observation noise. A sensitivity matrix G is then computed for these model parameters and candidate observations, e.g., using a blackbox simulator. In the case of observations with single measurements and uncorrelated noise, the following conditions are set as indicated by step 302: n=0; C.sub.0=C. Then, in step 304, find g.sub.1 that maximizes

(15) .Math. C 1 - 1 .Math. .Math. C 0 - 1 .Math. = 1 + σ 1 - 2 g 1 T C 0 g 1 .
In step 306, n is incremented. Then, for n=1,

(16) C 1 = C 0 - σ 1 - 2 C 0 g 1 g 1 T C 0 1 + σ 1 - 2 g 1 T C 0 g 1
as indicated by step 308. Steps 304, 306 and 308 are repeated until convergence as determined in step 310. For example, in a subsequent iteration for n=1, find g.sub.2 that maximizes

(17) .Math. C 2 - 1 C 1 - 1 .Math. = 1 + σ 1 - 2 g 2 T C 1 g 2 ,
and then for n=2,

(18) 0 C 2 = C 1 - σ 2 - 2 C 1 g 2 g 2 T C 1 1 + σ 2 - 2 g 2 T C 1 g 2 ,
and find g.sub.3 that maximizes

(19) .Math. C 3 - 1 C 2 - 1 .Math. = 1 + σ 3 - 2 g 3 T C 2 g 3
Examples of stop 312 conditions include: adding a new observation to the experiment no longer results in significant improvement in |C.sub.n+1.sup.−1|/|C.sub.n.sup.−1|; when the number of added measurements exceeds the total number of model parameters; and when the number of added measurements exceeds a predefined maximum number. After completion, the optimal design is the set of observations that would maximally impact the resolution of the model parameters given the information available on the model parameters and the anticipated observation noise prior to collecting the measurement.

(20) In view of the description above, it will be appreciated that the algorithm is advantageously generic in the sense that it can be applied with any physical measurement or observation, i.e., it is not limited to seismic and acoustic surveys. However, at least with regard to seismic and acoustic surveys the algorithm provides the advantage of utilizing prior model information which often exists but might not otherwise be used. For instance, in geophysical tomography, the prior mean model and associated covariance matrix could come from surface seismic data interpretation when one is considering 3D vertical seismic profile acquisition to refine a particular area of the subsurface model. When one is performing a real-time survey design, the prior information on the model could come from the interpretation of the already acquired measurements.

(21) It should be noted that although the algorithm has been described for the case of single measurement observations and uncorrelated noise, it is also applicable when considering observations with multiple measurements and correlated noise by changing the function to maximize for finding g.sub.n and by changing the formula used to update the base experiment matrix C.sub.n. In the case where each observation may be associated with multiple measurements, the observation selection algorithm considers k measurements at a time for an observation with k associated measurements. To compare the numerical advantage of performing one-time rank-k updates versus over performing k consecutive rank-one updates, let
Γ=[Γ.sub.1Γ.sub.2 . . . Γ.sub.k].sup.T  (26)
be the matrix whose rows Γ.sub.i, 1≦i≦k, are the sensitivity kernels of the relevant data stations. For a diagonal data covariance matrix, i.e., for

(22) ( C D ) n + k = [ ( C D ) n 0 0 S ] , wherein ( 27 ) S = [ σ 1 2 0 .Math. 0 0 σ 2 2 .Math. 0 .Math. .Math. .Math. 0 0 .Math. σ k 2 ] . ( 28 )
It follows that for a one-time rank-k augmentation
C.sub.n+k.sup.−1=C.sub.n.sup.−1+Γ.sup.TSΓ,  (29)
from which it is straightforward show that

(23) .Math. C n + k - 1 C n - 1 .Math. = .Math. S .Math. - 1 .Math. S + ΓC n Γ T .Math. . ( 30 )
This expression reduces for k=2 to

(24) .Math. C n + 2 - 1 C n - 1 .Math. = σ 1 - 2 σ 2 - 2 .Math. [ σ 1 2 + Γ 1 T C n Γ 1 Γ 1 T C n Γ 2 Γ 2 T C n Γ 1 σ 2 2 + Γ 2 T C n Γ 2 ] .Math. = ( 1 + σ 1 - 2 Γ 1 T C n Γ 1 ) ( 1 + σ 2 - 2 Γ 2 T C n Γ 2 ) - σ 1 - 2 σ 2 - 2 .Math. Γ 1 T C n Γ 2 .Math. 2 . ( 32 ) ( 31 )
Examining the operations count for a one-time rank-k augmentation one has to sum the operations occurring in four computation steps: 1. The product of a typically large M×M matrix with a M×k matrix requires M×M×k multiplications and M×k×(M−1) additions. This step requires O(M.sup.2k) operations. 2. The product of a k×M matrix by a M×k matrix requires k×k×M multiplications and k×k×(M−1) additions. This requires O(k.sup.2M) operations. 3. The determinant of a typically small k×k matrix requires O(k.sup.3) operations. 4. The product of k+1 scalars requires O(k) operations.

(25) Therefore the total cost of this one-time rank-k augmentation procedure is
O(M.sup.2k)+O(k.sup.2M)+O(k.sup.3)+O(k)≈O(M.sup.2k) for k<<M.  (33)
Now, for k consecutive rank-one updates

(26) C n + k - 1 = C n - 1 + .Math. l = 1 k σ l - 2 Γ l Γ l T . ( 34 )
An analytic expression |C.sub.n+k.sup.−1|/|C.sub.n.sup.−1| for a given k, is

(27) .Math. C n + k - 1 C n - 1 .Math. = .Math. l = 1 k ( 1 + σ k - l + 1 - 2 Γ k - l + 1 Λ k - l - 1 Γ k - l + 1 T ) . where ( 35 ) Λ k - l C n - 1 + .Math. l = 1 k - l σ r - 2 Γ r Γ r T . ( 36 )
The explicit form of Eq. 35 for an arbitrary k is somewhat cumbersome, but for k=2 it can be seen that it is identical to Eq. 32. Examining the operations count for k consecutive rank-one updates, one has to perform k(M.sup.2+M+2)+k multiplications and k(M(M−1)+(M−1)+1) additions, yielding the operations count estimate
O(M.sup.2k) for k<<M.  (37)
Comparing the operations counts for both update approaches, there is no advantage to using one over the other. However, in comparing the implementation complexity of Eq. 35 versus Eq. 30, one might prefer the simplicity of the former over the latter.

(28) In the most general case the data covariance matrix, C.sub.D, is a symmetric, positive definite matrix; it is conveniently written in block form as

(29) ( C D ) n + 1 = [ ( C D ) n c n + 1 c n + 1 T σ n + 1 2 ] ( 38 )
wherein (C.sub.D).sub.n is the covariance matrix of the base experiment data, σ.sub.n+1.sup.2 is the variance of the data measurement that corresponds to the new candidate observation, and c.sub.n+1 is the vector whose components are the covariance terms of this measurement. Using the formula for the inverse of a block matrix (G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins, Baltimore, Md., USA, 1996),

(30) ( C D ) n + 1 - 1 = [ [ ( C D ) n - σ n + 1 2 c n + 1 c n + 1 T + ] - 1 - ( C D ) n - 1 c n + 1 [ σ n + 1 2 - c n + 1 T ( C D ) n - 1 c n + 1 ] - 1 - σ n + 1 - 2 c n + 1 T [ ( C D ) n - σ n + 1 2 c n + 1 c n + 1 T ] - 1 [ σ n + 1 2 - c n + 1 T ( ) n - 1 ] - 1 ] . ( 39 )
Substituting (39) into (8) yields

(31) C n + 1 - 1 = G n T [ ( C d ) n - σ n + 1 2 c n + 1 c n + 1 T ] - 1 G n - σ n + 1 - 2 g n + 1 c n + 1 T [ ( C D ) n - σ n + 1 2 c n + 1 c n + 1 T ] - 1 G n - [ σ n + 1 2 - c n + 1 T ( C d ) n - 1 c n + 1 ] - 1 G n T ( C D ) n - 1 c n + 1 g n + 1 T + [ σ n + 1 2 - c n + 1 T ( C d ) n - 1 c n + 1 ] - 1 g n + 1 g n + 1 T + C - 1 . ( 40 )
Upon using the Sherman-Morrison formula described by Golub and Loan, Eq. (40) reduces to

(32) 0 C n + 1 - 1 = A + hg n + 1 T + g n + 1 k T , wherein ( 41 ) A = C n - 1 + G n T BG n , ( 42 ) B σ n + 1 2 ( C D ) n - 1 c n + 1 c n + 1 T ( C D ) n - 1 1 - σ n + 1 2 c n + 1 T ( C d ) n - 1 c n + 1 , ( 43 ) h [ σ n + 1 2 - c n + 1 T ( C D ) n - 1 c n + 1 ] - 1 [ g n + 1 - G n T ( C D ) n - 1 c n + 1 ] and ( 44 ) k T - σ n + 1 - 2 c n + 1 T [ ( C D ) n - 1 + σ n + 1 2 ( C d ) n - 1 c n + 1 c n + 1 T ( C D ) n - 1 1 - σ n + 1 2 c n + 1 T ( C D ) n - 1 c n + 1 ] G n . ( 45 )
The repeated use of the matrix determinant lemma (D. A. Harville. Matrix Algebra from a Statistician's Perspective. Springer-Verlag, New York, N.Y., USA, 1997) and the Sherman-Morrison formula yields for Eq. 41

(33) .Math. C n + 1 - 1 .Math. = .Math. A .Math. ( 1 + g n + 1 T A - 1 h ) [ 1 + k T ( A - 1 - A - 1 hg n + 1 T A - 1 1 + g n + 1 T A - 1 h ) g n + 1 ] . ( 46 )
Note that the Woodbury formula described by Harville, a rank-k-generalization of the Sherman-Morrison result, could be used to calculate A.sup.−1. This yields
A.sup.−1=(C.sub.n.sup.−1=G.sub.n.sup.TBG.sub.n).sup.−1=C.sub.n−C.sub.nG.sub.n.sup.T(B.sup.−1+G.sub.nC.sub.nG.sub.n.sup.T).sup.−1G.sub.nC.sub.n.  (37)
However, from a computational point of view this is not very helpful as one would still have to calculate the inverse of two large matrices. To calculate |A| the generalized matrix determinant lemma can be used, which yields
|A|=|C.sub.n.sup.−1+G.sub.n.sup.TBG.sub.n|=|C.sub.n.sup.−1∥I+BG.sub.nC.sub.nG.sub.n.sup.T|.  (48)
wherein I is the identity matrix. With these results, Eqs. (46) and (48), the objective function can be expressed as

(34) .Math. C n + 1 - 1 .Math. .Math. C n - 1 .Math. = ( 1 + g n + 1 T A - 1 h ) .Math. 1 + BG n C n G n T .Math. [ 1 + k T ( A - 1 - A - 1 hg n + 1 T A - 1 1 + g n + 1 T A - 1 h ) g n + 1 ] , ( 49 )
wherein A is given by Eq. 42, B is given by Eq. 43, h is given by Eq. 44, and k is given by Eq. 45. It can be shown that when the data covariance matrix is diagonal, i.e., when c.sub.n+1=0, Eq. 49 reduces to Eq. 16. This result, Eq. 49, would clearly be computationally expensive to implement, but may be an acceptable cost when the data measurements are not independent from one another.

(35) FIGS. 4a-4b, 5a-5c, and 6a-6c illustrate the benefits of accounting for prior information in optimal experimental design. More particularly, the illustrated examples show that while the non-Bayesian and the Bayesian approaches yield the same survey design when no prior model information is available and when observation noise is ignored, the Bayesian algorithm guides the design process to primarily focus on areas of higher model uncertainty. FIG. 4a shows the set of all rays from potential source-receiver pairs in a homogenous model for which compression wave velocity is uniformly 2400 m/s. Two vertical boreholes are positioned on the left- and right-sides of the model. Twelve seismic uniformly-spaced sources are placed in the left borehole, and twelve uniformly-spaced seismic receivers are placed in the right borehole. The measurement is the compressional-wave traveltime between each source-receiver pair. Consequently, a total of 12×12 potential source-receiver pairs are considered in this experimental design problem. FIGS. 4b and 5a illustrate survey designs from non-Bayesian and Bayesian algorithms, respectively, where the Bayesian design neglects measurement noise and model uncertainty. In the Bayesian algorithm these can be neglected by setting C.sub.D∝I and C∝I. The rays are numbered in order of their selection by the optimization algorithms. Note that the two designs are identical. FIG. 5b illustrates the result of modifying the Bayesian design by using the same noise model as in the previous designs, but with the uncorrelated prior model covariance having a standard deviation within the a disk 10 times larger than in the other portion of the model. Note that the Bayesian algorithm responds to the locally larger uncertainty by concentrating more rays within this region of uncertainty in order to reduce the uncertainty toward the background level. FIG. 5c is a plot of the convergence of |C.sub.n+1.sup.−1|/|C.sub.n.sup.−1|−1 versus n. Note that initial convergence is rapid, while the overall convergence is monotonic.

(36) FIGS. 6a through 6c illustrate comparative results with a heterogeneous model. Geometry of the velocity grid is the same as that of the homogeneous model described above, but the velocities and the distribution of the sources and receivers is different. The uniformly-spaced sources are now placed on the surface, and the receivers are distributed in two boreholes, one of which is vertical, the other of which is deviated. Hence, a total of 325 source-receiver pairs are considered. A simple two-layer inhomogeneous velocity model is considered. It consists of an upper layer in which the velocity of the seismic waves is 2000 m/s and a lower layer in which the velocity is 2400 m/s. The two layers are separated by an interface with a Gaussian hump. Out of the 325 rays, the ray tracer failed to trace 28% of the rays. The reason behind this failure is that they lie in the shadow zone, making these failed rays poor candidates for observation stations. FIGS. 6b and 6c show the first ten observation stations in the survey designs created by the non-Bayesian and Bayesian algorithms, respectively. The Bayesian algorithm uses an uncorrelated prior model covariance in which the standard deviation within the disk is 10 times larger than in the other portion of the model. Note that the Bayesian design yields a survey design that concentrates on those areas with higher uncertainty, even in the presence of inhomogeneities.

(37) While the invention is described through the above exemplary embodiments, it will be understood by those of ordinary skill in the art that modification to and variation of the illustrated embodiments may be made without departing from the inventive concepts herein disclosed. Moreover, while the preferred embodiments are described in connection with various illustrative structures, one skilled in the art will recognize that the system may be embodied using a variety of specific structures. Accordingly, the invention should not be viewed as limited except by the scope and spirit of the appended claims.