Harmonic encoding for FWI

Abstract

A deterministic method for selecting a set of encoding weights for simultaneous encoded-source inversion of seismic data that will cause the iterative inversion to converge faster than randomly chosen weights. The encoded individual source gathers are summed (83), forming a composite gather, and simulated in a single simulation operation. The invention creates multiple realizations of the simulation (84), each with its own encoding vector (82) whose components are the weights for the shots in the composite gather. The encoding vectors of the invention are required to be orthogonal (82), which condition cannot be satisfied by random weights, and in various embodiments of the invention are related to eigenvectors of a Laplacian matrix, sine or cosine functions, or Chebyshev nodes as given by the roots of Chebyshev polynomials. For non-fixed receiver geometry, an encoded mask (61) may be used to approximately account for non-listening receivers.

Claims

1. A computer-implemented iterative method for inversion of seismic data to update a model of subsurface velocity or other physical property, wherein a plurality of encoded source gathers of data are inverted simultaneously, said method comprising: (a) with a computer, selecting a plurality of individual source gathers of the seismic data; (b) in a first iteration, encoding the selected gathers with weights using the computer, said weights forming components of a weight vector, and summing the encoded gathers to form a composite gather; (c) generating, with the computer, at least one realization of predicted data for the entire composite gather, wherein the predicted data are computer-simulated, using a current model, in a single forward-modeling operation, a different realization being characterized by a different weight vector; (d) updating, with the computer, the current model using the composite gather and the simulated composite gather from each of the at least one realization; (e) in a second iteration, repeating (b)-(d), using the updated model from the first iteration as the current model for the second iteration, resulting in a further updated model; and (f) using the further updated model in prospecting for hydrocarbons; wherein, (i) each iteration has a plurality of realizations, and the weight vectors for each realization are orthogonal to one another; or (ii) the weight vector or weight vectors for the first iteration are orthogonal to the weight vector or weight vectors for the second iteration; or both (i) and (ii); and wherein the orthogonal weight vectors are generated by selecting a set of random, linearly independent vectors and applying a Gram-Schmidt orthogonalization algorithm to them to produce an orthonormal set of vectors.

2. The method of claim 1, further comprising, after (c), repeating (a)-(c), selecting source gathers in (a) that were not previously selected, and using all composite gathers and the predicted data for each composite gather to generate the model update in (d).

3. The method of claim 1, wherein the seismic data are acquired using a non-fixed spread geometry for seismic receivers, and further comprising: designing a mask for each source gather in the composite gather, to mask non-listening receiver locations; generating a mask for the composite gather from the individual source gather masks; encoding each composite gather mask using weights selected to form a mask weight vector that is orthogonal to mask weight vectors used for the composite gather in one or more other realizations or in one or more other iterations; and applying the encoded composite gather mask to the composite gather in the forward modeling of predicted data for the composite gather.

4. The method of claim 1, wherein the orthogonal weight vectors are generated from a periodic harmonic function whose period is a function of seismic shot number.

5. The method of claim 4, wherein the periodic harmonic function (w) may be represented mathematically as $w = \sin (\frac{x}{n} * k)$ where n is the number of individual source gathers selected to form the composite source gather; x is the seismic shot number, i.e. an integer ranging from 1 to n; and k is a selected reference frequency unique to each weight vector.

6. The method of claim 5, wherein a low range of frequencies is selected from which to choose a k for each weight vector for the first iteration, and a progressively higher frequency range is used for the second and any subsequent iterations.

7. The method of claim 1, wherein weight vectors a.sup.(i)=(a.sub.1.sup.(i), . . . , a.sub.n.sup.(i)) for i=1 . . . k realizations are constructed so as to minimize a selected measure of A.sub.kI.sub.nn, where I is an identity matrix; n is the number of shots, meaning individual-source gathers, in the composite gather; A.sub.k=.sub.i=1.sup.kl A.sup.(i), where A.sup.(i) is a matrix given by an outer product of the weight vector a.sup.(i) with itself.

8. The method of claim 1, wherein updating the current model using the composite gather and the simulated composite gather comprises: computing a cost function measuring misfit between the composite gather and the simulated composite gather; computing a gradient of the cost function and model parameters phase; and using the gradient to update the current model.

9. The method of claim 8, wherein the gradient is computed by correlating a forward simulation time series representing the simulated composite gather with a backward simulation time series at each model location, wherein the backward simulation time series is computed from the composite gather and the simulated composite gather in a computation that depends on the cost function.

10. The method of claim 1, wherein the random vectors each have components all of which are selected from +1 and 1 with equal probability, and the random vectors are checked to ensure they are all linearly independent.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The present invention and its advantages will be better understood by referring to the following detailed description and the attached drawings in which:

(2) FIGS. 1A and 1B compare the error (sequential-source inversion is assumed to give the correct answer) for +1/1 random encoding to the error for orthogonal encoding (as per the present inventive method) for fixed-spread acquisition geometry;

(3) FIG. 2 compares two different encoding approaches on FWI inversion of the Marmousi velocity modelrandom +1/1 encoding versus the present inventive method's orthogonal SVD matrix exp {|xy|.sub.2};

(4) FIG. 3 furthers the comparison of FIG. 2 by showing a measure of model convergence versus iteration number for the two encoding approaches and also for conventional sequential source inversion;

(5) FIG. 4 is a schematic diagram illustrating the contrast between random +1/1 source encoding and the harmonic encoding of the present invention;

(6) FIG. 5 illustrates an example of an advantage of harmonic encoding of the present invention over other inversion techniques, even sequential-source inversion;

(7) FIG. 6 is a symbolic description of the mask encoding embodiment of the present invention, designed to deal with non-fixed spread acquisition geometry;

(8) FIGS. 7A-7C are schematic diagrams that illustrate full wavefield inversion, and two alternative approaches thereto, whereby the gradient computation is done sequentially, one source at a time, versus the approach were the gradient is computed for multiple encoded sources simultaneously; and

(9) FIG. 8 is a flow chart showing basic steps in some embodiments of the present inventive method.

(10) FIG. 5 is a black and white reproduction of original color drawings due to patent restrictions on use of color.

(11) The invention will be described in connection with example embodiments. However, to the extent that the following detailed description is specific to a particular embodiment or a particular use of the invention, this is intended to be illustrative only, and is not to be construed as limiting the scope of the invention. On the contrary, it is intended to cover all alternatives, modifications and equivalents that may be included within the scope of the invention, as defined by the appended claims.

DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS

(12) Fixed-Spread Acquisition Geometry.

(13) The above-referenced simultaneous encoded-source method disclosed by Krebs et al. uses weights (Krebs called them encoding functions) that are, in a preferred embodiment of his invention, randomly chosen. By contrast, the present disclosure gives several ways to modify how these weights are chosen. In embodiments of the present invention, the weights (also called scaling constants herein) are obtained deterministically, not randomly or arbitrarily, and the weights are orthogonal relative to each other according to some inner product definition, including roots of orthogonal polynomials. Stated more precisely, a set of weights, one for each source gather in the super gather, form the components of a weight vector, and the weight vector, according to the present invention, must be orthogonal to weight vectors for other realizations of the super gather, or to weight vectors in other iteration cycles of the inversion, or both. Deterministic weights are weights that are generated according to a prescribed procedure that produces an orthogonal set, or converts a set of weights into an orthogonal set. Generating deterministic weights does not involve use of a random number generator. Specific embodiments include: eigenvectors of a graph Laplacian matrix, sine/cosine pairs, and roots of orthogonal polynomials (i.e. Chebyshev nodes as given by the roots of Chebyshev polynomials). Each of these is a deterministic choice of weights and a key to observed improvements in inversion behavior.

(14) In many embodiments of the invention, the weight (encoding) vectors are smoothly varying, almost periodic functions of source location or some other source identification parameter. This applies to a single vector of weights, one weight for each source gather in a composite gather, selected for a single realization. Preferably, the weights assigned to individual source gathers in a composite source gather are unique, although good results may be obtained when vectors are repeated. Selecting/constructing suitable weights is described further below, including FIGS. 4 and 5.

(15) FIG. 4 is a schematic diagram that illustrates the contrast between basic encoding, as described by Krebs et al., and the harmonic encoding of the present invention. In harmonic encoding, the scaling constants may be picked from eigenvectors of a Laplace operator. In practice we the Laplace operator in question may be one obtained as a Graph Laplacian (see http://en.wikipedia.org/wiki/Laplacian_matrix for a definition and discussion) matrix on the graph where nodes are sources and connections between them are defined to nearby sources (e.g., a grid defines up to four connections for each source location). The eigenvectors of such a matrix are orthogonalinner product of any two such vectors yields zeroand that is the property is exploited in some embodiments of the present invention. The eigenvectors of a Laplace operator are called harmonic functions, which is why this encoding technique may be called Harmonic Encoding.]

(16) FIG. 5 illustrates an example of an advantage that harmonic encoding can have over other inversion techniques. A Gaussian anomaly was introduced into a velocity modelthe diffuse square in the panel Target model. Two types of inversion are shown: Harmonic encoding on the left, and the standard (sequential) inversion on the right. The initial model is the same for both and has no information about the anomaly. Observe that the harmonic encoding approach is able to recover the anomaly and, eventually, to invert the model. By contrast, the standard (sequential) inversion fails to recover the anomaly and to converge. Although not shown here, the random encoding technique described by Krebs et al. (see FIG. 4) fails to converge as well.

(17) Unless adjustments are made to the basic method, simultaneous encoded-source inversion assumes a fixed spread of receivers, i.e. that all receivers are listening to all shots. This is often not the case in actual surveys, particularly marine streamer surveys. In addition to the deterministic method for improving the multiple realization approach described briefly above (and in more detail below) for fixed-receiver geometry, the present disclosure extends this concept to non-fixed spread geometries (see FIG. 6). In the fixed spread case, the present disclosure shows how to pick weights for n shots so the result of using n realizations is identical to the sequential approach (within numerical precision). It is also explained below how to pick sets of weights optimally, so that the best result is obtained if only k<n realizations are generated. Additionally, at least one of the embodiments disclosed herein appears to converge faster than the random approach.

(18) It should be noted that the benefits of orthogonal weight vectors in simultaneous encoded source inversion can be obtained with as few as one realization per iteration cycle. In this case, it is the single weight vector from each iteration cycle that must be orthogonal to the weight vectors from the other iteration cycles.

(19) The deterministic approach of the present disclosure may be derived by generalizing the following example.

Example 1

(20) Let there be only two sources, and define the vector of weights as a=(a.sub.1, a.sub.2). Denote by u.sub.1, w.sub.1 the forward and adjoint wavefields (the inner product of the two at each spatial location gives the gradient) due to source 1 and u.sub.2, w.sub.2 the respective wavefields due to source 2. Much of the inversion procedure in FWI may be reduced to quadratic forms of such wavefields. Specifically, the gradient used in iterative methods for conventional sequential-source inversion requires the sum of the inner products (u.sub.1, w.sub.1)+(u.sub.2, w.sub.2), each obtained by an independent simulation. By contrast, the simultaneous source simulation will produce the inner product

(21) $\begin{matrix} (a_{1} u_{1} + a_{2} u_{2}, a_{1} w_{1} + a_{2} w_{2}) = \begin{matrix} a_{1} a_{1} (u_{1}, w_{1}) + a_{1} a_{2} (u_{1}, w_{2}) \\ + \\ a_{2} a_{1} (u_{2}, w_{1}) + a_{2} a_{2} (u_{2}, w_{2}) \end{matrix} . & [Eq . 1] \end{matrix}$
A second realization using weights b=(b.sub.1, b.sub.2) will produce a similar inner product where only the weights are different.

(22) Applying [Eq. 1] to a and b, one obtains

(23) $\begin{matrix} \begin{matrix} (a_{1} u_{1} + a_{2} u_{2}, a_{1} w_{1} + a_{2} w_{2}) \\ + \\ (b_{1} u_{1} + b_{2} u_{2}, b_{1} w_{1} + b_{2} w_{2}) \end{matrix} = \begin{matrix} (a_{1} a_{1} + b_{1} b_{1}) (u_{1}, w_{1}) + (a_{1} a_{2} + b_{1} b_{2}) (u_{1}, w_{2}) \\ + \\ (a_{2} a_{1} + b_{2} b_{1}) (u_{2}, w_{1}) + (a_{2} a_{2} + b_{2} b_{2}) (u_{2}, w_{2}) \end{matrix} & [Eq . 2] \end{matrix}$
Therefore, the sum of the two simultaneous source inner products is completely described by the sum of the matrices A and B:

(24) $A + B = [\begin{matrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{matrix}] + [\begin{matrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{matrix}] = [\begin{matrix} a_{11} + b_{11} & a_{12} + b_{12} \\ a_{21} + b_{21} & a_{22} + b_{22} \end{matrix}]$
Observe that the sequential approach is captured here also: pick a=(1,0) and b=(0,1) and

(25) $A + B = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}] .$

(26) In general, when there are n sources and k realizations, one has k weight vectors denoted a.sup.(i)=(a.sub.1.sup.(i), . . . , a.sub.n.sup.(i)) the effect of each of which may be described with a matrix A.sup.(i). The effect of summing all k realizations is therefore described by the matrix A.sub.k which, in turn, approximates the identity matrix (i.e. the sequential approach). In symbols:

(27) $A_{k} = {.Math.}_{i = 1}^{k} A^{(i)} I_{n n}$

(28) The following formalizes the preceding discussion and is a direct consequence of singular value decomposition theory.

(29) Proposition 1: Let the a.sup.(i) form an orthonormal set, i.e.

(30) $(a^{(i)}, a^{(j)}) = {\begin{matrix} 1 & i = j \\ 0 & i j \end{matrix} .$
Then, 1. A.sub.k is the best k-realization approximation to I.sub.nn, i.e. error(k,n)=|A.sub.kI.sub.nn|.sub.Frobenius is lowest. 2. A.sub.n=.sub.i=1.sup.nA.sup.(i)=I.sub.nn. 3. At least n realizations are needed to reproduce the sequential approach exactly.
Choosing a Set of Weights for Simultaneous Source Encoding Assuming a Fixed Spread Geometry
Method 1: Convert an Existing Set of Weights Using Gram-Schmidt

(31) FIGS. 1A and 1B illustrate a straightforward application of Proposition 1. The procedure consists of two steps: 1. Choose k random vectors r.sup.(i) each of size n with entries +1 and 1 with equal probability. Make sure that they are linearly independent. 2. Apply the Gram-Schmidt orthogonalization algorithm to the set of random vectors and produce the orthonormal set a.sup.(i). (http://en.wikipedia.org/wiki/Gram-Schmidt_process)
The example in FIGS. 1A and 1B used k=n in the first step. The left-hand graph in FIG. 1A shows the error according to Proposition 1 (shown is error(k,100) for k=1, . . . , 100). Random +/1 encoding is shown by the dark solid curve. An orthognolized version of that encoding using Gram-Schmidt is shown by the lighter dashed curve. The right-hand graph in FIG. 1 B shows the error in the gradient using the same encodings. Note that the error in the left panel does not take into account the actual wavefield inner products whereas the error in the right panel does. Also, observe that after 100 realizations, the random encoding does not describe the sequential gradient exactly whereasas expected from Proposition 1the orthogonal version does. In addition, the error of the random encoding at 100 realizations is achieved by the orthogonal version after 50 realizations.
Method 2: Exploit Properties of Wavefield Inner Products

(32) It can be shown that the cross-talk (noise) due to simultaneously simulating nearby sources is larger than the cross-talk due to sources that are far apart. In other words, the inner products corresponding to off-diagonal entries that are close to the diagonal are the most problematic. Thus, if we can afford k realizations, then we can group shots into clusters of k and to each cluster apply orthogonal weights vectors of length k. For example, we could pick the canonical basis for dimension k (b.sup.(i)=(0, . . . , 1, . . . , 0) where the 1 is at position i). Another approach is to apply the procedure of Method 1 to a problem of size k. Thus, the final vectors for the full problem of size nk consists of concatenating n/k times the vectors for the problem of size k, i.e. a.sup.(i)=(b.sup.(i), . . . , b.sup.(i)). After k realizations there will be no cross-talk due to sources that are closer than k units apart. This optimal-k encoding may be randomly perturbed from iteration to iteration by multiplying the bs above by 1 or 1 chosen at random for each group. Alternatively, the perturbation may be achieved with an orthogonal set vectors with dimension equal to n/k. Additionally, the location of the 1 in the optimal-k vectors may be randomly perturbed to improve the inversion results.

(33) Method 3: Design a Matrix of Rank N and Non-Repeating Singular Values that Approximates the Identity Matrix.

(34) A matrix M can be designed that approximates a desired behavior. For example, the identity matrix (because it represents the sequential approach) can be approximated by defining M.sub.i,j=exp(|loc(i)loc(j)|). Applying SVD on M, one obtains an orthonormal set of weights. This choice seems to give the best results in 2D inversion tests that were run. See FIG. 2 and FIG. 3, which compare this embodiment of the present invention to random +1/1 encoding for FWI inversion of the Marmousi velocity model. Note the faster convergence of this method compared to the random approach. Also, note that, as FIG. 5 further demonstrates, this embodiment may be preferred (faster convergence) even over the standard (sequential) FWI.

(35) It may be noted that the eigenvectors of the example matrix given in the preceding paragraph are related to harmonic functions, i.e. eigenvectors of a Laplace operator. In some special geometries, e.g. a line graph, these eigenvectors may be obtained as sine and cosine functions as described below. In other words, the analytical expression of the eigenvectors above is given by sine and cosine functions, and so similar results are obtained by defining the weights analytically as given by sine and cosine functions. For example, if n is the number of shots along a spatial dimension, then the weights may be given by

(36) $w = \sin (\frac{x}{n} * k)$
or by the cosine of the same arguments. Here, the argument x is an integer between 1 and n, and k is a spatial (i.e. reference) frequency for this weight. Note that it is by varying this k that different weight vectors are obtained, i.e. the ones used for independent realizations. This is a 1-D example (i.e. there is a single line of shots), but the same idea applies in 2-D: simply multiply two 1-D weight vectors. (The 2-D case needs two spatial frequenciesfrequencies in spaceand so we can take two 1-D vectors and then their outer product to get a matrixi.e. 2-D distribution of weights.)

(37) Using the above sinusoidal function as an example, experience has shown that it may be preferred to use lower frequency vectors for the first iteration of the inversion process, then progressively higher frequency vectors for each succeeding iteration. In other words, the range of k-values used for the different realizations in the first iteration would be a low range, and a progressively higher range would be used for each succeeding iteration. The next section discloses other schemes for varying (or not varying) the weights from one iteration to the next.

(38) Using the Set of Weights in an Inversion

(39) Given a set of weights, one can choose k vectors (one vector for each of the k realizations) to use for each iteration, but how to vary these vectors from one iteration to another decision that remains. Following are a few of the possible choices (some of which may be applied in conjunction with others). Regarding nomenclature, each vector will have n components, where each component is a weight for one of the n shots in the super (composite) gather. 1. Pick the same k vectors for each iteration. 2. Pick a different set of k vectors at random for each iteration. 3. Pick a set of k vectors that have not yet been picked in previous iterations. a. Choose at random from available ones. b. Choose in sequence: iteration i selects the i.sup.th group of k vectors. 4. If all vectors are exhausted: a. Ignore all picks and start anew. b. Generate a different orthogonal set using any method above. c. Use a random set instead of an orthogonal one. 5. Each vector in the set of k vectors may be multiplied by a random constant. 6. Any of the above methods may be used in conjunction with applying time shifts to the data. a. The time shift may be randomly chosen to be within a determined time window for each shot separately. b. The time shift as above but where the encoding is performed only for shots that have the same time shift. c. As in (b) but where the encoding is applied regardless of the time shift. 7. Any of the above where the shots may have already been encoded by applying frequency selection filters either prior to encoding or following the encoding.
Non-Fixed Spread Geometry (e.g., Marine Streamer).

(40) Mask encoding, as disclosed herein, is a deterministic method that allows encoding multiple shots and simulating them simultaneously even for a non-fixed spread acquisition geometry. FIG. 6 describes this embodiment of the present inventive method. At the top of FIG. 6, the diagram illustrates, for a simple example of two shots (denoted by stars), how only the trailing receivers (solid inverted triangles) will be recording any shotthe positions where there is no receiver for that shot (but where there is a receiver for the other shot) are the open inverted triangles. When predicted data are model-simulated using a computer, data will be simulated for all receiver locations in the composite (encoded) gather being simulated. When the residuals are computed, i.e. some measure of the difference between predicted data and corresponding measured data, the residuals corresponding to non-listening receivers will not be zero, although they should be zero. This will cause the computer cost function and the resulting model update to be incorrect. One way of preventing this is to superimpose a mask on the data for each composite gather, the mask corresponding approximately (as contrasted with a hard mask acting on each individual source gather) to the non-listening receivers for the multi-source composite gather.

(41) If M.sub.i is the hard mask for the i.sup.th source gather G.sub.i, and CM.sub.k is the desired composite mask for the k.sup.th composite gather CG.sub.k, then the composite mask may be created such that
.sub.iM.sub.i*G.sub.i.sub.kCM.sub.k*CG.sub.k,
where the sum on the left is over all gathers in the composite gather, and the sum on the right is over all realizations the user may elect to have.

(42) For simulation of simultaneous sources, the masks may then be encoded (61 in FIG. 6) by treating them as shots and using the same harmonic encoding technique as in the fixed spread case; this is denoted by Maski in the equation for the adjoint source in FIG. 6. (The gradient of the cost function is computed by the adjoint method in this embodiment of the invention using mask encoding; see Tarantola, A., Inversion of seismic reflection data in the acoustic approximation, Geophysics 49, 1259-1266 (1984), which paper is incorporated herein by reference in all jurisdictions that allow it.) Then, shots are encoded as well (using the same technique but independently of the masks); the shot encoding is the ci in the equation for the adjoint source in FIG. 6. As in the fixed-spread case, to complete a cycle of the iterative inversion, more than one mask encoding realization is needed to produce a good approximation to the gradient of the cost function, i.e. to the model update. Thus, the cost is Rmasks*Rgrad number of forward simulations. In contrast, only Rgrad number of simulations is required by the fixed spread case.

(43) In the aforementioned adjoint method, the gradient of the objective (cost) function may be computed by correlating a forward simulation time series at each model location with a backward simulation time series at the same location. The forward simulation ensues from simulating an encoded source signature (wavelet); the backward simulation ensues from an adjoint source (instead of the signature) computed in a way that depends on the choice of objective function. For example, the adjoint source for the L2 norm objective function is simply the difference between recorded data and forward simulated data, but each objective function may produce a different backward simulation source term.

(44) Hermann and Haber (PCT Patent Application Publication WO 2011/160201) describe a method that, like the method of Krebs et al. for a fixed spread geometry, may greatly reduce the number of gradient calculations during an inversion. The key to their method is a stochastic (i.e. random choice of samples) inversion that utilizes randomly chosen weights to encode multiple shots into one together with a method that corrects for simulated data at receiver locations that do not record any data (this is the key difference between fixed spreadin which all receivers record data from all sourcesand non-fixed spreadin which some receivers do not record data from some sources). By contrast, the present inventive method is totally deterministic and proceeds uses double encoding: to encode masks that perform the necessary correction as in Hermann and Haber's approach, and to encode the shots as taught herein for the fixed-spread case.

(45) The foregoing description is directed to particular embodiments of the present invention for the purpose of illustrating it. It will be apparent, however, to one skilled in the art, that many modifications and variations to the embodiments described herein are possible. All such modifications and variations are intended to be within the scope of the present invention, as defined by the appended claims.

Harmonic encoding for FWI

Assignee

Inventors

Cpc classification

Classification Explorer

G01V2210/614

PHYSICS

Classification Explorer

G01V2210/1214

PHYSICS

Classification Explorer

G01V1/306

PHYSICS

Classification Explorer

G01V2210/622

PHYSICS

Classification Explorer

G01V1/282

PHYSICS

Classification Explorer

G01V1/005

PHYSICS

Classification Explorer

G01V2210/675

PHYSICS

International classification

Classification Explorer

G01V1/28

PHYSICS

Classification Explorer

G01V1/30

PHYSICS

Classification Explorer

G01V1/00

PHYSICS

Abstract

Claims

Description