Harmonic encoding for FWI
10422899 ยท 2019-09-24
Assignee
Inventors
Cpc classification
G01V1/306
PHYSICS
G01V1/005
PHYSICS
International classification
Abstract
A deterministic method for selecting a set of encoding weights for simultaneous encoded-source inversion of seismic data that will cause the iterative inversion to converge faster than randomly chosen weights. The encoded individual source gathers are summed (83), forming a composite gather, and simulated in a single simulation operation. The invention creates multiple realizations of the simulation (84), each with its own encoding vector (82) whose components are the weights for the shots in the composite gather. The encoding vectors of the invention are required to be orthogonal (82), which condition cannot be satisfied by random weights, and in various embodiments of the invention are related to eigenvectors of a Laplacian matrix, sine or cosine functions, or Chebyshev nodes as given by the roots of Chebyshev polynomials. For non-fixed receiver geometry, an encoded mask (61) may be used to approximately account for non-listening receivers.
Claims
1. A computer-implemented iterative method for inversion of seismic data to update a model of subsurface velocity or other physical property, wherein a plurality of encoded source gathers of data are inverted simultaneously, said method comprising: (a) with a computer, selecting a plurality of individual source gathers of the seismic data; (b) in a first iteration, encoding the selected gathers with weights using the computer, said weights forming components of a weight vector, and summing the encoded gathers to form a composite gather; (c) generating, with the computer, at least one realization of predicted data for the entire composite gather, wherein the predicted data are computer-simulated, using a current model, in a single forward-modeling operation, a different realization being characterized by a different weight vector; (d) updating, with the computer, the current model using the composite gather and the simulated composite gather from each of the at least one realization; (e) in a second iteration, repeating (b)-(d), using the updated model from the first iteration as the current model for the second iteration, resulting in a further updated model; and (f) using the further updated model in prospecting for hydrocarbons; wherein, (i) each iteration has a plurality of realizations, and the weight vectors for each realization are orthogonal to one another; or (ii) the weight vector or weight vectors for the first iteration are orthogonal to the weight vector or weight vectors for the second iteration; or both (i) and (ii); and wherein the orthogonal weight vectors are generated by selecting a set of random, linearly independent vectors and applying a Gram-Schmidt orthogonalization algorithm to them to produce an orthonormal set of vectors.
2. The method of claim 1, further comprising, after (c), repeating (a)-(c), selecting source gathers in (a) that were not previously selected, and using all composite gathers and the predicted data for each composite gather to generate the model update in (d).
3. The method of claim 1, wherein the seismic data are acquired using a non-fixed spread geometry for seismic receivers, and further comprising: designing a mask for each source gather in the composite gather, to mask non-listening receiver locations; generating a mask for the composite gather from the individual source gather masks; encoding each composite gather mask using weights selected to form a mask weight vector that is orthogonal to mask weight vectors used for the composite gather in one or more other realizations or in one or more other iterations; and applying the encoded composite gather mask to the composite gather in the forward modeling of predicted data for the composite gather.
4. The method of claim 1, wherein the orthogonal weight vectors are generated from a periodic harmonic function whose period is a function of seismic shot number.
5. The method of claim 4, wherein the periodic harmonic function (w) may be represented mathematically as
6. The method of claim 5, wherein a low range of frequencies is selected from which to choose a k for each weight vector for the first iteration, and a progressively higher frequency range is used for the second and any subsequent iterations.
7. The method of claim 1, wherein weight vectors a.sup.(i)=(a.sub.1.sup.(i), . . . , a.sub.n.sup.(i)) for i=1 . . . k realizations are constructed so as to minimize a selected measure of A.sub.kI.sub.nn, where I is an identity matrix; n is the number of shots, meaning individual-source gathers, in the composite gather; A.sub.k=.sub.i=1.sup.kl A.sup.(i), where A.sup.(i) is a matrix given by an outer product of the weight vector a.sup.(i) with itself.
8. The method of claim 1, wherein updating the current model using the composite gather and the simulated composite gather comprises: computing a cost function measuring misfit between the composite gather and the simulated composite gather; computing a gradient of the cost function and model parameters phase; and using the gradient to update the current model.
9. The method of claim 8, wherein the gradient is computed by correlating a forward simulation time series representing the simulated composite gather with a backward simulation time series at each model location, wherein the backward simulation time series is computed from the composite gather and the simulated composite gather in a computation that depends on the cost function.
10. The method of claim 1, wherein the random vectors each have components all of which are selected from +1 and 1 with equal probability, and the random vectors are checked to ensure they are all linearly independent.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The present invention and its advantages will be better understood by referring to the following detailed description and the attached drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11) The invention will be described in connection with example embodiments. However, to the extent that the following detailed description is specific to a particular embodiment or a particular use of the invention, this is intended to be illustrative only, and is not to be construed as limiting the scope of the invention. On the contrary, it is intended to cover all alternatives, modifications and equivalents that may be included within the scope of the invention, as defined by the appended claims.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
(12) Fixed-Spread Acquisition Geometry.
(13) The above-referenced simultaneous encoded-source method disclosed by Krebs et al. uses weights (Krebs called them encoding functions) that are, in a preferred embodiment of his invention, randomly chosen. By contrast, the present disclosure gives several ways to modify how these weights are chosen. In embodiments of the present invention, the weights (also called scaling constants herein) are obtained deterministically, not randomly or arbitrarily, and the weights are orthogonal relative to each other according to some inner product definition, including roots of orthogonal polynomials. Stated more precisely, a set of weights, one for each source gather in the super gather, form the components of a weight vector, and the weight vector, according to the present invention, must be orthogonal to weight vectors for other realizations of the super gather, or to weight vectors in other iteration cycles of the inversion, or both. Deterministic weights are weights that are generated according to a prescribed procedure that produces an orthogonal set, or converts a set of weights into an orthogonal set. Generating deterministic weights does not involve use of a random number generator. Specific embodiments include: eigenvectors of a graph Laplacian matrix, sine/cosine pairs, and roots of orthogonal polynomials (i.e. Chebyshev nodes as given by the roots of Chebyshev polynomials). Each of these is a deterministic choice of weights and a key to observed improvements in inversion behavior.
(14) In many embodiments of the invention, the weight (encoding) vectors are smoothly varying, almost periodic functions of source location or some other source identification parameter. This applies to a single vector of weights, one weight for each source gather in a composite gather, selected for a single realization. Preferably, the weights assigned to individual source gathers in a composite source gather are unique, although good results may be obtained when vectors are repeated. Selecting/constructing suitable weights is described further below, including
(15)
(16)
(17) Unless adjustments are made to the basic method, simultaneous encoded-source inversion assumes a fixed spread of receivers, i.e. that all receivers are listening to all shots. This is often not the case in actual surveys, particularly marine streamer surveys. In addition to the deterministic method for improving the multiple realization approach described briefly above (and in more detail below) for fixed-receiver geometry, the present disclosure extends this concept to non-fixed spread geometries (see
(18) It should be noted that the benefits of orthogonal weight vectors in simultaneous encoded source inversion can be obtained with as few as one realization per iteration cycle. In this case, it is the single weight vector from each iteration cycle that must be orthogonal to the weight vectors from the other iteration cycles.
(19) The deterministic approach of the present disclosure may be derived by generalizing the following example.
Example 1
(20) Let there be only two sources, and define the vector of weights as a=(a.sub.1, a.sub.2). Denote by u.sub.1, w.sub.1 the forward and adjoint wavefields (the inner product of the two at each spatial location gives the gradient) due to source 1 and u.sub.2, w.sub.2 the respective wavefields due to source 2. Much of the inversion procedure in FWI may be reduced to quadratic forms of such wavefields. Specifically, the gradient used in iterative methods for conventional sequential-source inversion requires the sum of the inner products (u.sub.1, w.sub.1)+(u.sub.2, w.sub.2), each obtained by an independent simulation. By contrast, the simultaneous source simulation will produce the inner product
(21)
A second realization using weights b=(b.sub.1, b.sub.2) will produce a similar inner product where only the weights are different.
(22) Applying [Eq. 1] to a and b, one obtains
(23)
Therefore, the sum of the two simultaneous source inner products is completely described by the sum of the matrices A and B:
(24)
Observe that the sequential approach is captured here also: pick a=(1,0) and b=(0,1) and
(25)
(26) In general, when there are n sources and k realizations, one has k weight vectors denoted a.sup.(i)=(a.sub.1.sup.(i), . . . , a.sub.n.sup.(i)) the effect of each of which may be described with a matrix A.sup.(i). The effect of summing all k realizations is therefore described by the matrix A.sub.k which, in turn, approximates the identity matrix (i.e. the sequential approach). In symbols:
(27)
(28) The following formalizes the preceding discussion and is a direct consequence of singular value decomposition theory.
(29) Proposition 1: Let the a.sup.(i) form an orthonormal set, i.e.
(30)
Then, 1. A.sub.k is the best k-realization approximation to I.sub.nn, i.e. error(k,n)=|A.sub.kI.sub.nn|.sub.Frobenius is lowest. 2. A.sub.n=.sub.i=1.sup.nA.sup.(i)=I.sub.nn. 3. At least n realizations are needed to reproduce the sequential approach exactly.
Choosing a Set of Weights for Simultaneous Source Encoding Assuming a Fixed Spread Geometry
Method 1: Convert an Existing Set of Weights Using Gram-Schmidt
(31)
The example in
Method 2: Exploit Properties of Wavefield Inner Products
(32) It can be shown that the cross-talk (noise) due to simultaneously simulating nearby sources is larger than the cross-talk due to sources that are far apart. In other words, the inner products corresponding to off-diagonal entries that are close to the diagonal are the most problematic. Thus, if we can afford k realizations, then we can group shots into clusters of k and to each cluster apply orthogonal weights vectors of length k. For example, we could pick the canonical basis for dimension k (b.sup.(i)=(0, . . . , 1, . . . , 0) where the 1 is at position i). Another approach is to apply the procedure of Method 1 to a problem of size k. Thus, the final vectors for the full problem of size nk consists of concatenating n/k times the vectors for the problem of size k, i.e. a.sup.(i)=(b.sup.(i), . . . , b.sup.(i)). After k realizations there will be no cross-talk due to sources that are closer than k units apart. This optimal-k encoding may be randomly perturbed from iteration to iteration by multiplying the bs above by 1 or 1 chosen at random for each group. Alternatively, the perturbation may be achieved with an orthogonal set vectors with dimension equal to n/k. Additionally, the location of the 1 in the optimal-k vectors may be randomly perturbed to improve the inversion results.
(33) Method 3: Design a Matrix of Rank N and Non-Repeating Singular Values that Approximates the Identity Matrix.
(34) A matrix M can be designed that approximates a desired behavior. For example, the identity matrix (because it represents the sequential approach) can be approximated by defining M.sub.i,j=exp(|loc(i)loc(j)|). Applying SVD on M, one obtains an orthonormal set of weights. This choice seems to give the best results in 2D inversion tests that were run. See
(35) It may be noted that the eigenvectors of the example matrix given in the preceding paragraph are related to harmonic functions, i.e. eigenvectors of a Laplace operator. In some special geometries, e.g. a line graph, these eigenvectors may be obtained as sine and cosine functions as described below. In other words, the analytical expression of the eigenvectors above is given by sine and cosine functions, and so similar results are obtained by defining the weights analytically as given by sine and cosine functions. For example, if n is the number of shots along a spatial dimension, then the weights may be given by
(36)
or by the cosine of the same arguments. Here, the argument x is an integer between 1 and n, and k is a spatial (i.e. reference) frequency for this weight. Note that it is by varying this k that different weight vectors are obtained, i.e. the ones used for independent realizations. This is a 1-D example (i.e. there is a single line of shots), but the same idea applies in 2-D: simply multiply two 1-D weight vectors. (The 2-D case needs two spatial frequenciesfrequencies in spaceand so we can take two 1-D vectors and then their outer product to get a matrixi.e. 2-D distribution of weights.)
(37) Using the above sinusoidal function as an example, experience has shown that it may be preferred to use lower frequency vectors for the first iteration of the inversion process, then progressively higher frequency vectors for each succeeding iteration. In other words, the range of k-values used for the different realizations in the first iteration would be a low range, and a progressively higher range would be used for each succeeding iteration. The next section discloses other schemes for varying (or not varying) the weights from one iteration to the next.
(38) Using the Set of Weights in an Inversion
(39) Given a set of weights, one can choose k vectors (one vector for each of the k realizations) to use for each iteration, but how to vary these vectors from one iteration to another decision that remains. Following are a few of the possible choices (some of which may be applied in conjunction with others). Regarding nomenclature, each vector will have n components, where each component is a weight for one of the n shots in the super (composite) gather. 1. Pick the same k vectors for each iteration. 2. Pick a different set of k vectors at random for each iteration. 3. Pick a set of k vectors that have not yet been picked in previous iterations. a. Choose at random from available ones. b. Choose in sequence: iteration i selects the i.sup.th group of k vectors. 4. If all vectors are exhausted: a. Ignore all picks and start anew. b. Generate a different orthogonal set using any method above. c. Use a random set instead of an orthogonal one. 5. Each vector in the set of k vectors may be multiplied by a random constant. 6. Any of the above methods may be used in conjunction with applying time shifts to the data. a. The time shift may be randomly chosen to be within a determined time window for each shot separately. b. The time shift as above but where the encoding is performed only for shots that have the same time shift. c. As in (b) but where the encoding is applied regardless of the time shift. 7. Any of the above where the shots may have already been encoded by applying frequency selection filters either prior to encoding or following the encoding.
Non-Fixed Spread Geometry (e.g., Marine Streamer).
(40) Mask encoding, as disclosed herein, is a deterministic method that allows encoding multiple shots and simulating them simultaneously even for a non-fixed spread acquisition geometry.
(41) If M.sub.i is the hard mask for the i.sup.th source gather G.sub.i, and CM.sub.k is the desired composite mask for the k.sup.th composite gather CG.sub.k, then the composite mask may be created such that
.sub.iM.sub.i*G.sub.i.sub.kCM.sub.k*CG.sub.k,
where the sum on the left is over all gathers in the composite gather, and the sum on the right is over all realizations the user may elect to have.
(42) For simulation of simultaneous sources, the masks may then be encoded (61 in
(43) In the aforementioned adjoint method, the gradient of the objective (cost) function may be computed by correlating a forward simulation time series at each model location with a backward simulation time series at the same location. The forward simulation ensues from simulating an encoded source signature (wavelet); the backward simulation ensues from an adjoint source (instead of the signature) computed in a way that depends on the choice of objective function. For example, the adjoint source for the L2 norm objective function is simply the difference between recorded data and forward simulated data, but each objective function may produce a different backward simulation source term.
(44) Hermann and Haber (PCT Patent Application Publication WO 2011/160201) describe a method that, like the method of Krebs et al. for a fixed spread geometry, may greatly reduce the number of gradient calculations during an inversion. The key to their method is a stochastic (i.e. random choice of samples) inversion that utilizes randomly chosen weights to encode multiple shots into one together with a method that corrects for simulated data at receiver locations that do not record any data (this is the key difference between fixed spreadin which all receivers record data from all sourcesand non-fixed spreadin which some receivers do not record data from some sources). By contrast, the present inventive method is totally deterministic and proceeds uses double encoding: to encode masks that perform the necessary correction as in Hermann and Haber's approach, and to encode the shots as taught herein for the fixed-spread case.
(45) The foregoing description is directed to particular embodiments of the present invention for the purpose of illustrating it. It will be apparent, however, to one skilled in the art, that many modifications and variations to the embodiments described herein are possible. All such modifications and variations are intended to be within the scope of the present invention, as defined by the appended claims.