COMPUTER-IMPLEMENTED METHOD FOR HIGH SPEED MULTI-SOURCE LOADING AND RETRIEVAL OF WAVEFIELDS EMPLOYING FINITE DIFFERENCE MODELS
20210406427 · 2021-12-30
Assignee
Inventors
Cpc classification
G06F30/23
PHYSICS
International classification
Abstract
A method for efficiently injecting sources and retrieving receiver wavefields using finite difference models for wavefield simulations on a GPU, when the source and receiver are not on numerical grid points, and therefore arbitrarily located. To accomplish that, the method employs GPU texture memory to increase the memory read bandwidth, and then positions at arbitrary or simulated locations the sources in a finite difference grid, as well as extends them over a newly generated number of grid points.
Claims
1. A computer-implemented method for high-speed loading source points of incidences using a finite difference model, when placed at arbitrary locations on a grid. The method comprising: storing a source time wavelet, source points, and a modeled grid, having source gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit; retrieving the modeled grid, from the memory resource by the central processing unit; extracting pressure wavefield variables with arbitrary source gridpoint positions, from the retrieved modeled grid by the central processing unit; storing the extracted pressure wavefield variables with arbitrary source gridpoint positions, to the memory resource on the central processing unit; computing effective source gridpoint positions for the stored pressure wavefield variables with arbitrary gridpoint positions, by the central processing unit; storing the computed effective source gridpoint positions for the pressure wavefield variables to the memory resource on the central processing unit; copying the stored effective source gridpoint positions for the pressure wavefield variables from the memory resource on the central processing unit, to an off-chip global memory resource located on a graphics processing unit from, using a CUDA copy application programming interface; mapping the off-chip global memory resource of the graphic processing unit, to a texture memory resource of the graphics processing unit; generating an initial time-step value by the graphics processing unit; storing the generated initial time-step value to the off-chip global memory resource of the graphics processing unit; inputting to the graphics processing a user-defined maximum time-step value to the graphics processing unit; storing the inputted user-defined maximum time-step value to the off-chip global memory resource of the graphics processing unit; retrieving the initial time-step value, and the user-defined maximum time-step value from the off-chip global memory resource of the graphics processing unit; computing by the graphics processing unit, a wave propagation algorithm, using the copied effective source gridpoint positions for the pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value; repeating the step of computing a wave propagation algorithm by the graphics processing unit, until the last inputted time-step value is equal to the user-defined maximum time-step value; loading by the graphics processing unit, a new grid having new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm; copying from the graphics processing unit, to the central processing unit, the new grid having new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm, using the CUDA copy application programming interface; and storing to the memory resource of the central processing unit, the copied new grid having new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm.
2. The computer-implemented method of claim 1, wherein the step of storing a source time wavelet, source points, and a modeled grid, having source gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit, further comprises a source wavelet represented in time domain by the expression:
s(t)
3. The computer-implemented method of claim 1, wherein the step of storing a source time wavelet, source points, and a modeled grid, having source gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit, further comprises individual source points represented by the expression:
{right arrow over (r)}.sub.s=(x.sub.s,y.sub.s,z.sub.s)
4. The computer-implemented method of claim 1, wherein the step of storing a source time wavelet, source points, and a modeled grid, having source gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit, further comprises the acoustic wave velocity model represented by the expression:
V.sub.P
5. The computer-implemented method of claim 1, wherein the step of extracting pressure wavefield variables with arbitrary source gridpoint positions, from the retrieved modeled grid by the central processing unit, further comprises the expression:
P({right arrow over (r)},{right arrow over (r)}.sub.s;t)
6. The computer-implemented method of claim 1, wherein the step of computing effective source gridpoint positions for the stored pressure wavefield variables with arbitrary grid point positions, by the central processing unit, further comprises of up to 512 gridpoint positions for each off-gridpoint source;
7. The computer-implemented method of claim 1, wherein the step of computing effective source gridpoint positions for the stored pressure wavefield variables with arbitrary grid point positions, by the central processing unit, further comprises of assigning a weight to each gridpoint position represented by the expression:
S.sub.i.sub.
8. The computer-implemented method of claim 1, wherein the step of inputting to the graphics processing a user-defined maximum time-step value to the graphics processing unit further comprises as the minimum of a stability requirement ranging from 0.1 ms to 2 ms, an input data sample interval ranging from 1 ms to 4 ms, and an imaging condition interval ranging from 2 ms to 8 ms;
9. The computer-implemented method of claim 1, wherein the step of computing by the graphics processing unit, a wave propagation algorithm, using the copied effective source gridpoint positions for the pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value, further comprises the expression:
10. The computer-implemented method of claim 1, wherein the step of loading by the graphics processing unit, a new grid having new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm:
P.sub.i.sub.
11. A computer-implemented method for high-speed retrieval of receiver wavefield locations using a finite difference model, when placed at arbitrary locations on a grid. The method comprising: storing a source time wavelet, receiver locations, and a modeled grid, having receiver gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit; retrieving the modeled grid, from the memory resource by the central processing unit; extracting receiver pressure wavefield variables with arbitrary receiver gridpoint positions, from the retrieved modeled grid by the central processing unit; storing the extracted receiver pressure wavefield variables with arbitrary receiver gridpoint positions, to the memory resource on the central processing unit; computing effective receiver gridpoint positions for the stored receiver pressure wavefield variables with arbitrary gridpoint positions, by the central processing unit; storing the computed effective receiver gridpoint positions for the receiver pressure wavefield variables to the memory resource on the central processing unit; copying the stored effective receiver gridpoint positions for the receiver pressure wavefield variables from the memory resource on the central processing unit, to an off-chip global memory resource located on a graphics processing unit from, using a CUDA copy application programming interface; mapping the off-chip global memory resource of the graphic processing unit, to a texture memory resource of the graphics processing unit; generating an initial time-step value by the graphics processing unit; storing the generated initial time-step value to the off-chip global memory resource of the graphics processing unit; inputting to the graphics processing a user-defined maximum time-step value to the graphics processing unit; storing the inputted user-defined maximum time-step value to the off-chip global memory resource of the graphics processing unit; retrieving the initial time-step value, and the user-defined maximum time-step value from the off-chip global memory resource of the graphics processing unit; computing by the graphics processing unit, a wave propagation algorithm, using the copied effective receiver gridpoint positions for the receiver pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value; repeating the step of computing a wave propagation algorithm by the graphics processing unit, until the last inputted time-step value is equal to the user-defined maximum time-step value; loading by the graphics processing unit, a new grid having new receiver gridpoints for the receiver pressure wavefield variables at effective positions from each computed wave propagation algorithm; copying from the graphics processing unit, to the central processing unit, the new grid having new receiver gridpoints for the receiver pressure wavefield variables at effective positions from each computed wave propagation algorithm, using the CUDA copy application programming interface; and storing to the memory resource of the central processing unit, the copied new grid having new receiver gridpoints for the receiver pressure wavefield variables at effective positions from each computed wave propagation algorithm.
12. The computer-implemented method of claim 11, wherein the step of storing receiver locations, and a modeled grid, having receiver gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit, further comprises individual receiver points represented by the expression:
{circumflex over (r)}.sub.r=(x.sub.r,y.sub.r,z.sub.r)
13. The computer-implemented method of claim 11, storing receiver locations, and a modeled grid, having receiver gridpoints placed at arbitrary locations, using an acoustic wave velocity model in 3D coordinates, to a memory resource on a central processing unit, further comprises the acoustic wave velocity model represented by the expression:
V.sub.P
14. The computer-implemented method of claim 11, wherein the step of extracting receiver pressure wavefield variables with arbitrary source gridpoint positions, from the retrieved modeled grid by the central processing unit, further comprises the expression:
P.sub.i.sub.
15. The computer-implemented method of claim 11, wherein the step of computing effective receiver gridpoint positions for the stored receiver pressure wavefield variables with arbitrary grid point positions, by the central processing unit, further comprises of up to 512 gridpoint positions for each off-gridpoint source;
16. The computer-implemented method of claim 1, wherein the step of inputting to the graphics processing a user-defined maximum time-step value to the graphics processing unit further comprises as the minimum of a stability requirement ranging from 0.1 ms to 2 ms, an input data sample interval ranging from 1 ms to 4 ms, and an imaging condition interval ranging from 2 ms to 8 ms;
17. The computer-implemented method of claim 1, wherein the step of computing by the graphics processing unit, a wave propagation algorithm, using the copied effective source gridpoint positions for the pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value, further comprises the expression:
18. The computer-implemented method of claim 11, wherein the step of loading by the graphics processing unit, a new grid having new receiver gridpoints for the receiver pressure wavefield variables at effective positions from each computed wave propagation algorithm:
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] The teachings of the present invention can be readily understood by considering the following description in conjunction with the accompanying drawings.
[0037]
[0038]
[0039]
[0040]
[0041]
DETAILED DESCRIPTION OF THE INVENTION
[0042] Reference will now be made in detail, to several embodiments of the present disclosures, examples of which, are illustrated in the accompanying figures. It is noted that wherever practicable similar or like reference numbers may be used in the figures and may indicate similar or like functionality. The figures depict embodiments of the present disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures, systems, and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
[0043] The computer-implemented method of the present invention, introduces an efficient way to inject sources, and to retrieve wavefields using in finite difference for wavefield simulation on the GPU when the source or receiver are not on numerical gridpoints. In the present state of the art, memory access is a bottleneck when comparing to computation instructions. Furthermore, when sources are positioned at arbitrary locations in a finite difference grid, they must be extended on a number of grid points, thereby effectively increasing the number of source points to inject into the simulation. As these extended gridpoints are often not located in a contiguous memory region, this greatly increase the memory access cost, yet when the actual number sources are a few, the increased memory access cost is limited. But when performing reverse time migration (RTM) or full waveform inversion (FWI), the number of actual sources during backward wave simulations, is equal to the number of data traces, thereby significantly increasing memory access costs. Similarly, in a forward wave simulation, when receivers are not on the grid points, the wavefield recorded at the receivers must be obtained from its neighbor grid points. In one of the present embodiments of the present invention, the computer-implemented method aims to overcome these memory access costs, by using GPU texture memory. Even though a GPU thread for source injection or wavefield retrieving will have to read noncontiguous memory, memory access patterns exhibit a great deal of spatial locality. In a computing application, this roughly implies that a thread is likely to read from an address “near” the address that nearby threads read. GPU texture memory is designed to increase the effective bandwidth of this spatial locality memory read. Using this texture memory, the efficiency of the source injection and wavefield retrieving can improve by ten times.
1. Deployment of a Technique to Increase GPU Bandwidth
[0044] In order to compute an acoustic wave propagation equation by the present computer-implemented method, the following algorithm was developed in simplified form:
[0046] In the finite difference numerical solution, equation (1) is discretized in space and time, with grids sizes (N.sub.x,N.sub.y,N.sub.z,N.sub.t) in X, Y, Z and t dimension respectively. Nevertheless, the pressure wavefield only has values on these gridpoints: P.sub.i,j,k=(1,1,1), . . . ,(N.sub.
[0047] When the source location {right arrow over (r)}.sub.s is on a grid point (i.sub.s,j.sub.s,k.sub.s), the source loading at time-step value n in the finite difference model is simply computed as:
P.sub.i.sub.
[0048] The memory cost of this source loading is simply one-read and one-write the in-memory resource, for each source. As such, most individuals skilled in the art ignore this cost, when compared to other operations.
[0049] A. When Receiver Location is on Grid Point
[0050] On the other hand, when the receiver location {right arrow over (r)}.sub.r is on the grid point (i.sub.r,j.sub.r,k.sub.r), the pressure wavefield R.sub.i.sub.
R.sub.i.sub.
[0051] Nevertheless, the memory cost of this retrieving process is simply one-read, and one-write in memory for each receiver. As such, most individuals skilled in the art, also ignore this cost when compared to other operations.
[0052] However, when the source point {right arrow over (r)}.sub.s=(x.sub.s,y.sub.s,z.sub.s) is not on a finite difference grid site, one must find a set of grid points sources (i,j,k) that are combined to be equivalent to the source located at {right arrow over (r)}.sub.s. As such, the following equation must be implemented:
S.sub.i.sub.
where
and w.sub.i.sub.
[0053] In theory, the set of grid-points sources (i,j,k) for each actual point-source are typically extended to all over the spatial grid space. Each grid-point source from expression (4) is then loaded according to expression (2), which significantly increases the cost. However, in practice, because of accuracy considerations; a windowed grid-point-source of equation (4) is used, such that for each point-source at {right arrow over (r)}.sub.s one uses: −n.sub.x/2+1≤i≤n.sub.x/2, −n.sub.y/2+1≤j≤n.sub.y/2, −n.sub.z/2+1≤k≤n.sub.z/2 set of grid-point-sources. The effective number of gridpoint-sources following the aforementioned, is therefore n.sub.xn.sub.yn.sub.z. One again, for accuracy considerations, in practice, 2n.sub.x(y,z) is usually taken in the order of the finite difference operator N therefore, for a 16.sup.th-order finite difference n.sub.x(y,z)=8, the effective number of combined grid-point-source terms ends up being of up to 512, clearly increasing the loading cost by 512 times. Furthermore, since for each point-source, the extended gridpoints: −n.sub.x/2+1≤i≤n.sub.x/2, −n.sub.y/2+1≤j≤n.sub.y/2, −n.sub.z/2+1≤k≤n.sub.z/2, are not in contiguous memory resources, the memory access cost is also further increased. This increased memory access cost, becomes even worse in the backward wave simulation stage for RTM or FWI because there are N.sub.xN.sub.y actual point-sources, and the number of extended grid point-sources becomes n.sub.xn.sub.yn.sub.zN.sub.xN.sub.y.
[0054] B. When Receiver Location is not on Grid Point
[0055] Similarly, when the receiver {right arrow over (r)}.sub.r=(x.sub.r,y.sub.r,z.sub.r) is not on the finite difference grid site, one must find a set of grid point wavefield terms (i,j,k) that are combined to obtain the wavefield located at {right arrow over (r)}.sub.r, using the following expression:
R(x.sub.r,y.sub.r,z.sub.r,n)=Σ.sub.(i,j,k)=−n.sub.
[0056] Under expression (5), wavefield R(x.sub.r,y.sub.r,z.sub.r,n) at the receiver is combined with the n.sub.xn.sub.yn.sub.z grid-point wave field values of P.sub.i.sub.
[0057] Although gridpoints: −n.sub.x/2+1≤i≤n.sub.x/2, −n.sub.y/2+1≤j≤n.sub.y/2, −n.sub.z/2+1≤k≤n.sub.z/2, are needed for source loading or wavefield retrieving in situations when the source or the receivers are not on the grid-points, or on contiguous area in a computer memory; they can still be spatially located, if one extends the grid-points, from the source or the receiver position. This can be achieved via the texture cached memory on the GPU, which was originally designed for graphics applications, where memory access patterns exhibit a great deal of spatial locality with about 10 times higher effective read and write bandwidth than the read of the off-chip global memory. Further this texture memory can also be used for general-purpose computing but thus far none have been exploited.
2. Loading or Injection of Sources or Points of Incidence
[0058] In general terms, the sequence of steps presented in one of the embodiments of the inventions for fast multi-sources injection at arbitrary location in the finite difference method on GPU includes:
[0059] a) Determining the effective number of the combined gridpoint sources for each actual source. When the source point i=(x.sub.s,y.sub.s,z.sub.s) is not on the finite difference grid site, one must find a set of grid point (i,j,k) sources that are combined to be equivalent to the source located at 1%, in accordance with formula (4) and,
[0060] is the nearest grid point of the (off-grid) source, Δx(Δy,Δz) is the grid step length in x(y,z)-direction, and w.sub.i.sub.
[0061] for a grid point i with coordinate x.sub.i.sub.
where n.sub.x is the window size user input or n.sub.x=N/2, half the finite difference order, I.sub.0 is the zero-order modified Bessel function, and b=4.31 is the shape control parameter. Similarly one obtains the grid point j and w.sub.j.sub.
[0062] b) Allocating these n.sub.xn.sub.yn.sub.z,N.sub.s effective independent gridpoint source grid positions, their weighs and as well as the wavefield variables on the off-chip GPU global memories is represented by the following source code,
[0063] //allocate the source position on grid point or no on grid points
[0064] cudaMalloc((void**)&config->srcpos_shot, config->NSRC_shot*sizeof(int4));
[0065] //Allocate source weight when offset on grid points
[0066] cudaMalloc((void**)&config->srcSw8_shot, config->NSRC_shot*sizeof(float));
[0067] c) copying from the CPU memory to the off-chip GPU global memory resource using CUDA copy API having the following source code//copy the point source position from CPU to GPU
TABLE-US-00002 cudaMemcpy(config−>srcpos_shot,&config−>srcpos_shot_thread[0],config− >NSRC_shot*sizeof(int4), cudaMemcpyHostToDevice); // copy source weigh from CPU to GPU cudaMemcpy(config−>srcSw8_shot,&config−>srcSw8_shot_thread[0],config− >NSRC_shot*sizeof(float),cudaMemcpyHostToDevice) ;
[0068] d) Mapping those allocated off-chip global memories to texture memories using CUDA API (application programming interface) as represented by the following source code;
[0069] cudaBindTexture(0, &d_srcSw8_tex,config->srcSw8,&channelDesc, [0070] nx*ny*nz*config->NREC*sizeof(float));
[0071] e) Using CUDA texture memory fetch API to read the independent point source grid positions, and the wavefields P.sub.i.sub.
[0072] e) Loading or injecting the effective independent gridpoint sources with the weights on the wavefields on the grid point (i.sub.s+i,j.sub.s+j,k.sub.s+k) as represented by the following formula an source code:
P.sub.i.sub.
TABLE-US-00003 // fetch source location int nslowsrc=tex1Dfetch(d_srcpos_tex,isrc)−RADIUS; int nmiddlesrc=tex1Dfetch(d_srcpos_tex,1*nsrc+isrc); int nfastsrc=tex1Dfetch(d_srcpos_tex,2*nsrc+isrc); int j=tex1Dfetch(d_srcpos_tex,3*nsrc+isrc); size_t globalAddr= (nslowsrc+RADIUS)*stridemf+ nmiddlesrc*stridef+nfastsrc; // fetch weighted source float tmp= tex1Dfetch(src,it*ntsrc+j); // fetch wavefield float pp= tex1Dfetch(d_p_tex, globalAddr); // inject the wavefield atomicAdd(&p[globalAddr],pp);
3. Retrieving Receiver Wavefields
[0073] The sequence of steps for fast wavefield retrieving at arbitrary location in the finite difference method on GPU typically includes:
[0074] a) Determining the effective number of the combined grip points for each receiver and calculate the contribution weight. When the receiver {right arrow over (r)}.sub.r is not on the finite difference grid site, one must find a set of grid point wavefield terms (i,j,k) that are combined to obtain the wavefield located at {right arrow over (r)}.sub.r using formula:
R(x.sub.r,y.sub.r,z.sub.r,n)=Σ.sub.(i,j,k)=−n.sub.
[0075] Here n.sub.x(y,z) is the window size which a person having ordinary skills in the art would recognize typically as 2n.sub.x(y,z)=N the finite difference order, represented as:
[0076] Which are the nearest grid point of a (off-grid) receiver, and w.sub.i.sub.
Similarly, one calculates w.sub.j.sub.
[0077] b) Allocating these n.sub.xn.sub.yn.sub.zN.sub.r grip point positions, weights and as well as the wavefield variables on these grid positions on the off-chip global memories;
[0078] c) Mapping those allocate off-chip global memories to texture memories;
[0079] d) Using CUDA texture memory fetch API to read the wavefields P.sub.i.sub.
[0080] e) Summing up the fields P.sub.i.sub.
[0081] f) copying from the off-chip GPU global memory resource to the CPU memory using CUDA copy API.
4. Detailed Description of the Figures
[0082]
[0083] In
[0084] Furthermore, as shown on
[0085] Although points of incidence, or shots 104, are represented in
[0086] As it pertains to
[0087] Persons of ordinary skill in the art, will recognize that grids like 201 produce detailed images of local geology in order to determine the location and size of possible hydrocarbon (oil and gas) reservoirs, and therefore a potential well location 103. Land acquisition grid geometry represented by
[0088] As it pertains to
where n.sub.x=N/2, (i.e. half the finite difference order). Similarly the algorithm for the finite difference model of a gridpoint j includes the computation of w.sub.j.sub.
[0089] cudaBindTexture(0, &d_srcSw8_tex,config->srcSw8,&channelDesc, [0090] nx*ny*nz*config->NREC*sizeof(float));
[0091] Once the mapping step 311 is complete, the graphics processing unit generates at step 312 an initial time-step value which comprise of the minimum values obtained from the source time wavelet, the source points, and the modeled grid all of which contain the following three different variables. These variables are: a) a stability requirement, b) an input data sample interval, and c) an imaging condition interval. The graphics processing unit then stores the initial time-step value to the off-chip global memory resource at 313, and sends a graphics user interface to a display monitor connected to the graphics processing unit, for a user, typically a person having ordinary skills in the art, to input at step 314, a user-defined maximum time-step value. Using the same variables of step 312, the user then defines its maximum time-step value as the minimum of: a) the stability requirement ranging from 0.1 ms to 2 ms, b) the input data sample interval ranging from 1 ms to 4 ms, and c) the imaging condition interval ranging from 2 ms to 8 ms. Upon entering the maximum time-step vale, the user confirms through the graphics user interface which instead signals the graphics processing unit to store the maximum time-step value to its off-chip global memory resource at step 315. The graphics processing unit at step 316 retrieves the initial time-step value, and the user-defined maximum time-step value for it to begin computing at step 317 a wave propagation algorithm, using the effective source gridpoint positions for the pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value. The computation of the wave propagation executed at step 317 is an iterative process, in order to resolve the algorithm presented by formula (1). Iterative step 317 basically uses the proposed efficient computer-implemented method to inject sources that generated the wavefield on the grids, and then propagates them until the last inputted time-step value is equal to the user-defined maximum time-step value. The graphics processing unit will verify at step 318 if the last inputted time-step value is equal to the user-defined maximum time-step value. If it isn't, then the graphics processing unit repeats the computation done at 317 until it exhausts all of the inputted time-step values. At the same time, the graphics processing unit, making use of its parallel processing power, verifies whether the last computation performed at 317 equals the user-defined maximum time-step value, and completes the iterative 317-318 process of wave propagation, and loads or injects at step 319 new source gridpoints onto a newly generated grid having the source pressure wavefield variables at effective. After step 319 is completed, the graphics processing unit begins at step 320 copying to a random access memory device in the central processing unit, the new grid having the new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm, using the CUDA copy application programming interface. Once the graphics processing unit has completed step 321, it sends a signal to the central processing unit to begin storing to its memory resource (typically a hard drive) the new grid having new source gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm. A graphics user interface is then displayed in the monitor connected to the central processing unit, indicating the user that the computer-implemented method has been completed the first part of the computer-implemented method which comprise of high-speed loading source points of incidences, using a finite difference model, that the final data has been stored, and that is reading to begin the second phase of the computer-implemented method which comprises of high-speed retrieval of receiver locations using a finite difference model. This data can be used for further geophysical and seismic imaging analysis.
[0092] Turning over to
where n.sub.x=N/2, (i.e. half the finite difference order). Similarly the algorithm for the finite difference model of a gridpoint j includes the computation of w.sub.j.sub.
[0093] cudaBindTexture(0, &d_srcSw8_tex, config->srcSw8,&channelDesc, [0094] nx*ny*nz*config->NREC*sizeof(float));
[0095] Once the mapping step 411 is complete, the graphics processing unit generates at step 412 an initial time-step value which comprise of the minimum values obtained from the receiver points, and the modeled grid all of which contain the following three different variables. These variables are: a) a stability requirement, b) an input data sample interval, and c) an imaging condition interval. The graphics processing unit then stores the initial time-step value to the off-chip global memory resource at 413, and sends a graphics user interface to a display monitor connected to the graphics processing unit, for a user, typically a person having ordinary skills in the art, to input at step 414, a user-defined maximum time-step value. Using the same variables of step 412, the user then defines its maximum time-step value as the minimum of: a) the stability requirement ranging from 0.1 ms to 2 ms, b) the input data sample interval ranging from 1 ms to 4 ms, and c) the imaging condition interval ranging from 2 ms to 8 ms. Upon entering the maximum time-step vale, the user confirms through the graphics user interface which instead signals the graphics processing unit to store the maximum time-step value to its off-chip global memory resource at step 415. The graphics processing unit at step 416 retrieves the initial time-step value, and the user-defined maximum time-step value for it to begin computing at step 417 a wave propagation algorithm, using the effective receiver gridpoint positions for the pressure wavefield variables, at every inputted time-step value, between the retrieved initial time-step value and the user-defined maximum time-step value. The computation of the wave propagation executed at step 417 is an iterative process, in order to resolve the algorithm presented by formula (1). Iterative step 417 basically uses the proposed efficient computer-implemented method to retrieve receiver wavefields that are generated on the grids, and then propagates them until the last inputted time-step value is equal to the user-defined maximum time-step value. The graphics processing unit will verify at step 418 if the last inputted time-step value is equal to the user-defined maximum time-step value. If it isn't, then the graphics processing unit repeats the computation done at 417 until it exhausts all of the inputted time-step values. At the same time, the graphics processing unit, making use of its parallel processing power, verifies whether the last computation performed at 417 equals the user-defined maximum time-step value, and completes the iterative 417-418 process of wave propagation, and retrieves at step 419 new receiver gridpoints onto a newly generated grid having the receiver pressure wavefield variables at effective. After step 419 is completed, the graphics processing unit begins at step 420 copying to a random access memory device in the central processing unit, the new grid having the new receiver gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm, using the CUDA copy application programming interface. Once the graphics processing unit has completed step 421, it sends a signal to the central processing unit to begin storing to its memory resource (typically a hard drive) the new grid having new receiver gridpoints for the pressure wavefield variables at effective positions from each computed wave propagation algorithm. A graphics user interface is then displayed in the monitor connected to the central processing unit, indicating the user that the computer-implemented method has been completed and that the final data has been stored and can be used for further geophysical and seismic imaging analysis.
[0096] As it pertains to
[0097] According the preferred embodiment of the present invention, certain hardware, and software descriptions were detailed, merely as example embodiments and are not to limit the structure of implementation of the disclosed embodiments. For example, although many internal, and external components of the central processing unit, and of the graphics processing unit have been described, those with ordinary skills in the art will appreciate that such components and their interconnection are well known. Additionally, certain aspects of the disclosed invention may be embodied in software that is executed using one or more, receiving systems, memory resources, central processing units, graphics processing units, or non-transitory computer readable memory devices.
[0098] Computer programs, or codes of the technology of the present invention may be thought of as “products” or “articles of manufacture” typically in the form of executable code and/or associated data that is carried on, or embodied in, a type of machine readable medium.
[0099] Tangible non-transitory “storage” type media and devices include any or all memory or other storage for the computers, process or the like, or associated modules thereof such as various semiconductor memories, tape drives, disk drives, optical or magnetic disks, and the like which may provide storage at any time for the software programming.
[0100] As used herein the term “survey region” refers to an area or volume of geologic interest, and may be associated with the geometry, attitude and arrangement of the area or volume at any measurement scale. A region may have characteristics such as folding, faulting, cooling, unloading, and/or fracturing that has occurred therein.
[0101] As used herein, the term “computing” encompasses a wide variety of actions, including calculating, determining, processing, deriving, investigation, look ups (e.g. looking up in a table, a database, or another data structure), ascertaining and the like. It may also include receiving (e.g. receiving information), accessing (e.g. accessing data in a memory) and the like. Also, “computing” may include resolving, selecting, choosing, establishing, and the like.
[0102] As used herein, “subsurface”, and “subterranean” means beneath the top surface of any mass of land at any elevation or over a range of elevations, whether above, below or at sea level, and/or beneath the floor surface of any mass of water, whether above, below or at sea level.
[0103] Unless specifically stated otherwise, terms such as “defining”, “creating”, “including”, “representing”, “pre-analyzing”, “pre-defining”, “choosing”, “building”, “assigning”, “creating”, “introducing”, “eliminating”, “re-meshing”, “integrating”, “discovering”, “performing”, “predicting”, “determining”, “inputting”, “outputting”, “identifying”, “analyzing”, “using”, “assigning”, “disturbing”, “increasing”, “adjusting”, “incorporating”, “simulating”, “decreasing”, “distributing”, “specifying”, “extracting”, “displaying”, “executing”, “implementing”, and “managing”, or the like, may refer to the action and processes of a retrieving system, or other electronic device, that transforms data represented as physical (electronic, magnetic, or optical) quantities within some electrical device's storage, like memory resources, or non-transitory computer readable memory, into other data similarly represented as physical quantities within the storage, or in transmission or display devices.
[0104] Embodiments disclosed herein also relate to computer-implemented system, used as part of the retrieving system for performing the operations herein. This system may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program or code stored in the memory resources, or non-transitory computer readable memory. As such, the computer program or code may be stored or encoded in a computer readable medium or implemented over some type of transmission medium. A computer-readable medium, or a memory resource includes any medium or mechanism for storing or transmitting information in a form readable by a machine, such as a computer (‘machine’ and computer may be used synonymously herein). As a non-limiting example, a computer-readable medium or a memory resource may include a computer-readable storage medium (e.g., read only memory (“ROM”), random access memory (“RAM”), magnetic disk storage media, optical storage media, flash memory devices, etc.). A transmission medium may be twisted wire pairs, coaxial cable, optical fiber, or some other suitable wired or wireless transmission medium, for transmitting signals such as electrical, optical, acoustical, or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.).
[0105] A receiving system or sensor 105 as used herein, typically includes at least hardware capable of executing machine readable instructions, as well as the software for executing acts (typically machine-readable instructions) that produce a desired result. In addition, a retrieving system may include hybrids of hardware and software, as well as computer sub-systems.
[0106] Hardware generally includes at least processor-capable platforms, such as client-machines (also known as servers), and hand-held processing devices (for example smart phones, personal digital assistants (PDAs), or personal computing devices (PCDs)). Further, hardware may include any physical device that can store machine-readable instructions, such as memory or other data storage devices. Other forms of hardware include hardware sub-systems, including transfer devices such as modems, modem cards, ports, and port cards, for example.
[0107] Software includes any machine code stored in any memory medium, such as RAM or ROM, and machine code stored on other devices (such as non-transitory computer readable media like external hard drives, or flash memory, for example). Software may include source or object code, encompassing any set of instructions capable of being executed in a client machine, server machine, remote desktop, or terminal.
[0108] Combinations of software and hardware could also be used for providing enhanced functionality and performance for certain embodiments of the disclosed invention. One example is to directly manufacture software functions into a silicon chip. Accordingly, it should be understood that combinations of hardware and software are also included within the definition of a retrieving system and are thus envisioned by the invention as possible equivalent structures and equivalent methods.
[0109] Computer-readable mediums or memory resources include passive data storage, such as a random-access memory (RAM) as well as semi-permanent data storage such as external hard drives, and external databases, for example. In addition, an embodiment of the invention may be embodied in the RAM of a computer to transform a standard computer into a new specific computing machine.
[0110] Data structures are defined organizations of data that may enable an embodiment of the invention. For example, a data structure may provide an organization of data, or an organization of executable code. Data signals could be carried across non-transitory transmission mediums and stored and transported across various data structures, and, thus, may be used to transport an embodiment of the invention.
[0111] The system computer may be designed to work on any specific architecture. For example, the system may be executed on a high-performance computing system, which typically comprise the aggregation of multiple single computers, physically connected, or connected over local area networks, client-server networks, wide area networks, internets, hand-held and other portable and wireless devices, and networks.
[0112] An “output device” includes the direct act that causes generating, as well as any indirect act that facilitates generation. Indirect acts include providing software to a user, maintaining a website through which a user is enabled to affect a display, hyperlinking to such a website, or cooperating or partnering with an entity who performs such direct or indirect acts. Thus, a user may operate alone or in cooperation with a third-party vendor to enable the reference signal to be generated on a display device. A display device may be included as an output device, and shall be suitable for displaying the required information, such as without limitation a CRT monitor, an LCD monitor, a plasma device, a flat panel device, or printer. The display device may include a device which has been calibrated through the use of any conventional software intended to be used in evaluating, correcting, and/or improving display results (e.g., a color monitor that has been adjusted using monitor calibration software). Rather than (or in addition to) displaying the reference image on a display device, a method, consistent with the invention, may include providing a reference image to a subject. “Providing a reference image” may include creating or distributing the reference image to the subject by physical, telephonic, or electronic delivery, providing access over a network to the reference, or creating or distributing software to the subject configured to run on the subject's workstation or computer including the reference image. In one example, providing of the reference image could involve enabling the subject to obtain the reference image in hard copy form via a printer.
[0113] For example, information, software, and/or instructions could be transmitted (e.g., electronically or physically via a data storage device or hard copy) and/or otherwise made available (e.g., via a network) in order to facilitate the subject using a printer to print a hard copy form of reference image. In such an example, the printer may be a printer which has been calibrated through the use of any conventional software intended to be used in evaluating, correcting, and/or improving printing results (e.g., a color printer that has been adjusted using color correction software).
[0114] A database, or multiple databases may comprise any standard or proprietary database software, such as Oracle, Microsoft Access, SyBase, or DBase II, for example. The database may have fields, records, data, and other database elements that may be associated through database specific software. Additionally, data may be mapped. Mapping is the process of associating one data entry with another data entry. For example, the data contained in the location of a character file can be mapped to a field in a second table. The physical location of the database is not limiting, and the database may be distributed. For example, the database may exist remotely from the server, and run on a separate platform. Further, the database may be accessible across a local network, a wireless network of the Internet.
[0115] Furthermore, modules, features, attributes, methodologies, and other aspects can be implemented as software, hardware, firmware, or any combination thereof. Wherever a component of the invention is implemented as software, the component can be implemented as a standalone program, as part of a larger program, as a plurality of separate programs, as a statically or dynamically linked library, as a kernel loadable module, as a device driver, and/or in every and any other way known now or in the future to those of skill in the art of computer programming. Additionally, the invention is not limited to implementation in any specific operating system or environment.
[0116] Various terms as used herein are defined below. To the extent a term used in a claim is not defined below, it should be given the broadest possible definition persons in the pertinent art have given that term as reflected in at least one printed publication or issued patent.
[0117] As used herein, “and/or” placed between a first entity and a second entity means one of (1) the first entity, (2) the second entity, and (3) the first entity and the second entity. Multiple elements listed with “and/or” should be construed in the same fashion, i.e., “one or more” of the elements so conjoined
[0118] Additionally, the flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the Figures. For examples, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowcharts illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified hardware functions or acts, or combinations of special purpose hardware and computer instructions.
[0119] While in the foregoing specification this disclosure has been described in relation to certain preferred embodiments thereof, and many details have been set forth for purpose of illustration, the invention is not to be unduly limited to the foregoing which has been set forth for illustrative purposes. On the contrary, a wide variety of modifications and alternative embodiments will be apparent to a person skilled in the art, without departing from the true scope of the invention, as defined in the claims set forth below. Additionally, it should be appreciated that structural features or method steps shown or described in any one embodiment herein can be used in other embodiments as well.