Method of stabilizing sonar images

Abstract

A method of tracking a known object is presented, wherein a sonar image of an object which is distorted by an artifact associated with sonar imaging is compared with an image generated from a model of the object, and at least one of the two images is modified to reduce differences between them.

Claims

1. A method of tracking a known object, comprising: constructing a first image of the known object from received sonar data returned from the known object; comparing the first image to an second image generated from known characteristics of the known object; and modifying at least one of the first image and second image to reduce differences between the first image and the second image, wherein the differences between the first image and the second image arise from an artifact associated with sonar imaging.

2. The method of claim 1, wherein; the artifact arises from low resolution of a two dimensional sonar receiving device.

3. The method of claim 2, wherein; the artifact is a dilation.

4. The method of claim 1, wherein; the artifact is an erosion of the first image of a first area of the surface of the object.

5. The method of claim 1, wherein; the artifact at a voxel element of the first image arises from interference signals reflected from neighboring voxel elements of the object.

6. The method of claim 1, wherein; the artifact arises from interference signals from local multipath reflection.

7. The method of claim 1, wherein; the artifact arises from interference signals from multipath reflection from a surface different from the object removed from the surface of the object.

8. The method of claim 1, wherein; the second image is modified to more closely fit the first image to within a criterion.

9. The method of claim 8, wherein the criterion is that the second image is stable over a series of second images.

10. The method of claim 1, wherein; the first image is modified to more closely fit the second image to within a criterion.

11. The method of claim 10, wherein the criterion is that the second image is stable over a series of second images.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] FIG. 1 shows a sketch of a sonar source sending sonar wave to a surface and a sonar array receiving device receiving reflected sonar signals from the surface.

[0012] FIGS. 2A-2D show images of different cast blocks used for breakwaters.

[0013] FIG. 3 shows a sketch of an excavator arm carrying a suspended block for placement in a breakwater.

[0014] FIG. 4A shows a sonar image of a block casting a shadow against a background.

[0015] FIG. 4B shows a sonar image of the data of FIG. 4A shown from a different viewpoint.

[0016] FIG. 4C shows a non-sonar image of a model block.

[0017] FIG. 4D shows a set of points on the surface of the model of FIG. 4C.

[0018] FIG. 4E shows the image of FIG. 4A combined with an image of a model of the block wherein the set of points of FIG. 4D have been fit to points measured from the data of FIG. 4A to give the orientation of the model of FIG. 4C.

[0019] FIG. 4F shows the combined image of FIG. 4E taken from the viewpoint of FIG. 4B.

[0020] FIGS. 5A-5F show sonar images of a block and images chosen from a model of the block in the steps of finding orientation of the model which will be the same as the orientation of the block.

[0021] FIGS. 6A and 6B shows show photographs of portions of breakwaters above water level.

[0022] FIGS. 7A and 7B show near field sonar images of Accropodes having different resolution.

[0023] FIGS. 8 A-D show sketches of a barbell changed by scaling and by inflation.

[0024] FIGS. 9 A-D show sketches of shapes changed by inflation based on face normals and surface normals.

[0025] FIGS. 10 A-F shows results of scaling and inflation Accropode model images.

DETAILED DESCRIPTION OF THE INVENTION

[0026] It has long been known that data presented in visual form is much better understood by humans than data presented in the form of tables, charts, text, etc. However, even data presented visually as bar graphs, line graphs, maps, or topographic maps requires experience and training to interpret them. Humans can, however, immediately recognize and understand patterns in visual images which would be impossible for even the best and fastest computers to pick out. Much effort has thus been spent in turning data into images.

[0027] In particular, images which are generated from data which are not related to light are difficult to produce. One such type of data is sonar data, wherein a sonar signal is sent out from a generator into a volume of fluid, and the reflected sound energy from objects in the ensonified volume is recorded one or more detector elements. The term ensonified volume is known to one of skill in the art and is defined herein as being a volume of fluid through which sound waves are directed.

[0028] The sonar data from multielement detectors is generally recorded as points in three dimensional space as a function of range and of two orthogonal angles. These data in polar coordinate space are in turn generally reduced and presented as data from a three dimensional Cartesian coordinate space. The data may then be presented as height above the sea bed, for example, or depth below the surface, as a z coordinate, while the x and y coordinates could be chosen as west and north, for example. In other examples, the x or y coordinate could be chosen to be parallel to a wall or other long, mostly straight object.

[0029] One characteristic of sonar data is that it is very sparse, as the ensonified volume is generally water having only one or a few objects of interest. The volume of the fluid is generally divided into a series of cubes, and data is returned from a small percentage of the cubes. The resolution of the sonar is proportional to the linear dimension of the cubes, while the computation cost of recording the signal from each detector element and calculating from whence the signals have come is inversely proportional to the cube dimensions to the third power. There is then a tradeoff between resolution and computer power and time taken to produce an image from received data.

[0030] In other electromagnetic or ultra sound imaging technologies, the data are very dense. In an art unrelated to sonar imaging, medical imaging essentially has signals from each voxel, and the techniques for such imaging as CT scans, MRI scans, PET scans, and Ultrasound Imaging is not applicable to the sparse sonar data. In the same way, signals from sound waves sent out from the earths surface into the depths to return data of rock formations in the search for oil produce dense data, and techniques developed for such fields would not in general be known or used by one of skill in the art of sonar imaging.

[0031] The present invention is used to treat the sparse data from sonar imaging equipment to produce images which would be comparable to an optical image of an object in a sound wave transmitting medium if the object could in fact be seen through turbid water or other fluid or gas. These images are used to track and precisely place objects Optical inspection of objects in a fluid is often not possible because of smoke and fog in air, for example, or turbidity in water or other fluid. Sonar imaging of such objects is often used. However, if objects are to be placed, grasped, or moved in the fluid, a typical sonar image taken from a single point of view is not sufficient. The backside of the object is not viewable, nor is the background of the object in the sonar shadow viewable.

[0032] FIG. 1 shows a typical sonar imaging set up, where a source 10 projects an outgoing sound wave noted as a wave front 12. The sound wave can be traced as a ray 13 which strikes a sonar reflecting surface 14 and is reflected as ray 15 to a sonar imaging array 16. If an object, or part of the background, stops the sound waves from striking a region 18, the no data is collected from the region and so it is not imaged and is said to be in a sonar shadow. A sonar imaging array, such as an Echoscope from CodaOctopus Inc., comprises an array of microphones which send electrical signals representing the received sound waves to electronics and a computer system for analysis. The sonar signal 12 will be a ping of duration milliseconds to microseconds. A typical sonar ping will be, say, 70 microseconds in duration and have a frequency of 300-1000 KHz. Each microphone of the array receives a many reflected sonar signals of the frequency of the sound wave sent out from the source, but with differing phases which add coherently to produce a signal. The reflected sonar signals from parts of an object nearer to the imaging array will arrive sooner that a signal from objects in the background, and a filter, for example a time window, can be used to sort out the signals from the different parts of the object and background. The reflected sonar signals are digitally sampled at a rate higher than the frequency of the sound wave, and the amplitude and phase of the signals are recorded. The device works very much like an interferometric imaging device where light from an object is imaged on a plane, and where the intensity and phase of the light are measured simultaneously. From the measured sonar signals, the range of the various parts of the object from the imaging array can be calculated and the intensity and phase of the sound waves are recorded, for example, in a computer accessible data storage device for further manipulation. From the recorded data, and image can be created and viewed, for example, on a computer monitor, where the image is shown with the height of the different parts of the image shown as differing colors. We define this type of image herein as a three dimensional (3D) image. Another type of image would be an image where the surfaces are shaded to give the impression of light and shade which also can be interpreted by a human observer as a 3D image. Another type of 3D image is an image which appears to shift as the viewpoint is shifted, which serves very well to show up the range differences by parallax. Another type of 3D image is the familiar topographic map image, where the regions of equal height are connected with lines. All images where a third dimension can be read on a two dimensional image display are anticipated by the inventor.

[0033] As with optical holograms, images may be produced as would be seen from differing viewpoints. The inventor anticipates that a binocular image could be produced for display on a 3 dimensional display device for projecting the image to two eyes of a human observer.

[0034] When building a breakwater, the top (armor) layer is usually made with large heavy concrete blocks. These blocks must be placed sufficiently densely so as to minimize gaps between them to stop the egress of the underlying layers, and must be sufficiently heavy so as not to be moved by the action of waves and tides. Traditionally two layers of boulders, or in most cases cubic concrete blocks have been used. In order to reduce the amount of material required a new approach was introduced, where complex geometric volumes with overlapping parts were chosen. This allows only one layer of armor to be used while still meeting the minimum gap requirement. Photographs of typical blocks are shown in FIGS. 2A-2D. These blocks are generally made from concrete cast in steel molds, and may be several meters high and weigh many tons. The advantage of the Echoscope data is that as it is three dimensional (3D), the virtual eye-point can be moved when displaying the data to give the user a better overview of the scene.

[0035] FIG. 3 shows a sketch of an excavator arm 30 carrying a suspended block 31 for placement in a breakwater. The angular portion 32 of the ensonified region of the background behind the block 31 is called a sonar shadow, and the portion of the background 38 does not receive sound waves and is invisible to the detector 16. This effect is shown in the sonar image shown in FIG. 4A where the background immediately behind the block is shaded.

[0036] One advantage of the 3D visualization made possible by the 3D sonar detector is that the view point of the images drawn may be moved to take advantage of the human recognition of parallax to give the 3.sup.rd dimensional image information. As the Echoscope itself is fixed with respect to the scene, this virtual movement makes the shadowing effect more apparent. When the image shown from a viewpoint apart from the sonar array 16 as in FIG. 4B, the points corresponding to the backside of the block are missing and the block appears truncated.

[0037] In order to show the backside of the block as the eyepoint is moved around, we obtain the sonar data on the relative coordinates of the surface of the block, and construct a model of the block in the computer as in FIG. 4C. The coordinates of the model are best taken from the machine drawings of the mold for the block, but any measurement of the surface of the block may be used. The data may be stored as a set of x, y, z coordinates detailing the entire surface of the block, or a subset of the data may be stored if there are symmetries like reflection in a plane or rotation about an axis. In general, more of the surface must be measured than can be viewed from a single viewpoint by the sonar system. Generally, the entire surface of the block is measured. The center of mass of the block is determined by calculation from the measured surface, or by measurement. In order to track and align the model with the sonar image data, a set of points such as sketched in FIG. 4D are created based on the vertices and faces of the 3D model. These points are then aligned with the sonar data points using a recognized image matching technique such as Iterated Closest Point (ICP).

[0038] The model data image has now the same rotational orientation as the object, and appears to be the same distance away from the detector. FIG. 4E shows the data of FIG. 4A overlaid on to an image of the model aligned with the sonar data. FIG. 4F shows data from the viewpoint of FIG. 4B where the model image is drawn in to replace the missing sonar data.

[0039] Many other methods of finding the best fit between sets of points in three dimensions could be used.

[0040] The ICP algorithm and other point matching algorithms require a time proportional to the number n of points in the first set of points to be matched times the number m of points in the second set of points. This time proportional to nm may be reduced to n log m by reducing the set of points from the model to just those points which could be seen from an Echoscope. FIG. 5A shows the Echoscope data with the data from the background. FIG. 5B shows the data of FIG. 5A where the background has been set to zero. FIG. 5C is the sonar data of FIG. 5B reduced to a set of coordinate points.

[0041] FIG. 5D shows one of a small (<20) set of points representing a part of the surface of the block, which are iteratively matched to the points of FIG. 5C to find the tilt and rotation needed to bring the model into matching position FIG. 5E with the block. FIG. 5E shows the rotated and tilted points used in FIG. 5D. Note that some points in FIG. 5D would not be seen, and can be removed to give a set of points as in FIG. 5F. Once the orientation of the block has been roughly calculated, some points may be removed as they would become invisible as the block rotates, and other points that would come into view are added, to further match the model orientation with the sonar data from the object as in FIG. 5F. This process is repeated with each of the small set of points, and the best match chosen to give the best orientation of the model to the block Now, the orientated model image is added to the sonar data to provide views of the backside of the object as in FIG. 4F. Alternatively, only the calculated model image of the correctly placed and oriented is shown to the operator. The total time taken to match the model points to the sonar points is much less when matching each of the small set of points and using the best match than using all the data points from the model.

[0042] FIGS. 6A and 6B shows show photographs of portions of breakwaters above water level.

[0043] Before the first block in a set of blocks is laid, a sonar image of the background is recorded. The position and orientation of the sonar source and sonar imaging device are recorded, so that the background of the sonar shadow recalled from the recording, and can be filled in as the block is moved into place. The orientation of the block is known after it is placed, and the image of the block can be added to the background. As the blocks are placed, the position, orientation, etc. of each block is recorded so that the entire background may be matched. The measurement of the exact positions of the background blocks and the exact position of the equipment supporting the block being placed is at times not accurate enough to ensure correct placement of the blocks from positional data alone, and it is often preferable that the sonar background objects be measured as the block is being moved into position. As the block is being swung into place, the background is measured in the field of view int front of the swinging block. This background image is used by itself, or fit to a previously recorded background.

[0044] The block is moved into position to place it in a location and orientation with respect to the other blocks. The location and orientation must satisfy a criterion. One such criterion is that each block is supported by contact of at least three contact points with other blocks.

[0045] As the block is being moved and rotated, the movement and rotation is slow compared to the rate at which sonar images are recorded. The velocity and rotation of the block is measured by measuring the location of the excavator arm and the distance from the excavator arm to the block, and measuring the rotation of the block from ping to ping. The position and rotational velocity of the block is predicted at the time of the next ping, and and the previous set of points for matching model to sonar image is adjusted take into account the new position and rotation angle, so the iterative process of matching takes much less time, which allows us to track the block more accurately. For example, we anticipate that a set of points along one edge of the block can disappear from the sonar image, while another set of points on the opposite edge swings into view.

[0046] In viewing the block and background in the sonar image, the background can also be enhanced by using previously recorded orientations and positions to draw in the previously placed blocks. The sonar data is then much easier to understand, especially when the eyepoint is rotated back and forth to give enhanced 3D visualization. The previously recorded background orientations and positions may be augmented or replaced by images collected in as the blocks move into place.

[0047] A number of artifacts combine to produce sonar images which are quite distorted. When models are used to produce additional data for the sonar imaging visualization, the position and orientation of the model must be fit to the sonar data. Artifacts which distort the sonar image then affect the program which tries to match the sonar data points to the model data points, and different orientations of the model image with respect to the sonar image may give a fit to within the criterion chosen to end the iterative process. In particular, orientations chosen for each ping differ enough that the model image appears to jitter, even when the object is stationary.

[0048] Image artifacts arise, for example, due to the resolution of the sonar system. If the resolution of the system changes because the distance between the object and the Echoscope changes, the protrubences on an Accropode (a large concrete object used in underwater breakwaters to armor rip rap) may appear to be thicker than they should be because the diameter is measured at high resolution would have an uncertainty of a resolution element of 10 cm, and at low resolution of 30 cm.

[0049] Objects can appear smaller as at some angles the reflected energy is below the detection threshold. Consider a sphere. The surface normal of the center (surface) points directly at the sonar source and receiver, so reflects directly back at high intensity, which measurement is set to unity. The surface normal half way out to the edge of the sphere is indicates that 70% of the energy reflects back, while 30% is scattered more than 90 degrees to the incoming beam. The surface normal of the edge (surface) forms an angle of 90 degrees to the direction of the sonar beam, so reflects no energy directly back from that point. Setting the threshold for detection to 80% will show a sphere less than half the true size (even accounting for inflation). Another artifact of sonar imaging is sidelobe illumination. Every beam has 4 neighbors with a lower intensity, and so some energy from neighboring points of the surface will arrive at the detector and appear to come from another point. The beams can combine to show a surface where there is indeed a hole. Random data from other sound sources is an artifact which is very difficult to deal with if it is truly random, or even if it is not understood. Local Reflection/MultiPath effects is where a point on the object reflects sound onto another part of the object, which further reflects to the detector, and which causes points to appear in the wrong place.

[0050] The further away objects are, the less accurately we can track them. An Echoscope produces comparatively low resolution images compared to Images generated by light. A standard frequency (375 KHz) Echoscope has a resolution of 4848 elements (50 Degrees by 50 Degrees), giving an approximate angular resolution of 1 degree. However due to the way the image is constructed, we also have a limiting factor based on the physical size of the array (20 cm20 cm). This is known as Aperture size. The range above the point where 1 degree is greater than 20 cm is known as Far Field. The range below the point where 1 degree is less than 20 cm is known as Near Field. So the resolution of the standard frequency Echoscope is either 1 degree or 20 cm, whichever is greater.

[0051] (In Near Field you can make the aperture smaller by only using, say, 2424 elements. This gives a resolution of 2 degrees or 10 cm, whichever is greater. Limiting the number of elements to a 12 by 12 element array gives 5 cm resolution, etc.)

[0052] For 4848 resolution elements, or one degree resolution, the standard frequency Echoscope Far Field starts at around 11.5 m.

[0053] FIGS. 7A and 7B show how resolution changes an image.

[0054] The Accropode Image in FIG. 7A was captured by a Standard frequency Echoscope and is of a 5 cubic Meter Block (2.58 m high) at a range of 4 m, giving us an effective near field resolution of 12 cm which works out to 21 beam widths across the Accropode height. (Note: there is also a Range resolution. The block resolution that is due to the generation of images as slices of data, It is dependent on Sonar pulse length and how often we generate range slices. For an Echoscope it is usually less than 10 cm, so the aperture range resolution is the a dominant factor.)

[0055] The Accropode Image in FIG. 7B was captured by a High frequency Echoscope which uses 610 KHz sound waves. Increasing the sound frequency changes the beam width, resolution, near to far field transition distance, etc. The near to far field transition is at 19 m, FIG. 7B is an image of a 2cubic Meter Block (which is 1.9 m high) at a range of 12 m, giving us an effective near field resolution of 16 cm. The block appears to be 12 beam widths high and the resolution is 8% of the block height The blocks appear approximately the same size in images 7A and B. However the protrubences on the image in FIG. 7B appear to be fatter (not larger) than in 7A because of the lower resolution.

[0056] If we place Accropode model data over the Accropode sonar image in FIG. 7B, we can easily see how its size does not match up well in all dimensions. The protrubences of the model Accropode fit loosely inside the data of the protrubences of the sonar image in FIG. 7B, and when the program tries to match the images from ping to ping, the best fit to the sonar image data shifts a lot from ping to ping, and the model image appears to jiggle with small and random motions.

[0057] The Accropode in FIG. 7A appears less fat as its beam resolution is 4.6% of the block size. The Accropode in FIG. 7B appears fatter as its beam resolution is 8.4% of the block size.

[0058] A novel method for reducing this visual jittering effect has been implemented.

[0059] Since the cause of the distortion in the sonar image is known, we can preferably reverse the distortion of the sonar image. More preferably, we can distort the model image in a way which matches the distortion of the sonar image since we have more and more accurate data about the model than we do sonar data.

[0060] Preferable ways to match the distortion are dilation of the model image or erosion of the sonar image.

[0061] If we were to break the volume of interest into a volume of (say 1 cm cubes (voxels)), we can place the model inside this volume, if a 1 cm cube is predominantly inside the volume of the model we set that cube to be 1, otherwise it is 0. We can then inflate the model (by 1 cm), by looking at each cube, and its neighbors (the cubes that share any of its faces (there are 6 of these), or share any of its vertices (there are 26 of these)). If any of its neighbors have the value of 1, then set the value of this cube to 1. Every time we repeat this process the volume inflates by one voxel. Similarly if we wanted to make the data object smaller we could do some by Erosion. If we were to break the volume of interest into a volume of (say 1 cm cubes (voxels)), we can place the model inside this volume, if a 1 cm cube is predominantly inside the volume of the model we set that cube to be 1, otherwise it is 0. We can then deflate the model (by 1 cm), by looking at each cube, and its neighbors (the cubes that share any of its faces (there are 6 of these), or share any of its vertices (there are 26 of these)). If the voxel has a value of one and the number of its neighbors is less than or equal to some value (say one if we are looking at face neighbors), then we set that's voxel value to zero (removing it from the volume). Every time we repeat this process the volume deflates by one voxel.

[0062] Inflation, deflation, or scaling of the data takes less computer time than dilation or erosion, and is more preferable with limited computer equipment. Scaling of the model or sonar image data is one preferred embodiment of the invention. More preferably, inflation of the model data or deflation of the sonar data changes the fitting of the model to the sonar data better than scaling the data. The most preferable embodiment of the data is to inflate the model data to better fit the sonar data.

[0063] For simple (convex) objects, such as Cubes and spheres Inflation and scaling are the same. However this is not the case for more complex non-Convex objects. FIGS. 8A-D compares images of a Barbell that has been both inflated and scaled. When scaled the image uniformly increases in size, however when inflated, both bells get larger, but there centers remain the same, and the bar gets both shorter and thicker i.e. The scaling is non uniform for complex (non Convex) Objects.

[0064] There are many ways to inflate or deflate and scale objects. Preferable ways are based on Face Normals or Vertex Normals. Although Face Normals give a more uniform result, Vertex Normal technique as it is much simpler to implement and for our needs gives adequate results, and is the most preferred way to change the image of the model. FIGS. 9A and 9B show two differing shapes two dimensional shapes (solid lines) standing in for the three dimensional objects we are discussing. Face normals given by the arrows determine the inflated (dotted lines), while FIGS. 9C and 9 D show the same shapes inflated by the vertex normals (scaled by square root of 2).

[0065] FIGS. 10A and 10D shows an image of an undistorted model Accropode. FIGS. 10 B and C show images of the Accropode scaled by 5% and 10%. FIGS. 10 E and 10F show images of the Accropode inflated along the vertex normals by 5% and 10%. Clearly, the inflated images more closely resemble the sonar images than do the scaled images.

[0066] The percentage inflation is of the model image is increased until the image of the Accropode is stable from ping to ping. As conditions (ie range) change, the percentage may be adjusted automatically or by hand.

[0067] Once the model Accropode image orientation and range have been determined, the sonar images may have the missing points drawn in. Or, the entire sonar image of the object may be replaced with an image of the model, and the image of the model can be drawn from any viewpoint at all. In particular, the model image may be used to guide the model with respect to either the sonar images or the model images of the background to a fit better than the resolution of sonar images.

[0068] Obviously, many modifications and variations of the present invention are possible in light of the above teachings. It is therefore to be understood that, within the scope of the appended claims, the invention may be practiced otherwise than as specifically described.

Method of stabilizing sonar images

Assignee

Inventors

Cpc classification

Classification Explorer

G06T5/50

PHYSICS

Classification Explorer

G01S15/89

PHYSICS

Classification Explorer

G01S7/62

PHYSICS

Classification Explorer

G01S7/52003

PHYSICS

Classification Explorer

G06T5/002

PHYSICS

Classification Explorer

G01S7/6245

PHYSICS

International classification

Classification Explorer

G01S7/52

PHYSICS

Classification Explorer

G01S15/89

PHYSICS

Classification Explorer

G06T5/00

PHYSICS

Classification Explorer

G06T5/50

PHYSICS

Abstract

Claims

Description