Stereo vision for sensing vehicles operating environment
09819925 ยท 2017-11-14
Assignee
Inventors
- John H. Posselius (Ephrata, PA, US)
- Christopher A. Foster (Mohnton, PA, US)
- Bret T. Turpin (Wellsville, UT, US)
- Daniel J. Morwood (Amalga, UT, US)
- Thomas M. Petroff (Richmond, UT, US)
- Brad A. Baillio (Smithfield, UT, US)
- Chad B. Jeppesen (Richmond, UT, US)
Cpc classification
G06V20/56
PHYSICS
H04N2013/0092
ELECTRICITY
G05D1/0251
PHYSICS
H04N13/243
ELECTRICITY
H04N13/25
ELECTRICITY
International classification
H04N13/00
ELECTRICITY
H04N9/80
ELECTRICITY
Abstract
A vehicle including a chassis, a drive system carrying the chassis, and a vision system carried by the chassis. The vision system having a stereo visible light camera producing a colorized 3D point cloud and a stereo long wave infrared camera producing 3D data. The vision system being configured to fuse the 3D data with the 3D point cloud thereby producing an enhanced 3D point cloud.
Claims
1. A vehicle, comprising: a chassis; a drive system carrying the chassis; and a vision system carried by said chassis, said vision system including: a stereo visible light camera producing a colorized 3D data point cloud; a stereo long wave infrared (LWIR) camera producing 3D LWIR data; and a near infrared (NIR) camera, wherein both said LWIR camera and said NIR camera produce data that is fused with said 3D point cloud, wherein said fusing includes detecting foliage by way of measuring and comparing NIR and red spectrum energy level ratios, wherein the LWIR camera data is used to fill in information in areas of said 3D point cloud including areas where there is low light at night or dust which obscures a field of vision of said stereo visible light camera, and wherein sparse computations are used to determine sparse points based on edge features, and wherein a segmentation computation and a hole-filling computation are used to fill in regions, including depth regions, between sparse points to fuse with the 3D point cloud; wherein said vision system is configured to fuse said 3D data, detected data, and filled-in data with said 3D point cloud to produce an enhanced 3D point cloud.
2. The vehicle of claim 1, wherein said drive system is directed to at least one of steer, change velocity, start and stop the vehicle, dependent upon said enhanced 3D point cloud.
3. The vehicle of claim 1, wherein said vision system is further configured to extract features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
4. The vehicle of claim 3, wherein said vision system is further configured to match at least some of the extracted features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
5. The vehicle of claim 4, wherein said vision system is further configured to perform a consistency validation of the 3D data that is matched.
6. A vision system for use by a vehicle having a drive system, the vision system comprising: a stereo visible light camera producing a colorized 3D point cloud; a stereo long wave infrared (LWIR) camera producing 3D LWIR data; and a near infrared (NIR) camera, wherein both said stereo LWIR camera and said NIR camera produce data that is fused with said 3D point cloud; wherein said fusing includes detecting foliage by way of measuring and comparing NIR and red spectrum energy level ratios, wherein LWIR data is used to fill in information in areas of said 3D point including areas where there is low light at night or dust which obscures a field of vision of said stereo visible light camera, and wherein sparse computations are used to determine sparse points based on edge features, and wherein a segmentation computation and a hole-filling computation are used to fill in regions, including depth regions, between sparse points to fuse with the 3D point cloud, wherein said vision system is configured to fuse said 3D LWIR data, detected data, and filled-in data with said 3D point cloud to produce an enhanced 3D point cloud.
7. The vision system of claim 6, wherein the drive system is directed to at least one of steer, change velocity, start and stop the vehicle, dependent upon said enhanced 3D point cloud.
8. The vision system of claim 6, wherein said vision system is further configured to extract features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
9. The vision system of claim 8, wherein said vision system is further configured to match at least some of the extracted features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
10. The vision system of claim 9, wherein said vision system is further configured to perform a consistency validation of the 3D data that is matched.
11. A method of directing a vehicle using a vision system, the method comprising the steps of: producing a colorized 3D point cloud with data from a stereo visible light camera; fusing data from a stereo long wave infrared (LWIR) camera with said 3D point cloud; fusing data from a near infrared (NIR) camera with said 3D point cloud; detecting foliage via fusing by way of measuring and comparing NIR and red spectrum energy level ratios, filling in information via the LWIR camera data in areas of said 3D point cloud including areas where there is low light at night or dust which obscures a field of vision of said stereo visible light camera; filling in regions, including depth regions via a segmentation computation and a hole-filling computation between sparse points to fuse with the 3D point cloud; and producing from the fused, detected, and filled-in 3D data an enhanced 3D point cloud that is used to direct tasks of the vehicle.
12. The method of claim 11, further comprising the step of directing a drive system of the vehicle to at least one of steer, change velocity, start and stop the vehicle, dependent upon said enhanced 3D point cloud.
13. The method of claim 11, wherein said vision system is further configured to extract features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
14. The method of claim 13, wherein said vision system is further configured to match at least some of the extracted features from the 3D point cloud and the 3D data as part of producing said enhanced 3D point cloud.
15. The method of claim 14, wherein said vision system is further configured to perform a consistency validation of the 3D data that is matched.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The above-mentioned and other features and advantages of this invention, and the manner of attaining them, will become more apparent and the invention will be better understood by reference to the following description of an embodiment of the invention taken in conjunction with the accompanying drawings, wherein:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16) Corresponding reference characters indicate corresponding parts throughout the several views. The exemplification set out herein illustrates one embodiment of the invention, in one form, and such exemplification is not to be construed as limiting the scope of the invention in any manner.
DETAILED DESCRIPTION OF THE INVENTION
(17) Referring now to the drawings, there is illustrated (in
(18) As represented in
(19) Near Infrared (NIR) and Short Wavelength Infrared (SWIR) cameras operate on the edge of the visible light spectrum from the wavelength of 0.75 to 3 m. The infrared radiation in this spectrum interacts with objects in a similar manner as the visible wavelengths. This makes the images from the NIR and SWIR similar to the visible light images in resolution and detail. The main difference is that the infrared images are not in color. Energy from within these wavelengths must usually be reflected in order to obtain good imagery. This means that there must be some external illumination and at night, these cameras typically require some type of artificial illumination.
(20) Long Wavelength Infrared cameras 18 operate on the far end of the infrared spectrum from the wavelength of 8 to 15 m. The Long Wavelength Infrared (LWIR) cameras band is called the thermal band. Cameras in this range can be completely passive. This means that they require no external illumination. In
(21) Although there has been significant work in the area of visible light stereo depth perception, there has been little research into infrared spectrum stereo vision. This may be partially due to inaccessibility of sensors in previous years due to cost, as well as experimentation on older sensors that failed to perform using dense stereo computation due to high noise levels. With the cost reductions of IR sensors in recent years there have been further attempts at computing range from infrared pairs; most common methods have been for pedestrian detection and have used a priori models of objects (humans) and blob detection to perform depth computation. The present invention uses sparse algorithms to determine a set of feature correspondences based on edge features and a segmentation/interpolation algorithm is used to fill in depth regions between sparse points. This algorithm is more adaptable per specific camera and allows cameras with noisier images to still be functional, where dense algorithms will fail. The algorithm is very suitable to overhead imagery where edge data is prominent and especially in urban environments where terrain is structured and surfaces between edges tend be to linear.
(22) TABLE-US-00001 TABLE 1 System capabilities Features Summary Terrain Sensing and Mapping based on Rely on edge based features, interpolate passive stereo imagery with low signal to segmented regions noise ratio Can use typical aerial data from urban and Data collection done locally using ground rural settings system and/or quad rotor assigned thereto. Use of many existing sensors for range, Sensor data fusion field of view, resolution, frame rate and power Used on various scales of Unmanned Sensor performance of various levels which Vehicles, with a target of low cost and 60 can be integrated with the vision system degree field of view Uses low light sensors and machine vision Use vision algorithms with embedded stereo stereo algorithms processor Produces accurate real time terrain data Use a vehicle vision system with algorithms that operate at 1 Hz refresh rate Robust software system Algorithms have a high noise tolerance and scalability, modular to camera selection and stereo baseline
(23)
(24) The vision system of the present invention can be used with existing robots (ground or aerial) to model terrain and detect features/objects.
(25) Sensors were studied to determine relevant sensors that could be used in a stereo camera system, with an emphasis on tradeoffs for quality, cost, weight etc. Sensors are selected for the vision system using existing hardware and baseline sensors. The sensors allow data collection: to collect IR imagery of terrain data Feature extraction and evaluation, determine which features can be used for sufficient pixel correspondence in infrared image pairs The present invention uses a stereo algorithm using infrared imagery to resolved depth data of terrain, with the algorithm being tuned to perform with the best results at real time These features are explained in more detail in the following sections. The features and summary of requirements is contained in Table 1.
(26) A survey was performed of commercially available IR camera modules and the selection of prime candidates for use in the three desired scale classes was done. It was important to identify sensors which allow for stereo vision processing; that is the cameras allow for hardware synchronization between cameras, have a global shutter to handle motion, allow for digital interfacing, have rugged properties for military and off-road vehicle usage, allow for short exposure time to minimize motion blur and allow for proper calibration between cameras (cameras must have very similar properties), and not require extensive external components for cooling or illumination. Further, signal to noise characteristics were used to help quantize camera quality for comparison.
(27) The stereo algorithm functions in the desired environment, using sensors selected and hardware bread-boarded because of the ease of accessibility for collecting data. The present invention uses an embedded processor platform suitable for real time stereo algorithms that are used to interface to the IR camera and collect terrain images.
(28) Representative data sets that match the requirements for targets and expected environments were collected. Data sets were gathered around testing grounds (buildings, varying brush types, varying tree leaf densities, coniferous and deciduous trees). Each data set was captured from a moving vehicle such that detection range could be investigated.
(29) For the aerial photography, a quad-rotor platform was used (see
(30) In addition to aerial analysis, the technology of unmanned ground vehicles (UGVs) is leveraged. Several automated tractors are used as platforms for rural data acquisition, and an automated Ford F-150 was used for urban collection. While some of the urban settings may not easily lend themselves to automated testing (for safety concerns), all of the rural testing is automated. This eliminates variability in the test data from human path-following inconsistencies. By collecting multiple data sets, there is increased statistical significance of the findings, and the automation decreases the variability between each data set. Automating the testing also helps increase safety, especially for testing in low-light/night conditions.
(31) Correspondence feature extraction and matching algorithm: The main software development of the present invention assessed which vision feature type is most effective for pixel correspondence of noisy IR images. Log-Gabor edge type features are sufficient for good correspondence, but newer features such as a Ferns algorithm, Binary Robust Independent Elementary Features (BRIEF), and Speeded Up Robust Features (SURF) improve accuracy/speed/coverage tradeoffs of processed terrain images collected. The present invention has a stereo algorithm framework (see
(32) Sparse features can be used to detect obstacles in Infrared imagery using thermal cameras (see
(33) Sparse to Depth Computation, segmentation and hole filling: Because a sparse stereo algorithm won't completely generate an estimate of depth for every pixel, a secondary hole-filling algorithm is used. For this task, a segmentation algorithm is used to partition overhead terrain and a reconstruction algorithm performs in real time on an embedded processor. A Mean shift segmentation algorithm and a linear plane fit interpolation algorithm are used to fill empty regions (
(34) The sensing and perception systems on vehicle 10 are to collect data such that the vehicle can interact with its environment. One of the data types/formats that appears to be most useful is colorized 3D point clouds. This is typically achieved by overlaying a camera image on top of range data. The combination of image analysis (edge/blob detection) with detecting features in the 3D range data allows for better reliability in the detection and identification of features.
(35) RGB cameras and laser sensors have significant problems seeing through obscurants such as dust or smoke. In agriculture dust is a common problem. Sensing and detecting clouds of dust can be of interest, but we are typically more interested in the features hidden behind a dust cloud. The present invention provides sensing systems that can penetrate dust clouds and provide information on the objects/features hidden behind dust clouds. While Radar is very good at penetrating dust, it typically only detects macro features (large scale features) so it is normally very difficult to get higher resolution information that provides useful information on smaller features. Typically the wavelength of the energy is proportional to how well it can penetrate obscurants. There is a relationship between obscurant particle sizes and how well a certain wavelength can penetrate, so higher wavelengths (e.g. LWIR) are typically better at penetrating obscurants. By using a stereo LWIR camera data is used to fill in the 3D data where dust obscures certain fields of vision of other sensing devices (RGB cameras, laser scanners, etc.).
(36) There is also a known ratio between energy levels in the NIR/SWIR and red spectrum ranges for detection of chlorophyll. The present invention can use this to detect foliage, which is abundantly present in many agricultural environments. The present invention measures and compares near infrared and red spectrum energy level ratios. This can be used to help differentiate plants from other objects (rock vs shrub). For this reason it is useful to have an NIR sensor included in the sensing system of the present invention.
(37) In summary the present invention would have and provide: A Stereo RGB camera (provide colorized 3D point cloud) A Stereo LWIR camera (fill in 3D data where obscurants/dust is present, and detection of warm blooded creatures) An NIR sensor (for foliage/plant detection) The algorithms fuse data from these 3 sensors to provide an Enhanced 3D point cloud that will allow the software to make decisions with higher levels of confidence.
(38) While this invention has been described with respect to at least one embodiment, the present invention can be further modified within the spirit and scope of this disclosure. This application is therefore intended to cover any variations, uses, or adaptations of the invention using its general principles. Further, this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains and which fall within the limits of the appended claims.