METHOD FOR DETECTING OBJECTS IN AUTOMOTIVE-GRADE RADAR SIGNALS
20220242443 · 2022-08-04
Inventors
- Johannes TRAA (Medford, MA, US)
- Andrew M. SCHWEITZER (Cambridge, MA, US)
- Atulya YELLEPEDDI (Medford, MA, US)
Cpc classification
B60W2554/4048
PERFORMING OPERATIONS; TRANSPORTING
B60W60/001
PERFORMING OPERATIONS; TRANSPORTING
B60W2554/4049
PERFORMING OPERATIONS; TRANSPORTING
International classification
Abstract
A method includes an operation to collect radar signals reflected from objects in a field of view. Range-angle-doppler bins representing three-dimensional objects in the field of view are formed. A local median operation is used across a selected dimension of the range-angle-doppler bins to eliminate background noise in the range-angle-doppler bins. Low energy peak regions are masked by removing radial velocity values in the selected dimension to form a sparse range-angle two-dimensional grid. The radar signals reflected from objects in the view of view are processed to extract reflection point detections. Reflection point detections are tracked in accordance with short-term filter rules to form tracked reflection point detections. The tracked reflection point detections are formed into clusters. The clusters are processed with long-term filter rules.
Claims
1. A method, comprising: collect radar signals reflected from objects in a field of view; form range-angle-doppler bins representing three-dimensional objects in the field of view; use a local median operation across a selected dimension of the range-angle-doppler bins to eliminate background noise in the range-angle-doppler bins; and mask low energy peak regions by removing radial velocity values in the selected dimension to form a sparse range-angle two-dimensional grid.
2. The method of claim 1 further comprising an operation to utilize peak region validation of the range-angle-doppler bins.
3. The method of claim 1 further comprising up-sampling remaining sparse range-angle two-dimensional grid values with bilinear interpolation.
4. The method of claim 1 wherein the selected dimension is the Doppler dimension.
5. The method of claim 1 further comprising threshold masking.
6. The method of claim 1 further comprising background estimation.
7. The method of claim 1 wherein the radar signals reflected from objects in the field of view are processed to extract reflection point detections.
8. The method of claim 7 further comprising applying short-term tracking rules to the reflection point detections to form tracklets, where each tracklet corresponds to a reflection point detection tracked over several frames.
9. The method of claim 8 further comprising applying long-term tracking rules to a cluster of tracklets to track objects having several reflection point detections.
10. The method of claim 9 further comprising applying rules to the tracklets to establish births of the tracklets.
11. The method of claim 9 further comprising applying rules to the tracklets to establish deaths of the tracklets.
12. The method of claim 9 further comprising applying rules to the group to establish birth of the group.
13. The method of claim 9 further comprising applying rules to the group to establish death of the group.
14. The method of claim 9 further comprising applying a Kalman filter to the tracklets.
15. The method of claim 9 further comprising applying a Kalman filter to the group.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0029] The present disclosure is best understood from the following detailed description when read with the accompanying figures. It is emphasized that, in accordance with the standard practice in the industry, various features are not necessarily drawn to scale, and are used for illustration purposes only. Where a scale is shown, explicitly or implicitly, it provides only one illustrative example. In other embodiments, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. Similarly, for the purposes of clarity and brevity, not every component may be labeled in every drawing.
[0030] For a fuller understanding of the nature and advantages of the present invention, reference is made to the following detailed description of preferred embodiments and in connection with the accompanying drawings, in which:
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
DETAILED DESCRIPTION
[0042] The present disclosure relates to techniques for detecting objects two-dimensional in automotive-grade radar signals. More particularly, the disclosure describes background estimation and peak region validation in a radar environment. During radar signal processing, background subtraction tends to be overly aggressive and is computationally intensive. One method disclosed herein runs a median operation on one dimension of a radar cube.
[0043] This results in the removal of one dimension from computation in calculating the background statistics, e.g., 3D to 2D or 4D to 3D as may be used in multidimensional processing. In a 3D to 2D example, doppler-velocity can be removed using the median operation. A local median 2D can then be applied on a sparse range-angle grid. The result can then be up-sampled with bilinear interpolation. Additionally, spurious peaks can be eliminated by a validation step.
[0044] The following description and drawings set forth certain illustrative implementations of the disclosure in detail, which are indicative of several exemplary ways in which the various principles of the disclosure may be carried out. The illustrative examples, however, are not exhaustive of the many possible embodiments of the disclosure. Other objects, advantages and novel features of the disclosure are set forth in the proceeding in view of the drawings where applicable.
[0045] The present disclosure generally relates to Millimeter Wave Sensing, while other wavelengths and applications are not beyond the scope of the invention. Specifically, the present method pertains to a sensing technology called Frequency Modulated Continuous Waves (FMCW) RADARS, which is very popular in automotive and industrial segments.
[0046] FMCW radar measures the range, velocity, and angle of arrival of objects in front of it. At the heart of an FMCW radar is a signal called a chirp.
[0047] A chirp is a sinusoid or a sine wave whose frequency increases linearly with time.
[0048] Thus, the chirp is a continuous wave whose frequency is linearly modulated. Hence the term frequency modulated continuous wave or FMCW for short.
[0049]
[0050] The resultant signal is called an intermediate (IF) signal. The IF signal prepared for signal processing by low-pass (LP) filtering and sampled using an analog to digital converter (ADC). The significance of the mixer will now be described in greater in detail.
[0051]
[0052] The instantaneous frequency of the output equals the difference of the instantaneous frequencies of the two input sinusoids. So, the frequency of the output at any point in time would be equal to the difference of the input frequencies of two time-varying sinusoids at that point in time. Tau, t, represents the round-trip delay from the radar to the object and back in time. It can also be expressed as twice the distance to the object divided by the speed of light. A single object in front of the radar produces an IF signal with a constant frequency given by S2d/c.
[0053]
[0054] Each row corresponds to one chirp. That is, for every chirp there is a row in the chirp index, i.e., N rows for N chirps. Each box in a particular row represents one ADC sample. Accordingly, if each chirp is sampled M times, there will be M columns in the matrix. The transformation of the data matrix in range and velocity matrices will now be described.
[0055]
[0056] The application of the range-FFT resolves objects in range. As one skilled in the art can appreciate, the x-axis is actually the frequency corresponding to the range FFT bins. But, since range is proportional to the IF frequency, this can be plotted directly as the range axis. Therefore,
[0057]
[0058] As can be appreciated,
[0059]
[0060] In this example depicted in
[0061]
[0062] The radar cube data is in the form of 3-dimensional, complex-valued array with dimensions corresponding to azimuth (angle), radial velocity (doppler), and radial distance (range). Magnitude in each angle-doppler-range bin is taken to describe how much energy the radar sensor sees coming from that point in space (angle and range) for that radial velocity. For demonstrative purposes, linear antenna arrays oriented parallel to the ground are assumed. Pillars with peak energies can be selected for peak validation, which will be discussed later in the disclosure.
[0063]
[0064] The result is depicted in
[0065] Foreground extraction can be performed by various methods for suppressing noise and artifacts and extracting salient regions of a radar cube. Most, if not all, involve estimating either a background model and somehow removing it or creating a mask that emphasizes desired bins and suppresses others through element-wise multiplication.
[0066] CFAR Constant False Alarm Rate (CFAR) thresholding is probably the most well-known and well-studied technique and involves estimating a background model through local averaging. Constant false alarm rate (CFAR) detection refers to a common form of adaptive algorithm used in radar systems to detect target returns against a background of noise, clutter and interference.
[0067] The primary idea is that noise statistics may be non-uniform across the array. CA-CFAR (cell averaging) computes a moving mean while excluding a region at the center of the averaging window (guard cells) to avoid including a desired object in the background estimate. OS-CFAR (order-statistic) does the same computation but with a percentile operation instead of a mean. Given the background model (estimate of background value in each bin) b.sub.ijk, the foreground can be estimated as:
∀i,j,k x.sub.ijk←x.sub.ijk⊙(x.sub.ijk≥α.Math.b.sub.ijk) (1)
[0068] for some factor α that controls the amount of background suppression.
[0069] One or more objects of the present disclosure is to efficiently isolate objects in an automotive scene from noise. The motivation(s) being that automotive radar signals consist of reflections off of objects (e.g., cars, cyclists, buildings) and that traditional approaches may not simultaneously achieve accuracy, efficiency, and usability in a machine learning pipeline.
[0070] Heretofore, previous solutions have attempted 2D-only or full 3D background estimation. Other solutions don't include peak validation or retention of peak regions.
[0071] Solutions to similar problems in related fields and other disciplines are untenable in their application to 3D radar data cubes. They also create barriers to computational efficiency.
[0072]
[0073] The OS-CFAR described above is expensive and a bit inflexible because it operates on the full radar cube. This implementation can be improved without significantly affecting the result as follows. For demonstrative purposes, it is assumed that the background model is only a function of range-angle and not doppler. By doing so, a full 3D moving median can be approximated with a median over doppler followed by a 2D moving median over range-angle.
[0074] This allows much more flexibility in terms of choosing (2D) window size and overlap. The moving median on a sparse grid can be calculated and quickly upsampled to the original range-angle grid with widely used and optimized image resampling techniques. The two median operations give this approach its name.
[0075] Turning to
[0076] Typically, in the art, aggressive thresholding is applied to a radar cube, which compresses the data by effectively throwing away a large amount of information, i.e., the data under the threshold. Image thresholding is a simple, yet effective, way of partitioning an image into a foreground and background. This image analysis technique is a type of image segmentation that isolates objects by converting grayscale images into binary images. Image thresholding is most effective in images with high levels of contrast.
[0077] By thresholding less aggressively, a larger area of surrounding regions could be kept. This is useful in intelligent perception analysis. Constructive perception or intelligent perception is the theory of perception in which the perceiver uses sensory information and other sources of information to construct a cognitive understanding of a stimulus. In contrast to this top-down approach, there is the bottom-up approach of direct perception. Perception is more of a hypothesis, and the evidence to support this is that “Perception allows behavior to be generally appropriate to non-sensed object characteristics,” meaning that we react to obvious things that, for example, are like doors even though we only see a “long, narrow rectangle as the door is ajar.”
[0078] In the present embodiment, intelligent perception not only allows a radar system to identify an object's (let's say car) position, speed and direction, it can dispositively associate that with a previously identified car. That is, the radar system can more easily temporally correlate object between scenes thereby mitigating surprises.
[0079] In one or more embodiments, a fourth dimension includes tangential velocity. In state-of-the-art radar systems, doppler can only resolve a velocity which is closer to the observer. That is, closer and further from the radar antenna. In some embodiments, tangential velocity can resolve velocities substantially normal to the plane of the observer. In more layman's term, this includes left-right or up-down movement and speed (actually velocity, since it is a vector). A 4D radar cube can be referred to as a tesseract.
[0080] Turning back to
[0081] However, medians are computationally intensive since they require sorting. If the background is assumed to be independent of the radial velocity direction, one dimension of the radar cube can be truncated by the following. A moving median is performed of the radial velocity direction. This median can then be used to reduce the 3D radar cube to just range-angle by substituting the velocity median in the third dimension. The importance of which will be described in more detailed in the discussion of
[0082] Once background estimation is achieved, masking can be performed. Masking removes any background noise while allowing peak regions of interest to remain. The result is a threshold cube which is computationally far more tenable. Peak detection and filtering can now be performed which will also be discussed later in the disclosure.
[0083] An aspect of the radar cube processing includes using short-term track hypotheses (referred to as “tracklets”) to model detections. Tracklets are clustered into groups to form longer-term tracking hypotheses (referred to as “groups”). The tracklets and groups are subject to birth-death rules and different filters, as discussed below.
[0084] The peak list can then be processed with different state of the art filtering and detection methods. These methods do not take into account the small variation of the reflection points from measurement to measurement. For example, a reflection point on a curved surface of an object like a car hood can change its position depending on orientation and position of the car. This movement of the reflection point then adds an artificial movement not corresponding to the real movement of the object. It becomes difficult to associate all the reflection points to the same object due to this offset. Further, as the object moves, new reflection points on the object show up and others vanish. To mitigate this problem, the following novel techniques are used.
[0085] As a first step a short-term hypothesis is used to model the dynamic/short-term movement of each detection from frame to frame. The tracked/filtered detections are the previously referenced tracklets. As a next step, several tracklets (minimum one, but preferably a larger number than 1) are clustered. This cluster is then tracked by a long term hypothesis to model the underlying movement of the object. The individual tracklets and groups are preferably modeled with alpha/Beta Kalman Filter rules and with birth and death rules. Kalman filters use a series of measurements observed over time to produce estimates of unknown variables that tend to be more accurate than those based on a single measurement alone. The accuracy is attributable to joint probability distribution over the variables for each timeframe. An example for a birth rule is that a reflection has to be present in a number of frames, e.g. 3, to be tracked as a tracklet. Similar, a death rule embodiment is that the corresponding detection of the tracklet has to miss in 3 consecutive frames for the tracklet to die. The technical advantage of this method is that the new and/or vanishing reflection points are modeled accurately, yet small fluctuation are compensated. Similar techniques are used for the objects (grouped or clustered tracklets) to be present in the scene/entering the scene/leaving.
[0086] It is advantageous to use filters with no constraints on the movements for the short-term hypothesis filtering, e.g. using a hoovercraft object motion filter for the short-term hypothesis filter. For the long-term hypothesis filter it is preferred to use filters with a more confined motion model, e.g. a vehicle motion model.
[0087]
[0088] In some embodiments, median background estimation on sparse range-angle comprises sorting a median over eight points surrounding an origin of a square in question. In other embodiments, any number of two-dimensional points can be used. In yet other embodiments, over averaging metrics can be utilized and remain within the scope of the present disclosure.
[0089] Peak detection, thresholding and validation is disclosed as follows. Thresholding does most of the work of extracting foreground regions in the radar cube, but it is not perfect. Some spurious peaks rise above the threshold. These can be removed by assuming that all regions of interest are well-described as 3-dimensional blob shapes. This consists of finding all local maxima via:
[0090] and scoring these peaks as the sum of magnitudes within a 5×5×5 neighborhood centered on each bin and then masking out peak regions whose scores are below a minimum value. This can reduce the number of local maxima from 100s-1,000s to 30-100.
[0091] The utility of the aforementioned is robustness in that spurious peaks are eliminated by the validation step. Additionally, usability is augmented by peak regions (rather than just the peaks) which can be used by a machine learning engine later in the pipeline.
[0092] While one or more of the previous embodiments entail performing a median operation over the velocity degree of freedom, the operator can be performed on any parameter in order to simply (compress) data for further analysis. For example, medians or other averages (moving or otherwise) can be used on range, angle, and/or tangential velocity, in order to attenuate the data matrices.
[0093]
[0094] Having thus described several aspects and embodiments of the technology of this application, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those of ordinary skill in the art. Such alterations, modifications, and improvements are intended to be within the spirit and scope of the technology described in the application. For example, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the embodiments described herein.
[0095] The above-described embodiments may be implemented in any of numerous ways. One or more aspects and embodiments of the present application involving the performance of processes or methods may utilize program instructions executable by a device (e.g., a computer, a processor, or other device) to perform, or control performance of, the processes or methods.
[0096] In this respect, various inventive concepts may be embodied as a computer readable storage medium (or multiple computer readable storage media) (e.g., a computer memory, one or more compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement one or more of the various embodiments described above.
[0097] The computer readable medium or media may be transportable, such that the program or programs stored thereon may be loaded onto one or more different computers or other processors to implement various ones of the aspects described above. In some embodiments, computer readable media may be non-transitory media.