METHOD FOR DETECTING OBJECTS IN AUTOMOTIVE-GRADE RADAR SIGNALS

20220242443 · 2022-08-04

    Inventors

    Cpc classification

    International classification

    Abstract

    A method includes an operation to collect radar signals reflected from objects in a field of view. Range-angle-doppler bins representing three-dimensional objects in the field of view are formed. A local median operation is used across a selected dimension of the range-angle-doppler bins to eliminate background noise in the range-angle-doppler bins. Low energy peak regions are masked by removing radial velocity values in the selected dimension to form a sparse range-angle two-dimensional grid. The radar signals reflected from objects in the view of view are processed to extract reflection point detections. Reflection point detections are tracked in accordance with short-term filter rules to form tracked reflection point detections. The tracked reflection point detections are formed into clusters. The clusters are processed with long-term filter rules.

    Claims

    1. A method, comprising: collect radar signals reflected from objects in a field of view; form range-angle-doppler bins representing three-dimensional objects in the field of view; use a local median operation across a selected dimension of the range-angle-doppler bins to eliminate background noise in the range-angle-doppler bins; and mask low energy peak regions by removing radial velocity values in the selected dimension to form a sparse range-angle two-dimensional grid.

    2. The method of claim 1 further comprising an operation to utilize peak region validation of the range-angle-doppler bins.

    3. The method of claim 1 further comprising up-sampling remaining sparse range-angle two-dimensional grid values with bilinear interpolation.

    4. The method of claim 1 wherein the selected dimension is the Doppler dimension.

    5. The method of claim 1 further comprising threshold masking.

    6. The method of claim 1 further comprising background estimation.

    7. The method of claim 1 wherein the radar signals reflected from objects in the field of view are processed to extract reflection point detections.

    8. The method of claim 7 further comprising applying short-term tracking rules to the reflection point detections to form tracklets, where each tracklet corresponds to a reflection point detection tracked over several frames.

    9. The method of claim 8 further comprising applying long-term tracking rules to a cluster of tracklets to track objects having several reflection point detections.

    10. The method of claim 9 further comprising applying rules to the tracklets to establish births of the tracklets.

    11. The method of claim 9 further comprising applying rules to the tracklets to establish deaths of the tracklets.

    12. The method of claim 9 further comprising applying rules to the group to establish birth of the group.

    13. The method of claim 9 further comprising applying rules to the group to establish death of the group.

    14. The method of claim 9 further comprising applying a Kalman filter to the tracklets.

    15. The method of claim 9 further comprising applying a Kalman filter to the group.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0029] The present disclosure is best understood from the following detailed description when read with the accompanying figures. It is emphasized that, in accordance with the standard practice in the industry, various features are not necessarily drawn to scale, and are used for illustration purposes only. Where a scale is shown, explicitly or implicitly, it provides only one illustrative example. In other embodiments, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. Similarly, for the purposes of clarity and brevity, not every component may be labeled in every drawing.

    [0030] For a fuller understanding of the nature and advantages of the present invention, reference is made to the following detailed description of preferred embodiments and in connection with the accompanying drawings, in which:

    [0031] FIGS. 1A and 1B depict an exemplary radar chirp as a function of time, as known in the art;

    [0032] FIG. 2 depicts an exemplary auto-grade radar system, according to some embodiments;

    [0033] FIGS. 3A and 3B illustrate the frequency difference in exemplary send and receive radar chirps, according to some embodiments;

    [0034] FIG. 4 illustrates an exemplary two-dimensional range array being populated, according to some embodiments;

    [0035] FIGS. 5A and 5B illustrate the creation of a velocity-range array from a chirp index-range array, according to some embodiments;

    [0036] FIG. 6 depicts an exemplary antenna array used to calculate angle, according to some embodiments;

    [0037] FIG. 7 depicts an exemplary range-angle-velocity radar cube, according to some embodiments;

    [0038] FIGS. 8A and 8B depict a scene before and after image processing, according to some embodiments;

    [0039] FIG. 9 is an exemplary method for detecting objects in automotive-grade radar, according to some embodiments;

    [0040] FIG. 10 depicts an exemplary median background estimation on a sparse range angle grid, according to some embodiments; and

    [0041] FIG. 11 is a schematic of an exemplary radar system, according to some embodiments.

    DETAILED DESCRIPTION

    [0042] The present disclosure relates to techniques for detecting objects two-dimensional in automotive-grade radar signals. More particularly, the disclosure describes background estimation and peak region validation in a radar environment. During radar signal processing, background subtraction tends to be overly aggressive and is computationally intensive. One method disclosed herein runs a median operation on one dimension of a radar cube.

    [0043] This results in the removal of one dimension from computation in calculating the background statistics, e.g., 3D to 2D or 4D to 3D as may be used in multidimensional processing. In a 3D to 2D example, doppler-velocity can be removed using the median operation. A local median 2D can then be applied on a sparse range-angle grid. The result can then be up-sampled with bilinear interpolation. Additionally, spurious peaks can be eliminated by a validation step.

    [0044] The following description and drawings set forth certain illustrative implementations of the disclosure in detail, which are indicative of several exemplary ways in which the various principles of the disclosure may be carried out. The illustrative examples, however, are not exhaustive of the many possible embodiments of the disclosure. Other objects, advantages and novel features of the disclosure are set forth in the proceeding in view of the drawings where applicable.

    [0045] The present disclosure generally relates to Millimeter Wave Sensing, while other wavelengths and applications are not beyond the scope of the invention. Specifically, the present method pertains to a sensing technology called Frequency Modulated Continuous Waves (FMCW) RADARS, which is very popular in automotive and industrial segments.

    [0046] FMCW radar measures the range, velocity, and angle of arrival of objects in front of it. At the heart of an FMCW radar is a signal called a chirp. FIGS. 1A and 1B depict an exemplary radar chirp as a function of time, as known in the art.

    [0047] A chirp is a sinusoid or a sine wave whose frequency increases linearly with time. FIG. 1A shows this as an amplitude versus time, or A-t plot. Turning to FIG. 1B, the chirp starts as a sine wave with a frequency of fc and gradually increase its frequency ending up with a frequency of fc plus B, where B is the bandwidth of the chirp. The frequency of the chirp increases linearly with time, linear being the operative word. So, in the f-t plot, the chip would be a straight line with a slope S.

    [0048] Thus, the chirp is a continuous wave whose frequency is linearly modulated. Hence the term frequency modulated continuous wave or FMCW for short.

    [0049] FIG. 2 depicts an exemplary auto-grade radar system, according to some embodiments. It is represented as a simplified block diagram of an FMCW radar with a single TX and a single RX antenna. In one or more embodiments, the radar operates as follows. A synthesizer generates a chirp. This chirp is transmitted by the TX antenna. The chirp is then reflected off an object, such as, a car. The reflected chirp is received at the RX antenna. The RX signal and the TX signal are mixed at a mixer.

    [0050] The resultant signal is called an intermediate (IF) signal. The IF signal prepared for signal processing by low-pass (LP) filtering and sampled using an analog to digital converter (ADC). The significance of the mixer will now be described in greater in detail.

    [0051] FIGS. 3A and 3B illustrate the frequency difference in exemplary send and receive radar chirps, according to some embodiments. In one or more embodiments, this difference is estimated using a mixer. A mixer has two inputs and one output, as is known in the art. If two sinusoids are input to the two input ports of the mixer, the output of the mixer is also a sinusoid as described below.

    [0052] The instantaneous frequency of the output equals the difference of the instantaneous frequencies of the two input sinusoids. So, the frequency of the output at any point in time would be equal to the difference of the input frequencies of two time-varying sinusoids at that point in time. Tau, t, represents the round-trip delay from the radar to the object and back in time. It can also be expressed as twice the distance to the object divided by the speed of light. A single object in front of the radar produces an IF signal with a constant frequency given by S2d/c.

    [0053] FIG. 4 illustrates an exemplary two range matrix being populated by a radar frame, according to some embodiments. A radar frame (left) has a time T.sub.F and comprises a plurality of chirps, 1-N, each separated in time by Tc.

    [0054] Each row corresponds to one chirp. That is, for every chirp there is a row in the chirp index, i.e., N rows for N chirps. Each box in a particular row represents one ADC sample. Accordingly, if each chirp is sampled M times, there will be M columns in the matrix. The transformation of the data matrix in range and velocity matrices will now be described.

    [0055] FIG. 5A illustrates the creation of a chirp-range matrix from the previous data matrix, according to some embodiments. As mentioned above, each row corresponds to samples from a specific chirp. To determine range(s), a range-FFT is performed on each row. A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in the frequency domain and vice versa.

    [0056] The application of the range-FFT resolves objects in range. As one skilled in the art can appreciate, the x-axis is actually the frequency corresponding to the range FFT bins. But, since range is proportional to the IF frequency, this can be plotted directly as the range axis. Therefore, FIG. 5A is a matrix of chirps with each chirp having an array of frequency bins. Pursuant to the discussion above, these bins correspond directly to the range via the IF.

    [0057] FIG. 5B illustrates the creation of a velocity-range matrix from the previous chirp index-range matrix, according to some embodiments. A Doppler-FFT is performed along the columns of these range-FFT results shown in FIG. 5A. This resolves objects in the velocity dimension.

    [0058] As can be appreciated, FIG. 5B depicts two objects in the third range bin traveling at two different speeds. Similarly, there are three objects in the eighth range bin traveling at three different speed. It should be noted that these are accurate for a fixed range-angle. Angle determination will now be discussed in greater detail.

    [0059] FIG. 6 depicts an exemplary antenna array used to calculate angle, according to some embodiments. Angle estimation requires at least 2 receiver (RX) antennas. The differential distance of the object to each of these antennas is exploited to estimate distance. So, the transmit (TX) antenna transmits a signal that is a chirp. It is reflected off the object with one ray going from the object to the first RX antenna and another ray going from the object to the second RX antenna.

    [0060] In this example depicted in FIG. 6, the ray to the second RX antenna has to travel a little longer. That is, an additional distance of delta d. This additional distance results in an additional phase of omega equal to 2 pi delta d by lambda. So, this is the phase difference between the signal at this antenna and the signal at this antenna.

    [0061] FIG. 7 depicts an exemplary range-angle-velocity radar cube, according to some embodiments. As can be appreciated by one skilled in art, the assembly of the matrices result in a 3D radar cube with axes of range-angle-velocity. Methods disclosed herein describe techniques for processing and interpreting radar data extracted from one (or more) 77-GHz DigiMMIC (FMCW) radar sensor mounted on a moving vehicle, although other frequencies and applications are not beyond the scope of the present disclosure.

    [0062] The radar cube data is in the form of 3-dimensional, complex-valued array with dimensions corresponding to azimuth (angle), radial velocity (doppler), and radial distance (range). Magnitude in each angle-doppler-range bin is taken to describe how much energy the radar sensor sees coming from that point in space (angle and range) for that radial velocity. For demonstrative purposes, linear antenna arrays oriented parallel to the ground are assumed. Pillars with peak energies can be selected for peak validation, which will be discussed later in the disclosure.

    [0063] FIGS. 8A and 8B depict a scene before and after image processing, according to some embodiments. An object of present disclosure is foreground extraction/background subtraction. FIG. 8A is the raw image of a car in a radar cube. From the relatively noisy image depicted in FIG. 8A, distinct peak energy regions can be extracted. The high-level idea is to isolate reflections off of objects in an automotive scene by combining background noise estimation with peak region masking. The former finds regions in the radar signal that stand out from the background and the latter validates and isolates them for further processing.

    [0064] The result is depicted in FIG. 8B. FIG. 8B is a processed image of a car in a radar cube with detected peaks (dots) and peak regions (non-black pixels). The process of which will now be discussed in greater detail.

    [0065] Foreground extraction can be performed by various methods for suppressing noise and artifacts and extracting salient regions of a radar cube. Most, if not all, involve estimating either a background model and somehow removing it or creating a mask that emphasizes desired bins and suppresses others through element-wise multiplication.

    [0066] CFAR Constant False Alarm Rate (CFAR) thresholding is probably the most well-known and well-studied technique and involves estimating a background model through local averaging. Constant false alarm rate (CFAR) detection refers to a common form of adaptive algorithm used in radar systems to detect target returns against a background of noise, clutter and interference.

    [0067] The primary idea is that noise statistics may be non-uniform across the array. CA-CFAR (cell averaging) computes a moving mean while excluding a region at the center of the averaging window (guard cells) to avoid including a desired object in the background estimate. OS-CFAR (order-statistic) does the same computation but with a percentile operation instead of a mean. Given the background model (estimate of background value in each bin) b.sub.ijk, the foreground can be estimated as:


    i,j,k x.sub.ijk←x.sub.ijk⊙(x.sub.ijk≥α.Math.b.sub.ijk)  (1)

    [0068] for some factor α that controls the amount of background suppression.

    [0069] One or more objects of the present disclosure is to efficiently isolate objects in an automotive scene from noise. The motivation(s) being that automotive radar signals consist of reflections off of objects (e.g., cars, cyclists, buildings) and that traditional approaches may not simultaneously achieve accuracy, efficiency, and usability in a machine learning pipeline.

    [0070] Heretofore, previous solutions have attempted 2D-only or full 3D background estimation. Other solutions don't include peak validation or retention of peak regions.

    [0071] Solutions to similar problems in related fields and other disciplines are untenable in their application to 3D radar data cubes. They also create barriers to computational efficiency.

    [0072] FIG. 9 is an exemplary method for detecting objects in automotive-grade radar, according to some embodiments. The method comprises medmed background modeling. Medmed refers median-median, although other averaging techniques are not beyond the scope of the present disclosure. A median is a value in an ordered set of values below and above which there is an equal number of values or which is the arithmetic mean of the two middle values if there is no one middle number.

    [0073] The OS-CFAR described above is expensive and a bit inflexible because it operates on the full radar cube. This implementation can be improved without significantly affecting the result as follows. For demonstrative purposes, it is assumed that the background model is only a function of range-angle and not doppler. By doing so, a full 3D moving median can be approximated with a median over doppler followed by a 2D moving median over range-angle.

    [0074] This allows much more flexibility in terms of choosing (2D) window size and overlap. The moving median on a sparse grid can be calculated and quickly upsampled to the original range-angle grid with widely used and optimized image resampling techniques. The two median operations give this approach its name.

    [0075] Turning to FIG. 9, the radar cube comprises a large amount of data which includes both background and energy clusters. These energy clusters or clouds may be regions of interest which are confirmed during peak detection. However, before that can occur, some of the capacious background need to be removed (subtracted) in order to have a tractable data cube which can be more efficiently processed.

    [0076] Typically, in the art, aggressive thresholding is applied to a radar cube, which compresses the data by effectively throwing away a large amount of information, i.e., the data under the threshold. Image thresholding is a simple, yet effective, way of partitioning an image into a foreground and background. This image analysis technique is a type of image segmentation that isolates objects by converting grayscale images into binary images. Image thresholding is most effective in images with high levels of contrast.

    [0077] By thresholding less aggressively, a larger area of surrounding regions could be kept. This is useful in intelligent perception analysis. Constructive perception or intelligent perception is the theory of perception in which the perceiver uses sensory information and other sources of information to construct a cognitive understanding of a stimulus. In contrast to this top-down approach, there is the bottom-up approach of direct perception. Perception is more of a hypothesis, and the evidence to support this is that “Perception allows behavior to be generally appropriate to non-sensed object characteristics,” meaning that we react to obvious things that, for example, are like doors even though we only see a “long, narrow rectangle as the door is ajar.”

    [0078] In the present embodiment, intelligent perception not only allows a radar system to identify an object's (let's say car) position, speed and direction, it can dispositively associate that with a previously identified car. That is, the radar system can more easily temporally correlate object between scenes thereby mitigating surprises.

    [0079] In one or more embodiments, a fourth dimension includes tangential velocity. In state-of-the-art radar systems, doppler can only resolve a velocity which is closer to the observer. That is, closer and further from the radar antenna. In some embodiments, tangential velocity can resolve velocities substantially normal to the plane of the observer. In more layman's term, this includes left-right or up-down movement and speed (actually velocity, since it is a vector). A 4D radar cube can be referred to as a tesseract.

    [0080] Turning back to FIG. 9, the absolute value operation is performed on the radar cube. As is known, absolute value returns the magnitude of a complex number which is simply the distance from the origin in the complex plane. In one or more embodiments, a median is performed to achieve some thresholding standard. A moving median is desirable as a more robust statistic of averaging as the mean can get thrown off by objects which are heavily weighted. That is, median can ignore high energy objects in the scene. This is important in the background subtraction step.

    [0081] However, medians are computationally intensive since they require sorting. If the background is assumed to be independent of the radial velocity direction, one dimension of the radar cube can be truncated by the following. A moving median is performed of the radial velocity direction. This median can then be used to reduce the 3D radar cube to just range-angle by substituting the velocity median in the third dimension. The importance of which will be described in more detailed in the discussion of FIG. 10.

    [0082] Once background estimation is achieved, masking can be performed. Masking removes any background noise while allowing peak regions of interest to remain. The result is a threshold cube which is computationally far more tenable. Peak detection and filtering can now be performed which will also be discussed later in the disclosure.

    [0083] An aspect of the radar cube processing includes using short-term track hypotheses (referred to as “tracklets”) to model detections. Tracklets are clustered into groups to form longer-term tracking hypotheses (referred to as “groups”). The tracklets and groups are subject to birth-death rules and different filters, as discussed below.

    [0084] The peak list can then be processed with different state of the art filtering and detection methods. These methods do not take into account the small variation of the reflection points from measurement to measurement. For example, a reflection point on a curved surface of an object like a car hood can change its position depending on orientation and position of the car. This movement of the reflection point then adds an artificial movement not corresponding to the real movement of the object. It becomes difficult to associate all the reflection points to the same object due to this offset. Further, as the object moves, new reflection points on the object show up and others vanish. To mitigate this problem, the following novel techniques are used.

    [0085] As a first step a short-term hypothesis is used to model the dynamic/short-term movement of each detection from frame to frame. The tracked/filtered detections are the previously referenced tracklets. As a next step, several tracklets (minimum one, but preferably a larger number than 1) are clustered. This cluster is then tracked by a long term hypothesis to model the underlying movement of the object. The individual tracklets and groups are preferably modeled with alpha/Beta Kalman Filter rules and with birth and death rules. Kalman filters use a series of measurements observed over time to produce estimates of unknown variables that tend to be more accurate than those based on a single measurement alone. The accuracy is attributable to joint probability distribution over the variables for each timeframe. An example for a birth rule is that a reflection has to be present in a number of frames, e.g. 3, to be tracked as a tracklet. Similar, a death rule embodiment is that the corresponding detection of the tracklet has to miss in 3 consecutive frames for the tracklet to die. The technical advantage of this method is that the new and/or vanishing reflection points are modeled accurately, yet small fluctuation are compensated. Similar techniques are used for the objects (grouped or clustered tracklets) to be present in the scene/entering the scene/leaving.

    [0086] It is advantageous to use filters with no constraints on the movements for the short-term hypothesis filtering, e.g. using a hoovercraft object motion filter for the short-term hypothesis filter. For the long-term hypothesis filter it is preferred to use filters with a more confined motion model, e.g. a vehicle motion model.

    [0087] FIG. 10 depicts an exemplary median background estimation on a sparse range angle grid, according to some embodiments. Pursuant to previous discussion, background estimation can be performed on range-angle with the use of velocity (doppler) median. Median background estimation on sparse range-angle grid comprises a moving median over a 2D grid.

    [0088] In some embodiments, median background estimation on sparse range-angle comprises sorting a median over eight points surrounding an origin of a square in question. In other embodiments, any number of two-dimensional points can be used. In yet other embodiments, over averaging metrics can be utilized and remain within the scope of the present disclosure.

    [0089] Peak detection, thresholding and validation is disclosed as follows. Thresholding does most of the work of extracting foreground regions in the radar cube, but it is not perfect. Some spurious peaks rise above the threshold. These can be removed by assuming that all regions of interest are well-described as 3-dimensional blob shapes. This consists of finding all local maxima via:

    [00001] p ijk = { 1 if x ijk = max ( x i - 1 : i + 1 , j - 1 : j + 1 , k - 1 :: k + 1 ) and x ijk > 0 0 o . w . ( 5 )

    [0090] and scoring these peaks as the sum of magnitudes within a 5×5×5 neighborhood centered on each bin and then masking out peak regions whose scores are below a minimum value. This can reduce the number of local maxima from 100s-1,000s to 30-100.

    [0091] The utility of the aforementioned is robustness in that spurious peaks are eliminated by the validation step. Additionally, usability is augmented by peak regions (rather than just the peaks) which can be used by a machine learning engine later in the pipeline.

    [0092] While one or more of the previous embodiments entail performing a median operation over the velocity degree of freedom, the operator can be performed on any parameter in order to simply (compress) data for further analysis. For example, medians or other averages (moving or otherwise) can be used on range, angle, and/or tangential velocity, in order to attenuate the data matrices.

    [0093] FIG. 11 is a schematic of an exemplary radar system, according to some embodiments. The radar system comprises transmitter, duplexer, low noise amplifier, mixer, local oscillator, matched filter, IF filter, 2.sup.nd detector, video amplifier and display.

    [0094] Having thus described several aspects and embodiments of the technology of this application, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those of ordinary skill in the art. Such alterations, modifications, and improvements are intended to be within the spirit and scope of the technology described in the application. For example, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the embodiments described herein.

    [0095] The above-described embodiments may be implemented in any of numerous ways. One or more aspects and embodiments of the present application involving the performance of processes or methods may utilize program instructions executable by a device (e.g., a computer, a processor, or other device) to perform, or control performance of, the processes or methods.

    [0096] In this respect, various inventive concepts may be embodied as a computer readable storage medium (or multiple computer readable storage media) (e.g., a computer memory, one or more compact discs, optical discs, magnetic tapes, flash memories, circuit configurations in Field Programmable Gate Arrays or other semiconductor devices, or other tangible computer storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement one or more of the various embodiments described above.

    [0097] The computer readable medium or media may be transportable, such that the program or programs stored thereon may be loaded onto one or more different computers or other processors to implement various ones of the aspects described above. In some embodiments, computer readable media may be non-transitory media.