Method and system for detecting objects in a vehicle blind spot
11257375 · 2022-02-22
Assignee
Inventors
Cpc classification
B60R11/04
PERFORMING OPERATIONS; TRANSPORTING
G08G1/167
PHYSICS
International classification
B60R11/00
PERFORMING OPERATIONS; TRANSPORTING
B60Q9/00
PERFORMING OPERATIONS; TRANSPORTING
Abstract
A method for detecting objects in a vehicle blind spot comprises the following steps: generating a region of interest onto an image taken from one camera placed on one side of the vehicle; generating a top view of the region of interest; detecting an object in the region of interest; determining if the object in the region of interest is a target object; and triggering a signal if it is determined that there is a target object in the region of interest.
Claims
1. A method for detecting objects in a vehicle blind spot, wherein the method comprises: generating a region of interest onto a first image taken from one camera placed in one side of the vehicle; generating a top view of the region of interest; dividing the top view of the region of interest into a plurality of rows; determining an average value of each row of the plurality of rows; generating a first 1D array containing the average value of each row; detecting an object based on the 1D array; comparing the average values of the first 1D array with values of a second 1D array containing a second set of average values; determining that the object in the top view of the region of interest is a moving object based on the comparing; and triggering a signal indicating that the moving object has been determined in the region of interest.
2. The method according to claim 1, wherein the generation of the top view of the region of interest is carried out by inverse perspective mapping homography computation.
3. The method according to claim 1, wherein the determination that the object in the region of interest is moving is carried out by track horizontal mean computation.
4. The method according to claim 1, wherein the method further comprises filtering out objects with inconsistent motion.
5. The method according to claim 4, wherein the filtering out is carried out by horizontal mean computation.
6. The method according to claim 2, wherein the mapping homography computation comprises mapping a 3D grid onto an image plane of the ground, so that each point in the grid is assigned a corresponding intensity value from the image plane.
7. The method according to claim 3, wherein the track horizontal mean computation comprises a method of measuring a level of similarity between signals from two different frames.
8. The method according to claim 7, wherein the method of measuring the level of similarity comprises a normalized cross correlation.
9. The method according to claim 4, wherein is filtering out is carried out by ego-motion computation.
10. The method according to claim 1, wherein in determining that a target object in the region of interest is present, only objects with movements similar to a vehicle are considered.
11. A system for detecting objects in a vehicle blind spot using the method according to claim 1, wherein the system comprises a blind spot detection module including only one camera on each side of the vehicle for detecting target objects in a zone adjacent to the vehicle.
12. The system according to claim 11, wherein the system also comprises a side collision warning module for detecting target objects in a rear zone.
13. The system according to claim 11, wherein said camera is a fixed camera during operation.
14. The method according to claim 9, wherein the ego-motion computation comprises determining the vehicle speed with respect to the road.
15. The method of claim 1, wherein the second set of average values of the second 1D array is generated based on average values derived from a second plurality of rows of the region of interest of a second image taken by the camera subsequent to the first image.
16. The method of claim 15, wherein the determining that the object is the moving object is made when a first portion of the second 1D array corresponding to the detected object includes values similar to values of the first 1D array corresponding to the detected object.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) For a better understanding of the above explanation and for the sole purpose of providing an example, some non-limiting drawings are included that schematically depict a practical embodiment.
(2)
(3)
(4)
(5)
(6)
(7)
DESCRIPTION OF A PREFERRED EMBODIMENT
(8) According to a preferred embodiment, the system comprises the following modules, shown in
(9) The Blind Spot Detection (BSD) module uses the following inputs and provides the following output.
(10) Inputs: Camera images, such as grey images, e.g. 1280×800 grey level image with 8-bit color depth, but other types, such as, color images, LIDAR images, etc., are possible; Camera calibration (intrinsic and/or extrinsic); Vehicle information (which is optional), such as speed (m/s) and yaw rate (rad/s); Time stamp, such as, in milliseconds.
(11) The output is to determine if the adjacent lane is occupied.
(12)
(13) In particular these steps are: take an image to the rear-side (the rear zone and/or the adjacent zone) of the vehicle; generate a region of interest; compute IPM homography; compute horizontal mean; compute ego-motion (which is an optional step); and track horizontal mean (also called “tracker”).
(14) Now these steps are described individually: Compute IPM Homography:
(15) After the steps of taking an image to the rear-side of the vehicle from a single camera provided in a side of the vehicle and the step of generating a region of interest of the image taken, the system creates a top view image of the region of interest.
(16) For example, a grey scale 2D image (but other types, such as, color images, LIDAR images, can be possible) is mapped to a ground plane by means of IPM. In IPM the angle of view under which a scene is acquired and the distance of the objects from the camera (namely the perspective effect) contribute to associate a different information content to each pixel of an image. The perspective effect in fact must be taken into account when processing images in order to weigh each pixel according to its information content.
(17) IPM allows removal of the perspective effect from the acquired image, remapping it into a new 2-dimensional domain in which the information content is homogeneously distributed among all pixels, thus allowing the efficient implementation of the following processing steps with a Single Instruction, Multiple Data (SIMD) paradigm. Obviously, the application of the IPM transform requires the knowledge of the specific acquisition conditions (camera position, orientation, optics, etc.) and some assumption on the scene represented in the image (here defined as a-priori knowledge, for example assuming the road in front of the vision system is planar). Thus, the IPM transform can be of use in structured environments, where, for example, the camera is mounted in a fixed position or in situations where the calibration of the system and the surrounding environment can be sensed via another kind of sensor.
(18) The IPM is not based on lane detection. This method obtains 4 end-points of a pair of lanes in perspective image which is not based on any lane detection algorithm.
(19) Then, a 3D grid is mapped onto an image plane of the ground (
(20) Bear in mind that the inverse perspective mapping (IPM) scheme is another method for obtaining a bird's eye view of the scene from a perspective image. The inverse perspective mapping technique can also be used to remove the perspective distortion caused by the perspective projection of a 3D scene into a 2D image (
(21)
(22) Compute Horizontal Mean:
(23) In this step, the average value of rows on the top view image (IPM image) is determined and a 1D array is generated. The computed horizontal mean converts the top view image into a 1-D vector information, for example, by summing the intensity values of the same row in the horizontal direction for each position of the 1-D vector.
(24) The average value of each row on the IPM image is computed. This produces a 1D array with the same size as the IPM image height. All values, as preferred, are normalized with respect to the maximum mean value. In addition, spatio-temporal smoothing is applied in order to remove noise caused by small spikes. A spatial smoothing may be computed by applying a 3×1 mean mask on each element of the array, i.e. every element may be averaged with its direct neighbors. Temporal smoothing may be computed by means of weighted average of each element of the array with its previous value. Current values may be given higher weights.
(25)
(26) One can observe that the mean horizontal values without a vehicle are almost uniform, which does not happen when a vehicle is present.
(27) Ego-Motion Computation:
(28) This step is optional and is used for filtering out objects with inconsistent motion, such as shadows from bridges or trees, false alarms from adjacent fence, or vehicles moving in the opposite direction. The ego-motion computation is used as a double check for reducing false negatives in combination with said tracker.
(29) The ego-motion is obtained by computing vehicle motion, and then mapping this metric in pixels onto the IPM image.
(30) For example, if the vehicle is moving at 30 m/s, it means that at 30 fps it will move 1 m/frame.
(31) Then, knowing the number of pixels per meter in the IPM image, allows one to obtain the motion in pixels on the IPM image. In particular, the ego-motion takes into account: (i) speed of the car through the Controller Area Network (CAN), (ii) time between frames, and (iii) size of the IPM (i.e. relationship between pixel and distance of the exterior world of the camera). Therefore, the ego-motion returns distance expressed in pixels. This information will later on be useful to filter out elements inside the 1D array that have for instance opposite motion. It is also useful to cluster them together based on their motion.
(32) In the ego-motion firstly is calculated the relative speed between the road and the vehicle.
(33) From this information, how many pixels have moved in the area of interest are calculated, and blocks of a predefined size that have the same movement are grouped.
(34) Next, the blocks that have an incoherent or opposite movement are deleted (e.g. cars in the opposite direction), and finally vehicles that are approaching are identified.
(35) Track Horizontal Mean (Tracker):
(36) This step determines the motion of objects on the road, mainly on the IPM image. According to one preferred example, after the computation of the mean value of each row and the possible ego-motion, a tracking is computed on the 1D array. The core of the tracking is the Normalized Cross Correlation (NCC).
(37) The Tracker is used to determine any change related to the relative position of the detected object between two frames. Thus, if there is a change of the relative position of the detected object, this means that said detected object has a displacement. Therefore, the Tracker is used to check if the detected object has moved with respect to the previous frame or if it has not moved, i.e. it is static. In particular, the objective of the Tracker is to determine if the captured objects by the camera have a displacement, and if said displacement is a realistic displacement.
(38) For this, a method that compares two signals is required. In particular, it is a comparison of a value of the current frame and a value of the previous frame. In a preferred example, the comparison function is the Normalized Cross Correlation (NCC), but it could be another function, such as, a Sum of Square Difference (SSD), or a Sum of Absolute Difference (SAD).
(39) NCC outputs a measure of similarity of two series as a function of the displacement of one relative to the other. In a preferred example, NCC returns a value between −1 and 1, where the value of 1 means a high level of similarity between signals. NCC is used to match two dimensional signals: the current frame with the previous frame. This match allows identification of the position of one object within the IPM image. Therefore, the tracker may determine the displacement of said object as well as its motion direction.
(40) Ego-motion, which is optionally in the tracker, can be used for additional filter of objects based on the vehicle (absolute) speed. In addition, a persistence filter is applied to remove sporadic or inconsistent motion.
(41) In particular, the tracker includes three iterations (loops), wherein in a preferred example the 3rd loop is inside the 2nd loop and the 2nd loop inside the 1st loop:
(42) The 1st loop checks a plurality of positions (from two positions to all positions) of the 1D array (vertical vector). In a preferred example, the 1st loop checks all positions of the 1D array (vertical vector) from the first position to the last position (the last position of the vector corresponds to the height of the IPM image, in particular, to the height of the 2D image of the top view image). In a further example, the 1st loop checks from position 1 to position 250, the position 250 being the last position of the 1D array (vertical vector).
(43) In the 2nd loop the different possible displacements of the previous frame (frame −1) are iterated. Thus, it is possible to check (compare) the value of a determined position of the current frame within a preselected range for the previous frame (frame −1). Said preselected range is calculated from a displacement parameter. Therefore, it is possible to compare between a pattern (e.g. a value of determined position of the current frame) and a range (e.g. a plurality of values of the previous frame). The 2nd loop is iterated in the previous frame (frame −1) until a maximum value is reached (i.e. the maximum displacement taken from the position), and therefore, the 2nd loop does not take into account values out of the displacement range in order not to take into account unnecessary search positions.
(44) For example, we are in position 100 of our 1D array (vertical vector). A displacement parameter is selected (e.g. 30). Then, we compare position 100 of said 1D array of the current frame with positions 70 (100−30) to 130 (100+30) of the previous frame (frame −1). In a preferred example, we first compare position 100 of the current frame with position 70 (100+(−30)) of the previous frame, returning a similarity value between them. Then, we continue with the iteration increasing the displacement, that is: position 71 (100+(−29)) of the previous frame is compared with position 100 of the current frame, returning a similarity value between them. In one example, the tracker only takes into account the highest similarity value. Thus, if the similarity value between position 71 of previous frame and position 100 of current frame is higher than the similarity value between position 70 of the previous frame and position 100 of current frame, then the similarity value between position 70 of the previous frame and position 100 of the current frame is disregarded. Then, we continue with the iteration increasing the displacement, that is: position 72 (100+(−28)) of the previous frame is compared with position 100 of the current frame. Said similarity value is compared with the highest similarity value obtained so far. As explained, the tracker only takes into account the highest similarity value, and so the other similarity values are disregarded. Thus, the tracker allows a rapid skipping of the positions that cannot provide a better degree of match than the current best-matching one. We continue with the iteration until the tracker reaches position 130 (100+30). Therefore, position 100 of the current frame has been compared to the range from position 70 to position 130 of the previous frame (frame −1).
(45) In some examples, positions below 70, and positions above 130, are not taken into consideration. In some other examples, positions below 70, and positions above 130 are taken into account. In some other examples, all positions of the 1D array of the current frame are taken into account. Parameter 30 can be changed (e.g. it can be 29, 28, etc. or 31, 32, etc.).
(46) In the 3rd loop a mean Kernel is iterated. This arises because an object can have a single position of the vertical vector (1D array), or it can be two positions, or it can have a size of three positions, etc. Therefore, we will compare the displacement of 1 position of the current frame with a position of the previous frame, a block of 2 positions of the current frame with a block of 2 positions of the previous frame, a block of 3 positions of the current frame with a block of 3 positions of the previous frame, and so on.
(47) For example, the maximum Kernel value is 150, but it could be another value (e.g. 149, 145, etc.). Half of the maximum value of the Kernel is calculated (150/2) which is 75. If for example, we are in position 100, we calculate: 100−75.
(48) Therefore, we will go from position 25 (100−75) to 175 (100+75). In conclusion, here we consider the measure of the object, that is, the displacement of a group of positions (this group can be a position of the vertical vector or many positions).
(49) Therefore, we group the blocks of a predefined size that have the same movement.
(50) The tracker, in addition to including the aforementioned 3 loops, also uses the information provided by the loops, that is, computes the displacement, discards objects with unusual movement or that are not of interest, and performs minor tasks such as updating variables, copying vectors, etc. For example, negative movements (in the opposite direction) and/or too large movements (errors) are discarded.
(51) The tracker does not process any image, so it is very light (in terms of computational power), and the tracker works on numerical values based on the top view image.
(52) In summary, the invention provides a system and a method for detecting objects in the blind spot of a vehicle.
(53) Changing lanes can be particularly hazardous when another vehicle is continuously operating in the blind spot of the vehicle in an adjacent lane.
(54) Side Collision Warning (SCW) and the (Blind Spot Detection) BSD modules work out independently.
(55) For “right/left rear zone” Side Collision Warning (SCW) module is used, whereas for the “right/left adjacent zone” BSD is used.
(56) The Lane Recognition (LR) module is used only for the Side Collision Warning (SCW) module, not for the BSD module.
(57) It must be pointed out that BSD does not take into account the speed of the object as such. The Track Horizontal Mean compares the displacement of two consecutive frames to identify the object's motion direction, and to remove sporadic or inconsistent motion.
(58) Additionally, there is an optional module called Ego-motion computation, which is the only module for BSD module that takes into account the detected object speed.
(59) The BSD module comprises only one camera per side. No other sensors (i.e. positioning sensors) are needed and there is no actuator/motor to rotate the camera (change the field of view).
(60) Even though reference has been made to a specific embodiment of the invention, it is obvious to a person skilled in the art that the method and system described herein are susceptible to numerous variations and modifications, and that all of the details mentioned can be replaced by other technically equivalent details without departing from the scope of protection defined by the attached claims.