VISION SYSTEM FOR A MOTOR VEHICLE

20230171510 · 2023-06-01

    Inventors

    Cpc classification

    International classification

    Abstract

    A vision system (10) for a motor vehicle comprises an imaging apparatus (11) adapted to capture images from a surrounding of the motor vehicle, and a data processing unit (14) adapted to perform image processing on images captured by said imaging apparatus (11) in order to detect objects in the surrounding of the motor vehicle. The data processing unit (14) comprises a flicker mitigation software module (33) adapted to generate a flicker mitigated current image (30′) for a current image frame by filter processing involving a captured current image (30.sub.N+1) corresponding to the current image frame and at least one captured earlier image (30.sub.N) corresponding to an earlier image frame.

    Claims

    1-15. (canceled)

    16. A vision system for a motor vehicle, comprising: a memory; and a processor communicatively coupled to the memory and configured to: receive, from a camera, a first image frame and a second image frame; process the first image frame and the second image frame to detect objects within the first image frame and the second image frame; and generate a flicker mitigated current image based on filter processing the first image frame and the second image frame.

    17. The vision system of claim 16, wherein the processor is configured to: detect a light source in the first image frame and the second image frame; and time filter a region around the detected light source in the first image frame and the second image frame.

    18. The vision system of claim 17, wherein the processor is configured to blend a first image region around the detected light source in the first image frame with a corresponding second image region in the second image frame.

    19. The vision system of claim 18, wherein the processor is configured to blend the first image region with the second image region based on first and second weights.

    20. The vision system of claim 19, wherein the first and second weights vary within the first and second image regions.

    21. The vision system of claim 20, wherein the first and second weights vary monotonically from a center to an edge of the first and second image regions.

    22. The vision system of claim 18, wherein the processor is configured to determine which of the first image region and the second image region has at least one of a higher brightness and pre-defined color.

    23. The vision system of claim 17, wherein the processor is configured to blend the second image region around the detected light source in the second image frame over the first image region in the first image frame, wherein the detected light source is visible in the second image frame and not visible in the first image frame.

    24. The vision system of claim 17, wherein the processor is configured to track the detected light source over a plurality of image frames comprising the first image frame and the second image frame.

    25. The vision system of claim 17, wherein the processor is configured to predict the position of the detected light source in a future image frame.

    26. The vision system of claim 16, wherein the processor is configured to: calculate a spatially low pass filtered difference image between the first image frame and the second image frame; and compensate the first image frame based on the spatially low pass filtered difference image.

    27. The vision system of claim 26, wherein the processor is configured to calculate the spatially low pass filtered difference image based on a color intensity of the first image frame and the second image frame.

    28. The vision system of claim 27, wherein the processor is configured to calculate the spatially low pass filtered difference image between a green pixel intensity of the first image frame and a green pixel intensity of the second image frame.

    29. The vision system of claim 16, wherein the camera is configured to capture a plurality of image frames comprising the first image frame and the second image frame at a plurality of exposure settings, and wherein the processor is configured to generate flicker mitigated images from the captured images based on the plurality of exposure settings, the flicker mitigated images comprising the flicker mitigated current image.

    30. The vision system of claim 16, wherein the processor is configured to resample the second image frame before the filter processing to compensate for movement of the motor vehicle from a first time associated with the second image frame to a second time associated with the first image frame.

    31. The vision system of claim 16, comprising the camera.

    32. A method by at least one processor, the method comprising: receiving, from a camera, a first image frame and a second image frame; processing the first image frame and the second image frame to detect objects within the first image frame and the second image frame; and generating a flicker mitigated current image based on filter processing the first image frame and the second image frame.

    33. The method of claim 32, comprising: detecting a light source in the first image frame and the second image frame; and time filtering a region around the detected light source in the first image frame and the second image frame.

    34. The method of claim 33, comprising blending a first image region around the detected light source in the first image frame with a corresponding second image region in the second image frame.

    35. The method of claim 34, comprising blending the first image region with the second image region based on first and second weights.

    Description

    [0032] In the following the invention shall be illustrated on the basis of preferred embodiments with reference to the accompanying drawings, wherein:

    [0033] FIG. 1 shows a scheme of an on-board vision system;

    [0034] FIG. 2 shows a drawing for illustrating the LED flicker effect in a video stream;

    [0035] FIG. 3 shows a flow diagram illustrating image processing according to a first embodiment of the invention;

    [0036] FIGS. 4, 5 show captured images corresponding to consecutive image frames;

    [0037] FIG. 6 shows a flicker mitigated image;

    [0038] FIG. 7 shows a captured image at night time;

    [0039] FIG. 8 shows a diagram with green pixel intensities averaged over an row for five consecutive image frames;

    [0040] FIG. 9 shows a diagram with differences between any two consecutive curves of FIG. 8;

    [0041] FIG. 10 shows a 2D spatially low pass filtered difference image between a captured current image and a captured earlier image; and

    [0042] FIG. 11 shows a flicker mitigated current image generated by compensating the captured current image with the 2D spatially low pass filtered difference image of FIG. 11.

    [0043] The on-board vision system 10 is mounted, or to be mounted, in or to a motor vehicle and comprises an imaging apparatus 11 for capturing images of a region surrounding the motor vehicle, for example a region in front of the motor vehicle. The imaging apparatus 11, or parts thereof, may be mounted for example behind the vehicle windscreen or windshield, in a vehicle headlight, and/or in the radiator grille. Preferably the imaging apparatus 11 comprises one or more optical imaging devices 12, in particular cameras, preferably operating in the visible wavelength range, or in the infrared wavelength range, or in both visible and infrared wavelength range. In some embodiments the imaging apparatus 11 comprises a plurality of imaging devices 12 in particular forming a stereo imaging apparatus 11. In other embodiments only one imaging device 12 forming a mono imaging apparatus 11 can be used. Each imaging device 12 preferably is a fixed-focus camera, where the focal length f of the lens objective is constant and cannot be varied.

    [0044] The imaging apparatus 11 is coupled to an on-board data processing unit 14 (or electronic control unit, ECU) adapted to process the image data received from the imaging apparatus 11. The data processing unit 14 is preferably a digital device which is programmed or programmable and preferably comprises a microprocessor, a microcontroller, a digital signal processor (DSP), and/or a microprocessor part in a System-On-Chip (SoC) device, and preferably has access to, or comprises, a digital data memory 25. The data processing unit 14 may comprise a dedicated hardware device, like a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Graphics Processing Unit (GPU) or an FPGA and/or ASIC and/or GPU part in a System-On-Chip (SoC) device, for performing certain functions, for example controlling the capture of images by the imaging apparatus 11, receiving the electrical signal containing the image information from the imaging apparatus 11, rectifying or warping pairs of left/right images into alignment and/or creating disparity or depth images. The data processing unit 14 may be connected to the imaging apparatus 11 via a separate cable or a vehicle data bus. In another embodiment the ECU and one or more of the imaging devices 12 can be integrated into a single unit, where a one box solution including the ECU and all imaging devices 12 can be preferred. All steps from imaging, image processing to possible activation or control of a safety device 18 are performed automatically and continuously during driving in real time.

    [0045] Image and data processing carried out in the data processing unit 14 advantageously comprises identifying and preferably also classifying possible objects (object candidates) in front of the motor vehicle, such as pedestrians, other vehicles, bi-cyclists and/or large animals, tracking over time the position of objects or object candidates identified in the captured images, and activating or controlling at least one safety device 18 depending on an estimation performed with respect to a tracked object, for example on an estimated collision probability.

    [0046] The safety device 18 may comprise at least one active safety device and/or at least one passive safety device. In particular, the safety device 18 may comprise one or more of: at least one safety belt tensioner, at least one passenger air-bag, one or more restraint systems such as occupant airbags, a hood lifter, an electronic stability system, at least one dynamic vehicle control system, such as a brake control system and/or a steering control system, a speed control system; a display device to display information relating to a detected object; a warning device adapted to provide a warning to a driver by suitable optical, acoustical and/or haptic warning signals.

    [0047] The invention is applicable to autonomous driving, where the ego vehicle is an autonomous vehicle adapted to drive partly or fully autonomously or automatically, and driving actions of the driver are partially and/or completely replaced or executed by the ego vehicle.

    [0048] The problem underlying the present invention is illustrated in FIG. 2, which has been taken from B. Deegan, “LED flicker: root cause, impact and measurement for automotive imaging applications”, IS&T Electronic Imaging, Autonomous Vehicles and Machines 2018, p. 146-1 to 146-6. It displays an LED traffic light signalling red in two consecutive time frames N and N+1. The LED pulse scheme of the traffic light is shown in the second line under the traffic lights. In the last line, the exposure scheme of the imaging device 12 (more specifically, of the imaging sensor in the camera 12) is shown. In time frame N, the exposure time of the imaging sensor overlaps the LED pulse ON, such that the red light is visible in the image of time frame N. However, in time frame N+1, there is no overlap between the exposure time and the LED pulse ON, since the exposure time lies completely in the blanking interval of the imaging sensor. Consequently, time frame N+1 completely misses the LED pulses, and the traffic light appears completely OFF in time frame N+1, which causes an unwanted flicker effect in the video stream.

    [0049] On order to solve the above problem, the data processing unit 14 comprises a flicker mitigation software module 20 adapted to generate a flicker mitigated current image for a current image frame by filter processing involving a captured current image corresponding to the current image frame and at least one captured earlier image corresponding to an earlier image frame. This is explained in the following for two basic embodiments of the invention. The flicker mitigation software module 20 has access to the data memory 25 where the one or more earlier images needed for the flicker mitigation are stored for use in the current time frame processing.

    [0050] A first basic embodiment of the invention is explained with reference to FIGS. 3 to 6. In FIG. 3 image processing in the data processing unit 14 is illustrated in a flow diagram. Images 30 captured by the imaging apparatus is input to a light source detector 31 which is adapted to detect light sources, like traffic lights, traffic signs and/or other vehicles headlights or backlights in the images 30.

    [0051] A simple practical example of two images 30.sub.N, 30.sub.N+1 corresponding to consecutive time frames N and N+1 is shown in FIGS. 4 and 5, where N+1 is the current image frame, such that FIG. 5 shows the captured current image 30.sub.N+1, and N is the last time frame before the current time frame, such that FIG. 4 shows the captured earlier image 30.sub.N. Two traffic lights for a level crossing are visible, where the light source detector 31 is adapted to detect these traffic lights and output a so-called bounding box 40.sub.N, 41.sub.N (40.sub.N+1, 41.sub.N+1) for each detected light source or traffic light, which limits a small, usually rectangular image region around and including said detected light sources. The image region within a bounding box 40.sub.N, 41.sub.N (40.sub.N+1, 41.sub.N+1) defines the corresponding region-of-interest (ROI) of the corresponding traffic light in the flicker mitigation processing. In the following, the terms “bounding box” and “ROI” are used synonymously, where it should be understood that an ROI is actually an image region (or an image patch, i.e. an image content) within boundaries defined by the bounding box.

    [0052] By comparing FIGS. 4 and 5, it is evident that FIG. 4 corresponds to an ON phase of the LED light pulse of the traffic lights, such that the traffic lights are brightly visible, while FIG. 5 corresponds to an OFF phase of the LED light pulse, such that the green traffic lights are barely visible in the captured current image 30.sub.N+1 shown in FIG. 5, although the traffic lights are actually on (green lights). This leads to a disadvantageous flicker in a video comprising the time frames . . . , N, N+1, . . . .

    [0053] The light source detector 31 outputs information relating to the bounding boxes 40, 41, like position and size of these, and the image patches (ROIs) limited by the bounding boxes, to an optional light source tracker 32. The light source tracker 32, if present, is adapted to track the detected light sources over several time frames, and to output corresponding bounding box information 40, 41. For example, FIG. 5 shows an image from the same imaging apparatus 11 as FIG. 4 but corresponding to the next image frames N+1. The light source tracker 32 is adapted to track the traffic lights of FIG. 4 also in the image of the consecutive image frame N+1 (FIG. 5) and determine corresponding bounding boxes 40.sub.N+1, 41.sub.N+1 also in FIG. 5. Of course, detected light sources may be tracked over more than two consecutive image frames.

    [0054] The light source detector 31 and the light source tracker 32 are software modules similar to conventional object detectors and trackers for detecting and tracking objects like for example other vehicles, pedestrians etc., and may be known per se.

    [0055] All information on bounding boxes 40.sub.N, 41.sub.N, 40.sub.N+1, 41.sub.N+1 of consecutive image frames N, N+1, . . . , are forwarded to a flicker mitigation software module 33. The flicker mitigation software module 33 takes the region of interest (ROI) of the traffic light from time frame N (image region in bounding box 40.sub.N and 41.sub.N, respectively), and resamples the ROI of time frame N to the size of the traffic light ROI in the time frame N+1 (image region in bounding box 40.sub.N+1 and 41.sub.N+1, respectively).

    [0056] In one embodiment, the flicker mitigation software module 33 calculates an average ROI 40′.sub.N+1, 41′.sub.N+1 from the resampled ROI of time frame N and the ROI of time frame N+1, where calculating an average ROI means calculating an average z value (RGB value, greyscale value or intensity value) of each pixel of the ROI. The flicker mitigation software module 33 then creates a flicker mitigated current image 30′.sub.N+1 by taking the captured current image 30.sub.N+1 everywhere outside the ROIs of detected light sources (here, everywhere outside the ROIs 40.sub.N+1, 41.sub.N+1); while filling in the averaged ROIs 40′.sub.N+1, 41′.sub.N+1 into the bounding boxes of the of the detected light sources.

    [0057] As a result, the flicker mitigated current image 30′.sub.N+1 shown in FIG. 6 is obtained, where the traffic lights are much better visible than in the captured (non-flicker mitigated) current image 30′.sub.N+1 shown in FIG. 5, such that flicker in a video comprising the time frames . . . , N, N+1, . . . can be strongly reduced. Flicker mitigates images 30′ are output by said flicker mitigation software module 33, see FIG. 3.

    [0058] In another embodiment, the flicker mitigation software module 33 comprises a brightness and/or color detector which is adapted to detect the brightness and/or color (like green/orange/red in the case of traffic lights) of the detected light sources in the ROIs 40.sub.N, 41.sub.N, 40.sub.N+1, 41.sub.N+1, and to decide which of the ROIs 40.sub.N, 41.sub.N, 40.sub.N+1, 41.sub.N+1 is preferable. In the example of FIGS. 4 and 5, the brightness and/or color detector would be able to detect that the ROIs 40.sub.N, 41.sub.N are bright and green (corresponding to green traffic light), while the ROIs 40.sub.N+1, 41.sub.N+1 are essentially dark. Therefore, the brightness and/or color detector decides that the ROIs 40.sub.N, 41.sub.N are preferable over the ROIs 40.sub.N+1, 41.sub.N+1. The flicker mitigation software module 33 then creates a flicker mitigated current image 30′.sub.N+1 by taking the captured current image 30.sub.N+1 everywhere outside the ROIs of detected light sources (here, everywhere outside the ROIs 40.sub.N+1, 41.sub.N+1); while filling in the brighter and/or colored, and therefore preferred, ROIs 40.sub.N, 41.sub.N into the bounding boxes of the of the detected light sources. As a result, a flicker mitigated current image is obtained, where the traffic lights are very well visible (like in FIG. 4), such that flicker in a video comprising the time frames . . . , N, N+1, . . . can be strongly reduced or even eliminated.

    [0059] In a second basic embodiment of the invention, the flicker mitigation software module 33 is adapted to calculate a spatially low pass filtered difference image between a captured current image 30.sub.N+1 and a captured earlier image 30.sub.N; and preferably to compensate the captured current image 30.sub.N+1 on the basis of the calculated spatially low pass filtered difference image.

    [0060] The second basic embodiment of the invention is described in the following with reference to FIGS. 7 to 11.

    [0061] FIG. 7 shows a captured image 30 of a city scene with a fairly uniform illumination of the scene. As an example, it can be assumed that the street lights are powered by 50 Hz.

    [0062] Before coming to the general case, a simple example with a fairly uniform illumination of the scene will be investigated for a better understanding. Here, the flicker mitigation software module 33 is adapted to calculate the mean (average) of the green pixel intensity (in an RGB color sensor) over every image row of captured images 30 like the one shown in FIG. 7. The result is shown in FIG. 8 for five consecutive image or time frames (frames 1-5), where the y axis denotes the green pixel (intensity) value intensity averaged over an image row, for example as given by the Least Significant Bit (LSB), and the x-axis denotes the row number. Since the streetlights in the scene in this example flicker with 100 Hz (50 Hz net frequency), similar row mean intensity values are obtained for all the odd frames (1, 3, 5 in the plot) and other similar row mean intensity values for the even frames (2 and 4 in the plot). This is expected due to the relationship between the net frequency and the camera 12 frequency.

    [0063] The flicker mitigation software module 33 is adapted to calculate the differences between the row mean intensity values (row mean differences) for consecutive frames. The corresponding differences between the row mean intensity values of image frames 1 and 2, frames 2 and 3, frames 3 and 4, and frames 4 and 5 of FIG. 8 are shown in FIG. 9, where the y axis denotes the difference of the curves of FIG. 8 for two consecutive frames, and the x axis again denotes the row number. By low pass filtering the row mean differences, the solid curves in FIG. 9 are obtained. Here, a clear pattern due to the camera frame rate and rolling shutter line time compared to the net frequency driving the street lights is visible.

    [0064] Generalizing the above, the following compensation scheme performed in the flicker mitigation software module 33 is suited for removing the flicker/banding in a perfectly even illuminated scene: [0065] Calculate the green pixel intensity averaged over an image row (row mean) for consecutive frames N+1 and N; [0066] Calculate the row mean difference between frame N+1 and frame N; [0067] spatially low pass filter the row mean difference; [0068] compensate frame N+1 with half of the spatially low pass filtered row mean difference.

    [0069] In reality there can be much more varying illumination in a scene. Therefore, instead of calculating one compensation value per row (1D compensation), the flicker mitigation software module 33 should preferably be adapted to perform a 2D compensation. In a similar fashion like above, green pixel intensity differences between two frames are calculated by the flicker mitigation software module 33 in a 2D fashion (instead of 1D). This can be done in several ways, e.g.: [0070] A. Calculate a complete 2D difference image from image N and N+1. Spatially low pass filter it. An example of a complete low pass filtered 2D difference image for the scene of FIG. 7 is shown in FIG. 10. Use the low pass filtered complete 2D difference image for compensation. An example of the compensated current image for the scene of FIG. 7, where the compensation has been performed on the basis of the complete low pass filtered 2D difference, is shown in FIG. 1. In the scene of FIG. 11, there are strong down-ward facing streetlights giving local flicker in the scene without flicker mitigation. [0071] B. Divide the image into sub-regions (e.g. 64 px×32 px sub-regions) and calculate pixel mean values for these regions. Calculate a (64×32) px difference sub-image between the two sub-images corresponding to the sub-regions using the regional averages. Optionally perform spatial low pass filtering. Perform compensation of the captured current image N+1 by interpolating the small (64×32) px difference image.

    [0072] When the vehicle is moving, subsequent images N and N+1 capture a slightly different view of the environment since the camera has moved relative to the environment. This can preferably be compensated by resampling image N before calculating the difference image. This will be more computationally efficient when using approach B above compared to approach A, since a lower resolution image, the sub-region image, needs to be resampled compared to resampling the full resolution image.

    [0073] The pixel resampling locations can be calculated from, e.g., optical flow or from a model of the environment, or from a combination thereof. The model would use camera calibration and the vehicle movement. Vehicle movement can be known from vehicle signals like speed and yaw rate, or be calculated from visual odometry. The most simple model of the environment is a flat world model, where the ground is flat and nothing exists above the ground. Several models could be used, e.g. a tunnel model can be used when driving in a tunnel.