High quality Lightning resilient segmentation system using active background
20170287140 · 2017-10-05
Inventors
- Sylvain Cardin (Essertines sur Yverdon, CH)
- Julien Vincent (Renens, CH)
- Frederic Condolo (Montreux, CH)
Cpc classification
H04N5/262
ELECTRICITY
H04N5/2226
ELECTRICITY
International classification
Abstract
The present invention refers to the field of video processing, and, in particular, to a system and a method for achieving high quality foreground segmentation using an active background.
The present invention is embodied in a system and a method capable of achieving high quality foreground segmentation using an active background, wherein foreground is any object or person located between a camera and a background. The system is comprising an active background, one or several multispectral cameras, a hardware synchronizer, an invisible light driver and a main computer.
The main features of the system consist of one or several of the following: a. A sub-system acquiring reference images of the active background. b. A sub-system acquiring each video frame images. c. A sub-system performing real-time frame processing d. A sub-system performing noise reduction.
Claims
1. A method, comprising: controlling emission of invisible light from an active background; employing a multispectral camera to record an image from invisible light (IL) received from the active background and to record an image from visible light (VL); and processing the image from the invisible light to determine pixels associated with a foreground located between the active background and the camera.
2. The method according to claim 1, wherein the invisible light is generated from behind the active background which is at least partially translucent.
3. The method according to claim 1, wherein the invisible light is reflected by the active background.
4. The method according to claim 1, further including triggering flashes of the invisible light from the active background.
5. The method according to claim 1, further including controlling the emission of the invisible light based upon a level of visible ambient light.
6. The method according to claim 1, wherein the VL image comprises RGB pixel values and the IL image comprises pixel intensity value for the IL spectrum.
7. The method according to claim 1, further including processing the image using pixel maps including a first reference map of RGB values for each pixel, a second reference map of IL spectrum values without IL emission, and a third reference map of IL spectrum values with IL emission.
8. The method according to claim 7, further including generating a foreground mask by comparing the IL spectrum values for a current frame with the second reference map of IL spectrum values without IL emission.
9. The method according to claim 1, further including generating a foreground mask by comparing IL spectrum values for a current frame with IL spectrum values for a previous frame.
10. The method according to claim 1, further including performing recalibration when lighting conditions have changed by more than a selected threshold.
11. The method according to claim 1, further including synchronizing emission of the IL and image acquisition for the VL image.
12. The method according to claim 11, further including synchronizing a pulse of IL emission and image acquisition for the IL image.
13. The method according to claim 1, further including assigning each pixel as background, foreground or unknown, and processing the unknown pixels to determine an alpha channel corresponding to the VL image.
14. The method according to claim 13, further including determining the alpha channel using a foreground visibility ratio.
15. A system comprising: a backlighting system to selectably provide invisible light emission; a multispectral camera to acquire an invisible light image from the invisible light emitted by the backlighting system and to acquire a visible light image; a signal generator to control the invisible light emission by the backlighting system; and a processing module to process the image from the invisible light to determine pixels associated with a foreground located between the backlighting system and the camera.
16. A system according to claim 15, wherein the backlighting system consists of one or several surfaces emitting or reflecting light in the invisible spectrum
17. A system according to one of claims 15-16, wherein the backlighting system is able to produce short flashes of invisible light by the means of: a programmable hardware trigger signals generator an invisible light driver to power up and control the IL emitter
18. A system of claim 17, wherein the system is able to filter out the invisible light which does not come from the backlighting system
19. A system according to one of the preceding claims, wherein the backlighting system is illuminating in one of the Infra-Red, Near Infra-Red (NIR) or ultra violet spectrums.
20. A system of claim 19, wherein LED strips are used as the IL emitter and a LED driver is used to reach maximal burst electric current during flash.
Description
DESCRIPTION OF THE DRAWINGS
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
DETAILED DESCRIPTION
[0069] In a preferred embodiment, the system is composed of an active background (110), one or several multispectral cameras (120), a hardware synchronizer (130), an invisible light driver (140) and a main computer (150). A general scheme of the components is presented in
[0070] In one embodiment of the invention, a system setup (312) is initially performed. The components can be placed in such configurations as allowing the subject to be located between the camera (120) and the active background (110). One example for such a system configuration is schematically illustrated in
[0071] In one embodiment of the invention, subsequently to the system setup (312) the system can be triggered to start a new recording (314). In a preferred embodiment the whole system can start up by itself once plugged, the main computer (150) auto boot a client application dedicated for the real time processing and the hardware synchronizer (130) always emits its trigger signals. The client application will further send the startup sequence and change the hardware synchronizer (130) mode if needed.
[0072] In one embodiment of the invention, the real time processing is preceded by a calibration step when background reference images (316) are acquired and stored in the main computer (150). This step requires that no foreground object is present. The reference images are acquired both with and without illumination in the VL and IL spectrums. The main computer (150) is creating 3 reference maps, storing for each pixel mean value and standard deviation as follows: the first reference map for the 3 color channels of the visible spectrum (RGB), the second with the IL spectrum values with backlighting on and the third in the IL spectrum with backlighting off. According to some embodiments of the present invention such a recalibration (330) can be executed during the video recording, reference images being automatically recomputed when no foreground object is detected for a given period of time. For example, the recalibration is mostly needed when the lightning conditions are changing drastically. Other situations that could require recalibration includes scenarios when the background have been moved a little bit (happening for example when the subject collide with the background while playing).
[0073] In a preferred embodiment, the real time processing is performed frame by frame (320) by the main computer (150) based on the frame images (318) acquired by the camera (120) in both visible and invisible spectrums. During the IL frame images acquisition (318) the background lightning is ensured either by keeping the IL emitter always switched on or by using the hardware synchronizer (130) to set the timing for the invisible lightning system and the camera exposure, wherein the IL exposure time is kept very short compared to the visible light exposure time. Considering the usage of the hardware synchronizer (130) different scenarios will be further described.
[0074] A step by step method for performing a real time frame processing (320) is schematically illustrated in
[0075] In one embodiment of the invention, in order to compute the alpha-channel (327) the system estimates the ratio of foreground visibility (“alpha”) as follows: “alpha”=(il−bg_il)/(fg_il−bg_il) where bg_il is an estimate of the IL intensity that would be measured if the pixel was showing only the background, and where fg_il is an estimate of the IL intensity that would be measured if the pixel was totally showing the foreground object. One possibility to estimate bg_il is to use the proper reference map taken when no foreground is present (with either backlighting on or off). Another option is to search the closest pixel marked as background in the trimap and to the current frame IL pixel at this location as an estimate for bg_il. One possibility to estimate fg_il is by searching the closest pixel on the trimap marked as foreground, and to use the IL value for the pixel at this location as fg_il. When searching for the nearest pixel different methodologies can be further applied. One possibility could be to accelerate the computation of searching the nearest pixel satisfying some properties by using a Distance Transform algorithm, modified to keep track of which pixel is the closest [Felzenszwalb 2004].
[0076] In one embodiment of the invention, the visibility ratio “alpha” can be used to remove the influence of the background on “unknown” pixels. In one of the embodiments the color “fg” without background influence can be computed in the following way: “fg”=measured/alpha−estimatedBg*(1/alpha−1) where “measured” is the measured RGB pixel from the input image, and where estimatedBg is an estimate of the RGB color of the background at this location. One possible method to estimate the background color for a pixel within the unknown zone of the trimap is to look at the nearest pixel marked as background in the trimap and use this color, or use an average of neighboring colors. Another option is to rely on a color background model. In a preferred embodiment a reference color model showing the background is acquired. It would not be used directly since illumination and camera acquisition settings might have change between background acquisition time and the current frame. It is necessary to estimate illumination locally to correct the color background model. Doing so consists in multiplying the background model pixel at the estimated location by the ratio between a current frame pixel at a close known background location with the background model pixel at the same location.
[0077] According to some embodiments of the present invention the hardware synchronizer (130) is used either in Strobe or Double Strobe modes so that noise reduction (321) can be performed as the first step in frame processing. This can be achieved by comparing two consecutive IL images (one with the background IL on and the other with the background IL off), then subtracting the two consecutive IL images together and storing the absolute value for each pixel. This value will have the ambient IL influence removed and can be further used in the real time processing. For example, when in Double Strobe mode one possibility to achieve this is by acquiring the two IL frames with a very short aperture time (for example close to 200 us) and a negligible delay between them (for example less than 1 ms). Additionally, when in Strobe mode, the last 2 frames are stored. By comparing the current frame with the one taken with the same lighting conditions, movement compensation can be further achieved. In one embodiment of the invention camera movement can be also achieved by using for example external augmented reality engines to compute camera position and movement to find the proper pixel to pixel matching to the reference background model.
[0078] Referring to
[0079] The processes described herein are not limited to use with the hardware and software of
[0080] The system may be implemented, at least in part, via a computer program product, (e.g., in a non-transitory machine-readable storage medium such as, for example, a non-transitory computer-readable medium), for execution by, or to control the operation of, data processing apparatus (e.g., a programmable processor, a computer, or multiple computers). Each such program may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the programs may be implemented in assembly or machine language. The language may be a compiled or an interpreted language and it may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program may be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. A computer program may be stored on a non-transitory machine-readable medium that is readable by a general or special purpose programmable computer for configuring and operating the computer when the non-transitory machine-readable medium is read by the computer to perform the processes described herein. For example, the processes described herein may also be implemented as a non-transitory machine-readable storage medium, configured with a computer program, where upon execution, instructions in the computer program cause the computer to operate in accordance with the processes. A non-transitory machine-readable medium may include but is not limited to a hard drive, compact disc, flash memory, non-volatile memory, volatile memory, magnetic diskette and so forth but does not include a transitory signal per se.
[0081] The processing blocks (for example, in the processes described herein associated with implementing the system may be performed by one or more programmable processors executing one or more computer programs to perform the functions of the system. All or part of the system may be implemented as, special purpose logic circuitry (e.g., an FPGA (field-programmable gate array) and/or an ASIC (application-specific integrated circuit)). All or part of the system may be implemented using electronic hardware circuitry that include electronic devices such as, for example, at least one of a processor, a memory, a programmable logic device or a logic gate.
[0082] Elements of different embodiments described herein may be combined to form other embodiments not specifically set forth above. Various elements, which are described in the context of a single embodiment, may also be provided separately or in any suitable sub combination. Other embodiments not specifically described herein are also within the scope of the following claims.
CITATIONS
[0083] 2007, Davis, James W and Sharma, Vinay [0084] Background-subtraction using contour-based fusion of thermal and visible imagery [Journal: “Computer Vision and Image Understanding”] [0085] 2014, Cerny, J. [0086] System for capturing scene and nir relighting effects in movie postproduction transmission [Patent, publication number: WO2014057335] [0087] 2011, Relyea, D. and Felt, M. [0088] Image compositing via multi-spectral detection [0089] [Patent, publication number: US20110117532] [0090] 2004, Pedro F. Felzenszwalb and Daniel P. Huttenlocher [0091] Distance Transforms of Sampled Functions