IMAGE STITCHING IN THE PRESENCE OF A FULL FIELD OF VIEW REFERENCE IMAGE
20230065883 · 2023-03-02
Inventors
- Paz Ilan (Tel Aviv, IL)
- Shai Vaisman (Tel Aviv, IL)
- Ruthy Katz (Tel Aviv, IL)
- Adi Teitel (Tel Aviv, IL)
- Hagai Tzafrir (Tel Aviv, IL)
- Gal Shabtay (Tel Aviv, IL)
- Oded Gigushinski (Tel Aviv, IL)
- Itamar Azoulai (Tel Aviv, IL)
Cpc classification
H04N23/81
ELECTRICITY
H04N5/2628
ELECTRICITY
H04N23/10
ELECTRICITY
G02B13/02
PHYSICS
G06T3/4038
PHYSICS
H04N5/2625
ELECTRICITY
H04N23/90
ELECTRICITY
G03B37/04
PHYSICS
H04N23/69
ELECTRICITY
H04N23/951
ELECTRICITY
International classification
Abstract
Systems and methods for obtaining a seamless, high resolution, large field of view image comprise capturing a plurality of Tele images in a scene using a scanning Tele camera, each captured Tele image having an associated Tele field of view FOV.sub.T, retrieving a R image having a respective R image scene with a field of view greater than FOV.sub.T, analyzing the R image for defining an order of scanning positions according to which the folded Tele camera scans a scene to capture the plurality of Tele images, aligning the plurality of Tele images and the R image to obtain aligned Tele images, and composing the aligned Tele images into an output image. The output image may include at least parts of the R image and may be one of a stream of output images.
Claims
1. A method, comprising: providing a folded Tele camera configured to scan and capture a plurality of Tele images, each captured image having a Tele image resolution (RES.sub.T), a Tele image signal-to-noise-ratio (SNR.sub.T) and a Tele field of view (FOV.sub.T); obtaining and analyzing a reference (R) image with a R field of view FOV.sub.R > FOV.sub.T having a R image resolution RES.sub.R < RES.sub.T, and/or a R image with a signal-to-noise-ratio SNR.sub.R < SNR.sub.T; determining an order of one or more scanning FOV.sub.T positions for consecutive captures of the Tele images; capturing a Tele image at each respective scanning FOV.sub.T position; aligning the captured Tele images with segments of the R image to obtain aligned Tele images; and using the aligned Tele images and the R image to create a new image having a field of view FOV.sub.N ≤ FOV.sub.R, wherein the image resolution of the new image RES.sub.N > RES.sub.R and/or wherein the SNR of the new image SNR.sub.N > SNR.sub.R, wherein the determining an order of one or more scanning FOVT positions is performed so that a desired coverage of FOV.sub.R with a plurality of FOV.sub.TS is performed in a fastest manner.
2. (canceled)
3. (canceled)
4. The method of claim 1, further comprising aligning each Tele image with the R image immediately after its capture and prior to the capture of an immediately following Tele image, analyzing each Tele image for faults, and if faults are detected in the Tele image, re-capturing the Tele image at a same FOV.sub.T position, or, if faults are not detected in the Tele image, proceeding to capture an immediately following Tele image at a respective FOV.sub.T position.
5. The method of claim 1, further comprising analyzing the aligned Tele images for faults, and if faults are detected in a particular Tele image, re-capturing the particular Tele image at a same FOV.sub.T position, or, if faults are not detected, using the aligned Tele images and the R image to create the new image.
6. (canceled)
7. The method of claim 1, wherein the aligned Tele images and the R image are fed into an algorithm to create a super wide image having a field of view FOVsw, wherein a FOV segment within FOV.sub.R included in at least one FOV.sub.T of the captured Tele images has a field-of-view union-FOV.sub.T, and wherein union-FOVr < FOVsw ≤ FOV.sub.R.
8-15. (canceled)
16. The system of claim 1, wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that each of the one and more Tele images exhibits a specific amount of natural Bokeh.
17-19. (canceled)
20. A method, comprising: providing a folded Tele camera configured to scan and capture a plurality of Tele images, each captured image having a Tele image resolution (RES.sub.T), a Tele image signal-to-noise-ratio (SNR.sub.T) and a Tele field of view (FOV.sub.T); obtaining and analyzing a reference (R) image with a R field of view FOV.sub.R > FOV.sub.T having a R image resolution RES.sub.R < RES.sub.T, and/or a R image with a signal-to-noise-ratio SNR.sub.R < SNR.sub.T; determining an order of one or more scanning FOV.sub.T positions for consecutive captures of the Tele images; capturing a Tele image at each respective scanning FOV.sub.T position; aligning the captured Tele images with segments of the R image to obtain aligned Tele images; and using the aligned Tele images and the R image to create a new image having a field of view FOV.sub.N < FOV.sub.R, wherein the image resolution of the new image RES.sub.N > RES.sub.R and/or wherein the SNR of the new image SNR.sub.N > SNR.sub.R wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that the composed new image covers a maximal FOV according to a mechanical limitation of the scanning.
21. The method of claim 1, wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that the new image covers a region of interest selected by a user.
22. The method of claim 1, wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that the new image covers a region of interest defined by an algorithm.
23. The method of claim 1, wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that each T images include scene segments having a specific depth range or include scene segments that do not exceed a specific depth threshold.
24. The method of claim 1, wherein the determining an order of one or more scanning FOV.sub.T positions is performed so that first moving objects are captured, and after the moving objects are captured, stationary objects are captured.
25. (canceled)
26. A method, comprising: providing a folded Tele camera configured to scan and capture a plurality of Tele images, each captured image having a Tele image resolution (RES.sub.T), a Tele image signal-to-noise-ratio (SNR.sub.T) and a Tele field of view (FOV.sub.T); obtaining and analyzing a reference (R) image with a R field of view FOV.sub.R > FOV.sub.T having a R image resolution RES.sub.R < RES.sub.T, and/or a R image with a signal-to-noise-ratio SNR.sub.R < SNR.sub.T; determining an order of one or more scanning FOV.sub.T positions for consecutive captures of the Tele images; capturing a Tele image at each respective scanning FOV.sub.T position; aligning the captured Tele images with segments of the R image to obtain aligned Tele images; and using the aligned Tele images and the R image to create a new image having a field of view FOV.sub.N < FOV.sub.R, wherein the image resolution of the new image RES.sub.N > RES.sub.R and/or wherein the SNR of the new image SNR.sub.N > SNR.sub.R, and wherein the determining an order of two or more FOV.sub.T positions is performed so that capturing a minimal number of T images is required.
27. The method of claim 1, wherein the determining an order of two or more FOV.sub.T positions is performed so that Tele images including specific scene characteristics within their respective FOV.sub.Ts may be captured consecutively, and wherein the scene characteristics may be visual data such as texture or physical data such as brightness, depth or spectroscopic composition of a scene.
28-35. (canceled)
36. The method of claim 4 5, wherein the faults are selected from the group consisting of motion blur, electronic noise, rolling shutter, defocus blur and incorrect image alignment or obstructions.
37. The method of claim 4 , wherein the faults are mechanical faults.
38. (canceled)
39. The method of claim 1, wherein the folded Tele camera captures two or more Tele images at two or more respective FOV.sub.T positions within FOV.sub.R, wherein the determining an order of two or more scanning FOV.sub.T positions is performed so that a moving object is removed from a scene included in FOV.sub.R.
40-42. (canceled)
43. The method of claim 7, wherein the determining an order of one or more scanning FOV.sub.T positions includes capturing an object in a Tele image with specific FOV.sub.T to improve RES or SNR of a similar object included in FOV.sub.N but not included in the specific FOV.sub.T.
44-46. (canceled)
47. The method of claim 1, wherein the folded Tele camera is a multi-zoom Tele camera having different zoom states for capturing Tele images having different respective zoom factors (ZF), and wherein the R image is a Tele image having a first ZF (ZF1), wherein the Tele images that are captured consecutively according to the order have a second zoom factor (ZF2), and wherein ZFl≤1.25xZF2.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0052] Non-limiting examples of embodiments disclosed herein are described below with reference to figures attached hereto that are listed following this paragraph. Identical structures, elements or parts that appear in more than one figure are generally labeled with a same numeral in all the figures in which they appear. If identical elements are shown but numbered in only one figure, it is assumed that they have the same number in all figures in which they appear. The drawings and descriptions are meant to illuminate and clarify embodiments disclosed herein and should not be considered limiting in any way. In the drawings:
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
DETAILED DESCRIPTION
[0079] Returning now to the figures,
[0080] In step 408, a subsequent Tele image is acquired (captured) using the scanning position selected or updated in step 406. For a SIM, the subsequently acquired Tele image is aligned with previously found Tele images that have some shared FOV and with the R image in step 410 to obtain an aligned Tele image. For a SWM, the subsequently acquired Tele image is aligned with the R image in step 410 to obtain an aligned Tele image. The aligned Tele image is analyzed for faults in step 412 and, based on the detected faults, a subsequent scanning position is updated by returning to step 406. Steps 406-412 are repeated until the desired coverage of the R image has been achieved. Afterwards, the SI or SW are composed as described in
[0081] In some embodiments, image composition step 414 may be performed after all the Tele images are acquired and aligned as described above. In other embodiments, image composition step 414 may be performed after each iteration of Tele image acquisition and image alignment steps 406-412, to perform “on the fly” blending with intermediate viable results. In such embodiments, a SI exists after each iteration of steps 406-412.
[0082]
[0083] Steps 432 - 440 describe the process of aligning the T images captured in step 430 with the R image retrieved in step 422. Further details on the image alignment are described in
[0084] In step 442, the R image and the aligned T images are fed into a super-resolution algorithm. Relevant super-resolution algorithms are described for example in Daniel Glasner et al., "Super-Resolution from a Single Image", ICCV, 2009, Tamar Rott Shaham et al., "SinGAN: Learning a Generative Model from a Single Natural Image", ICCV, 2019, arXiv:1905.01164, or Assaf Shocher et al.,"Zero-Shot Super-Resolution using Deep Internal Learning", 2017, arXiv:1712.06087.
[0085] A new image having RES.sub.N > RES.sub.R and/or SNR.sub.N > SNR.sub.R is output in step 444. In general, FOV.sub.N is larger than the union of all FOV.sub.TS that are fed into the super-resolution algorithm in step 442, i.e. FOV.sub.N>union-FOV.sub.T. Union-FOV.sub.T represents the FOV within FOV.sub.R which is included in at least one FOV.sub.T of one of the T images captured in step 428.
[0086] The FOV.sub.T scanning may be performed by actuating (e.g. for rotation) one or more optical path folding elements (OPFEs) of the scanning Tele camera. Fast actuation may be desired. Actuation may be performed in 2-20 ms for scanning e.g. 2°-5° and in 10-70 ms for scanning 15-25°. A scanning Tele camera may have a maximal diagonal scanning range of 60°. “Maximal diagonal scanning range” is defined by the center of the FOV in the maximum state bottom-left of a center FOV and the center of the FOV in the maximum state top-right of a center FOV. For example and referring to FOV diagonal, a scanning T camera having FOV.sub.T=20° and 60° scanning range covers an overall FOV of 80°. A diagonal scanning range of 40° may cover around 60-100% of a FOVw. The scanning Tele camera may have an of EFL=7mm-40 mm. Typical zoom factors (ZF) may be 2x-10x zoom with respect to a W camera hosted in the same mobile device, meaning that an image of a same object captured at a same distance is projected at a size 2x-10x larger on the image sensor of the T camera than on the W camera. Assuming that a same sensor is used in R camera and T camera, the image resolution scales linearly with the ZF. For same sensors, typically, RES.sub.T>2x RES.sub.W. In some examples, RES.sub.T>5x RES.sub.W.
[0087]
[0088]
[0089]
[0090]
[0091]
[0092] It is noted that determining a scanning order includes determining the respective FOV.sub.T position, meaning that FOV.sub.T positions and their scanning order are determined.
[0093]
[0094] In other embodiments for SIM and SWM, the scanning positions may be determined based on the maximal coverage of an object of interest or ROI as obtained from an algorithm, e.g. from a Saliency map, for example as described in “Salient Object Detection: A Discriminative Regional Feature Integration Approach” by Jiang et al. or as in “You Only Look Once: Unified, Real-Time Object Detection” by Redmon et al. The FOV of a SI or a SW may be selected based on the Saliency map.
[0095] In yet other embodiments for SIM, the scanning positions may be determined such that specific features within an ROI are located in a center region of a FOV.sub.T and not in an overlap region. A specific feature may be for example the face of a person. Locating specific features in a center region may avoid stitching artifacts in the SI' s FOV segments where the ROI is located, e.g. by applying “stitching seams” in the FOV covered by the specific feature.
[0096] In yet other embodiments for SIM and SWM, scanning positions may be determined so that a minimal number of T image captures is required for a given selected ROI covering a particular FOV which is larger than FOV.sub.T, e.g. for reducing power consumption and capture time.
[0097] In yet other embodiments for SIM and SWM, a criterion for determining an order of scanning position may be based on artistic or visual effects such as e.g. a desired amount of natural Bokeh. The amount of natural Bokeh depends on differences in the object-lens distance of foreground objects (in-focus) and background objects (out-of-focus). A scanning position criterion may e.g. be an image background with uniform natural Bokeh.
[0098] In yet other embodiments for SIM and SWM, a criterion for determining an order of scanning position may be based on desired data for computational photography. Such data may be for example stereo image data including T image data and image data from the R image. From stereo image data of a single FOV.sub.T and the overlapping image FOV segment of the FOV.sub.R, a stereo depth map covering FOV.sub.T may be calculated as known in the art, e.g. by triangulation. The stereo depth map may enable application of artificial Bokeh algorithms to the R image or to the SI. In some embodiments, the SI output in step 414 may not be an image including visual data, but an output that includes stereo depth data.
[0099] In other embodiments, a scanning order criterion may include desired artistic SI effects. Such effects may be created by synchronizing T image capture and FOV scanning, wherein capture happens during FOV movement, so that a motion blur effect in the T image is achieved. For this, a scanning order criterion may be a desired amount of motion blur of a specific scene segment.
[0100] In yet other embodiments for SIM and SWM, a criterion for scanning position determination may be based on a depth estimation of the scene included in the R image. For example, one may select scanning positions so that single T images include scene segments having a specific depth range (i.e. a specific camera-object distance range) or include scene segments that do not exceed a specific depth threshold. In another example, one may select scanning positions so that single T images include ROIs covering a particular FOV size. As an example, a scanning order criterion may be to capture scene segments having similar depths or including ROIs of particular FOV sizes consecutively. This may be beneficial for a scanning camera that may have not one fixed FOV (i.e. zoom state) but different FOVs (zoom states). For fast SI or SW capture, one may prefer to capture FOV segments with identical zoom states consecutively (sequentially), as it may e.g. be desired to minimize number of (time-consuming) zoom state switches. As another example, a scanning order criterion may be to capture scene segments having similar depths consecutively, because this may minimize the amount of time required for re-focusing the T camera between single T image captures and may also facilitate the alignment of the T images.
[0101] In yet another embodiment for SIM and SWM, a scanning order criterion may be that T images comprising specific scene characteristics within their respective FOV.sub.TS may be captured consecutively. In some embodiments, T images with similar scene characteristics within their respective FOV.sub.TS may be captured consecutively. Scene characteristics may be visual data such as texture. Scene characteristics may be physical data such as brightness, depth or spectroscopic composition of a scene. A spectroscopic composition may be defined by the intensity values of all wavelengths present in the scene.
[0102]
[0103] One can determine the order of capturing the T images such that the moving object will not appear in the scene at all, as illustrated in
[0104] The T scanning order (i.e. the scanning order criteria) may alternatively be based on camera or scene properties. In some embodiments, a scanning order criterion may be based on fast SI capture. In some embodiments, the SI output in step 414 or the SW output in step 444 may not be an image including visual data, but it may be an output including spectroscopic data, stereo depth data or other image data that is generated by computational photography or physical analysis.
[0105] In some embodiments, a plurality of sub-SIs that form a single SI may be captured in the FOV of a R image simultaneously, i.e. in a single capture process as described in
[0106]
[0107] In contrast with SIM, in a SWM for increasing RES or SNR in a segment of FOV.sub.R one must not necessarily capture a T image having a FOV.sub.T that includes this very FOV.sub.R segment. It may be sufficient to capture a T image that includes similar features present in the same scene. As an example and with reference to
[0108] Furthermore, for SWM the T images must not necessarily be aligned with each other, but only with the R image. Therefore, the captured T images must not necessarily include an overlapping FOV, which is required for SIM.
[0109] There are several options for determining a T scanning order, as follows.
[0110]
[0111] In another example, a T scanning order is determined so that a desired coverage of FOV.sub.R with a plurality of FOV.sub.T is performed in a fastest manner.
[0112] In yet another example and for a Tele camera which is a multi-zoom camera, a T scanning order is determined so that a desired coverage of FOV.sub.R with a desired zoom factor (ZF) is performed in a fastest manner. A user or an algorithm may select the desired ZF. One criterion for selecting the ZF may be a desired ratio of RES.sub.T/RES.sub.R and/or of SNR.sub.T/SNR.sub.R, another criterion may be a desired FOV.sub.T. In some embodiments, the R image may be a Tele image which is captured with a first ZF (ZF1) and the Tele images that are captured consecutively according to the order have a second ZF (ZF2), wherein ZF1<ZF2, for example ZF1≤l.lxZF2, ZF1≤1.25xZF2, ZF 1≤2xZF2.
[0113] In yet another example and for a Tele camera which is a multi-zoom camera, a T scanning order is determined so that Tele images with a same ZF are captured consecutively. For example, first all Tele images with a particular first ZF (ZF1) are captured, and afterwards all Tele images with a particular second ZF (ZF2) are captured.
[0114]
[0115]
[0116] Some reasons may be related to scene characteristics that were not identified in the R image analysis. Consider for example a bright oscillating light source in FOV.sub.N. The light source may have been "Off' when the R image was captured, but it may have been "On" when the respective T image was captured, causing large differences in the T camera parameters deployed for this T image in contrast to prior or consecutive T images. In such a scenario re-capturing the T image with the light source "Off' may be desired.
[0117] An additional fault reason may relate to mechanical faults, e.g. the OPFE did not reach the desired location accurately, and therefore issues in the alignment of the image may occur and the image needs to be recaptured.
[0118]
[0119] The influence of color correction step 1506 on the SI is shown in
[0120]
[0121]
[0122] Mobile device 1700 may further comprise a R (e.g. W or UW) camera module 1730 with a FOV larger than the FOV of camera module 1710. Camera module 1730 includes a second lens module 1732 that forms an image recorded by a second image sensor 1734. A second lens actuator 1736 may move lens module 1732 for focusing and/or OIS.
[0123] In some embodiments, first calibration data may be stored in a first memory 1722 of a camera module, e.g. in an EEPROM (electrically erasable programmable read only memory). In other embodiments, first calibration data may be stored in a third memory 1750 such as a NVM (nonvolatile memory) of mobile device 1700. The first calibration data may comprise calibration data for calibration between sensors of R camera module 1730 and of T camera module 1710. In some embodiments, second calibration data may be stored in a second memory 1738. In some embodiments, the second calibration data may be stored in third memory 1750. The second calibration data may comprise calibration data between sensors of R camera module 1730 and T camera module 1710.
[0124] Mobile device 1700 may further comprise an application processor (AP) 1740. In use, AP 1740 may receive respective first and second (reference) image data from camera modules 1710 and 1730 and supply camera control signals to camera modules 1710 and 1730. In some embodiments, AP 1740 may receive first image data from camera module 1710 and R image data from third memory 1750. In other embodiments, AP 1740 may receive calibration data stored in a first memory located on camera module 1710 and in a second memory located in camera module 1730. In yet another embodiment, AP 1740 may receive R image data stored in third memory 1750. In yet another embodiment, AP 1740 may retrieve R images from an external database. AP 1740 includes an image analyzer 1742 for analyzing R images (e.g. for scene understanding and defining a Tele scanning order) and T images (e.g. for fault detection), a FOV scanner 1744 that calculates an OPFE control signal (e.g. for implementing a Tele scanning order) and an image generator 1744 for composing new images as outlined in steps 402 - 414 and in steps 1502 -1510 (for SIM) and in steps 422-444 and in steps 1522-1528 (for SWM).
[0125] While this disclosure has been described in terms of certain embodiments and generally associated methods, alterations and permutations of the embodiments and methods will be apparent to those skilled in the art. The disclosure is to be understood as not limited by the specific embodiments described herein.
[0126] All references mentioned in this application are incorporated herein by reference in their entirety. It is emphasized that citation or identification of any reference in this application shall not be construed as an admission that such a reference is available or admitted as prior art.