DYNAMIC TARGETING OF PREFERRED OBJECTS IN VIDEO STREAM OF SMARTPHONE CAMERA
20220269396 ยท 2022-08-25
Inventors
- Alexander Pashintsev (Cupertino, CA, US)
- Boris Gorbatov (Sunnyvale, CA, US)
- Eugene Livshitz (San Mateo, CA, US)
- Vitaly Glazkov (Moscow, RU)
Cpc classification
G06F1/1694
PHYSICS
G06F3/04842
PHYSICS
G06F3/017
PHYSICS
G06F3/002
PHYSICS
H04M1/72454
ELECTRICITY
G06F1/1626
PHYSICS
International classification
G06F3/04842
PHYSICS
Abstract
Selecting objects in a video stream of a smart phone includes detecting quiescence of frame content in the video stream, detecting objects in a scene corresponding to the frame content, presenting at least one of the objects to a user of the smart phone, and selecting at least one of the objects in a group of objects in response to input by the user. Detecting quiescence of frame content in the video stream may include using motion sensors in the smart phone to determine an amount of movement of the smart phone. Detecting quiescence of frame content in the video stream may include detecting changes in view angles and distances of the smart phone with respect to the scene. Detecting objects in a scene may use heuristics, custom user preferences, and/or specifics of scene layout. At least one of the objects may be a person or a document.
Claims
1. A method of capturing a subset of objects within a video stream captured by an electronic device, the method comprising: receiving a video stream captured by an electronic device; detecting within a frame of the video stream one or more objects for capture; determining a plurality of scenarios based on the one or more objects, wherein each scenario of the plurality of scenarios is a distinct subset of the one or more objects; displaying, via the electronic device, the frame of the video stream in conjunction with a first scenario of the plurality of scenarios; responsive to a first user input rejecting the first scenario of the plurality of scenarios, displaying, via the electronic device, the frame of the video stream in conjunction with a second scenario of the plurality of scenarios; and responsive to a second user input selecting the second scenario of the plurality of scenarios, extracting a respective subset of the one or more objects corresponding to the second scenario from the frame of the video stream.
2. The method of claim 1, further comprising: after determining the plurality of scenarios, pre-selecting the first scenario of the plurality of scenarios to be displayed, via the electronic device, based on a third user input.
3. The method of claim 2, wherein the third user input includes one or more of a change in a view angle and a change in a distance of the electronic device with respect to the one or more objects.
4. The method of claim 1, wherein: displaying the frame of the video stream in conjunction with the first scenario of the plurality of scenarios includes displaying the first scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the first scenario; and displaying the frame of the video stream in conjunction with the second scenario of the plurality of scenarios includes displaying the second scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the second scenario.
5. The method of claim 1, wherein the one or more objects include one or more of a person and a document.
6. The method of claim 1, further comprising: after extracting the respective subset of the one or more objects corresponding to the second scenario from the frame of the video stream, displaying, via the electronic device, the respective subset of the one or more objects corresponding to the second scenario.
7. The method of claim 1, further comprising: after extracting the respective subset of the one or more objects corresponding to the second scenario from the frame of the video stream, displaying, via the electronic device, one or more affordances including: a first affordance that allows the user to store the respective subset of the one or more objects corresponding to the second scenario, and a second affordance that allows the user to share the respective subset of the one or more objects corresponding to the second scenario.
8. The method of claim 1, wherein detecting within the frame of the video stream the one or more objects for capture includes one or more of determining one or more objects in focus, determining one or more objects with a predetermined distance relative to the electronic device, and determining one or more unobstructed objects.
9. The method of claim 1, wherein the first user input includes one or more of selection of an rejection affordance displayed on the electronic device and a rejection gesture including shaking the electronic device left-and-right.
10. The method of claim 1, wherein the second user input includes one or more of selection of an approval affordance displayed on the electronic device, allowing a predetermined amount of time to elapse without moving the electronic device, eye-tracking, spatial gestures, and facial expressions.
11. An electronic device, comprising: one or more processors; and memory storing one or more programs for execution by the one or more processors, the one or more programs including instructions for: receiving a video stream captured by an electronic device; detecting within a frame of the video stream one or more objects for capture; determining a plurality of scenarios based on the one or more objects, wherein each scenario of the plurality of scenarios is a distinct subset of the one or more objects; displaying, via the electronic device, the frame of the video stream in conjunction with a first scenario of the plurality of scenarios; responsive to a first user input rejecting the first scenario of the plurality of scenarios, displaying, via the electronic device, the frame of the video stream in conjunction with a second scenario of the plurality of scenarios; and responsive to a second user input selecting the second scenario of the plurality of scenarios, extracting a respective subset of the one or more objects corresponding to the second scenario from the frame of the video stream.
12. The electronic device of claim 11, wherein the one or more programs further include instructions for: after determining the plurality of scenarios, pre-selecting the first scenario of the plurality of scenarios to be displayed, via the electronic device, based on a third user input.
13. The electronic device of claim 12, wherein the third user input includes one or more of a change in a view angle and a change in a distance of the electronic device with respect to the one or more objects.
14. The electronic device of claim 11, wherein: displaying the frame of the video stream in conjunction with the first scenario of the plurality of scenarios includes displaying the first scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the first scenario; and displaying the frame of the video stream in conjunction with the second scenario of the plurality of scenarios includes displaying the second scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the second scenario.
15. The electronic device of claim 11, wherein the one or more objects include one or more of a person and a document.
16. A non-transitory computer-readable storage medium storing one or more programs for execution by one or more processors of an electronic device, the one or more programs comprising instructions for: receiving a video stream captured by the electronic device; detecting within a frame of the video stream one or more objects for capture; determining a plurality of scenarios based on the one or more objects, wherein each scenario of the plurality of scenarios is a distinct subset of the one or more objects; displaying, via the electronic device, the frame of the video stream in conjunction with a first scenario of the plurality of scenarios; responsive to a first user input rejecting the first scenario of the plurality of scenarios, displaying, via the electronic device, the frame of the video stream in conjunction with a second scenario of the plurality of scenarios; and responsive to a second user input selecting the second scenario of the plurality of scenarios, extracting a respective subset of the one or more objects corresponding to the second scenario from the frame of the video stream.
17. The non-transitory computer-readable storage medium of claim 16, wherein the one or more programs further include instructions for: after determining the plurality of scenarios, pre-selecting the first scenario of the plurality of scenarios to be displayed, via the electronic device, based on a third user input.
18. The non-transitory computer-readable storage medium of claim 17, wherein the third user input includes one or more of a change in a view angle and a change in a distance of the electronic device with respect to the one or more objects.
19. The non-transitory computer-readable storage medium of claim 16, wherein: displaying the frame of the video stream in conjunction with the first scenario of the plurality of scenarios includes displaying the first scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the first scenario; and displaying the frame of the video stream in conjunction with the second scenario of the plurality of scenarios includes displaying the second scenario with an overlay highlighting a respective subset of the one or more objects corresponding to the second scenario.
20. The non-transitory computer-readable storage medium of claim 16, wherein the one or more objects include one or more of a person and a document.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0026] Embodiments of the system described herein will now be explained in more detail in accordance with the figures of the drawings, which are briefly described as follows.
[0027]
[0028]
[0029]
[0030]
[0031]
DETAILED DESCRIPTION OF VARIOUS EMBODIMENTS
[0032] The system described herein provides a mechanism for identifying preferred objects in frames of preview video stream of a smartphone camera, building possible scenarios of object selection, providing a user with choice options and tools and creating photographs of chosen objects or their combinations for subsequent use.
[0033]
[0034]
[0035]
[0036] In contrast with
[0037]
[0038] An original position of the smartphone 110 with the camera 120 indicates the frame 340 pre-selected by the user according to
[0039] Referring to
[0040] If it is determined at the test step 515 that the change in position and view angle (if applicable) of the device are not occurring rapidly, processing proceeds from the step 515 to a step 525, where the system registers a scene analysis mode. After the step 525, processing proceeds to a step 530, where the system selects a frame from the video stream for processing. After the step 530, processing proceeds to a step 535, where the system detects preferred object candidates in the scene. After the step 535, processing proceeds to a test step 540, where it is determined whether a set of candidates for preferred objects is stable over a period of time (time-based sequencing for scene analysis, described elsewhere herein, is not shown in
[0041] After the step 555, processing proceeds to a step 560, where a first scenario is selected and a corresponding pictogram for displaying to the user is built, as illustrated, for example, by items 420a-420e in
[0042] Various embodiments discussed herein may be combined with each other in appropriate combinations in connection with the system described herein. Additionally, in some instances, the order of steps in the flowcharts, flow diagrams and/or described flow processing may be modified, where appropriate. Subsequently, elements and areas of screen described in screen layouts may vary from the illustrations presented herein. Further, various aspects of the system described herein may be implemented using software, hardware, a combination of software and hardware and/or other computer-implemented modules or devices having the described features and performing the described functions. The smartphone may include software that is pre-loaded with the device, installed from an app store, installed from a desktop (after possibly being pre-loaded thereon), installed from media such as a CD, DVD, etc., and/or downloaded from a Web site. The smartphone 110 may use an operating system selected from the group consisting of: iOS, Android OS, Windows Phone OS, Blackberry OS and mobile versions of Linux OS. The smartphone 110 may be connected by various types of wireless and other connections, such as cellular connections in Wide Area Networks, Wi-Fi, Bluetooth, NFC, USB, infrared, ultrasound and other types of connections. A mobile device other than a smartphone may be used. Note that the system described herein may be used with other devices capable of taking a photograph and providing appropriate feedback to a user, such as a wireless digital camera with a screen for providing messages to the user and a mechanism for providing an intermediate image stream.
[0043] Software implementations of the system described herein may include executable code that is stored in a computer readable medium and executed by one or more processors. The computer readable medium may be non-transitory and include a computer hard drive, ROM, RAM, flash memory, portable computer storage media such as a CD-ROM, a DVD-ROM, a flash drive, an SD card and/or other drive with, for example, a universal serial bus (USB) interface, and/or any other appropriate tangible or non-transitory computer readable medium or computer memory on which executable code may be stored and executed by a processor. The software may be bundled (pre-loaded), installed from an app store or downloaded from a location of a network operator. The system described herein may be used in connection with any appropriate operating system.
[0044] Other embodiments of the invention will be apparent to those skilled in the art from a consideration of the specification or practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with the true scope and spirit of the invention being indicated by the following claims.