SYSTEM AND METHOD FOR HUMAN INTERACTION WITH VIRTUAL OBJECTS
20210389818 · 2021-12-16
Assignee
Inventors
Cpc classification
G06F3/0393
PHYSICS
G06F3/041
PHYSICS
G06F3/04815
PHYSICS
G06F3/0488
PHYSICS
G06F3/011
PHYSICS
G06F3/0421
PHYSICS
G01S5/22
PHYSICS
G06F2203/04106
PHYSICS
G01S5/30
PHYSICS
International classification
G01S5/22
PHYSICS
G06F3/041
PHYSICS
G06F3/0481
PHYSICS
Abstract
A system for human interaction with virtual objects comprises: a touch sensitive surface, configured to detect a position of a contact made on the touch sensitive surface; a reference layer rigidly attached to the touch sensitive surface and comprising one or more patterns; a display device, configured to display a virtual object that is registered in a reference coordinate fixed with respect to the touch sensitive surface; one or more image sensors rigidly attached to the display device, configured to capture an image of at least a portion of the one or more patterns; and at least one processor, configured to determine a position and an orientation of the display device with respect to the touch sensitive surface based on the captured image, and identify an interaction with the virtual object based on the detected position of the contact made on the touch sensitive surface.
Claims
1.-26. (canceled)
27. A system for human interaction with virtual objects, comprising: a touch sensitive surface, configured to detect a position of a contact made on the touch sensitive surface; a reference layer rigidly attached to the touch sensitive surface, wherein the reference layer comprises: a fiducial layer comprising one or more patterns; and a planar light source layer positioned adjacent the fiducial layer and configured to illuminate the one or more patterns of the fiducial layer; a wearable display device, configured to display a virtual object that is registered in a reference coordinate fixed with respect to the touch sensitive surface; one or more image sensors rigidly attached to the display device, configured to capture an image of at least a portion of the one or more patterns in the reference layer; and at least one processor, configured to: determine a position and an orientation of the display device relative to the touch sensitive surface based, at least in part, on image data associated with the one or more patterns found in the captured image; access touch data from the touch sensitive surface, the touch data comprising a detected position of contact made on the touch sensitive surface; cause the virtual object to be displayed by the wearable display device; generate a manipulation with the virtual object based on the detected position of the contact made on the touch sensitive surface; and cause the manipulation with the virtual object to be displayed by the wearable display device.
28. The system of claim 27, wherein the one or more patterns comprises light transmissive portions and light blocking portions.
29. The system of claim 27, wherein the one or more patterns comprises a light absorbing portion.
30. The system of claim 27, wherein the planar light source comprises a light guide plate configured to reflect light towards the fiducial layer.
31. The system of claim 27, wherein the planar light source layer is further configured to receive light emitted by one or more light sources from at least one lateral side of the planar light source layer and direct at least a portion of the light to the fiducial layer adjacent the planar light source layer.
32. The system of claim 27, wherein the virtual object is a two-dimensional virtual object.
33. The system of claim 27, wherein the virtual object is a three-dimensional virtual object.
34. The system of claim 27, wherein the display device is a see-through display device.
35. The system of claim 27, wherein the one or more patterns comprise one or a plurality of fiducial markers.
36. The system of claim 35, wherein the one or plurality of fiducial markers are configured to absorb infrared light, and the one or more image sensors are configured to sense infrared light.
37. The system of claim 35, where each of the one or plurality of fiducial markers comprises a rectangle containing an internal grid representation of binary codes.
38. The system of claim 35, wherein each of the one or plurality of fiducial markers comprises a plurality of image features with known positions, wherein each of the image feature corresponds to a unique feature descriptor.
39. The system of claim 27, wherein the planar light source layer is positioned underneath the fiducial layer and wherein light the light transmissive portions and the light blocking portions of the fiducial layer are configured as a mask.
40. The system of claim 27, wherein the planar light source layer is positioned between the touch sensitive surface and the fiducial layer.
41. A method for human interaction with virtual objects, the method comprising: detecting a position of a contact made on a touch sensitive surface; displaying, via a display device, a virtual object that is registered in a reference coordinate fixed with respect to the touch sensitive surface; illuminating, via a planar light source, one or more patterns of a fiducial layer onto the touch sensitive surface; capturing, with one or more image sensors rigidly attached to the display device, an image of at least a portion of the one or more patterns; determining a position and an orientation of the display device with respect to the touch sensitive surface based on the captured image; accessing touch data from the touch sensitive surface, the touch data comprising a detected position of contact made on the touch sensitive surface; causing the virtual object to be displayed by the display device; generating a manipulation with the virtual object based on the detected position of the contact; and causing the manipulation with the virtual object to be displayed by the display device.
42. The method of claim 41, wherein the one or more patterns of the fiducial layer comprises a light transmissive portion and a light blocking portion, and wherein the method further comprises: transmitting, via the one or more patterns of the fiducial layer, light through the light transmissive portion; and blocking, via the one or more patterns of the fiducial layer, light via the light blocking portion.
43. The method of claim 41, wherein the one or more patterns of the fiducial layer comprises a light absorbing portion and wherein the method further comprises absorbing, via the one or more patterns of the fiducial layer, light via the light absorbing portion.
44. The method of claim 41, wherein the one or more patterns comprises a fiducial marker.
45. The method of claim 41, wherein the one or more patterns comprises a light source with a known position.
46. The method of claim 41, wherein the one or more patterns comprises a mask and a light source, wherein at least a portion of light emitted from the light source and passing through the mask is captured by the one or more image sensors.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
DETAILED DESCRIPTION
[0029]
[0030] In some embodiments, the displaying device 105 may be a see-through displaying device through which the viewer can perceive computer generated virtual contents as well as the real world. As a result, the system 100 can be used in an augmented reality application. In some embodiments, the displaying device 105 may be opaque such that it may block the light from the real world and display only the computer-generated virtual contents. As a result, the system 100 can be used in a virtual reality application. In some embodiments, the displaying device 105 may be a head-worn device which is placed in front of the viewer's eye(s) 109 when in use.
[0031] In some embodiments, the interaction surface 103 is able to detect and report precise locations of touch events on the surface, wherein the touch events may be generated by contacts made between an object and the surface, wherein the object may be a finger or a hand of the user, or a stylus, etc. In some embodiments, the touch input device 101 may further detect and report a shape of a contact area of the touch event. In some embodiment, the touch input device 101 may further detect and report a force distribution over the contact area of the touch event.
[0032] In some embodiments, the reference layer 102 is perceivable by the pose tracking device 106 to determine a position and orientation of the touch input device 101 with respect to the pose tracking device 106. In some embodiments, the reference layer 102 may be a layer of fiducial patterns which may contain a predetermined set of points, lines, or shapes. In some embodiments, the reference layer 102 may comprise a layer of light emitting diodes arranged in a predetermined pattern.
[0033] In some embodiments, the interaction surface 103 may comprise a tactile sensor that precisely measures a position of a contact between the sensor and a finger or an object. In some embodiments, the interaction surface 103 may be fully or semi-transparent. As a result, the reference layer 102 may be disposed below the interaction surface 103 while still being perceivable by the pose tracking device 106. In some embodiments, the interaction surface 103 may be opaque. As a result, the reference layer 102 may be disposed on top of the interaction surface 103 or attached to one or more side(s) of the interaction surface 103. In some embodiments, the tactile sensor may further measure an area and/or a force distribution of a contact. In some embodiments, the tactile sensor described above may be resistive sensing or capacitive sensing. The tactile sensor may be any type that one skilled in the art recognizes as suitable for performing the functionalities described herein.
[0034] As described above, the touch input device 101 may comprise a reference layer 102. In some embodiments, the reference layer 102 may comprise a predetermined set of fiducial patterns, wherein the fiducial patterns comprises a predetermined combination of features including shapes, lines, and points, wherein the sizes, positions, or orientations of such features are known. As a result, when a portion or the entirety of the fiducial pattern is captured by one or more imaging sensor(s), the position and orientation of the pattern can be determined. In some embodiments, the fiducial patterns may be printed or etched, e.g., with material that absorbs visible light and/or infrared light, on a layer of supporting substrate. In some embodiments, the fiducial patterns may be created by applying an opaque mask, with portions of it cut out, over a diffused illumination source. The fiducial patterns can be fabricated in many forms that one skilled in the art recognizes as suitable for performing the functionalities described herein.
[0035]
[0036] In some embodiments, the reference layer may comprise a plurality of light sources like light emitting diodes (LEDs) arranged in a predetermined pattern, wherein the positions of each LED is known. As a result, when a portion or all of the LEDs are captured by one or more imaging sensor(s), the position and orientation of the patterns can be determined. In some embodiments, the LEDs are lit up sequentially such that in each frame only one or a few LED(s) is captured by the imaging sensor(s). Because multiple LEDs may share the similar characteristics in a captured image frame if all LEDs are lit at the same time, and it may cause ambiguity issues when the correspondence between each of the observed LEDs and each of the known positions needs to be established. Therefore, the ambiguity issues may be solved by turning on the LEDs in a predetermined sequence. At the same time, because not all LEDs are required to be on all at all times for the pose tracking device to determine the position and orientation, sequentially lighting up the LEDs may save valuable battery power
[0037] In some embodiments, the pose tracking device 106 may comprise a single or a plurality of image sensors. In some embodiments, each image sensor may further contain a filter that allows only light with a predetermined range of wavelength (e.g., infrared light), to pass through while attenuating the intensity of light with other wavelengths (e.g., visible light), wherein the predetermined range may be dependent on the wavelength of the light reflected or emitted by the patterns described above. As a result, the patterns may be clearly captured by the image sensor(s) while other features in the field of view of the sensors(s) may be partially or completely invisible to the sensor(s). In some embodiments, the pose tracking device 106 may further include an illumination device, wherein the illumination device may comprise single or a plurality of light emitting diodes.
[0038] The computing unit 107 may comprise one or more processor(s). Although in the example shown in
[0039] For example, a virtual object may be enlarged upon being interacted with, and textual information associated with the particular object may be optionally shown. As another example, a virtual object may be dragged from a first location to a second location by a touch interaction.
[0040] In one embodiment, the fiducial patterns may comprise a plurality of square-based fiducial markers, each of which containing an external boarder and an internal grid representation of binary codes. An example of such a fiducial maker is shown in
[0041] In another embodiment, the fiducial patterns may comprise a predetermined image target containing a plurality of features with known positions in the image target and known descriptors. Features in computer vision or image processing are distinct local structures found in an image, such as a “edge” (a set of points in the image which have strong gradient magnitudes), a “corner/interest point” (a set of points where the direction of the gradient change rapidly within the local region), or local image patch. A descriptor encodes the characteristics of a feature, such as the magnitude and orientation of the local gradient of pixel intensities, a vector of intensity comparisons between a set of pixel pairs around the feature. The descriptor can be in many forms, including a numerical value, a vector of numerical values, or a vector of Boolean variables. Descriptors can be used to uniquely identify the corresponding features in an image. For example, in the case of BRIEF (Binary Robust Independent Elementary Features) descriptors, the Hamming Distance between the known descriptor and the descriptor of a candidate feature is calculated and a match is confirmed if the distance is less than a threshold.
[0042] When the image target is captured by the image sensor, a feature detection algorithm is used to extract all candidate feature points inside the captured frame, and corresponding descriptors are calculated for the candidate feature points. By comparing the descriptors, some of the candidate feature points are matched with the known feature points in the image target. The matched pairs are used to estimate the relative position and orientation of the image target with respect to the image sensor(s), for example by solving the Perspective-n-Point problem. Therefore, pose estimation based on the fiducial patterns is achieved.
[0043]
[0044]
[0045] As an alternative example, as shown in
[0046] In some embodiments, the touch input device may comprise a plurality of ultrasonic transmitters placed in a predetermined pattern, and the pose tracking device may comprises a plurality of ultrasonic receivers. In some embodiments, the pose tracking device may comprise a plurality of ultrasonic transmitters placed in a predetermined pattern, and the touch input device may comprises a plurality of ultrasonic receivers. As a result, distances between the transmitters and the receivers can be determined by measuring the time-of-flight for ultrasonic signals. Therefore, the position and the orientation of the touch input device can be determined.
[0047]
[0048] A variety of methods can be used to to track the position and orientation of a virtual object using ultrasonic receivers and transmitters. For example, in one implementation, three ultrasonic receivers are rigidly attached to one of the displaying device and the touch input device in a non-collinear arrangement, three transmitters are rigidly attached to the other one of the displaying device and the touch input device, and the computing unit is coupled to the transmitters and the receivers. The three transmitters generate ultrasonic pulses at three different frequencies respectively. Each of the three receivers separates the received ultrasonic waves with three different frequencies into three signals, resulting in a total of nine signals. Based on the time-of-flight principle, the nine signals are processed into nine distances between each of the three transmitters and each of the three receivers. As a result, the relative orientation and position between the transmitter assembly and receiver assembly can be estimated.
[0049] In another implementation, one ultrasonic transmitter and a 9-axis inertial measurement unit (IMU) are rigidly attached to one of the displaying device and the touch input device, three ultrasonic receivers and a 9-axis IMU are rigidly attached to the other one of the displaying device and the touch input device, wherein the receivers are arranged in a non-collinear arrangement, and the computing unit is coupled to the transmitter, the receivers, and the IMUs. Alternatively, three transmitters and one receiver may be used. The transmitter generates ultrasonic acoustic pulses at a known frequency and the receivers convert the received ultrasonic pulses into three signals. Based on the time-of-flight principle, the signals result in three distances between the transmitter and the three receivers, respectively. As a result, the relative position between the transmitter and the receiver assembly can be calculated. The IMUs measure the absolute orientations of the displaying device and the touch input device, such that the relative orientation between them can be determined.
[0050] In some embodiments, a footprint may be displayed for an elevated virtual object. Because user interaction is sensed by a touch input device, the interaction is limited to the proximity of a 2D plane. However, the virtual contents may be displayed above the touch input device with a non-negligible vertical distance. To overcome such a limitation, in one implementation, a virtual footprint, projected from the elevated virtual object onto the interaction layer, is displayed through the displaying device. As a result, the user can interact with the elevated virtual object via its virtual footprint using various touch gestures. For example, the user can perform a pinch gesture on the touch input device over the area of the virtual footprint to scale the corresponding virtual object, or the user can perform a press-and-drag gesture on the virtual footprint to move the corresponding virtual object. Furthermore, when the user touches the area of the virtual footprint, a virtual two dimensional menu element can be displayed on the interaction layer near the virtual footprint to provide additional operations on the corresponding virtual object. The user can tap on different areas of the interaction layer where menu items are displayed to activate related functions. For example, additional operations may include but are not limit to starting an animation associated with the corresponding virtual object, deleting the corresponding virtual object, or change an attribute of the corresponding virtual object.
[0051]
[0052] In some embodiments, the light blocking portions may comprise a polymer with a light blocking additive. In some embodiments, the light blocking portions may comprise light blocking paint deposited over a substrate. In some embodiments, the light transmissive portion is simply formed by voids, or the lack of any material. The materials of the mask layer 603 may be any type that one skilled in the art recognizes as suitable for performing the functionalities described herein.
[0053] In some embodiments, the interaction surface is positioned above the mask layer 603. In some embodiments, the interaction surface is positioned below the mask layer 603. In some embodiments, light blocking material is directly deposited over the interaction surface to form the mask layer 603.
[0054]
[0055] In some embodiments, the interaction surface is positioned above the mask layer 703. In some embodiments, the interaction surface is positioned below the mask layer 703. In some embodiments, light blocking material is directly deposited over the touch sensitive surface to form the mask layer 703.
[0056]
[0057] In some embodiments, the interaction surface is positioned above the light guide plate 801. In some embodiments, the interaction surface is positioned below the fiducial layer 803. In some embodiments, the interaction surface is positioned between the light guide plate 801 and the fiducial layer 803.
[0058] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated.