METHOD AND DEVICE FOR PROVIDING AUGMENTED REALITY (AR) SERVICE BASED ON VIEWER ENVIRONMENT
20220358728 · 2022-11-10
Inventors
Cpc classification
H04W4/021
ELECTRICITY
International classification
Abstract
The disclosure relates to a 5G or 6G communication system for supporting a higher data transmission rate. A method for operating a terminal for an augmented reality (AR) service in a mobile communication system includes generating terminal anchoring metadata based on environment information obtained from at least one sensor included in the terminal, transmitting the terminal anchoring metadata to a server, receiving, from the server, a 3D model generated based on the terminal anchoring metadata, and rendering a virtual object based on the 3D model and the environment information.
Claims
1. A method for operating a terminal for an augmented reality (AR) service in a mobile communication system, the method comprising: generating terminal anchoring metadata based on environment information obtained from at least one sensor included in the terminal; transmitting, to a server, the terminal anchoring metadata; receiving, from the server, a three-dimensional (3D) model generated based on the terminal anchoring metadata; and rendering a virtual object based on the 3D model and the environment information.
2. The method of claim 1, further comprising: determining an anchor where the 3D model is to be positioned based on the 3D model and the environment information.
3. The method of claim 1, wherein the terminal anchoring metadata includes at least one of a geometry of an environment where the terminal is located, coordinate information on the location of the terminal, a display resolution of the terminal, a display resolution corresponding to a region of interest, a position and direction of the terminal, position information on the anchor, attribute information on the anchor, and information on content processing required for anchoring.
4. The method of claim 1, further comprising: receiving, from the server, content anchoring metadata including at least one of space information on a scene representing AR content, space information on each virtual object constituting the scene, an anchor requirement, direction information on the scene and the virtual object, and possible content processing information.
5. A method for operating a server for an augmented reality (AR) service in a mobile communication system, the method comprising: receiving, from a terminal, terminal anchoring metadata generated based on environment information obtained from at least one sensor included in the terminal; generating a 3D model using the terminal anchoring metadata; and transmitting, to the terminal, the 3D model, wherein a virtual object is rendered by the terminal based on the 3D model and the environment information.
6. The method of claim 5, wherein the terminal anchoring metadata includes at least one of a geometry of an environment where the terminal is located, coordinate information on the location of the terminal, a display resolution of the terminal, a display resolution corresponding to a region of interest, a position and direction of the terminal, position information on the anchor, attribute information on the anchor, and information on content processing required for anchoring.
7. The method of claim 5, further comprising: generating content anchoring metadata including at least one of space information on a scene representing AR content, space information on each virtual object constituting the scene, an anchor requirement, orientation information on the scene and the virtual object, and possible content processing information; and transmitting, to the terminal, the content anchoring metadata.
8. A terminal for an augmented reality (AR) service in a mobile communication system, comprising: a transceiver; and a controller coupled with the transceiver and configured to: generate terminal anchoring metadata based on environment information obtained from at least one sensor included in the terminal, transmit, to a server, the terminal anchoring metadata, receive, from the server, a 3D model generated based on the terminal anchoring metadata, and render a virtual object based on the 3D model and the environment information.
9. The terminal of claim 8, wherein the controller is further configured to determine an anchor where the 3D model is to be positioned based on the 3D model and the environment information.
10. The terminal of claim 8, wherein the terminal anchoring metadata includes at least one of a geometry of an environment where the terminal is located, coordinate information on the location of the terminal, a display resolution of the terminal, a display resolution corresponding to a region of interest, a position and direction of the terminal, position information on the anchor, attribute information on the anchor, and information on content processing required for anchoring.
11. The terminal of claim 8, wherein the controller is further configured to receive, from the server, content anchoring metadata including at least one of space information on a scene representing AR content, space information on each virtual object constituting the scene, an anchor requirement, orientation information on the scene and the virtual object, and possible content processing information.
12. A server for an augmented reality (AR) service in a mobile communication system, comprising: a transceiver; and a controller coupled with the transceiver and configured to: receive, from a terminal, terminal anchoring metadata generated based on environment information obtained from at least one sensor included in the terminal, generate a 3D model using the terminal anchoring metadata, and transmit, to the terminal, the 3D model, wherein a virtual object is rendered by the terminal based on the 3D model and the environment information.
13. The server of claim 12, wherein the terminal anchoring metadata includes at least one of a geometry of an environment where the terminal is located, coordinate information on the location of the terminal, a display resolution of the terminal, a display resolution corresponding to a region of interest, a position and direction of the terminal, position information on the anchor, attribute information on the anchor, and information on content processing required for anchoring.
14. The server of claim 12, wherein the controller is further configured to: generate content anchoring metadata including at least one of space information on a scene representing AR content, space information on each virtual object constituting the scene, an anchor requirement, direction information on the scene and the virtual object, and possible content processing information, and transmit, to the terminal, the content anchoring metadata.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] The above and other aspects, features, and advantages of certain embodiments of the disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
DETAILED DESCRIPTION
[0026] Hereinafter, embodiments of the disclosure are described in detail with reference to the accompanying drawings. When determined to make the subject matter of the disclosure unclear, the detailed description of the known art or functions may be skipped. The terms as used herein are defined considering the functions in the disclosure and may be replaced with other terms according to the intention or practice of the user or operator. Therefore, the terms should be defined based on the overall disclosure.
[0027] Embodiments of the disclosure may also be applicable to communication systems with a similar technical background with minor changes without significantly departing from the scope of the disclosure, and this may be possible under the determination of those skilled in the art to which the disclosure pertains. As used herein, the term “communication system” encompasses broadcast systems, but when a broadcast service is a main service, the communication system may be clearly mentioned as broadcast system.
[0028] Advantages and features of the disclosure, and methods for achieving the same may be understood through the embodiments to be described below taken in conjunction with the accompanying drawings. However, the disclosure is not limited to the embodiments disclosed herein, and various changes may be made thereto. The embodiments disclosed herein are provided only to inform one of ordinary skilled in the art of the category of the disclosure. The disclosure is defined only by the appended claims. The same reference numeral denotes the same element throughout the specification.
[0029] Methods described below in connection with embodiments are based on hardware. However, embodiments of the disclosure encompass technology using both hardware and software and thus do not exclude software-based methods. The disclosure is not limited to the terms, and other terms equivalent in technical concept may also be used.
[0030] The 3D model for the AR service considered in the disclosure may be defined as a continuous volumetric frame that changes over time. The volumetric frame may be regarded as a set of primitive elements, such as points, lines, and planes, existing in a three-dimensional (3D) space at a specific time, and the primitive elements may have attributes, such as color, reflectance, and the like. The volumetric frames may be stored and transmitted in a format specialized for characteristics and applications of content. For example, the graphics language transmission format (GLTF) format spatially and logically structures and expresses a scene in a three-dimensional space. More specifically, the scene may be structured with nodes having a tree or graph structure and expressed in JavaScript object notation (JSON) format, and actual media data referenced by the node may be specified in the above-described raw element-based 3D model structure.
[0031]
[0032] Referring to
[0033] The AR terminal may determine at step 130 an anchor where the 3D model is located via real world analysis at step 120 and at step 140, render a virtual object represented as the 3D model. The real world analysis includes a process in which the AR terminal recognizes the environment around the AR terminal using a sensor, e.g., a camera, and the anchor may be a geometrical structure referenced for synthesizing the 3D model with the real world context.
[0034] For example, in a service in which the user of the AR terminal in a room with a table is able to put a virtual object on the table, the real world analysis may be a process for figuring out the size of the room, the position and size of the table, and the process of determining the anchor may be a process for determining the position where the 3D model indicating the virtual object may be placed on the top surface of the table. Since the real world may be varied by the movement of the AR terminal or external factors, the position of the anchor may be changed, if necessary, by continuously performing the real world analysis after the virtual object rendering at step 140.
[0035]
[0036] Referring to
[0037] According to an embodiment, the network interface 202 may be denoted as a communication interface or a transceiver. The AR application 201, the vision engine 203, the AR renderer 204, and the pose correction 205 function are collectively referred to as a controller or a processor.
[0038] The AR application 201 denotes a service logic that executes the AR service based on the user's input. The vision engine 203 may perform the real world analysis 120 of
[0039] Meanwhile, for the above-described AR terminal to determine the anchor for rendering the virtual object, information on the 3D model representing the virtual object is required. For example, the user of the AR terminal may perform the application of placing a 3D model representing a piece of furniture, as a virtual object, in the room of the user of the AR terminal to determine a piece of furniture to purchase. The 3D model representing the piece of furniture in the application should be rendered to be recognized by the user of the AR terminal as having the same size as a real piece of furniture and, to that end, the anchor should be set in a position where there is an empty space sufficient to place the real piece of furniture. There may be a piece of furniture impossible to place depending on the structure of the room of the AR terminal user. In this case, the AR service may provide the AR terminal user with only 3D models for placeable furniture using terminal anchoring metadata, which is to be described below.
[0040] As another example, an application in which a virtual object is not placed in a fixed position but is movable in the real world of the AR terminal user may be performed. To correctly synthesize the virtual object with the AR terminal user's real world, information on the space necessary for representing the movement of the virtual object as well as information regarding the size of the virtual object is required. Accordingly, in this case, the AR service may transfer information on a movement range of the virtual object to the AR terminal using the content anchoring metadata which is to be described below.
[0041] The terminal anchoring metadata may include at least one of the geometry of the real world where the AR terminal is located, coordinate system, the display resolution of the user AR terminal, display resolution corresponding to the region of interest, the position and direction (or pose) of the user AR terminal, anchor position and attributes, and information on content processing required for anchoring. The geometry of the real world may be represented as a 3D model, and additional information may be specified as the attributes of the nodes constituting the 3D model depending on its application range.
[0042] Examples of the parameters constituting the above-described terminal anchoring metadata are as follows: [0043] Geometry of the real world where the AR terminal is located: may be represented as a set of primitive elements, such as dots, lines, and planes, present on a 3D space and may further include information regarding an object if an object such as a desk or a chair is recognized; [0044] Coordinate system: The relationship between the coordinate system representing the geometry and the real coordinate system; [0045] Pose of the user AR terminal: The position and direction of the user AR terminal may be used when the AR server performs view frustum culling on the 3D model; [0046] Anchor position and attributes: The position of the anchor in the real world of the AR terminal or a candidate area where the anchor may be positioned may further include such attributes as anchor normal, geometric characteristics (horizontal plane, vertical plane, boundary plane, etc.), and information on real objects (next to a water cup, in a frame, etc.). The anchor position and attributes may be set based on requirements for content anchoring metadata described below, results of real world analysis, and user input; [0047] Content processing required for anchoring: The type of content processing required for anchoring the 3D model in the AR terminal's real world and parameters for the content processing. Examples of content processing may include scaling, rotation, and translation, and the processing is applicable to individual virtual objects or the entire scene constituted of virtual objects. The content processing may be set based on requirements for content anchoring metadata described below, results of real world analysis, and user input; [0048] Display resolution of user AR terminal: Resolution corresponding to the entire field of view (FoV) of the AR terminal may be used when setting the precision of the 3D model by the AR server; and [0049] Display resolution corresponding to the region of interest: The resolution of the display of the AR terminal corresponding to the region of interest of the AR terminal. The region of interest may be a region in which the geometry of the content anchoring metadata described below or the 3D model is to be rendered in the position of the anchor.
[0050] The terminal anchoring metadata may further include parameters defined in the content anchoring metadata in response to the content anchoring metadata described below.
[0051] The content anchoring metadata may include at least one of the space to be occupied by the scene representing the AR content, the space to be occupied by each of the virtual objects constituting the scene, anchor requirements, the frontal direction of the scene and virtual object, and possible content processing. The space to be occupied by the scene and the space to be occupied by each of the virtual objects constituting the scene may be represented as a 3D model, and additional information may be specified as the attributes of the nodes constituting the 3D model depending on its application range.
[0052] Examples of the parameters constituting the above-described content anchoring metadata are as follows: [0053] Space to be occupied by the scene: may be represented as a set of primitive elements, such as dots, lines, and planes, present on a 3D space; [0054] Space to be occupied by each of the virtual objects constituting the scene: may be represented as a set of primitive elements, such as dots, lines, and planes, present on a 3D space and may further include information for recognizing virtual objects (e.g., a desk or a chair); [0055] Anchor requirements: Requirements for locating the anchor may further include such attributes as anchor normal, geometric characteristics (horizontal plane, vertical plane, boundary plane, etc.), and information on real objects (next to a water cup, in a frame, etc.); [0056] Frontal direction of the scene and virtual object; and [0057] Possible content processing: The type of content processing that may be provided by the AR server to anchor the 3D model in the AR terminal's real world and parameters for the content processing. Examples of content processing may include scaling, rotation, and translation, and the processing is applicable to individual virtual objects or the entire scene constituted of virtual objects. Content processing may further include a range (e.g., maximum reduction ratio) allowed for each processing.
[0058] The space to be occupied by each of the virtual objects constituting the scene may be represented as a set of simple structures, e.g., boxes or cylinders, which may include virtual objects or precisely specify the outer shape of the actual virtual object. The space to be occupied by the scene may also be represented as a set of simple structures, e.g., boxes or cylinders, which may include the scene or precisely specify the area to be occupied by the scene.
[0059] Each of the terminal anchoring metadata and the content anchoring metadata may be created, stored, transmitted, and processed in the format of, e.g., JSON, binary object, or XML and, according to an implementation, the terminal anchoring metadata and the content anchoring metadata may be created, stored, transmitted, and processed, as a single data unit. Further, the terminal anchoring metadata and the content anchoring metadata may be created, stored, transmitted, and processed as the same or separate file from the 3D model for AR service and, when present as a separate file, the AR service provider may provide a signaling method in which the AR terminal user may associate the 3D model with the terminal anchoring metadata and content anchoring metadata. Examples of the signaling method may include an electric service guide (ESG), a user service description, and service access information (SAI) of 3GPP 5G media streaming (5GMS).
[0060] The terminal anchoring metadata and the content anchoring metadata may be shared, in the form of an HTTP resource, between the AR terminal and the AR server or may be transmitted between the AR terminal and the AR server through a separate control protocol or media transport protocol.
[0061]
[0062] Referring to
[0063] The network interface 302 may be referred to as a communication interface or a transceiver. The AR application 301, the vision engine 303, the anchoring metadata processor 304, the AR renderer 305, and the pose correction 306 function are collectively referred to as a controller or a processor.
[0064] The following are examples of the functions of the anchoring metadata processor 304 shown in
[0067]
[0068] Referring to
[0069] The network interface 413 may be referred to as a communication interface or a transceiver. The original contents 411, anchoring metadata processor 412, network interface 413, and 3D model reconstruction 414 function are collectively referred to as a controller or a processor.
[0070] The following are examples of the functions of the anchoring metadata processor 412 shown in
[0073]
[0074] Referring to
[0075] The AR server may analyze the terminal anchoring metadata to generate a 3D model and transmit the generated 3D model to the AR terminal. The AR terminal may obtain the 3D model from the AR server at step 530, determine at step 540 the anchor where the 3D model is to be positioned based on the result of the real world analysis at step 510, and render the virtual object represented as the 3D model at step 550.
[0076] Since the real world may be varied by the movement of the AR terminal or external factors, the position of the anchor may be changed or terminal anchoring metadata may be updated, if necessary, by continuously performing the real world analysis after the virtual object rendering at step 550.
[0077]
[0078] Referring to
[0079] The AR server may analyze the terminal anchoring metadata to generate a 3D model and transmit the generated 3D model to the AR terminal. The AR terminal may obtain, at step 650, the 3D model from the AR server and render, at step 660, the virtual object represented as the 3D model based on the real world analysis at step 620 and the anchor identification at step 630. Since the real world may be varied by the movement of the AR terminal or external factors, the position of the anchor may be changed or terminal anchoring metadata may be updated, if necessary, by continuously performing the real world analysis after the virtual object rendering at step 660.
[0080] In the above-described AR terminal content anchoring metadata-based AR service reproduction procedure, the process of anchor identification at step 630 may extract one or more anchor candidate groups in which case the final anchor determination may be additionally performed after the 3D model is obtained at step 650). Further, the content anchoring metadata may be updated by the AR service provider and, in this case, the process after step 610 of obtaining the content anchoring metadata may be repeated.
[0081] According to an implementation, the AR terminal may generate terminal anchoring metadata by performing real world analysis before obtaining content anchoring metadata and select content anchoring metadata matching it, and perform the subsequent processes.
[0082]
[0083] The terminal described above in connection with
[0084] The transceiver 710 collectively refers to a transmitter and a receiver of the terminal and may transmit and receive signals to/from a base station, network entity, server, or another terminal. The signals transmitted and received to/from the base station, network entity, server, or the other terminal may include control information and data. To that end, the transceiver 710 may include a radio frequency (RF) transmitter for frequency-up converting and amplifying signals transmitted and an RF receiver for low-noise amplifying signals received and frequency-down converting the frequency of the received signals. However, this is merely an example of the transceiver 710, and the components of the transceiver 710 are not limited to the RF transmitter and the RF receiver.
[0085] The transceiver 710 may receive signals via a radio channel, output the signals to the controller 730, and transmit signals output from the controller 730 via a radio channel.
[0086] The memory 720 may store programs and data necessary for the operation of the terminal. The memory 720 may store control information or data that is included in the signal obtained by the terminal. The memory 720 may include a storage medium, such as ROM, RAM, hard disk, CD-ROM, and DVD, or a combination of storage media. Rather than being separately provided, the memory 720 may be embedded in the controller 730.
[0087] The controller 730 may control a series of processes for the terminal to be able to operate according to the above-described embodiments. For example, the controller 730 may generate terminal anchoring metadata based on environment information obtained from at least one sensor included in a terminal, transmit the terminal anchoring metadata to a server, receive, from the server, a 3D model generated based on the terminal anchoring metadata, and render a virtual object based on the 3D model and the environment information. There may be provided a plurality of controllers 730. The controller 730 may control the components of the terminal by executing a program stored in the memory 720.
[0088]
[0089] The server described above in connection with
[0090] The transceiver 810 collectively refers to the transmitter of the server and the receiver of the server and may transmit and receive signals to/from the terminal, base station, or network entity. The signals transmitted/received with the terminal, base station, or network entity may include control information and data. To that end, the transceiver 810 may include an RF transmitter for frequency-up converting and amplifying signals transmitted and an RF receiver for low-noise amplifying signals received and frequency-down converting the frequency of the received signals. However, this is merely an example of the transceiver 810, and the components of the transceiver 810 are not limited to the RF transmitter and the RF receiver.
[0091] The transceiver 810 may receive signals via a radio channel, output the signals to the controller 830, and transmit signals output from the controller 830 via a radio channel.
[0092] The memory 820 may store programs and data necessary for the operation of the server. The memory 820 may store control information or data that is included in the signal obtained by the server. The memory 820 may include a storage medium, such as ROM, RAM, hard disk, CD-ROM, and DVD, or a combination of storage media. Rather than being separately provided, the memory 820 may be embedded in the controller 830.
[0093] The controller 830 may control a series of processes for the server to be able to operate according to the above-described embodiments. For example, the controller 830 may control to receive, from a terminal, terminal anchoring metadata generated based on environment information obtained from at least one sensor included in a terminal, generate a 3D model using the terminal anchoring metadata, and control to transmit the 3D model to the terminal. In this case, a virtual object may be rendered by the terminal based on the 3D model and the environment information. There may be provided a plurality of controllers 830. The controller 830 may control the components of the server by executing a program stored in the memory 820.
[0094] The methods according to the embodiments described in the specification or claims of the disclosure may be implemented in hardware, software, or a combination of hardware and software.
[0095] When implemented in software, there may be provided a computer readable storage medium storing one or more programs (software modules). One or more programs stored in the computer readable storage medium are configured to be executed by one or more processors in an electronic device. One or more programs include instructions that enable the electronic device to execute methods according to the embodiments described in the specification or claims of the disclosure.
[0096] The programs (software modules or software) may be stored in random access memories, non-volatile memories including flash memories, ROMs, electrically erasable programmable read-only memories (EEPROMs), magnetic disc storage devices, compact-disc ROMs, digital versatile discs DVDs), or other types of optical storage devices, or magnetic cassettes. Or, the programs may be stored in a memory constituted of a combination of all or some thereof. As each constituting memory, multiple ones may be included.
[0097] The programs may be stored in attachable storage devices that may be accessed via a communication network, such as the Internet, Intranet, local area network (LAN), wide area network (WAN), or storage area network (SAN) or a communication network configured of a combination thereof. The storage device may connect to the device that performs embodiments of the disclosure via an external port. A separate storage device over the communication network may be connected to the device that performs embodiments of the disclosure.
[0098] In the above-described specific embodiments, the components included in the disclosure are represented in singular or plural forms depending on specific embodiments proposed. However, the singular or plural forms are selected to be adequate for contexts suggested for ease of description, and the disclosure is not limited to singular or plural components. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise.
[0099] Although specific embodiments of the disclosure have been described above, various changes may be made thereto without departing from the scope of the disclosure. Thus, the scope of the disclosure should not be limited to the above-described embodiments, and should rather be defined by the following claims and equivalents thereof.