ENCODER, DECODER AND SCENE DESCRIPTION DATA SUPPORTING MULTIPLE ANIMATIONS AND/OR MOVEMENTS FOR AN OBJECT
20230326113 · 2023-10-12
Inventors
- Cornelius Hellge (Berlin, DE)
- Thomas Schierl (Berlin, DE)
- Peter Eisert (Berlin, DE)
- Anna Hilsmann (Berlin, DE)
- Robert Skupin (Berlin, DE)
- Yago SÁNCHEZ DE LA FUENTE (Berlin, DE)
- Wieland MORGENSTERN (Berlin, DE)
- Gurdeep Singh Bhullar (Berlin, DE)
Cpc classification
International classification
Abstract
Scene description data having first data defining a 3D object and second data triggering an animation of the 3D object. The second data triggers an application of the animation to the 3D object and has a parameter discriminating between several animation modes of the application.
Claims
1. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object; acquire, from the scene description data, second data triggering an animation of the 3D object, apply the animation to the 3D object dependent on a mode parameter comprised by the second data, according to one of several animation modes, comprising one or more of apply the animation to the 3D object repeatedly in loops with starting each loop from an initial pose of the 3D object, and apply the animation to the 3D object repeatedly in loops with using a pose assumed by the 3D object at the end of one loop for starting a subsequent loop, and apply the animation to the 3D object with maintaining a pose assumed by the 3D object at the end of the animation, and apply the animation to the 3D object with returning to a pose assumed by the 3D object upon a start of the application of the animation to the 3D object, and apply the animation to the 3D object in reverse with starting from a pose assumed by the 3D object at the end of a previously applied animation.
2. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object; acquire, from the scene description data, second data triggering an animation of the 3D object, apply a temporal subinterval of the animation or of a cyclic application of the animation to the 3D object based on a trimming parameter comprised by the second data, wherein the trimming parameter controls as to which temporal subinterval of the animation or of a cyclic application of the animation is to be applied to the 3D object by the apparatus.
3. The apparatus according to claim 2, configured to apply the animation to the 3D object dependent on a mode parameter comprised by the second data, according to one of several animation modes, comprising one or more of apply the animation to the 3D object repeatedly in loops with starting each loop from an initial pose of the 3D object, and apply the animation to the 3D object repeatedly in loops with using a pose assumed by the 3D object at the end of one loop for starting a subsequent loop, and apply the animation to the 3D object with maintaining a pose assumed by the 3D object at the end of the animation, and apply the animation to the 3D object with returning to a pose assumed by the 3D object upon a start of the application of the animation to the 3D object, and apply the animation to the 3D object in reverse with starting from a pose assumed by the 3D object at the end of a previously applied animation.
4. The apparatus according to claim 2, wherein the trimming parameter indicates a start frame and an end frame of the temporal subinterval of the animation or of the cyclic application of the animation to be applied to the 3D object.
5. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object; acquire, from the scene description data, second data triggering an animation of the 3D object, and amplify or dampen pose movements of the 3D object caused by the animation using a weighting parameter comprised by the second data.
6. The apparatus of claim 5, wherein the animation of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter, amplify or dampen the pose movements of the 3D object caused by the animation dependent on the channel with which the respective pose movement is associated.
7. The apparatus of claim 5, wherein the animation of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter, amplify or dampen the pose movements of the 3D object caused by the animation specifically with respect to one or more predetermined channels and leave the pose movements of the 3D object caused by the animation uninfluenced with respect to remaining channels.
8. The apparatus of claim 5, wherein the animation of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter, amplify or dampen the pose movements of the 3D object caused by the animation specifically with respect to one or more predetermined channels and the apparatus is configured to infer a second weighting parameter by setting the second weighting parameter to a predetermined value and, using the second weighting parameter, amplify or dampen the pose movements of the 3D object caused by the animation with respect to one or more second channels.
9. The apparatus of claim 5, wherein the animation of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter, amplify or dampen the pose movements of the 3D object caused by the animation specifically with respect to one or more predetermined channels and the apparatus is configured to, using another weighting parameter comprised by the second data, amplify or dampen pose movements of the 3D object caused by the animation with respect to one or more further channels.
10. The apparatus of claim 5, wherein the animation triggered by the second data represents a first animation of the 3D object, and the apparatus is configured to acquire, from the scene description data, third data triggering a second animation of the 3D object, and apply the first animation and the second animation to the 3D object so that the first animation and the second animation are running simultaneously at least for a certain time interval, wherein the first animation of the 3D object is defined in the scene description data in a manner decomposed into a first set of channels and the second animation of the 3D object is defined in the scene description data in a manner decomposed into a second set of channels, wherein for each channel of the first set of channels, the respective channel defines a pose movement for a joint of the 3D object and wherein for each channel of the second set of channels, the respective channel defines a pose movement for a joint of the 3D object, wherein the apparatus is configured to apply pose movements defined by one or more channels of the first set of channels to the same joints as pose movements defined by one or more channels of the second set of channels, and the apparatus is configured to, using the weighting parameter comprised by the second data, amplify or dampen the pose movements defined by the one or more channels of the first set of channels and leave the pose movements of the 3D object caused by the first animation uninfluenced with respect to remaining channels, and the apparatus is configured to, using another weighting parameter comprised by the third data, amplify or dampen the pose movements defined by the one or more channels of the second set of channels and leave the pose movements of the 3D object caused by the second animation uninfluenced with respect to remaining channels.
11. The apparatus of claim 5, wherein the animation triggered by the second data represents a first animation of the 3D object, and the apparatus is configured to acquire, from the scene description data, third data triggering a second animation of the 3D object, and apply the first animation and the second animation to the 3D object so that the first animation and the second animation are running simultaneously at least for a certain time interval, wherein the first animation of the 3D object is defined in the scene description data in a manner decomposed into a first set of channels and the second animation of the 3D object is defined in the scene description data in a manner decomposed into a second set of channels, wherein for each channel of the first set of channels, the respective channel defines a pose movement for a joint of the 3D object and wherein for each channel of the second set of channels, the respective channel defines a pose movement for a joint of the 3D object, wherein the apparatus is configured to apply pose movements defined by one or more channels of the first set of channels the same joints as pose movements defined by one or more channels of the second set of channels, and the apparatus is configured to, using the weighting parameter comprised by the second data, amplify or dampen the pose movements defined by the one or more channels of the first set of channels and the apparatus is configured to infer a further weighting parameter and, using the further weighting parameter, amplify or dampen the pose movements of the 3D object caused by the first animation with respect to remaining channels, and the apparatus is configured to, using another weighting parameter comprised by the third data, amplify or dampen the pose movements defined by the one or more channels of the second set of channels and the apparatus is configured to infer a further another weighting parameter and, using the further another weighting parameter, amplify or dampen the pose movements of the 3D object caused by the second animation with respect to remaining channels.
12. The apparatus of claim 5, wherein the animation triggered by the second data represents a first animation of the 3D object, and the apparatus is configured to acquire, from the scene description data, third data triggering a second animation of the 3D object, and apply the first animation and the second animation to the 3D object so that the first animation and the second animation are running simultaneously at least for a certain time interval, wherein the first animation of the 3D object is defined in the scene description data in a manner decomposed into a first set of channels and the second animation of the 3D object is defined in the scene description data in a manner decomposed into a second set of channels, wherein for each channel of the first set of channels, the respective channel defines a pose movement for a joint of the 3D object and wherein for each channel of the second set of channels, the respective channel defines a pose movement for a joint of the 3D object, wherein the apparatus is configured to apply pose movements defined by one or more channels of the first set of channels to the same joints as pose movements defined by one or more channels of the second set of channels, and the apparatus is configured to, using the weighting parameter comprised by the second data, amplify or dampen the pose movements defined by the one or more channels of the first set of channels and the apparatus is configured to, using a further weighting parameter comprised by the second data, amplify or dampen the pose movements of the 3D object caused by the first animation with respect to remaining channels, and the apparatus is configured to, using another weighting parameter comprised by the third data, amplify or dampen the pose movements defined by the one or more channels of the second set of channels and the apparatus is configured to, using a further another weighting parameter comprised by the third data, amplify or dampen the pose movements of the 3D object caused by the second animation with respect to remaining channels.
13. The apparatus of claim 10, configured to add the second animation to the first animation so that a combined animation is applied to the 3D object, wherein the apparatus is configured to acquire the combined animation by forming a sum of pose movements caused by the first animation and the second animation, divided by a number of animations adding to the combined animation.
14. The apparatus of claim 10, wherein the apparatus is configured to apply the first animation to the 3D object during a first time interval triggered by the second data, and wherein the apparatus is configured to apply the second animation to the 3D object during a second time interval triggered by the third data, wherein the first time interval and the second time interval are at least partially overlapping so that the first animation and the second animation are running simultaneously at least for the certain time interval, wherein the apparatus is configured to add the second animation to the first animation so that a combined animation is applied to the 3D object during the certain time interval, wherein the apparatus is configured to acquire the combined animation by forming a sum of pose movements caused by the first animation and the second animation, divided by a number of animations adding to the combined animation.
15. The apparatus of claim 14, configured to, using the weighting parameter, amplify or dampen the pose movements of the 3D object caused by the first animation and the second animation during the certain time interval using a first scaling and during the remaining first time interval using a second scaling and/or during the remaining second time interval using a third scaling.
16. The apparatus of claim 5, configured to, using two or more weighting parameter comprised by the second data, amplify or dampen pose movements of the 3D object caused by the animation.
17. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object; acquire, from the scene description data, second data triggering an animation of the 3D object, acquire, from the scene description data, third data triggering a second animation of the 3D object, apply the first animation and the second animation to the 3D object so that the first animation and the second animation are running simultaneously at least for a certain time interval, wherein the apparatus is configured to apply the second animation comprised by the third data to the 3D object in a manner acting on the 3D object along with the first animation, and/or apply the second animation to the 3D object based on an inter-animation-control parameter comprised by the third data, wherein the inter-animation-control parameter discriminates between different animation combination modes, comprising two or more of applying the second animation overriding the first animation so that the first animation is not applied to the 3D object as long as the second animation lasts; applying the second animation overriding the first animation with respect to a portion of the 3D object affected by the first animation; adding the second animation to the first animation so that a combined animation is applied to the 3D object, wherein the combined animation is acquired by forming a sum of pose movements caused by the first animation and the second animation, divided by a number of animations adding to the combined animation, adding the second animation to the first animation so that a combined animation is applied to the 3D object, wherein the combined animation is acquired by forming a sum of pose movements of the first animation and the second animation.
18. The apparatus of claim 17, configured to apply the first and second animation to the 3D object dependent on animation IDs, wherein a first animation ID is associated with the first animation and a second animation ID is associated with the second animation.
19. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object; acquire, from the scene description data, second data triggering a first animation of the 3D object, acquire, from the scene description data, third data triggering a second animation of the 3D object, apply the first animation and the second animation to the 3D object so that the first animation and the second animation are running simultaneously at least for a certain time interval, apply the second animation comprised by the third data to the 3D object in a manner acting on the 3D object along with the first animation, and apply the first and second animation to the 3D object dependent on animation IDs, wherein a first animation ID is associated with the first animation and a second animation ID is associated with the second animation.
20. The apparatus of claim 19, wherein the first and second animation IDs are defined on an ordinal scale and the apparatus is configured to determine a final animation of the 3D object emerging from the first and second animations based on a rank of the second animation ID relative to the first animation ID.
21. The apparatus of claim 19, wherein the scene description data has the first and second animations defined therein using fourth data in a manner where each animation is tagged with the associated animation ID, and the second and third data trigger the first and second animations by indexing the first and second animations using the first and second animation IDs.
22. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object and a movement of the 3D object; acquire, from the scene description data, second data triggering an animation of the 3D object, wherein the apparatus is configured to apply the animation to the 3D object in a manner acting on the 3D object along with the movement of the 3D object defined by the first data, and/or apply the animation based on an animation-movement-interaction-control parameter comprised by the second data, wherein the animation-movement-interaction-control parameter discriminates between different modes of applying the animation to the 3D object, comprising one or more of applying the animation in a manner overriding the movement defined by the first data by using a pose of the 3D object at the time instant at which the animation is triggered by the second data as an initial pose of the 3D object to which the animation is applied; applying the animation in a manner overriding the movement defined by the first data by using a default pose as an initial pose of the 3D object to which the animation is applied instead of the pose of the 3D object at the time instant at which the animation is triggered by the second data; and applying the animation to the 3D object in a manner acting on the 3D object along with the movement defined by the first data.
23. The apparatus of claim 22, wherein the apparatus is configured to apply the second animation to the 3D object in a manner acting on the 3D object along with the movement of the 3D object using a weighting parameter comprised by the first data to amplify or dampen the movement of the 3D object defined by the first data and/or a weighting parameter comprised by the second data to amplify or dampen pose movements of the 3D object caused by the animation, and/or wherein the apparatus is configured to, in case of the animation-movement-interaction-control parameter indicating the mode, applying the animation to the 3D object in a manner acting on the 3D object along with the movement defined by the first data using a weighting parameter comprised by the first data to amplify or dampen the movement of the 3D defined by the first data and/or a weighting parameter comprised by the second data to amplify or dampen pose movements of the 3D object caused by the animation.
24. The apparatus of claim 23, wherein the animation of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter comprised by the second data, amplify or dampen the pose movements of the 3D object caused by the animation dependent on the channel with which the respective pose movement is associated.
25. The apparatus of claim 23, wherein the movement of the 3D object is defined in the scene description data in a manner decomposed into channels and the apparatus is configured to, using the weighting parameter comprised by the first data, amplify or dampen individual movements corresponding to the movement of the 3D object defined by the first data dependent on the channel with which the respective individual movement is associated.
26. The apparatus of claim 22, configured to acquire, from the scene description data, third data defining a further 3D object and a movement of the further 3D object, and acquire, from the scene description data, a weighting parameter associated with an object identification, wherein the apparatus is configured to decide based on the object identification whether the weighting parameter is to be used to amplify or dampen the movement of the 3D object or the movement of the further 3D object.
27. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object and a movement of the 3D object; wherein the first data defines the movement of the 3D object into a set of one or more channels so that one joint of the 3D object is moved, concurrently, by more than one channel.
28. The apparatus of claim 27, wherein each channel of the set of one or more channels indicates a pose movement for a joint of the 3D object, wherein two or more channels of the set of one or more channels are associated with the same joint of the 3D object, and wherein the second data stores the actual transformation value for each channel of the set of one or more channels.
29. An apparatus, configured to receive scene description data, acquire, from the scene description data, first data defining a 3D object and a movement of the 3D object, wherein the movement of the 3D object is defined by the first data in units of time frames so that per time frame a pose of the 3D object is defined, wherein the apparatus is configured to apply a pose transition mode indicated by the second data to render the 3D object on the basis of the first data, wherein, if the second data indicates a first predetermined mode, the apparatus is configured to interpolate between the poses of the 3D object at the time frames and, if the second data indicates a second predetermined mode, the apparatus is configured to acquire, form the scene description data, third data which triggers, for each of one or more of the time frames, one or more animations of the 3D object, wherein the apparatus is configured to apply the one or more animations to transition the 3D object between the pose of the 3D object at the respective time frame towards the pose of the object at a subsequent time frame.
30. The apparatus of claim 29, wherein the second data defines the animation of the 3D object or each of the animations of the 3D object into a set of one or more channels each of which indicates a pose movement for a joint of the 3D object and into a set of samplers defining values for a pose movement at certain time instants, wherein the apparatus is configured to apply the one or more animations defined by the second data to the 3D object in a manner so that values for a pose movement at time instants between the certain time instants defined by the sampler are interpolated for the respective channel.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0077] The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091]
[0092]
[0093]
[0094]
[0095]
[0096]
[0097]
[0098]
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
[0106]
[0107]
[0108]
[0109]
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
[0122]
DETAILED DESCRIPTION OF THE INVENTION
[0123] Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures.
[0124] In the following description, a plurality of details is set forth to provide a more throughout explanation of embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention. In addition, features of the different embodiments described herein after may be combined with each other, unless specifically noted otherwise.
[0125] In order to ease the understanding of the following embodiments of the present application, the description starts with a presentation of possible concepts into which the subsequently outlined embodiments of the present application could be built or with which the subsequently outlined embodiments of the present application could be combined with. In the following chapters one to three various examples are described which may assist in achieving a more effective compression when using multiple animations, volumetric video and/or combinations of animations and volumetric video.
[0126] 1 gITF Scene and Nodes
[0127] In a gITF file, the major entry point is the scenes. There can be multiple scenes in scenes array. Typically, there will be only a single scene, as shown in
[0128] Each node can contain an array called children that contains the indices of its child nodes. So each node is one element of a hierarchy of nodes, and together they define the structure of the scene as a scene graph. See
[0129] Each node can have a local transform. Such a transform will define a translation, rotation and/or scale. This transform will be applied to all elements attached to the node itself and to all its child nodes. For example:
TABLE-US-00002 ″node0″: { ″translation″: [ 10.0, 20.0, 30.0 ], ″rotation″: [ 0.259, 0.0, 0.0, 0.966 ], ″scale″: [ 2.0, 1.0, 0.5 ] }
[0130] When computing the final, local transform matrix of the node, these matrices are multiplied together. The convention is to compute the local transform matrix as LocalMatrix=translation*rotation*scale.
[0131] The global transform of a node is given by the product of all local transforms on the path from the root to the respective node:
TABLE-US-00003 structure local transform global transform root R R +−nodeA A R*A +−nodeB B R*A*B +−nodeC C R*A*C
[0132] One can import such a gITF file to any renderer which understand the gITF syntax. Microsoft Office suite also allow one to add a gITF asset to a word or ppt file.
[0133] 2 gITF Animations
[0134] Each node has a local transform. An animation can be used to describe the translation, rotation or scale of a node changing over time. In the example shown in
[0135]
[0136] The top-level animations array contains a single animation object. It consists of two elements: [0137] Samplers: Samplers which describe the sources of animation data [0138] Channels: The channels which can be imagined as connecting a “source” (i.e. sampler) of the animation data to a “target”
[0139] In the given example there is one sampler. Each sampler defines an input and an output property. They both refer to accessor object. Here, these are the times accessor (with Index 2) and the rotations accessor (with index 3). Additionally, the sampler defines an interpolation type, which is “LINEAR” in this example.
[0140] There is also one channel in the example shown in
[0141] The animation data is in the buffer at index 1. Animation.sampler.input and Animation.sampler.output point to accessor at index 2 and index 3 respectively. accessor 2 and accessor 3 both point to bufferView at index 2. The bufferView at index 2 point to buffer at index 1. This is how animations data is reached solving these dependencies.
[0142] During the animation, the animated values are obtained from the “rotations” accessor. They are interpolated linearly, based on the current simulation time and the key frame times that are provided by the times accessor. The interpolated values are then written into the “rotation” property of the node with index 0.
[0143] 3 Rigged 3D Object
[0144] Rigging is a technique used in skeleton animations for representing a 3D character model using a series of interconnected digital bones. Specifically, rigging refers to the process of creating the bone structure of a 3D model. This bone structure is used to manipulate the 3D model. The 3D object could be anything, space ship, a solider, a galaxy, a door. Rigging is most common in animated characters for games and movies. This technique simplifies the animation process. Once rigged with skeleton bones, any 3D object can be controlled and deformed as needed. Once a 3D object is rigged, it is animatable.
[0145]
[0146] Most 3D animation packages come with a solution for rigging a model. Maya, 3Ds Max, Blender, Cinema4D and Houdini all contain skeleton animation functionality. This is a direct rigging onto the asset of interest.
[0147] There could be also implicit rigging for a simple 3D object (without a skeleton). The 3D object will have vertex correspondences to a template 3D object which is already rigged. For example; a volumetric capture of an actor can be implicitly rigged by establishing the vertex correspondences to a humanoid representation which resembles the actors.
[0148] 3.1 Skeleton
[0149] The skeleton is expressed by the joint hierarchy. An example of such skeleton structure is shown in the pseudo file representing joint hierarchy shown in
[0150] The transformation of the joint could be accompanied by animations as well as explained in section 2. Animation controls the transformation of joint using channel. At a given instance only one property of the joint node can be transformed. Therefore, the complete joint transformation is decomposed into multiple channels which when combined together results in final output.
[0151] For example, a JointTransformationSample contains the transformation values for the model. These transformation values will be passed to the player and each joint node will receive an update transformation value.
[0152] 3.2 Animating a Rigged 3D Object
[0153] The rigging process results in a hierarchical structure where each joint is in a parent/child relationship with the joint it connects to. This simplifies the animations process as a whole. How the 3D model interacts with the joints is determined by a weight scale. As shown in
[0154] Some characters share the same skeleton structures. For example; a male 3D object and a female 3D object will share similar structure, so an animation defined for a male 3D object can be also applied to a female 3D object. Similarly, in a scene there can be multiple instances of a male 3D object (rigged), each object placed at a different position in the scene. Imagine a dance animation is presented, and player is interested to apply the same animation to all the rigged male 3D objects. For example; 3D Bollywood dancers each performing the same dance sequence in the background.
[0155] A directly or indirectly rigged volumetric scan be also be used to apply such animations.
[0156] 4 Solutions
[0157] One of the issues with the current approach that is identified comes with the looping functionality. There are cases where looping cannot simply mean restart the animation. E.g., let's consider an animation that resembles a person walking that starts with a step with the left foot and ends with a step with the left foot. Simply looping over would result in the person stepping twice with the left foot which would not make sense.
[0158]
[0159] The scene description data 100 comprises first data 110 defining a 3D object 112.
[0160] Additionally, the scene description data 100 comprises second data 120 triggering an animation 122 of the 3D object 112.
[0161] The apparatus 200 obtains from the scene description data 100 the first data 110 and the second data 120 and applies 210 the animation 122 to the 3D object 112 dependent on the mode parameter 124. The apparatus, for example, is configured to select one of the several animation modes 220 based on the mode parameter 124, e.g. to obtain a selected mode 230, and apply 210 the animation 122 to the 3D object according to the selected mode 230.
[0162]
[0163] According to one of the animation modes 220.sub.1, e.g., a first animation mode, the apparatus 200 is configured to apply 210 the animation 122 to the 3D object 112 repeatedly in loops 222.sub.1 to 222.sub.n with starting each loop from an initial pose 114 of the 3D object 112.
[0164] According to another animation mode 220.sub.2, e.g., a second animation mode, the apparatus 200 is configured to apply 210 the animation 122 to the 3D object 112 repeatedly in loops 222.sub.1 to 222.sub.3 with using a pose assumed by the 3D object at the end of one loop for starting a subsequent loop. As shown in
[0165] At another animation mode 220.sub.3, e.g., a third animation mode, the apparatus 200 is configured to apply 210 the animation 122 to the 3D object 112 with maintaining a pose 116 assumed by the 3D object 112 at the end of the animation 122. The animation 122, for example, is applied 210 for a certain duration and then stopped to end the animation 122. The pose 116 assumed by the 3D object 112 at this stop of the animation 122 is then maintained. This mode 220.sub.3 might be combinable with one of the other modes, e.g., 220.sub.1, 220.sub.2, 220.sub.4 and/or 220.sub.5, of the several modes 220. Thus, the animation 122 might be applied 210 to the 3D object 112 in several loops 222 and the pose assumed by the 3D object at an stop of the animation 122, i.e. at the end of the animation 122, is then maintained. The end of the animation 122 might be defined by the second data 120 of the scene description data 100, e.g. by indicating a stopping of the animation 122. The end of the animation might differ from a predetermined end associated with the animation 122. The animation 122 indicated by the second data 120 may run for a predetermined duration and end at the predetermined end. Mode 220.sub.3 allows to stop the animation 122 at a time other than the predetermined end and to maintain the pose 116 assumed by the 3D object 112 at this other end of the animation 122.
[0166] According to another animation mode 220.sub.4, e.g., a fourth animation mode, the apparatus 200 is configured to apply 210 the animation 122 to the 3D object 112 with returning to a pose 114, e.g. an initial pose, assumed by the 3D object 112 upon a start of the application 210 of the animation 122 to the 3D object 112. As shown in
[0167] According to another animation mode 220.sub.5, e.g., a fifth animation mode, the apparatus 200 is configured to apply 210 the animation 122.sub.2, e.g., once or repeatedly in loops 222, to the 3D object 112 in reverse 123 with starting from a pose 116 assumed by the 3D object 112 at the end of a previously applied animation 122.sub.1. The previously applied animation 122.sub.1 might be the animation 122 indicated by the second data 120 or might be a different animation.
[0168] With mode 220.sub.2 the object 112 could be rotated by 90 degrees and by looping such an animation the rotation could be continued further 90 degrees (achieving overall 180 degrees) or rotating again the same 90 degrees (achieving overall 270 degrees). Another alternative would be to mirror the animation by rotating backwards 90 degrees 123 to return to the origin position 114 afterwards, see mode 220.sub.5.
[0169] In a first embodiment signalling, e.g., the mode parameter 124, is provided to indicate how the looping should be carried out. The looping indication contains one or more of the following parameters (or combinations thereof): [0170] Start of the animation (from where to loop over) [0171] End of the animation (at which point to loop over) [0172] Return to the initial state (go back to the initial pose of the object at each loop), e.g., mode 220.sub.4 [0173] Continue the animation (keep the last pose of the object at each loop), e.g., mode 220.sub.2 [0174] Inverse the animation from last position, i.e. “mirroring” (e.g. when rotating clockwise 90 degrees start rotating anti-clockwise 90 degrees), e.g., mode 220.sub.5 [0175] Do not loop and go back to initial pose, e.g., mode 220.sub.4 [0176] Do not loop and keep the last pose, e.g., mode 220.sub.3 Examples of how the syntax could look like are shown in the following.
[0177] One option would be to use the existing “state” syntax, see
TABLE-US-00004 TABLE 2 Semantics of state value of gITFAnimationSample value identifier description 0 play Play the animation 1 stop Stop the animation and return to the initial state 2 pause Pause animation 3 restart Restart the animation, equivalent to stopping animation and playing it from the beginning. 4 update Update the animation characteristic, e.g. speed 5 loop Sets the animation to be run repeatedly in a loop. 6 loop_cycle Every loop begins from the initial state of the animation (i.e. reset to the original pose), e.g., mode 220.sub.1 7 loop_relative The loop begins from the last state of the previous animation (continue with the last pose), e.g., mode 220.sub.2 8 keep_final At animation end, keep the final state (stop animation but keep the last pose), e.g., mode 220.sub.3 9 go_to_initial At animation end, go back to initial state (stop animation and go back to initial pose), e.g., mode 220.sub.4 10 mirror The loop begins mirroring the previous interval from the last state of the previous animation, e.g., mode 220.sub.5 11 . . . 63 reserved Reserved for future use
[0178] Another alternative would be to extend the Animation sample with further syntax elements, as shown in
TABLE-US-00005 TABLE 3 Semantics of further syntax elements Property Description start_frame frame offset in order to reposition the start of the animation end_frame frame offset in order to reposition the end of the animation loop_mode Behaviour of the animation at the end key- frame in case of loop 0. Cycle: The animation restarts from the beginning (same as 6 above in table 1) 1. Relative: the animation continues from the last key value(same as 7 above in table 1) 2. Keep_Final: the animation will stop at the last key value (same as 8 above in table 1) 3. Mirror (same as 10 above in table 1) 4. Go_to_Initital (same as 9 above in table 1) [5-63] Reserved
[0179] According to the embodiment shown in
[0180] In addition to the use-case of looping, the start_frame 126.sub.1 and end_frame 126.sub.2 syntax could also be useful when not looping, e.g. when a subset of an animation 122 needs to be played, which could also be entailed when playing it alone or back to back sequentially with other animations so that the end result looks acceptable. I.e., scenarios where multiple animations are sequenced to convey a story, it would include smooth transition from one animation to another animation as the playback progress. In such a case, the presence of the two syntax elements would not be only based on the state being “loop” but the syntax could be extended as shown in
TABLE-US-00006 TABLE 4 Semantics of further syntax elements Property Description enable_offsets if equal to True, the animation will be included start_frame and end_frame *Further syntax as described above
[0181] A trimming parameter 126, as will be described in more detail with regard to
[0182]
[0183] The scene description data 100 comprises first data 110 defining a 3D object 112.
[0184] Additionally, the scene description data 100 comprises second data 120 triggering an animation 122 of the 3D object 112.
[0185] The apparatus 200 obtains from the scene description data 100 the first data 110 and the second data 120 and applies 210 the temporal subinterval 240 of the animation 122 or of a cyclic application 210 of the animation 122 to the 3D object 112 based on the trimming parameter 126.
[0186] According to an embodiment, the trimming parameter 126 indicates a start frame k′.sub.1 and an end frame k′.sub.2 of the temporal subinterval 240 of the animation 122 or of the cyclic application of the animation 122 to be applied to the 3D object 112.
[0187] As shown in
[0188] In case of the animation 122 being looped, i.e. a cyclic application 210 of the animation 122, a subinterval 240 of a duration of all loops might be chosen. Thus, it is possible that the initial pose k′.sub.1 of the subinterval equals a pose of the first loop k.sub.1 to k.sub.2 and that the last pose k′.sub.2 of the subinterval 240 equals a pose of a subsequent loop of the animation 122, e.g., a pose in the last loop of the animation 122.
[0189] According to another embodiment, the subinterval 240 of the animation 122 can be applied 210 to the 3D object 112 in loops, i.e. a cyclic application 210 of the subinterval 240 to the object 112.
[0190] The cyclic application 210 of the animation 122 to the object 112 or the cyclic application 210 of the subinterval 240 to the object 112 might be indicated by a mode parameter 124, which can be comprised by the second data 120 and which can be associated with features and/or functionalities, as described with regard to the mode parameter in
[0191] As discussed above, another aspect that is tackled herein is the case that there are multiple animations 122 acting on the same target. For this, purpose, it is identified how these several animations 122 are used to transform the object 112.
[0192] On one embodiment, a weight, e.g., a weight parameter 128, is provided that may indicate how much of each animation is taken for each of the final results. This is based on the idea that if several animations are acting on the same joints and/or bones of the object 112 it might look more realistic, if not all animations are applied to 100% to the object 112. Especially the inventors found, that it is advantageous to dampen or amplify individual pose movements associated with an animation, since only some pose movements associated with one animation might be applied to the same joints and/or bones of the object 112 as pose movements associated with another animation.
[0193]
[0194] The scene description data 100 comprises first data 110 defining a 3D object 112, see also the description with regard to
[0195] Additionally, the scene description data 100 comprises second data 120 triggering an animation 122 of the 3D object 112.
[0196] The apparatus 200 obtains from the scene description data 100 the first data 110 and the second data 120 and amplifies or dampens the pose movements 250 of the 3D object 112 caused by the animation 122 using the weighting parameter 128.
[0197] According to an embodiment, the weighting parameter 128 indicates for at least one of the pose movements 250 associated with the animation 122 a weight specifying how much the respective pose movement is to be amplified or dampened.
[0198] According to an embodiment, the weighting parameter 128 can indicate for all pose movements 250 associated with the animation 122 the same amplification or damping. Alternatively, the weighting parameter 128 can indicate for each pose movements 250 associated with the animation 122 a different amplification or damping or the weighting parameter 128 can indicate for some pose movements 250 associated with the animation 122 the same amplification or damping and for others a different amplification or damping. The weighting parameter 128 may also indicate whether some pose movements 250 may not be amplified or dampened. This might either be realized by specifying a weight of one for a pose movement 250, which is not to be amplified or dampened, or by specifying no weight for this pose movement 250, wherein the apparatus 200 may then be configured to infer that pose movements 250 to which no weight is assigned are not to be amplified or dampened.
[0199] The inventors found that such a weighting of pose movements 250 associated with an animation are advantageous in terms of flexibility and in terms of providing a more realistically animated object 112. This is based on the idea that the weighting parameter 128 allows to adapt a predetermined animation 122 to different objects 112 to be animated or to different scenes comprising the object 112 to be animated. With the weighting parameter 128 an object 112 can be animated very individually avoiding or reducing an unrealistic animation of the respective object 112.
[0200] The weighting parameter 128 is especially advantageous, if more than one animation 112 is applied to the same object 112, since it is possible to amplify or dampen pose movements 250 of two or more animations 122 associated with the same joints and/or bones of the object.
[0201] According to the following three embodiments, the animation 122 triggered by the second data 120 represents a first animation of the 3D object 112, and the scene description data comprises additionally third data triggering a second animation of the 3D object 112. The third data triggers the second animation of the 3D object 112 so that the first animation and the second animation are running simultaneously at least for a certain time interval. The first animation of the 3D object 112 is defined in the scene description data 100 in a manner decomposed into a first set of channels and, for example, into remaining channels, and the second animation of the 3D object 112 is defined in the scene description data 100 in a manner decomposed into a second set of channels and, for example, into remaining channels. For each channel of the first set of channels, the respective channel defines a pose movement for a joint of the 3D object 112 and, for each channel of the second set of channels, the respective channel defines a pose movement for a joint of the 3D object. Pose movements defined by the one or more channels of the first set of channels are applied to the same joints, e.g., common joints, as pose movements defined by the one or more channels of the second set of channels, e.g., the channels of the first- and second set of channels are associated with the same joints.
[0202] According to a first embodiment, the weighting parameter 128 comprised by the second data 120 amplifies or dampens the pose movements 250 defined by the one or more channels of the first set of channels and leaving the pose movements 250 of the 3D object 112 caused by the first animation 122 uninfluenced with respect to the remaining channels into which the first animation is decomposed. The third data comprises another weighting parameter amplifying or dampening the pose movements defined by the one or more channels of the second set of channels and leaving the pose movements of the 3D object caused by the second animation uninfluenced with respect to the remaining channels into which the second animation is decomposed.
[0203] According to a second embodiment, the weighting parameter 128 comprised by the second data 120 amplifies or dampens the pose movements 250 defined by the one or more channels of the first set of channels and a further weighting parameter is to be inferred, which further weighting parameter amplifies or dampens the pose movements 250 of the 3D object 112 caused by the first animation 122 with respect to the remaining channels into which the first animation is decomposed. The third data comprises another weighting parameter amplifying or dampening the pose movements defined by the one or more channels of the second set of channels and a further another weighting parameter is to be inferred, which further another weighting parameter amplifies or dampens the pose movements of the 3D object caused by the second animation with respect to the remaining channels into which the second animation is decomposed.
[0204] According to a third embodiment, the weighting parameter 128 comprised by the second data 120 amplifies or dampens the pose movements 250 defined by the one or more channels of the first set of channels and the second data 120 comprises a further weighting parameter amplifying or dampening the pose movements 250 of the 3D object 112 caused by the first animation 122 with respect to the remaining channels into which the first animation is decomposed. The third data comprises another weighting parameter amplifying or dampening the pose movements defined by the one or more channels of the second set of channels and the third data comprises a further another weighting parameter amplifying or dampening the pose movements of the 3D object caused by the second animation with respect to the remaining channels into which the second animation is decomposed.
[0205] According to an embodiment, additionally to the features described in one of the embodiments one to three above, the following feature might be implemented: The second animation can be added to the first animation 122 so that a combined animation is applied to the 3D object 112, wherein the combined animation is obtained by forming a sum of pose movements caused by the first animation 122 and the second animation, e.g., a sum of all pose movements, i.e. of the amplified or dampened pose movements and of the unprocessed (e.g., unamplified or undampened) pose movements, divided by a number of animations adding to the combined animation, e.g. divided by two, if only the first and the second animation are running at the same time.
[0206] According to an alternative embodiment, additionally to the features described in one of the embodiments one to three above, the following feature might be implemented: The second data might trigger the first animation 122 so that the first animation 122 is applied to the 3D object 112 during a first time interval, and the third data triggers the second animation of the 3D object so that the second animation is applied to the 3D object during a second time interval, wherein the first time interval and the second time interval are at least partially overlapping so that the first animation and the second animation are running simultaneously at least for the certain time interval, e.g. the time interval during which the first animation and the second animation are overlapping represents the certain time interval. The second animation is added to the first animation so that a combined animation is applied to the 3D object during the certain time interval, wherein the combined animation is obtained by forming a sum of pose movements caused by the first animation and the second animation (e.g., a sum of all pose movements, i.e. of the amplified or dampened pose movements and of the unprocessed (e.g., unamplified or undamped) pose movements), divided by a number of animations adding to the combined animation (e.g. divided by two, if only the first and the second animation are running at the same time).
[0207] Additionally, the weighting parameter may amplify or dampen the pose movements of the 3D object 112 caused by the first animation 122 and the second animation during the certain time interval using a first scaling and during the remaining first time interval using a second scaling and/or during the remaining second time interval using a third scaling.
[0208] According to an embodiment, the second data 120 comprises two or more weighting parameter amplifying or dampening pose movements 250 of the 3D object 112 caused by the animation 122, e.g., a weighting parameter per joint or a weighting parameter per channel.
[0209] According to an embodiment, the scene description data 100 and/or the apparatus 200 may comprise features and/or functionalities which are described with regard to one of
[0210] In the example provided in
[0211] Note that animations typically involve acting onto several joints or bones (represented by nodes in a gITF file) of a skeleton and providing a single weight value for the whole animation might lead to results that look visually unrealistic. One could envision for instance two animations; one acting on all joints and another only acting on a subset thereof. Therefore, the joints that are only affected by one animation could be fully affected (weight of 100%) by the single animation that has an impact on these joints while the other joints could be weighted differently (e.g. 50% each animation if there are two).
[0212] Therefore, in another embodiment a more flexible weight signalling is provided that allows a finer granular approach. For instance, one weigh per joint or even one weight per transformation described for a joint.
[0213] An example is shown in
[0214] Note that gITF can describe more complex animations with more than one transformation. In fact, for example, it describes an animation 122 as a collection of transformations, i.e. pose movements 250, in the form of channels. An example is shown in
[0215] So, basically samplers are defined that indicate in the “input”: a time interval and as the “output”: the transformation key-frame values. Then an animation (e.g. Animation1) has a given set of channels that point to the index in the samplers array that needs to be applied to perform an animation and indicate which node (target) is affected thereby. For example, the animation 122 of the 3D object 112 is defined in the scene description data 100 (e.g., the scene description data 100 shown in
[0216] According to an embodiment, the weighting parameter 128 amplifies or dampens the pose movements 250 of the 3D object 112 caused by the animation 122 dependent on the channel with which the respective pose movement 250 is associated, e.g., the weighting parameter 128 amplifies or dampens the pose movements 250 of the 3D object 112 channel individually.
[0217] According to an embodiment, the weighting parameter 128 amplifies or dampens the pose movements 250 of the 3D object caused by the animation 122 specifically with respect to one or more predetermined channels and leaves the pose movements of the 3D object caused by the animation uninfluenced with respect to remaining channels. For example, only channels, which assign a pose movement 250 to a node to which one or more further pose movements 250 associated with another animation, i.e. a second animation, have to be applied, are amplified or dampened using to the weighting parameter 128. For example, the scene description data 100 comprises additionally third data triggering the second animation of the 3D object 112. The third data may trigger the second animation of the 3D object 112 so that the animation 122, i.e. a first animation, and the second animation are running simultaneously at least for a certain time interval.
[0218] According to another embodiment, the weighting parameter 128 amplifies or dampens the pose movements 250 of the 3D object 112 caused by the animation 122 specifically with respect to one or more predetermined channels. Additionally, the second data 120 comprises another weighting parameter amplifying or dampening pose movements 250 of the 3D object 112 caused by the animation 122 with respect to one or more further channels. Alternatively, the weighting parameter 128 may indicate for each channel of the one or more predetermined channels and of the one or more further channels individually a weight specifying the amplification or damping for the respective channel.
[0219] According to an embodiment, the weighting parameter 128 may indicate channel individually a weight specifying the amplification or damping for the respective channel. The weighting parameter may not indicate for each channel into which the animation 122 is decomposed a weight. In case the weighting parameter 128 does not specify a weight for a channel, the apparatus 200 may be configured to infer that this channel is not to be amplified or dampened.
[0220] For example, the channels into which the animation 122 is decomposed consist of the one or more predetermined channels and the remaining channels. The one or more predetermined channels may correspond to channels associated with joints and/or bones of the 3D object 112 to which one or more animations are applied simultaneously and the remaining channels may correspond to channels associated with joints and/or bones of the 3D object 112 to which only the animation 122 is applied.
[0221] Note that the animation 122 described within a gITFAnimationSample and for which a weight, i.e. a weighting parameter 128, and a channel_index are provided might have a different number of channels than the animation describe in gITF that is triggered.
[0222] One possibility would be that the non-listed channels get a weight of 1 (meaning fully applied). Another possibility is that they get a weight of 0 (meaning not applied). A further possibility is that a default weight is specified for the other channels as shown in
[0223] According to the embodiment shown in
[0224] An important aspect of such transformations is that the order in which they are applied plays an important role, in particular if there is rotation involved in the transformation. Such information is clearly provided in
[0225] One of the issues with such an approach is for the case that there are overlapping animations as shown in
[0226]
[0227] The scene description data 100 comprises first data 110 defining a 3D object 112, see also the description with regard to
[0228] Additionally, the scene description data 100 comprises second data 120 triggering a first animation 122.sub.1 of the 3D object 112 and third data 130 triggering a second animation 122.sub.2 of the 3D object 112.
[0229] The apparatus 200 obtains from the scene description data 100 the first data 110, the second data 120 and the third data 130.
[0230] The second data 120 triggers the first animation 122.sub.1 of the 3D object 112 so that the first animation 122.sub.1 is running during a first time interval t.sub.1 to t.sub.4 and the third data 130 triggers the second animation 122.sub.2 of the 3D object 112 so that the second animation 122.sub.2 is running during a second time interval t.sub.2 to t.sub.3, wherein first time interval and the second time interval are overlapping at least for a certain time interval, e.g., in this case t.sub.2 to t.sub.3.
[0231] The apparatus 200 is configured to apply 210 the first animation 122.sub.1 and the second animation 122.sub.2 to the 3D object 112 so that the first animation 122.sub.1 and the second animation 122.sub.2 are running simultaneously at least for the certain time interval t.sub.2 to t.sub.3 and the second animation 122.sub.2 is applied 210 in a manner acting on the 3D object 112 along with the first animation 122.sub.1. As shown in
[0232] The apparatus 200 is configured to apply the first 122.sub.1 and second 122.sub.2 animation to the 3D object 112 dependent on the animation IDs 121.sub.1 and 121.sub.2. The animation IDs 121.sub.1 and 121.sub.2 may indicate an order according to which the first 122.sub.1 and second 122.sub.2 animation might be applied to the 3D object 112 at least during the certain time interval t.sub.2 to t.sub.3.
[0233] According to an embodiment, the first 121.sub.1 and second 121.sub.2 animation IDs are defined on an ordinal scale and the apparatus 200 is configured to determine a final animation 212 of the 3D object 112 emerging from the first 122.sub.1 and second 122.sub.2 animations based on a rank of the second animation ID 121.sub.1 relative to the first animation ID 121.sub.2. For example, the apparatus 200 is configured to derive an order according to which the two animations 122.sub.1 and 122.sub.2 have to be applied 210 to the 3D object 112 based on the rank. For example, animations 122 associated with an animation ID 121 with a lower value are applied before animations 122 associated with an animation ID 121 with a higher value. In case multiple animations 122 are acting on the same object 112 and running simultaneously at least for a certain time interval, the apparatus 200 might be configured to apply 210 the animations 122 dependent on their associated animation ID 121, wherein the animations are applied according to an animation ID order starting with the animation associated with the lowest animation ID and ending with the animation associated with the highest animation ID. According to the example shown in
[0234] According to an embodiment, the scene description data 100 has the first 122.sub.1 and second 122.sub.2 animations defined therein using fourth data in a manner where each animation 122 is tagged with the associated animation ID 121.sub.1 and the second 120 and third 130 data trigger the first 122.sub.1 and second 122.sub.2 animations by indexing the first 122.sub.1 and second 122.sub.2 animations using the first 121.sub.1 and second 121.sub.2 animation IDs. For example, the second 120 and third 130 data may both only indicate the respective animation ID 121 and a time instant at which the animation 122 associated with the respective animation ID 121 is to be started by the apparatus 200. The apparatus 200 may be configured to derive the animation 122 associated with the respective animation ID 121 from the fourth data.
[0235] Note that animations 122 are timed in the sense that any animation 122 described in gITF is described as a set of transformations, e.g., pose movements, of an object 112 applied during the timeframe of the given animation, e.g. the time frame t.sub.1 to t.sub.4 of the first animation 122.sub.1 and the time frame t.sub.2 to t.sub.3 of the second animation 122.sub.2. This means that when a gITFAnimationSample, e.g. the second data 120 and/or the third data 130, is provided to a player triggering the start of an animation 122 the time instant of that sample indicates the time at which the animation 122 is started but the current duration of the animation is not as for regular video a single timed output but can be longer as determined already by the animation 122 itself that is described for a particular time interval. Thus, it can be seen in the
[0236] In such a case, at any point a new animation 122 is added, an Animation sample is updated, with all active animations at each time (see the indicated points in time).
[0237] In a further embodiment, in order to not have to redundantly have to re-introduce already running animations 122 into a sample, explicit signalling to indicate ordering information, i.e. the animation IDs 121.sub.1 and 121.sub.2, is added, see
[0238] As already discussed with respect to
[0239] Concerning
[0240] In most scenarios, the mode later referred to as “additive”, i.e., simply adding the animations as they are, seems to be the most appropriate.
[0241] However, there might be other scenarios for which other options different from simply applying this additive operation could be envisioned. For instance, whenever a further animation is added the weights can be scaled appropriately (e.g. the provided weights are multiplied by 1 divided by the number of active animations 122, wherein the weights might be provided by a weighting parameter 128). Or similarly, if two animations 122 are started at the same time but they have a different duration, whether there is an implicit scaling that is considered for the time that two animations are played simultaneously and a further scaling for the time interval when there is a single animation could make sense. In this case, for example, the data 120/130 triggering the respective animation 122 of the object 112 may comprise two or more weighting parameter 128 associated with different time intervals of the respective animation 122, wherein each of the two or more weighting parameter 128 may indicate amplifications and/or attenuations of pose movements 250 associated with the respective animation 122.
[0242] A further option could be that an animation 122 simply overrides a previous animation, meaning only one is applied.
[0243] Therefore, in another embodiment, shown in
TABLE-US-00007 TABLE 5 Semantics of the syntax element order_mode order_mode Mode of the animation order 0. Override: An animation can override the animation with a lower order_id. The apparatus 200 might be configured to apply 210 the second animation 122.sub.2 overriding the first animation 122.sub.1 so that the first animation 122.sub.1 is not applied to the 3D object 112 as long as the second animation 122.sub.2 lasts. For example, the first animation 122.sub.1 might only be applied 210 to the object 112 in the time intervals t.sub.1 to t.sub.2 and t.sub.3 to t.sub.4 and in the time interval t.sub.2 to t.sub.3 only the second animation 122.sub.2 may be applied to the object 112. Alternatively, apparatus 200 might be configured to apply the second animation 122.sub.2 overriding the first animation 122.sub.1 with respect to a portion of the 3D object 112 affected by the first animation 122.sub.1, e.g., so that the first animation 122.sub.1 is not applied to joints of the 3D object 112, which joints the first animation 122.sub.1 has in common with the second animation 122.sub.2, as long as the second animation 122.sub.2 lasts. 1. Normalized: If several animations act on the same target, they will be combined together according to their weight value normalized by the number of animations, i.e. multiplied by 1 divided by the number of animations. The apparatus 200 might be configured to add the second animation 122.sub.2 to the first animation 122.sub.1 so that a combined animation is applied to the 3D object 112, wherein the combined animation is obtained by forming a sum of pose movements caused by the first animation 122.sub.1 and the second animation 122.sub.2, divided by a number of animations adding to the combined animation. 2. Additive: If several animations act on the same target, their animation transformation will be added together with exactly the weight indicated explicitly. The apparatus 200 might be configured to add the second animation 122.sub.2 to the first animation 122.sub.1 so that a combined animation is applied to the 3D object 112, wherein the combined animation is obtained by forming a sum of pose movements of the first animation and the second animation [3-63] Reserved
[0244] Although the syntax with a single weight 128 per animation is shown in
[0245] 5 Volumetric Video in gITF and Animations
[0246] Similar to the combination of several animations, a combination of dynamic behaviour from animations and real-capture can be done with volumetric video. A volumetric video is a sequence of 3D captures of a subject/actor. The actor or the subject may have a pose of their own in each volumetric frame. For example; a human actor is volumetrically captured in three dimensions. The volumetric capture may be self-rigged or indirectly rigged for example using the method of vertex correspondence whereby the volumetric scan is virtually glued to a model with underlying skeleton. The pose of the subject is conveyed by the skeleton structure as explained in section 3.1.
[0247] For example, gITF may also contain different animations as humanoid animations using third-party providers such as https://www.mixamo.com/. Such animations can be statically stored in a gITF file. Thereby an application may be interested in applying the gITF animations to a volumetrically captured subject.
[0248] Importantly, with volumetric video the base 3D geometry is dynamic and changes over time. This means that the default-pose of the captured object is not a static one but changes over time. The associated pose of the human body in a frame is a characteristic which could be expressed by e.g. the joint location of the skeleton for the volumetric capture. Any update in the pose of the human body can be carried out in JointsTransformationSample or other mechanism that provides such an update to the pose corresponding to the volumetric video.
[0249] The first question that arises is how the volumetric video is attached to the scene and then whether the volumetric scan and animations can be applied simultaneously, as there might be transformations that are not combinable, e.g. a volumetric scan video that is jumping and animation that is laying on the floor.
[0250]
[0251] The scene description data 100 comprises first data 110 defining a 3D object 112 and a movement of the 3D object 112. The first data might represent a volumetric video of the object 112. The movement of the 3D object 112 might be specified by a sequence 111 of 3D captures of the 3D object 112. In other words, the movement of the 3D object 112 might be defined by a sequence of frames, wherein each frame defines a certain pose of the 3D object 112 and wherein a pose of the 3D object changes over the sequence of frames.
[0252] Additionally, the scene description data 100 comprises second data 120 triggering an animation 122 of the 3D object 112.
[0253] The apparatus 200 obtains from the scene description data 100 the first data 110 and the second data 120 and applies 210.sub.1/210.sub.2 the animation 122 to the moving 3D object 112. In the example, shown in
[0254] The apparatus 200 might be configured to apply 210.sub.1 the animation 122 to the 3D object 112 in a manner acting on the 3D object 112 along with the movement of the 3D object 112 defined by the first data 110. In this case, the animation-movement-interaction-control parameter 127 is not needed. As shown in
[0255] This might be achieved by decomposing the movement defined by the first data 110 into time frames defining poses of the 3D object 112 at different time instants over time and combining per time frame these poses with pose movements associated with the animation 122 at the respective time instant.
[0256] Alternatively, the apparatus 200 might be configured to use the animation-movement-interaction-control parameter 127 for selecting a mode 214.sub.1 to 214.sub.3 according to which the animation 122 is to be applied to the moving 3D object 112. At a first mode 214.sub.1 the apparatus 200 might be configured to apply 210.sub.2 the animation 122 in a manner overriding the movement defined by the first data 110 by using a pose of the 3D object at the time instant t.sub.1 at which the animation 122 is triggered by the second data 120 as an initial pose of the 3D object 112 to which the animation 122 is applied. At a second mode 214.sub.2 the apparatus 200 might be configured to apply 210.sub.2 the animation 122 in a manner overriding the movement defined by the first data 110 by using a default pose 114 as an initial pose of the 3D object 112 to which the animation 122 is applied instead of the pose of the 3D object 112 at the time instant t.sub.1 at which the animation 122 is triggered by the second data 120. At a third mode 214.sub.3 the apparatus 200 might be configured to apply 210.sub.2 the animation 122 as described with regard to the application 210.sub.1 of the animation 122 to the moving 3D object 112.
[0257] According to an embodiment, the scene description data 100 and/or the apparatus 200 may comprise features and/or functionalities which are described with regard to one of
[0258] 5.1 Volumetric Video in gITF
[0259] In a gITF anything that is to be added to the scene, will be added in the scene property of the gITF scene. Like in the example shown in
[0260] Similarly, a volumetric video can be attached in the scene using a node. In the example shown in
[0261] 5.2 Volumetric Video+gITF Animations
[0262] In a gITF, the absolute transformation values are provided for animations 122 (see sampler.output). This means that during an animation 122 the object 112 transformed from one absolute state to another absolute state. Both start state and end state are independent. However, it is also interesting to note that there might be cases where each state is dependent on the other. For example: Move objectA by (1,0,0) units in 1 second, and then move ObjectA by (5,0,0) units but relative to the previous position (1,0,0), thus final state being ObjectA sitting at 6,0,0 position.
[0263] To this extent, in a further embodiment signalling, e.g., by the animation-movement-interaction-control parameter 127, is provided that indicates how to act at the receiver side as shown in table 6 below.
[0264] There might be different mode to applying the gITF animation to a volumetric video.
TABLE-US-00008 TABLE 6 Semantics of the syntax element mode Value Mode Description 0 Last-frame Use the last frame pose of the volumetric vide relative (last received jointTransformationSample value) as the initial pose for the subject in animation. (e.g., the first mode 214.sub.1) 1 T-Posed Apply a T-pose animation and overriding the pose of the volumetric video. (e.g., the second mode 214.sub.2) 2 Combinatory Use the joint transformation of the subject in each frame and current interpolated value of the joint from the gITF animations; and combine the two animations using methods as described in section 7. It may be that the “rotation” of the joints is controlled by the animation and “translation” of the joint node 2 is controlled by the ChannelTransformationSample. (e.g., the third mode 214.sub.3)
[0265] Thereby, for instance if they cannot be combined and the animation 122 is activated, the video needs to be paused. The signalling of whether they are to be combined, or whether the video is paused and the T-pose is used for the animation 122 or the last frame of the video before it is paused is used, need to be signalled to the user. An example is shown in
[0266] For the case that they are combinable similarly as shown for multiple animations, it might be necessary to provide to what extent each transformation is considered. In a further embodiment, shown in
[0267] However, there is an issue on how to identify which is the volumetric object 112.sub.1 or 112.sub.2 to which the weights described above apply. This is illustrated in
[0268] Linking the weight 128b of the volumetric video 260.sub.1 specified in the gITFAnimationSample to a particular volumetric scan video 260.sub.1 is entailed as indicated in
[0269] A first option consists of limiting the provided syntax of a gITFAnimationSample to apply to a single object 112. This can be made by linking the gITFAnimationSample to the object 112.sub.1 in gITF so that is known to what transformation vol_scan_weight applies.
[0270] The linkage of the gITFAnimationSample to the object 112.sub.1 in the gITF can be provided under the extension on “MPEG_ANIMATION_TIMING”, as shown in
[0271] In the example this link to an object 112 is done by target node pointing to node 0. Note that gITF describes nodes to which properties are attached, such as meshes, etc. In the example given, where there are 2 objects 112.sub.1 and 112.sub.2 (see
[0272] For example, an animation 122 associated with a first set of nodes, e.g., node0 to node9, corresponding to the 3D object 112.sub.1 is triggered. An animation associated with a second set of nodes, e.g., node10 to node 19, differing from the first set of nodes may be associated to a further 3D object 112.sub.2 and further data comprised in the scene description data 100 might trigger such an animation of the further 3D object 112.sub.2. Alternatively, the second data 120 might comprise an object identification associated with the animation 122 to be triggered to identify the 3D object 112.sub.1 to which the animation 122 is to be applied.
[0273] The provided solution means that the animations 122 are grouped into gITFAnimationSample that only apply to an object 112, while this is not necessary. Actually, since animations 122 in gITF clearly define which nodes are affected by the transformation described therein, a gITFAnimationSample could be kept generic and not be linked in gITF as described above to a single object 112.sub.1 or 112.sub.2.
[0274] Another alternative to avoid this is to keep gITFAnimationSample as generic as is currently, being able to indicate several animations that apply to different objects. In such a case, the object to which vol_scan_weight 128b applies needs to be indicated. For instance, as shown in
[0275] The object_index could be the node index to which an object 112 is attached. So in particular, this means that the linking of which is the weight 128b of the volumetric scan 260 to an object 112 is done within the gITFAnimationSample by pointing to the node (e.g., node 0 in the previous example).
[0276] As for the animation case described above, there may be scenarios where different joints in the skeleton are affected and the combination cannot use a single weight 128b for all joints. In another embodiment several weights 128b are provided.
[0277] An example of such a case is shown in
[0278] Then, it is possible to apply different weights 128b to the different components of the transformation of the video 260. For instance, in the example given above, the weight 128b of a potential head rotation movement in the volumetric scan 260 can be set to 0 so that the rotation is only taken from the animations 122. Obviously, other values for weights 128b (non-zero as in the mentioned example) for other examples can benefit of weighting different transformation differently.
[0279] An example is shown in
[0280] Please note that the highlighted syntax introduced earlier than this example can be incorporated in this gITFAnimationSample syntax format.
TABLE-US-00009 TABLE 7 Semantics of syntax elements used in FIG. 36 Property Description num_objects Number of objects for which animations are triggered. These objects may receive additional transformational information vol_num_channels number of transformations for the object vol_channel_index The index of the transformation for the object vol_scan_weight The weight for each of the transformation order_id The order for the object object_index The node index value to which the object is attached to num_channels Number of channels for an animation weight The weight value for each channel in an animation channel_index The index of the channel in an animation
[0281] Above, different embodiments for using a weighting parameter for volumetric scans are described. In the following it is described how the weighting parameter can be incorporated in the embodiment described with regard to
[0282] According to an embodiment, the application 210.sub.1 and/or the application 210.sub.2 might be performed using a weighting parameter 128b comprised by the first data 110 to amplify or dampen the movement 260 of the 3D object 112 defined by the first data 110 and/or using a weighting parameter 128 comprised by the second data 120 to amplify or dampen pose movements 250 of the 3D object 112 caused by the animation 122. The weighting parameter 128 might be used as described with regard to one or more of
[0283] According to an embodiment, the animation 122 of the 3D object 112 is defined in the scene description data 100 in a manner decomposed into channels and the apparatus 200 is configured to, using the weighting parameter 128 comprised by the second data 120, amplify or dampen the pose movements 250 of the 3D object 112 caused by the animation 122 dependent on the channel with which the respective pose movement 250 is associated.
[0284] According to an embodiment, the movement 260 of the 3D object 112 is defined in the scene description data 100 in a manner decomposed into channels, e.g., the movement 260 of the 3D object can be split into individual movements and each channel defines an individual movement 260 to be applied to a certain joint of the 3D object, e.g., using a rigged 3D object. For example, different channels might define different movements for the same joint of the 3D object. Additionally, the apparatus 200 is configured to, using the weighting parameter 128b comprised by the first data 110, amplify or dampen individual movements corresponding to the movement 260 of the 3D object 112 defined by the first data 110 dependent on the channel with which the respective individual movement 260 is associated.
[0285] Optionally features and/or functionalities as described with regard to the weighting parameter 128 in one of
[0286] According to an embodiment, the apparatus 200 is configured to obtain, from the scene description data 100, third data defining a further 3D object 112.sub.2 and a movement 260.sub.2 of the further 3D object 112.sub.2, and obtain, from the scene description data 100, a weighting parameter 128b associated with an object identification. The apparatus 200 may be configured to decide based on the object identification whether the weighting parameter 128b is to be used to amplify or dampen the movement 260.sub.1 of the 3D object 112.sub.1 or the movement 260.sub.2 of the further 3D object 112.sub.2.
[0287] 6 Channel Transformations
[0288] The discussion above is mainly focused on how a set of animations 122 can be trigged and combined. Additionally, the animations 122 can be combined with the volumetric video 260 as well. There are syntaxes available above which point to the volumetric_object and volumetric object_channel which could be combined together. Each operation could be weighted or ordered as explained above, e.g., as described with regard to one of
[0289] The discussion on how skeleton transformation can be carried out is presented in section 3.1. The jointTransformationSample might provide a complete joint transformation value. However, it might also be interesting to provide a much finer control over the different individual property transformations of the joint. That is a jointNode is translated and rotated. Rather than combining these two-transformation property in a single JointTransformationSample, provide a separate for the concern property.
[0290] This is, for example, shown in
[0291] The two or more channels 262.sub.1 and 262.sub.2 define different movements 260.sub.1/260.sub.2 associated with the same joint 113 of the 3D object 112. In the example shown in
[0292] An apparatus for animating or moving a 3D object might be configured to receive the scene description data 100 and obtain from the scene description data 100 the first data. The apparatus is configured to apply the movements 260.sub.1 and 260.sub.2 defined by the one or more channels of the set of one or more channels to the one joint, e.g., by the two or more channels of the set of two or more channels to the one joint.
[0293] Therefore, from a volumetric video stream, e.g., the movement 260, an additional timed metadata track can be used to facilitate how many such node.property transformations, e.g., pose movements 260.sub.1 and 260.sub.2 or channels 262.sub.2 and 262.sub.2, are applied to the volumetric video 260.sub.1 and which node.property transformations are affecting the node. Each node.property transformation is indexed which could be accessed by other samples such as gITFAnimationSample. So in the section below, node.property transformations can be understood as channels 262.sub.2 and 262.sub.2 where a channel 262 affects only one particular property, e.g., a particular pose movements 260.sub.1 and 260.sub.2, of a node at a time.
[0294] This metadata track discussed here is the one that provides the updates of the skeleton (joints of the object) over time so that additional transformations provided by the animation 122 as discussed previously can be applied.
[0295] Also, instead of adding a single transformation, e.g., a particular pose movement 260.sub.1 or 260.sub.2, to achieved the end position, as different transformations need to be indexed (see discussion on channels above and vol_channel_index in the exemplary syntax in
[0296] As shown in
[0297] The samples are provided in tracks which stores the actual transformation value for each channel 262 as defined. Each channel 262 has a target node, e.g., a joint 113, which is accessed by the node_index. The transform_property of the channel 262 determines the transformation type for the node. Multiple channels 262 can transform the same node. Each channel can be accessed by its unique channel_index.
[0298] According to an embodiment, each channel 262.sub.1 and 262.sub.2 of the set of one or more channels indicates a pose movement 260.sub.1 and 260.sub.2 for a joint 113 of the 3D object 112. Two or more channels 262 of the set of one or more channels 262 might be associated with the same joint 113 of the 3D object 112. For example, each channel 262 is only associated with one joint 113 of the 3D object 112, but more than one channel 262 can be applied to the same joint 113 of the 3D object 112, whereby different pose movements 260 can be applied to the same joint 113. Second data, e.g., representing the sample, might store the actual transformation value for each channel of the set of one or more channels.
[0299] In the example shown in
[0300] 7 Dynamic Animation
[0301] In the above section 5 we have discussed how a volumetric video 260 can be attached to a gITF scene. For example; in the gITF pseudo file, shown in
[0302] So, the motion of the volumetric video is expressed by the dynamically update of the corresponding mesh data.
[0303] In some cases, when there is enough throughput to transmit the data, the framerate of the volumetric video could be high enough to provide a smooth movement 260. In such a case, rendering the transmitted volumetric scan frames would be good enough. However, in other cases, as e.g., when the throughput of the network is not high enough, only some key fames would be transmitting and some kind of interpolation would be used at the rendering side.
[0304] In one embodiment, signalling is provided that indicates whether such kind of interpolations is entailed or not. Also the type of interpolation to be applied could be signalled.
[0305]
[0306] An apparatus 200 obtains the first data 110 and the second data 125 from the scene description data 100. The apparatus 200 is configured to apply a pose transition mode 218 indicated by the second data 125 to render the 3D object 112 on the basis of the first data 110.
[0307] If the second data 125 indicates a first predetermined mode 218.sub.1, e.g., a first predetermined pose transition mode, the apparatus 200 is configured to interpolate 270 between the poses of the 3D object 112 at the time frames 264. The apparatus 200 is configured to interpolate between poses of two temporally adjacent/neighbouring time frames 264, so that a smooth transition is realized between this two time frames 264. Thus, the apparatus 200 may interpolate between a current time frame i and a temporally subsequent time frame i+1, wherein 1≤i≤n−1 for a movement 260 defined by n time frames. This interpolation 127 might be done between all received temporally neighbouring time frames defining the movement 260. Received temporally neighbouring time frames represent the time frames received by the apparatus 200, e.g., from the scene description data 100.
[0308] If the second data 125 indicates a second predetermined mode 218.sub.2, e.g., a second predetermined pose transition mode, the apparatus 200 is configured to obtain, form the scene description data 100, third data 120 which triggers, for each of one or more of the time frames 264, one or more animations 122 of the 3D object 112. The apparatus 200 is configured to apply the one or more animations 122 to transition the 3D object 112 between the pose of the 3D object 112 at the respective time frame i towards the pose of the object 112 at a subsequent time frame i+1, wherein 1≤i≤n−1 for a movement 260 defined by n time frames. By applying the one or more animations 122 to a pose defined by a time frame 264 a smooth transition between this pose and a pose of a temporally subsequent time frame is possible. The movement 260 defined by the time frames 264 is in the example shown in
[0309] According to an embodiment, the second data 125 defines the animation 122 of the 3D object 112 or each of the animations of the 3D object 112 as a set of one or more channels each of which indicates a pose movement for a joint of the 3D object 112 and into a set of samplers defining values for a pose movement at certain time instants. The second data 125 defines the one or more animations to be applied to the 3D object in a manner so that values for a pose movement at time instants between the certain time instants defined by the sampler are interpolated for the respective channel.
[0310] The signalling could be just added to gITF as an extension to the meshes as shown in
[0311] Another possible way of doing it, would be to use gITF animations. Animations 122 in gITF provide the functionality to deform the 3D geometry. The deformation of the 3D geometry is controlled through animation.channel and animation.sampler. Therefore, the dynamic motion update of the volumetric video 260 could also be expressed by a set of corresponding animation.sampler and animation.channel.
[0312] In such a mechanism, as discussed, the mesh data would not necessarily has to be update constantly. Rather the deformations of the mesh data which represent the motion of the actual motion 260 of the character 112 can be expressed with gITF animations 122.
[0313] This can be understood in line with video coding. Similarly like, a mesh data is provided periodically or randomly in the volumetric video sequence (like Intra pictures) and updates in the representing the changes in the period are addressed by gITF animations (like Inter pictures) using the actual mesh data as the reference for the animations.
[0314] Therefore, a volumetric video sequence 260 can be expressed as a set of key-frame states 264. The client interpolates 270 between the key-frames 264 to give an illusion of a motion 260. The volumetric captured object 112 and its relevant pose information is stored in a gITF file. The sampler.output and sampler.input point to the key frame timing and transformation property for a node. The information received in the jointTransformationSample can be put in dynamic buffers which could be accessed by time accessors. The sampler.output and sampler.input store the index values of the timedAccessors to retrieve dynamic update for the animation. The key idea would be to indicate that the key frames 264 used for animation 122 are taken from a volumetric video scan. See, for example
[0315] Accessor 5 and accessor 6 can be a timedAccesors (i.e. The buffer data referred by the accessors is dynamic). The relevant dynamic buffers data of different joints nodes should be updated. Therefore, the pose update received should also include the relevant accessors index to where the transformations data gets stored in binary blob.
[0316] Above, different inventive embodiments and aspects have been described in a chapter “Solutions”, in a chapter “Volumetric video in gITF and animations”, in a chapter “Channel transformations” and in a chapter “Dynamic Animation”.
[0317] Also, further embodiments will be defined by the enclosed claims.
[0318] It should be noted that any embodiments as defined by the claims can be supplemented by any of the details (features and functionalities) described in the above mentioned chapters.
[0319] Also, the embodiments described in the above mentioned chapters can be used individually, and can also be supplemented by any of the features in another chapter, or by any feature included in the claims.
[0320] Also, it should be noted that individual aspects described herein can be used individually or in combination. Thus, details can be added to each of said individual aspects without adding details to another one of said aspects.
[0321] It should also be noted that the present disclosure describes, explicitly or implicitly, features usable in an encoder and in a decoder. Thus, any of the features described herein can be used in the context of an encoder and in the context of a decoder.
[0322] Moreover, features and functionalities disclosed herein relating to a method can also be used in an apparatus (configured to perform such functionality). Furthermore, any features and functionalities disclosed herein with respect to an apparatus can also be used in a corresponding method. In other words, the methods disclosed herein can be supplemented by any of the features and functionalities described with respect to the apparatuses.
[0323] Also, any of the features and functionalities described herein can be implemented in hardware or in software, or using a combination of hardware and software, as will be described in the section “implementation alternatives”.
[0324] Implementation Alternatives:
[0325] Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, one or more of the most important method steps may be executed by such an apparatus.
[0326] Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
[0327] Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
[0328] Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
[0329] Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
[0330] In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
[0331] A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary.
[0332] A further embodiment of the inventive method is, therefore, a data stream [e.g. the scene description data] or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
[0333] A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
[0334] A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
[0335] A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
[0336] In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods may be performed by any hardware apparatus.
[0337] The apparatus described herein may be implemented using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
[0338] The apparatus described herein, or any components of the apparatus described herein, may be implemented at least partially in hardware and/or in software.
[0339] The methods described herein may be performed using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
[0340] The methods described herein, or any components of the apparatus described herein, may be performed at least partially by hardware and/or by software.
[0341] While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.