G06T13/00

AUGMENTED REALITY TRANSLATION OF SIGN LANGUAGE CLASSIFIER CONSTRUCTIONS

A method, computer system, and a computer program product for translating a classifier construction into a graphical representation is provided. The present invention may include observing a classifier handshape by an augmented reality device. The present invention may include analyzing the observed classifier handshape according to an object recognition algorithm to determine a contextual meaning of the classifier handshape. The present invention may include converting the contextual meaning of the observed classifier handshape into a graphical representation. The present invention may include displaying the graphical representation alongside the observed classifier handshape on the augmented reality device.

AUGMENTED REALITY TRANSLATION OF SIGN LANGUAGE CLASSIFIER CONSTRUCTIONS

A method, computer system, and a computer program product for translating a classifier construction into a graphical representation is provided. The present invention may include observing a classifier handshape by an augmented reality device. The present invention may include analyzing the observed classifier handshape according to an object recognition algorithm to determine a contextual meaning of the classifier handshape. The present invention may include converting the contextual meaning of the observed classifier handshape into a graphical representation. The present invention may include displaying the graphical representation alongside the observed classifier handshape on the augmented reality device.

Personalized speech-to-video with three-dimensional (3D) skeleton regularization and expressive body poses

Presented herein are novel embodiments for converting a given speech audio or text into a photo-realistic speaking video of a person with synchronized, realistic, and expressive body dynamics. In one or more embodiments, 3D skeleton movements are generated from the audio sequence using a recurrent neural network, and an output video is synthesized via a conditional generative adversarial network. To make movements realistic and expressive, the knowledge of an articulated 3D human skeleton and a learned dictionary of personal speech iconic gestures may be embedded into the generation process in both learning and testing pipelines. The former prevents the generation of unreasonable body distortion, while the later helps the model quickly learn meaningful body movement with a few videos. To produce photo-realistic and high-resolution video with motion details, a part-attention mechanism is inserted in the conditional GAN, where each detailed part is automatically zoomed in to have their own discriminators.

METHOD FOR ANIMATION SYNTHESIS, ELECTRONIC DEVICE AND STORAGE MEDIUM
20220375456 · 2022-11-24 ·

A method for animation synthesis includes: obtaining an audio stream to be processed and a syllable sequence, wherein both the audio stream and the syllable sequence correspond to the same text and each syllable in the syllable sequence is pinyin of each character of the text; obtaining a phoneme information sequence of the audio stream by performing phoneme detection on the audio stream, wherein each piece of phoneme information in the phoneme information sequence comprises a phoneme category and a pronunciation time period; determining a pronunciation time period corresponding to each syllable in the syllable sequence based on the syllable sequence, phoneme categories and pronunciation time periods in the phoneme information sequence; and generating an animation video corresponding to the audio stream based on the pronunciation time period corresponding to each syllable in the syllable sequence and an animation frame sequence corresponding to each syllable.

METHOD FOR ANIMATION SYNTHESIS, ELECTRONIC DEVICE AND STORAGE MEDIUM
20220375456 · 2022-11-24 ·

A method for animation synthesis includes: obtaining an audio stream to be processed and a syllable sequence, wherein both the audio stream and the syllable sequence correspond to the same text and each syllable in the syllable sequence is pinyin of each character of the text; obtaining a phoneme information sequence of the audio stream by performing phoneme detection on the audio stream, wherein each piece of phoneme information in the phoneme information sequence comprises a phoneme category and a pronunciation time period; determining a pronunciation time period corresponding to each syllable in the syllable sequence based on the syllable sequence, phoneme categories and pronunciation time periods in the phoneme information sequence; and generating an animation video corresponding to the audio stream based on the pronunciation time period corresponding to each syllable in the syllable sequence and an animation frame sequence corresponding to each syllable.

LATE WARPING TO MINIMIZE LATENCY OF MOVING OBJECTS
20220375026 · 2022-11-24 ·

A method for minimizing latency of moving objects in an augmented reality (AR) display device is described. In one aspect, the method includes determining an initial pose of a visual tracking device, identifying an initial location of an object in an image that is generated by an optical sensor of the visual tracking device, the image corresponding to the initial pose of the visual tracking device. rendering virtual content based on the initial pose and the initial location of the object, retrieving an updated pose of the visual tracking device, tracking an updated location of the object in an updated image that corresponds to the updated pose, and applying a time warp transformation to the rendered virtual content based on the updated pose and the updated location of the object to generate transformed virtual content.

Portable multifunction device with animated sliding user interface transitions

In accordance with some embodiments, a computer-implemented method is performed at a portable multifunction device with a touch screen display. The computer-implemented method includes: displaying a home menu comprising a plurality of application launch icons; detecting activation of any respective application launch icon; and, in response to detecting the activation, displaying a first animation of a transition from display of the home menu to display of an application that corresponds to the activated application launch icon. The first animation comprises expanding an image of the application.

Portable multifunction device with animated sliding user interface transitions

In accordance with some embodiments, a computer-implemented method is performed at a portable multifunction device with a touch screen display. The computer-implemented method includes: displaying a home menu comprising a plurality of application launch icons; detecting activation of any respective application launch icon; and, in response to detecting the activation, displaying a first animation of a transition from display of the home menu to display of an application that corresponds to the activated application launch icon. The first animation comprises expanding an image of the application.

System and method for rendering a design including a dynamic design element

A computer implemented method for rendering a page that includes one or more dynamic design elements into an output video is described. The method comprises processing the page to generate one or more layers, each layer being either a static layer associated with one or more static design elements of the page or a dynamic layer associated with a single dynamic design element of the page. Output frames are then generated using the layers and then encoded into the output video.

System and method for rendering a design including a dynamic design element

A computer implemented method for rendering a page that includes one or more dynamic design elements into an output video is described. The method comprises processing the page to generate one or more layers, each layer being either a static layer associated with one or more static design elements of the page or a dynamic layer associated with a single dynamic design element of the page. Output frames are then generated using the layers and then encoded into the output video.