G06N3/094

Three-dimensional pose estimation

Devices and techniques are generally described for estimating three-dimensional pose data. In some examples, a first machine learning network may generate first three-dimensional (3D) data representing input 2D data. In various examples, a first 2D projection of the first 3D data may be generated. A determination may be made that the first 2D projection conforms to a distribution of natural 2D data. A second machine learning network may generate parameters of a 3D model based at least in part on the input 2D data and based at least in part on the first 3D data. In some examples, second 3D data may be generated using the parameters of the 3D model.

STABLE POSE ESTIMATION WITH ANALYSIS BY SYNTHESIS
20220392099 · 2022-12-08 ·

One embodiment of the present invention sets forth a technique for generating a pose estimation model. The technique includes generating one or more trained components included in the pose estimation model based on a first set of training images and a first set of labeled poses associated with the first set of training images, wherein each labeled pose includes a first set of positions on a left side of an object and a second set of positions on a right side of the object. The technique also includes training the pose estimation model based on a set of reconstructions of a second set of training images, wherein the set of reconstructions is generated by the pose estimation model from a set of predicted poses outputted by the one or more trained components.

VIDEO REENACTMENT TAKING INTO ACCOUNT TEMPORAL INFORMATION
20220392490 · 2022-12-08 ·

Apparati, methods, and computer readable media for inserting identity information from a source image (static image or video) (301) into a destination video (302), while mimicking motion of the destination video (302). In an apparatus embodiment, an identity encoder (304) is configured to encode identity information of the source image (301). When source image (301) is a multi-frame static image or a video, an identity code aggregator (307) is positioned at an output of the identity encoder (304), and produces an identity vector (314). A driver encoder (313) is coupled to the destination (driver) video (302), and has two components: a pose encoder (305) configured to encode pose information of the destination video (302), and a motion encoder (315) configured to separately encode motion information of the destination video (302). The driver encoder (313) produces two vectors: a pose vector (308) and a motion vector (316). A neural network generator (310) has three inputs: the identity vector (314), the pose vector (308), and the motion vector (316). The neural network generator (310) is configured to generate, in response to these three inputs, a composite video (303) comprising identity information of the source image (301) inserted into the destination video (302), where the composite video (303) has substantially the same temporal information as the destination video (302).

ARTIFICIAL INTELLIGENCE APPROACHES FOR PREDICTING CONVERSION ACTIVITY PROBABILITY SCORES AND KEY PERSONAS FOR TARGET ENTITIES

The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and efficiently predicting conversion probability scores and key personas for target entities utilizing an artificial intelligence approach. For example, the disclosed systems utilize a conversion activity score neural network to predict conversion activity probability scores for target entities and utilize a persona prediction machine learning model to predict key personas associated with target entities. In particular, the disclosed systems utilize the conversion activity score neural network to generate a predicted conversion activity probability score for a target entity from input data including client device interactions of digital profiles belonging to the target entity as well as an entity feature vector representing characteristics of the target entity. The disclosed systems also (or alternatively) utilize a persona prediction machine learning model to determine a set of key personas for the target entity from the entity feature vector.

NON-LINEAR LATENT TO LATENT MODEL FOR MULTI-ATTRIBUTE FACE EDITING

Systems and methods for image processing are described. One or more embodiments of the present disclosure identify a latent vector representing an image of a face, identify a target attribute vector representing a target attribute for the image, generate a modified latent vector using a mapping network that converts the latent vector and the target attribute vector into a hidden representation having fewer dimensions than the latent vector, wherein the modified latent vector is generated based on the hidden representation, and generate a modified image based on the modified latent vector, wherein the modified image represents the face with the target attribute.

THREE-DIMENSIONAL PRINTING USING GENERATIVE ADVERSARIAL NETWORK TECHNIQUES

Provided is a system, method, and computer program product for generating a three-dimensional (3D) printable file of a complete object by re-assembling pieces of a broken object using generative adversarial network techniques. A processor may generate a 3D scan of each piece of a plurality of pieces of a broken object. The processor may assemble the 3D scan of each piece of the plurality of pieces to generate a re-assembled object, where the re-assembled object includes one or more gaps. The processor may fill the one or more gaps in the re-assembled object to create a complete object. The processor may generate a 3D printable file of the complete object.

PASSWORD DISCOVERY SYSTEM USING A GENERATIVE ADVERSARIAL NETWORK
20220391491 · 2022-12-08 · ·

Systems and methods for password discovery are provided. A system receives a first password data set comprising known passwords and applies a rule-set to the first data set to generate a second password data set comprising passwords that are believed to be likely to be human-generated. The system trains a generative adversarial network, for generating predicted passwords, using the second data set, for example by incentivizing the GAN to favor passwords in the second data set. The system applies the generative adversarial network to generate a third password data set comprising predicted passwords. The system compares the third password data set to a data corpus to identify a string in the data corpus determined to match one of the predicted passwords in the first plurality of predicted passwords. The identified string may thus be identified as a previously undiscovered password, which may be applied to unlock password-protected systems and/or to further improve password discovery systems.

RESTORING DEGRADED DIGITAL IMAGES THROUGH A DEEP LEARNING FRAMEWORK
20220392025 · 2022-12-08 ·

The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately, efficiently, and flexibly restoring degraded digital images utilizing a deep learning framework for repairing local defects, correcting global imperfections, and/or enhancing depicted faces. In particular, the disclosed systems can utilize a defect detection neural network to generate a segmentation map indicating locations of local defects within a digital image. In addition, the disclosed systems can utilize an inpainting algorithm to determine pixels for inpainting the local defects to reduce their appearance. In some embodiments, the disclosed systems utilize a global correction neural network to determine and repair global imperfections. Further, the disclosed systems can enhance one or more faces depicted within a digital image utilizing a face enhancement neural network as well.

VIDEO REENACTMENT WITH HAIR SHAPE AND MOTION TRANSFER
20220392255 · 2022-12-08 ·

Methods and apparati for inserting face and hair information from a source video (401) into a destination (driver) video (402) while mimicking pose, illumination, and hair motion of the destination video (402). An apparatus embodiment comprises an identity encoder (404) configured to encode face and hair information of the source video (401) and to produce as an output an identity vector; a pose encoder (405) configured to encode pose information of the destination video (402) and to produce as an output a pose vector; an illumination encoder (406) configured to encode head and hair illumination of the destination video (402) and to produce as an output an illumination vector; and a hair motion encoder (414) configured to encode hair motion of the destination video (402) and to produce as an output a hair motion vector. The identity vector, pose vector, illumination vector, and hair motion vector are fed as inputs to a neural network generator (410). The neural network generator (410) is configured to generate, in response to the four inputs, a composite video (403) comprising face and hair information from the source video (401) inserted into the destination video (402).

METHOD, APPARATUS, ELECTRONIC DEVICE AND MEDIUM FOR IMAGE SUPER-RESOLUTION AND MODEL TRAINING
20220383452 · 2022-12-01 ·

The embodiments of the present application provide method, apparatus, electronic device, and medium for image super-resolution and model training. The method includes: inputting the image to be processed into a first super-resolution network model and a second super-resolution network model trained in advance, respectively; the first super-resolution network model is a trained convolutional neural network; the second super-resolution network model is a generative network included in a trained generative adversarial network; obtaining a first image output from the first super-resolution network model and a second image output from the second super-resolution network model; fusing the first image and the second image to obtain a target image, wherein the resolution of the target image is greater than the resolution of the image to be processed.