Patent classifications
G06V10/776
METHOD FOR TRAINING FEATURE EXTRACTION MODEL, METHOD FOR CLASSIFYING IMAGE, AND RELATED APPARATUSES
The present disclosure provides a method for training a feature extraction model, a method for classifying an image and related apparatuses, and relates to the field of artificial intelligence technology such as deep learning and image recognition. The scheme comprises: extracting an image feature of each sample image in a sample image set using a basic feature extraction module of an initial feature extraction model, to obtain an initial feature vector set; performing normalization processing on each initial feature vector in the initial feature vector set using a normalization processing module of the initial feature extraction model, to obtain each normalized feature vector; and guiding training for the initial feature extraction model through a preset high discriminative loss function, to obtain a target feature extraction model as a training result.
METHOD FOR TRAINING FEATURE EXTRACTION MODEL, METHOD FOR CLASSIFYING IMAGE, AND RELATED APPARATUSES
The present disclosure provides a method for training a feature extraction model, a method for classifying an image and related apparatuses, and relates to the field of artificial intelligence technology such as deep learning and image recognition. The scheme comprises: extracting an image feature of each sample image in a sample image set using a basic feature extraction module of an initial feature extraction model, to obtain an initial feature vector set; performing normalization processing on each initial feature vector in the initial feature vector set using a normalization processing module of the initial feature extraction model, to obtain each normalized feature vector; and guiding training for the initial feature extraction model through a preset high discriminative loss function, to obtain a target feature extraction model as a training result.
AR POSITION AND ORIENTATION ALONG A PLANE
Aspects of the present disclosure involve a system for presenting AR items. The system performs operations including receiving a video that includes a depiction of one or more real-world objects in a real-world environment and obtaining depth data related to the real-world environment. The operations include generating a three-dimensional (3D) model of the real-world environment based on the video and the depth data and adding an augmented reality (AR) item to the video based on the 3D model of the real-world environment. The operations include determining that the AR item has been placed on a vertical plane of the real-world environment and modifying an orientation of the AR item to correspond to an orientation of the vertical plane.
AR POSITION AND ORIENTATION ALONG A PLANE
Aspects of the present disclosure involve a system for presenting AR items. The system performs operations including receiving a video that includes a depiction of one or more real-world objects in a real-world environment and obtaining depth data related to the real-world environment. The operations include generating a three-dimensional (3D) model of the real-world environment based on the video and the depth data and adding an augmented reality (AR) item to the video based on the 3D model of the real-world environment. The operations include determining that the AR item has been placed on a vertical plane of the real-world environment and modifying an orientation of the AR item to correspond to an orientation of the vertical plane.
Deep network lung texture recogniton method combined with multi-scale attention
The invention discloses a deep network lung texture recognition method combined with multi-scale attention, which belongs to the field of image processing and computer vision. In order to accurately recognize the typical texture of diffuse lung disease in computed tomography (CT) images of the lung, a unique attention mechanism module and multi-scale feature fusion module were designed to construct a deep convolutional neural network combing multi-scale and attention, which achieves high-precision automatic recognition of typical textures of diffuse lung diseases. In addition, the proposed network structure is clear, easy to construct, and easy to implement.
METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TRAINING MODEL
Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for training a model. The method may include determining image features, audio features, and text features of a reference object based on reference image information, reference audio information, and reference text information associated with the reference object, respectively. The method may also include constructing a feature tensor from the image features, the audio features, and the text features. In addition, the method may further include decomposing the feature tensor into a first feature vector, a second feature vector, and a third feature vector corresponding to the image features, the audio features, and the text features, respectively, to determine a loss function value of the model. The method may also include updating parameters of the model based on the loss function value.
METHODS AND APPARATUS FOR PREDICTING A USER CONVERSION EVENT
In some examples, a system may be configured to obtain a set of features of a set of users including one or more features of transaction data of the set of users and one or more features of engagement data of the set of users. Additionally, the system may be configured to implement a first set of operations that generate output data including a plurality of conversion scores, based on the set of features. In some examples, each conversion score of the plurality of conversion scores are associated with a particular user of the set of users and characterize a likelihood of a conversion event of the corresponding user changing from a trial-member status to a full-member status prior to a predetermined future time.
METHODS AND APPARATUS FOR PREDICTING A USER CONVERSION EVENT
In some examples, a system may be configured to obtain a set of features of a set of users including one or more features of transaction data of the set of users and one or more features of engagement data of the set of users. Additionally, the system may be configured to implement a first set of operations that generate output data including a plurality of conversion scores, based on the set of features. In some examples, each conversion score of the plurality of conversion scores are associated with a particular user of the set of users and characterize a likelihood of a conversion event of the corresponding user changing from a trial-member status to a full-member status prior to a predetermined future time.
Apparatus and method for retraining object detection using undetected image
An apparatus for retraining an object detector according to an exemplary embodiment includes an inputter configured to receive an undetected image, a style transferer configured to generate one or more first augmented images that have the same content attribute as an object area of the undetected image, but a different style attribute, a content transferer configured to generate one or more second augmented images that have the same style attribute as the object area, but a different content attribute, and an influence analyzer configured to analyze a cause of non-detection of the undetected image by comparing object detection reliabilities of the undetected image, the first augmented image, and the second augmented image.
Optimizing supervised generative adversarial networks via latent space regularizations
A method of training a generator G of a Generative Adversarial Network (GAN) includes receiving, by an encoder E, a target data Y; receiving, by the encoder E, an output G(Z) of the generator G, where the generator G generates the output G(Z) in response to receiving a random sample Z and where a discriminator D of the GAN is trained to distinguish which of the G(Z) and the target data Y; training the encoder E to minimize a difference between a first latent space representation E(G(Z)) of the output G(Z) and a second latent space representation E(Y) of the target data Y, where the output G(Z) and the target data Y are input to the encoder E; and using the first latent space representation E(G(Z)) and the second latent space representation E(Y) to constrain the training of the generator G.