G06V10/7747

SYSTEMS AND METHODS FOR A LIGHTWEIGHT PATTERN-AWARE GENERATIVE ADVERSARIAL NETWORK

A computer-implemented method includes training at least a generative adversarial network, the method operable on one or more processors. The method includes at least (1) applying pattern extraction to a set of training data to extract one or more feature embeddings representing one or more features of the training data, (2) attenuating the one or more feature embeddings to create one or more attenuated feature embeddings, (3) providing the one or more attenuated embeddings to a generator of the generative adversarial network as a condition to at least partly control the generator in generating synthetic data, the providing being performed automatically and dynamically during training of the generator, and (4) with the generator, generating synthetic data based at least in part on the attenuated embeddings.

Deep graph de-noise by differentiable ranking
11645540 · 2023-05-09 · ·

A method for employing a differentiable ranking based graph sparsification (DRGS) network to use supervision signals from downstream tasks to guide graph sparsification is presented. The method includes, in a training phase, generating node representations by neighborhood aggregation operators, generating sparsified subgraphs by top-k neighbor sampling from a learned neighborhood ranking distribution, feeding the sparsified subgraphs to a task, generating a prediction, and collecting a prediction error to update parameters in the generating and feeding steps to minimize an error, and, in a testing phase, generating node representations by neighborhood aggregation operators related to testing data, generating sparsified subgraphs by top-k neighbor sampling from a learned neighborhood ranking distribution related to the testing data, feeding the sparsified subgraphs related to the testing data to a task, and outputting prediction results to a visualization device.

Hypercomplex deep learning methods, architectures, and apparatus for multimodal small, medium, and large-scale data representation, analysis, and applications

A method and system for creating hypercomplex representations of data includes, in one exemplary embodiment, at least one set of training data with associated labels or desired response values, transforming the data and labels into hypercomplex values, methods for defining hypercomplex graphs of functions, training algorithms to minimize the cost of an error function over the parameters in the graph, and methods for reading hierarchical data representations from the resulting graph. Another exemplary embodiment learns hierarchical representations from unlabeled data. The method and system, in another exemplary embodiment, may be employed for biometric identity verification by combining multimodal data collected using many sensors, including, data, for example, such as anatomical characteristics, behavioral characteristics, demographic indicators, artificial characteristics. In other exemplary embodiments, the system and method may learn hypercomplex function approximations in one environment and transfer the learning to other target environments. Other exemplary applications of the hypercomplex deep learning framework include: image segmentation; image quality evaluation; image steganalysis; face recognition; event embedding in natural language processing; machine translation between languages; object recognition; medical applications such as breast cancer mass classification; multispectral imaging; audio processing; color image filtering; and clothing identification.

CAMERA LOCALIZATION

In various embodiments there is a method for camera localization within a scene. An image of a scene captured by the camera is input to a machine learning model, which has been trained for the particular scene to detect a plurality of 3D scene landmarks. The 3D scene landmarks are pre-specified in a pre-built map of the scene. The machine learning model outputs a plurality of predictions, each prediction comprising: either a 2D location in the image which is predicted to depict one of the 3D scene landmarks, or a 3D bearing vector, being a vector originating at the camera and pointing towards a predicted 3D location of one of the 3D scene landmarks. Using the predictions, an estimate of a position and orientation of the camera in the pre-built map of the scene is computed.

SYSTEMS AND METHODS FOR OPEN VOCABULARY OBJECT DETECTION
20230154213 · 2023-05-18 ·

Embodiments described herein provide methods and systems for open vocabulary object detection of images. given a pre-trained vision-language model and an image-caption pair, an activation map may be computed in the image that corresponds to an object of interest mentioned in the caption. The activation map is then converted into a pseudo bounding-box label for the corresponding object category. The open vocabulary detector is then directly supervised by these pseudo box-labels, which enables training object detectors with no human-provided bounding-box annotations.

SOURCE-FREE CROSS DOMAIN DETECTION METHOD WITH STRONG DATA AUGMENTATION AND SELF-TRAINED MEAN TEACHER MODELING
20230154167 · 2023-05-18 ·

A method for implementing source-free domain adaptive detection is presented. The method includes, in a pretraining phase, applying strong data augmentation to labeled source images to produce perturbed labeled source images and training an object detection model by using the perturbed labeled source images to generate a source-only model. The method further includes, in an adaptation phase, training a self-trained mean teacher model by generating a weakly augmented image and multiple strongly augmented images from unlabeled target images, generating a plurality of region proposals from the weakly augmented image, selecting a region proposal from the plurality of region proposals as a pseudo ground truth, detecting, by the self-trained mean teacher model, object boxes and selecting pseudo ground truth boxes by employing a confidence constraint and a consistency constraint, and training a student model by using one of the multiple strongly augmented images jointly with an object detection loss.

SELF-SUPERVISED LEARNING FOR ARTIFICIAL INTELLIGENCE-BASED SYSTEMS FOR MEDICAL IMAGING ANALYSIS

Systems and methods for training an artificial intelligence-based system using self-supervised learning are provided. For each respective training medical image of a set of unannotated training medical images, the following steps are performed. A first augmented image is generated by applying a first augmentation operation to the respective training medical image. A second augmented image is generated by applying a second augmentation operation to the respective training medical image. A first representation vector is created from the first augmented image using an encoder network. A second representation vector is created from the second augmented image using the encoder network. The first representation vector is mapped to first cluster codes. The second representation vector is mapped to second cluster codes. The encoder network is optimized using the first and second representation vectors and the first and second cluster codes.

ADAPTIVE ARTIFICIAL INTELLIGENCE FOR THREE-DIMENSIONAL OBJECT DETECTION USING SYNTHETIC TRAINING DATA

Embodiments described herein are directed to an adaptive AI model for 3D object detection using synthetic training data. For example, an ML model is trained to detect certain items of interest based on a training set that is synthetically generated in real time during the training process. The training set comprises a plurality of images depicting containers that are virtually packed with items of interest. Each image of the training set is a composite of an image comprising a container that is packed with items of non-interest and an image comprising an item of interest scanned in isolation. A plurality of such images is generated during any given training iteration of the ML model. Once trained, the ML model is configured to detect items of interest in actual containers and output a classification indicative of a likelihood that a container comprises an item of interest.

IMAGE LEARNING METHOD, APPARATUS, PROGRAM, AND RECORDING MEDIUM USING GENERATIVE ADVERSARIAL NETWORK
20230154165 · 2023-05-18 · ·

The present disclosure relates to image learning method, apparatus, program, and recording medium using a generative adversarial network.

The present disclosure allows to learn various images as well as medical radiographic images to maintain structural information on the basis of a generative adversarial network. The present disclosure prevents the structural information of the generated image with respect to an original image from being lost, and improves image qualities, such as resolution, noise degree, contrast, etc. to the level of a target reference dataset. When the present disclosure is used for image standardization, medical radiographic images imaged by different institutions and any number of image datasets having various qualities can be standardized universally.

Document security enhancement

A method of providing, by a computing device, access to a user of sections of an electronic document. The method includes receiving, by a computing device, a computerized image of a user accessing an electronic document. The computing device further accesses a facial recognition database and compares the computerized image to one or more entries in the facial recognition database to determine an identity of the user. The user is provided access to one or more sections of the electronic document based upon the identity of the user.