G06V10/7747

Automated building of expanded datasets for training of autonomous agents
11455494 · 2022-09-27 · ·

Improved systems and methods for generating training data for classification models are disclosed. In an example, a training application accesses two fragments of text. The application represents each fragment of text as a parse thicket. The parse thickets jointly represent syntactic and discourse information. From the parse thickets, the application generalizes the text by identifying common entities or common rhetorical relations between parse thickets. The generalized text is added to a training data set, thereby increasing the coverage of the training set.

Method of detecting object in image and image processing device

At least one example embodiment discloses a method of detecting an object in an image. The method includes receiving an image, generating first images for performing a first classification operation based on the received image, reviewing first-image features of the first images using a first feature extraction method with first-type features, first classifying at least some of the first images as second images, the classified first images having first-image features matching the first-type features, reviewing second-image features of the second images using a second feature extraction method with second-type features, second classifying at least some of the second images as third images, the classified second images having second-image features matching the second-type features and detecting an object in the received image based on results of the first and second classifying.

Systems and methods of generating photorealistic garment transference in images

Systems and methods are provided for determining a first semantic segmentation image of a first image, wherein the first image includes at least a portion of a person wearing a first fashion item. A plurality of keypoints of the person's body may be determined in the first image. Using the determined first semantic segmentation image, the determined keypoints, and a second image that includes a second fashion item, a second semantic segmentation image of the person in the first image with the second fashion item of the second image may be generated. The first image may be masked to occlude pixels of the first fashion item that is to be replaced with the second fashion item. Using the masked first image, the second semantic segmentation image, and the second image that includes the second fashion item, a third image may be generated that includes the person with the second fashion item.

NEURAL NETWORK MODEL TRAINED USING GENERATED SYNTHETIC IMAGES

Training deep neural networks requires a large amount of labeled training data. Conventionally, labeled training data is generated by gathering real images that are manually labelled which is very time-consuming. Instead of manually labelling a training dataset, domain randomization technique is used generate training data that is automatically labeled. The generated training data may be used to train neural networks for object detection and segmentation (labelling) tasks. In an embodiment, the generated training data includes synthetic input images generated by rendering three-dimensional (3D) objects of interest in a 3D scene. In an embodiment, the generated training data includes synthetic input images generated by rendering 3D objects of interest on a 2D background image. The 3D objects of interest are objects that a neural network is trained to detect and/or label.

TECHNOLOGIES FOR DISTRIBUTING GRADIENT DESCENT COMPUTATION IN A HETEROGENEOUS MULTI-ACCESS EDGE COMPUTING (MEC) NETWORKS

Systems, apparatuses, methods, and computer-readable media, are provided for distributed machine learning (ML) training using heterogeneous compute nodes in a heterogeneous computing environment, where the heterogeneous compute nodes are connected to a master node via respective wireless links. ML computations are performed by individual heterogeneous compute nodes on respective training datasets, and a master combines the outputs of the ML computations obtained from individual heterogeneous compute nodes. The ML computations are balanced across the heterogeneous compute nodes based on knowledge of network conditions and operational constraints experienced by the heterogeneous compute nodes. Other embodiments may be described and/or claimed.

METHOD AND SYSTEM FOR TRAINING A MODEL FOR IMAGE GENERATION

A method and system for training a model for image generation. The model includes a hybrid variational auto-encoder (VAE)—generative adversarial network (GAN) framework. The method includes the steps of: multiple input of an input image into the VAE which outputs in response multiple distinct output image samples, determining the best of the multiple output image samples as a best-of-many sample, the best-of-many sample having the minimum reconstruction cost, and training the model based on a predefined training objective, the predefined training objective integrating the best-of-many sample reconstruction cost and a GAN-based synthetic likelihood term.

METHOD AND APPARATUS FOR ADJUSTING CABIN ENVIRONMENT
20220237943 · 2022-07-28 ·

A cabin interior environment adjustment method and apparatus are provided. Said method comprises: acquiring a face image of a person in a cabin; determining attribute information and state information of the person in the cabin on the basis of the face image; and adjusting a cabin interior environment on the basis of the attribute information and the state information of the person in the cabin.

VEHICLE ENVIRONMENT MODELING WITH CAMERAS

Various systems and methods for modeling a scene. A device for modeling a scene includes a hardware interface to obtain a time-ordered sequence of images representative of a scene, the time-ordered sequence including a plurality of images, one of the sequence of images being a current image, the scene captured by a monocular imaging system; and processing circuitry to: provide a data set to an artificial neural network (ANN) to produce a three-dimensional structure of the scene, the data set including: a portion of the sequence of images, the portion of the sequence of images including the current image; and motion of a sensor that captured the sequence of images; and model the scene using the three-dimensional structure of the scene, wherein the three-dimensional structure is determined for both moving and fixed objects in the scene.

DETERMINATION DEVICE, DETERMINATION METHOD, AND PROGRAM

A determination device includes an image information acquirer configured to acquire image information of a subject image obtained by photographing an internal space of a toilet bowl in excretion; an estimator configured to perform estimation regarding a determination matter relating to excretion by inputting the image information to a learned model, the learned model having learned a correspondence relationship between an image for learning and a determination result of the determination matter relating to excretion, the learned model learned by machine learning using a neural network, the image for learning representing an internal space of a toilet bowl in excretion; and a determiner configured to perform determination regarding the determination matter of the subject image based on an estimation result obtained by the estimator.

CONVERSION DEVICE, CONVERSION LEARNING DEVICE, CONVERSION METHOD, CONVERSION LEARNING METHOD, CONVERSION PROGRAM, AND CONVERSION LEARNING PROGRAM

A conversion apparatus includes: an input unit which receives an image for conversion; a mask generation unit which uses the image as an input to an identifier trained in advance and stored in a storage unit and generates a target attribute mask representing an attribute desired to be assigned to each position of a converted image of the image and an attribute degree of the converted image according to an output from the identifier; and an image conversion unit which uses the image and the target attribute mask as inputs to a converter trained in advance and stored in the storage unit and generates a converted image according to an output from the converter, and the identifier and the converter are trained under various restrictions including restrictions with respect to attributes.