G06V10/7753

Random sampling consensus federated semi-supervised learning

A method and systems for random sampling consensus federated (RSCFed) learning in non-IID settings are provided. The method includes randomly sampling local clients, assigning a current global model to the randomly sampled local clients for initialization at beginning of a synchronization round, conducting local training on the randomly sampled local clients, collecting local models from the randomly sampled local clients and executing distance-reweighted model aggregation (DMA) on the collected local models to obtain a sub-consensus model, repeating above steps multiple times to obtain a set of sub-consensus models, and aggregating a new model based on the sub-consensus models to be next global model.

OPTICAL AND OTHER SENSORY PROCESSING OF COMPLEX OBJECTS
20250225633 · 2025-07-10 · ·

Systems and methods for optical and other sensory analysis of nutritional and other complex objects are disclosed. For example, techniques may include capturing an RGB-D image of a food using an integrated camera; inputting the RGB-D image into an instance detection network configured to detect food items; segmenting a plurality of food items from the RGB-D image into a plurality of masks, the plurality of masks representing individual food items; classifying a particular food item among the individual food items using a multimodal large language model; estimating a volume of the particular food item by overlaying an RGB image associated with the RGB-D image with a depth-map to create a point cloud; and estimating the calories of the particular food item using the estimated volume and a nutritional database.

METHOD FOR ESTABLISHING 3D MEDICAL IMAGE SEGMENTATION MODEL BASED ON MASKED MODELING AND APPLICATION THEREOF

Disclosed is a method for establishing a 3D medical image segmentation model based on masked modeling and application thereof includes: establishing a semi-supervised learning network, wherein a student network includes an encoding module for extracting latent features and a segmentation decoder that predicts segmentation results, a teacher network includes an encoding module and a segmentation decoder that are structurally consistent with the student network; training the semi-supervised learning network, wherein during training, two random masking operations are performed on each image, and the image is input to the two networks respectively; optimizing and updating the weight of the student network, and transferring the updated weight to the teacher network, wherein the training loss function includes prototype representation loss, which is used to characterize the difference between the prototypes extracted and generated by the two networks; the student network may further include a reconstruction decoder and an auxiliary segmentation decoder.

LEARNING SYSTEMS AND METHODS

A sequence of images depicting an object is captured, e.g., by a camera at a point-of-sale terminal in a retail store. The object is identified, such as by a barcode or watermark that is detected from one or more of the images. Once the object's identity is known, such information is used in training a classifier (e.g., a machine learning system) to recognize the object from others of the captured images, including images that may be degraded by blur, inferior lighting, etc. In another arrangement, such degraded images are processed to identify feature points useful in fingerprint-based identification of the object. Feature points extracted from such degraded imagery aid in fingerprint-based recognition of objects under real life circumstances, as contrasted with feature points extracted from pristine imagery (e.g., digital files containing label artwork for such objects). A great variety of other features and arrangementssome involving designing classifiers so as to combat classifier copyingare also detailed.

RELEVANCE FACTOR VARIATION AUTOENCODER ARCHITECTURE FOR ANALYZING COGNITIVE DRAWING TESTS

A method for performing predictive operations, the method comprising receiving a classification dataset comprising clock drawing images, generating, using a classifier, one or more classification outputs, the one or more classification outputs comprising one or more identifications of dementia or non-dementia for respective ones of clock drawing images. The classifier comprises one or more weights based on a latent space associated with a relevance factor variational autoencoder (RF-VAE). The RF-VAE comprises an encoder configured to generate the latent space. The RF-VAE comprises a decoder configured to generate reconstructions of the second one or more clock drawings based on the latent space. The latent space comprises one or more latent dimensions representative of one or more unique aspects of variation associated with the second one or more clock drawings. The one or more latent dimensions comprise minimal total correlation between the one or more latent dimensions and two dimensions.

METHOD AND ELECTRONIC DEVICE FOR TRAINING A MACHINE LEARNING MODEL

A computer-implemented method for training a machine learning, ML, model to perform object detection, the method comprising: obtaining a first training dataset comprising a plurality of unlabelled images, each unlabelled image containing at least one object; analysing the first training dataset by using an object detector module of the ML model; forming a second training dataset using the unlabelled images of the first training dataset and their corresponding extracted bounding boxes and pseudo-labels; and training the object detector module, using the second training dataset, to output bounding boxes and pseudo-labels for input pseudo-labelled images.

Systems, Methods, and Apparatuses for Anatomically Consistent Embeddings in Composition and Decomposition

A method performed by a system having at least a processor and a memory therein to execute instructions for a self-supervised learning framework to learn anatomically consistent embeddings of anatomical structures in medical images of a plurality of patients across varying scales of anatomical structures in the plurality of patients receives a medical image as an input, and obtains a first cropped image and a second overlapping cropped image of the received medical image, each having respective representative global embeddings and corresponding overlapped patch embeddings and non-overlapped patch embeddings. The system calculates a global consistency loss using the respective representative global embeddings and calculates a local consistency loss using the corresponding overlapped patch embeddings and non-overlapped patch embeddings.

System and method for adaptive resource-efficient mitigation of catastrophic forgetting in continuous deep learning

Aspects of the present disclosure provide systems, methods, and computer-readable storage media that support adaptive machine learning (ML) classification that mitigates effects of catastrophic forgetting while reducing overall resource requirements. To illustrate, a computing device may train first and second ML classifiers based on historical streamed data. The second ML classifier is trained to use continuous learning, and the first ML classifier is not. If data drift of a data stream is below a lower threshold, the data stream is provided as input to the first ML classifier to generate classification output (e.g., predictions). If the data drift is above the lower threshold, dynamic switching occurs and the data stream is provided as input to the second ML classifier instead of the first ML classifier to generate the classifier output. If the data drift is above an upper threshold, operations are performed to train new ML classifiers.

Learning systems and methods

A sequence of images depicting an object is captured, e.g., by a camera at a point-of-sale terminal in a retail store. The object is identified, such as by a barcode or watermark that is detected from one or more of the images. Once the object's identity is known, such information is used in training a classifier (e.g., a machine learning system) to recognize the object from others of the captured images, including images that may be degraded by blur, inferior lighting, etc. In another arrangement, such degraded images are processed to identify feature points useful in fingerprint-based identification of the object. Feature points extracted from such degraded imagery aid in fingerprint-based recognition of objects under real life circumstances, as contrasted with feature points extracted from pristine imagery (e.g., digital files containing label artwork for such objects). A great variety of other features and arrangementssome involving designing classifiers so as to combat classifier copyingare also detailed.

DOMAIN ADAPTATION THROUGH MODEL PRUNING

In one implementation, a device receives, via a user interface, a selection of a labeled training dataset and a selection of an unlabeled training dataset, wherein the unlabeled training dataset is captured from a target domain. The device forms a domain-adapted training dataset by pruning the labeled training dataset based on the unlabeled training dataset. The device trains a machine learning model using the domain-adapted training dataset. The device prunes the machine learning model to form a domain-adapted model for the target domain.