Patent classifications
G06V10/77
METHOD FOR RECOGNIZING TEXT, ELECTRONIC DEVICE AND STORAGE MEDIUM
A method for recognizing a text, an electronic device and a storage medium. An implementation of the method comprises: obtaining a multi-dimensional first feature map of a to-be-recognized image; performing, based on feature values in the first feature map, feature enhancement processing on each feature value in the first feature map; and performing a text recognition on the to-be-recognized image based on the first feature map after the enhancement processing.
Neural network training device, system and method
A device includes image generation circuitry and convolutional-neural-network circuitry. The image generation circuitry, in operation, generates a digital image representation of a wafer defect map (WDM). The convolutional-neural-network circuitry, in operation, generates a defect classification associated with the WDM based on: the digital image representation of the WDM and a data-driven model associating WDM images with classes of a defined set of classes of wafer defects and generated using a training data set augmented based on defect pattern orientation types associated with training images.
ELECTRONIC DEVICE AND OPERATION METHOD THEREOF
According to an embodiment of the disclosure, an electronic device may include: a display, a memory, and a processor operatively connected to the display and the memory. According to an embodiment, the memory may store instructions that, when executed, cause the processor to: obtain a first image of a first shape, obtain linear information indicating a morphological characteristic of an object in the first image of the first shape, determine a conversion method for converting the first image of the first shape into an image of a second shape based on the obtained linear information, convert the first image of the first shape into a second image of the second shape based on the determined conversion method, and control the display to display the converted second image of the second shape on the display.
AUTOMATED AND ASSISTED IDENTIFICATION OF STROKE USING FEATURE-BASED BRAIN IMAGING
Provided herein are systems and methods for automated identification of volumes of interest in volumetric brain images using artificial intelligence (AI) enhanced imaging to diagnose and treat acute stroke. The methods can include receiving image data of a brain having header data and voxel values that represent an interruption in blood supply of the brain when imaged, extracting the header data from the image data, populating an array of cells with the voxel values, applying a segmenting analysis to the array to generate a segmented array, applying a morphological neighborhood analysis to the segmented array to generate a features relationship array, where the features relationship array includes features of interest in the brain indicative of stroke, identifying three-dimensional (3D) connected volumes of interest in the features relationship array, and generating output, for display at a user device, indicating the identified 3D volumes of interest.
LEARNING DEVICE, TRAINED MODEL GENERATION METHOD, AND RECORDING MEDIUM
In a learning device, a feature extraction means extracts image features from an input image. A class discrimination means discriminate a class of the input image based on the image features, and generates a class discriminative result. A class discriminative loss calculation means calculates a class discriminative loss based on the class discriminative result. A normal/abnormal discrimination means discriminates whether the class is a normal class or an abnormal class, based on the image features, and generates a normal/abnormal discriminative result. The AUC loss calculation means calculates an AUC loss based on the normal/abnormal result. A first learning means updates parameters of the feature extraction means, a class discrimination means, and the normal/abnormal discrimination means, based on the class discriminative loss and the AUC loss.
TRAINING APPARATUS, CONTROL METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
The training apparatus (2000) performs a first phase training and a second phase training of a discriminator (10). The discriminator (10) acquires a ground-view image and an aerial-view image, and determines whether the acquired ground-view image matches the acquired aerial-view image. The first phase training is performed using a ground-view image and a first level negative example of aerial-view image. The first level negative example of aerial-view image includes scenery of a different type from scenery in the ground-view image. The second phase training is performed using the ground-view image and a second level negative example of aerial-view image. The second level negative example of aerial-view image includes scenery of a same type as scenery in the ground-view image.
LANE LINE DETECTION METHOD, RELATED DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
A method comprises: first, obtaining a to-be-recognized lane line image; then determining, based on the lane line image, a candidate pixel used to recognize a lane line region, to obtain a candidate point set, where the lane line region is a region of a location of a lane line in the lane line image and a surrounding region of the location of the lane line; then selecting a target pixel from the candidate point set, and obtaining at least three location points associated with the target pixel in a neighborhood, where the at least three location points are on one lane line; and finally, performing extension by using the target pixel as a start point and based on the at least three location points associated with the target pixel, to obtain a lane line point set corresponding to the target pixel.
SYSTEM AND METHOD OF COUNTING LIVESTOCK
A system configured to receive video and/or images from an image capture device over a livestock path, generate feature maps from an image of the video by applying at least a first convolutional neural network, slide a window across the feature maps to obtain a plurality of anchor shapes, determine if each anchor shape contains an object to generate a plurality of regions of interest, each of the plurality of regions of interest being a non-rectangular, polygonal shape, extract feature maps from each region of interest, classify objects in each region of interest, in parallel with classification, predict segmentation masks on at least a subset of the regions of interest in a pixel-to-pixel manner, identify individual animals within the objects based on classifications and the segmentation masks, and count individual animals based on identification, and provide the count to a digital device for display, processing, and/or reporting.
METHOD FOR TRAINING MULTI-MODAL DATA MATCHING DEGREE CALCULATION MODEL, METHOD FOR CALCULATING MULTI-MODAL DATA MATCHING DEGREE, AND RELATED APPARATUSES
The present disclosure provides a method and apparatus for training a multi-modal data matching degree calculation model, a method and apparatus for calculating a multi-modal data matching degree, an electronic device, a computer readable storage medium and a computer program product, and relates to the field of artificial intelligence technology such as deep learning, image processing and computer vision. The method comprises: acquiring first sample data and second sample data that are different in modalities; constructing a contrastive learning loss function comprising a semantic perplexity parameter, the semantic perplexity parameter being determined based on a semantic feature distance between the first sample data and the second sample data; and training, by using the contrastive learning loss function, an initial multi-modal data matching degree calculation model through a contrastive learning approach, to obtain a target multi-modal data matching degree calculation model.
MODEL GENERATING APPARATUS AND METHOD
A model generating apparatus and method are provided. The apparatus receives a plurality of sample images. The apparatus generates a plurality of adversarial samples corresponding to the sample images. The apparatus inputs the sample images and the adversarial samples respectively to a first encoder and a second encoder in a self-supervised neural network to generate a plurality of first feature extractions and a plurality of second feature extractions. The apparatus calculates a similarity of each of the first feature extractions and the second feature extractions to train the self-supervised neural network. The apparatus generates a task model based on the first encoder and a plurality of labeled data.