G06V10/424

Method, apparatus, and storage medium for recognizing image object

The present disclosure describes methods, devices, and storage medium for recognizing a target object in a target image. The method including obtaining, by a device, an image recognition instruction, the image recognition instruction carrying object identification information used for indicating a target object in a target image. The device includes a memory storing instructions and a processor in communication with the memory. The method includes obtaining, by the device, an instruction feature vector matching the image recognition instruction; obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an i.sup.th image feature vector for indicating an image feature of the target image in an i.sup.th scale, and i being a positive integer; and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set.

Lithographic hotspot detection using multiple machine learning kernels

A hotspot detection system that classifies a set of hotspot training data into a plurality of hotspot clusters according to their topologies, where the hotspot clusters are associated with different hotspot topologies, and classifies a set of non-hotspot training data into a plurality of non-hotspot clusters according to their topologies, where the non-hotspot clusters are associated with different topologies. The system extracts topological and non-topological critical features from the hotspot clusters and centroids of the non-hotspot clusters. The system also creates a plurality of kernels configured to identify hotspots, where each kernel is constructed using the extracted critical features of the non-hotspot clusters and the extracted critical features from one of the hotspot clusters, and each kernel is configured to identify hotspot topologies different from hotspot topologies that the other kernels are configured to identify.

Computer architecture for mapping analog data values to a string correlithm object in a correlithm object processing system
11238072 · 2022-02-01 · ·

A string correlithm object generator is configured to output a string correlithm object comprising a plurality of sub-string correlithm objects. A node is configured to receive a plurality of data values. A memory is configured to store a node table that associates sub-string correlithm objects with the data values such that a first sub-string correlithm object is associated with a first data value and a second sub-string correlithm object is associated with a second data value. A processor is configured to receive a third data value that is between the first data value and the second data value, determine a third sub-string correlithm object that is interpolated between the first sub-string correlithm object and the second sub-string correlithm object, and associate the third sub-string correlithm object with the third data value.

Electronic apparatus and method for controlling the electronic apparatus

An electronic apparatus and a method for controlling the same are disclosed. The method for controlling an electronic apparatus includes acquiring multimedia content including a plurality of image frames, acquiring information related to the multimedia content, selecting at least one image frame including an object related to the acquired information among objects included in the plurality of image frames, generating description information for the at least one selected image frame based on the acquired information, and acquiring description information for the multimedia content based on the generated description information. Thus, the electronic apparatus may generate description information for more elaborate scene analysis regarding multimedia content.

Stacked cross-modal matching

The present concepts relate to matching data of two different modalities using two stages of attention. First data is encoded as a set of first vectors representing components of the first data, and second data is encoded as a set of second vectors representing components of the second data. In the first stage, the components of the first data are attended by comparing the first vectors and the second vectors to generate a set of attended vectors. In the second stage, the components of the second data are attended by comparing the second vectors and the attended vectors to generate a plurality of relevance scores. Then, the relevance scores are pooled to calculate a similarity score that indicates a degree of similarity between the first data and the second data.

METHOD AND APPARATUS FOR ESCAPE REORDER MODE FOR NEURAL NETWORK MODEL COMPRESSION
20210248410 · 2021-08-12 · ·

A method of an escape reorder mode for neural network model compression, is performed by at least one processor, and includes determining whether a frequency count of a codebook index included in a predicted codebook is less than a predetermined value, the codebook index corresponding to a neural network. The method further includes, based on the frequency count of the codebook index being determined to be greater than the predetermined value, maintaining the codebook index, and based on the frequency count of the codebook index being determined to be less than the predetermined value, assigning the codebook index to be an escape index of 0 or a predetermined number. The method further includes encoding the codebook index, and transmitting the encoded codebook index.

METHOD, APPARATUS, AND STORAGE MEDIUM FOR RECOGNIZING IMAGE OBJECT

The present disclosure describes methods, devices, and storage medium for recognizing a target object in a target image. The method including obtaining, by a device, an image recognition instruction, the image recognition instruction carrying object identification information used for indicating a target object in a target image. The device includes a memory storing instructions and a processor in communication with the memory. The method includes obtaining, by the device, an instruction feature vector matching the image recognition instruction; obtaining, by the device, an image feature vector set matching the target image, the image feature vector set comprising an i.sup.th image feature vector for indicating an image feature of the target image in an i.sup.th scale, and i being a positive integer; and recognizing, by the device, the target object from the target image according to the instruction feature vector and the image feature vector set.

Relational model based natural language querying to identify object relationships in scene
10789288 · 2020-09-29 · ·

Various aspects of the subject technology relate to systems, methods, and machine-readable media for relational image querying. A system may receive a search query for content from a client device, where the query specifies one or more objects and one or more spatial relationships between the one or more objects. The system may generate a query vector for the query using a computer-operated neural language model. The system may compare the query vector to an indexed vector for each of the one or more spatial relationships between the one or more objects of an image. The system may determine a listing of relational images from a collection of images based on the comparison. The system may determine a ranking for each image in the listing of relational images, and provide search results responsive to the search query to the client device, which may include a prioritized listing of the relational images based on the determined ranking.

ELECTRONIC APPARATUS AND METHOD FOR CONTROLLING THE ELECTRONIC APPARATUS
20200112771 · 2020-04-09 ·

An electronic apparatus and a method for controlling the same are disclosed. The method for controlling an electronic apparatus includes acquiring multimedia content including a plurality of image frames, acquiring information related to the multimedia content, selecting at least one image frame including an object related to the acquired information among objects included in the plurality of image frames, generating description information for the at least one selected image frame based on the acquired information, and acquiring description information for the multimedia content based on the generated description information. Thus, the electronic apparatus may generate description information for more elaborate scene analysis regarding multimedia content.

STACKED CROSS-MODAL MATCHING

The present concepts relate to matching data of two different modalities using two stages of attention. First data is encoded as a set of first vectors representing components of the first data, and second data is encoded as a set of second vectors representing components of the second data. In the first stage, the components of the first data are attended by comparing the first vectors and the second vectors to generate a set of attended vectors. In the second stage, the components of the second data are attended by comparing the second vectors and the attended vectors to generate a plurality of relevance scores. Then, the relevance scores are pooled to calculate a similarity score that indicates a degree of similarity between the first data and the second data.