G06V10/86

DEVICE AND COMPUTER-IMPLEMENTED METHOD FOR OBJECT TRACKING
20230051014 · 2023-02-16 ·

A device and computer-implemented method for object tracking. The method comprises providing a sequence of digital images, determining a sequence of relational graph embeddings, wherein a first relational graph embedding of the sequence comprises a first object embedding representing a first object in a first digital image of the sequence of digital images, wherein the first relational graph embedding comprises a first relation embedding of a relation for the first object embedding, wherein the first relation embedding relates the first object embedding to embeddings representing other objects of the first digital image in the first relational graph embedding and to embeddings in a second relational graph embedding of the sequence that represent objects of a second digital image of the sequence of digital images.

DEVICE AND COMPUTER-IMPLEMENTED METHOD FOR OBJECT TRACKING
20230051014 · 2023-02-16 ·

A device and computer-implemented method for object tracking. The method comprises providing a sequence of digital images, determining a sequence of relational graph embeddings, wherein a first relational graph embedding of the sequence comprises a first object embedding representing a first object in a first digital image of the sequence of digital images, wherein the first relational graph embedding comprises a first relation embedding of a relation for the first object embedding, wherein the first relation embedding relates the first object embedding to embeddings representing other objects of the first digital image in the first relational graph embedding and to embeddings in a second relational graph embedding of the sequence that represent objects of a second digital image of the sequence of digital images.

Scene-aware video encoder system and method

Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.

METHOD AND ELECTRONIC DEVICE FOR AUTOMATICALLY GENERATING REGION OF INTEREST CENTRIC IMAGE
20230237757 · 2023-07-27 ·

A method for automatically generating a Region Of Interest (ROI) centric image in an electronic device is provided. The method includes receiving an image frame(s), where the image frame(s) includes a plurality of objects. Further, the method includes identifying a first ROI, a second ROI, and a non-ROI in the image frame(s). Further, the method includes rescaling the second ROI in the image frame(s), summarizing the non-ROI in the image frame(s), and automatically generating the ROI centric image, where the ROI centric image includes the rescaled-first ROI, the rescaled-second ROI, the rescaled-non-ROI, and the summarized non-ROI.

METHOD AND ELECTRONIC DEVICE FOR AUTOMATICALLY GENERATING REGION OF INTEREST CENTRIC IMAGE
20230237757 · 2023-07-27 ·

A method for automatically generating a Region Of Interest (ROI) centric image in an electronic device is provided. The method includes receiving an image frame(s), where the image frame(s) includes a plurality of objects. Further, the method includes identifying a first ROI, a second ROI, and a non-ROI in the image frame(s). Further, the method includes rescaling the second ROI in the image frame(s), summarizing the non-ROI in the image frame(s), and automatically generating the ROI centric image, where the ROI centric image includes the rescaled-first ROI, the rescaled-second ROI, the rescaled-non-ROI, and the summarized non-ROI.

IDENTIFICATION OF A VEHICLE HAVING VARIOUS DISASSEMBLY STATES

Aspects of the present disclosure relate to a method of identifying a vehicle, and a system thereof. The method can include receiving a first image of a vehicle from a first camera and classifying the vehicle in the first image with a vehicle class label. The method can also include determining a first vehicle fingerprint for the vehicle. The method can also include detecting any changes in the first vehicle fingerprint and the vehicle class label after a first time period. The detected changes in the first vehicle fingerprint can correspond to a disassembly state of the vehicle. The method can also include performing, if the vehicle class label is unchanged, at least one action in response to detected changes in the first vehicle fingerprint.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY STORAGE MEDIUM

The present invention provides an image processing apparatus (100) including an image acquisition unit (101) that acquires a query image, based on an input keyword, a skeleton structure detection unit (102) that detects a two-dimensional skeleton structure of a person included in the query image, a feature value computation unit (103) that computes a feature value of the detected two-dimensional skeleton structure, and a search unit (105) that searches, based on a degree of similarity of the computed feature value, for an analysis target image including a person in a state similar to a state of a person included in the query image from the analysis target image.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY STORAGE MEDIUM

The present invention provides an image processing apparatus (100) including an image acquisition unit (101) that acquires a query image, based on an input keyword, a skeleton structure detection unit (102) that detects a two-dimensional skeleton structure of a person included in the query image, a feature value computation unit (103) that computes a feature value of the detected two-dimensional skeleton structure, and a search unit (105) that searches, based on a degree of similarity of the computed feature value, for an analysis target image including a person in a state similar to a state of a person included in the query image from the analysis target image.

METHOD AND APPARATUS FOR TEXT-TO-IMAGE GENERATION USING SELF-SUPERVISED DISCRIMINATOR TO EXTRACT IMAGE FEATURE

An apparatus for text-to-image generation which is a self-supervised based on one-stage generative adversarial network and uses a discriminator network that extracts an image feature may comprise: a text encoder that extracts a sentence vector from input text; a discriminator that determines whether or not an image matches the text from the sentence vector and the image input from a generator; and a decoder that is connected to an encoder inside the discriminator, wherein the decoder and the encoder form an autoencoder structure inside the discriminator.

METHOD AND APPARATUS FOR TEXT-TO-IMAGE GENERATION USING SELF-SUPERVISED DISCRIMINATOR TO EXTRACT IMAGE FEATURE

An apparatus for text-to-image generation which is a self-supervised based on one-stage generative adversarial network and uses a discriminator network that extracts an image feature may comprise: a text encoder that extracts a sentence vector from input text; a discriminator that determines whether or not an image matches the text from the sentence vector and the image input from a generator; and a decoder that is connected to an encoder inside the discriminator, wherein the decoder and the encoder form an autoencoder structure inside the discriminator.