G06V10/23

Action-object recognition in cluttered video scenes using text

A mechanism is provided to implement an action-object interaction detection mechanism for recognizing actions in cluttered video scenes. An object bounding box is computed around an object of interest identified in a corresponding label in an initial frame where the object of interest appears in the frame. The object bounding box is propagated from the initial frame to a subsequent frame. For the initial frame and the subsequent frame: the object bounding boxes of the initial frame and the subsequent frame are refined and cropped based on the associated refined object bounding boxes. The set of cropped frames are processed to determine a probability that an action that is to be verified from the corresponding label is being performed. Responsive to determining the probability is equal to or exceeds a verification threshold, a confirmation is provided that the action-object interaction video performs the action that is to be verified.

Cognitive multiple-level highlight contrasting for entities

A computer identifies entity-containing content. The computer analyzes the entity-containing content for entities. The computer identifies a plurality of hierarchy levels for the entities. The computer receives selections of highlights for the entities, wherein the highlights for the entities within each hierarchy level share one or more characteristics. The computer applies entity contrasting. The computer outputs the entity-containing content with applied entity contrasting to a user.

Method and apparatus for displaying business object in video image and electronic device

Embodiments of the present disclosure provide a method and an apparatus for displaying a business object in a video image and an electronic device. The method for displaying a business object in a video image includes: detecting at least one target object from a video image, and determining a feature point of the at least one target object; determining a display position of a to-be-displayed business object in the video image according to the feature point of the at least one target object; and drawing the business object at the display position by using computer graphics. According to the embodiments of the present disclosure, the method and apparatus are conductive to saving network resources and system resources of a client.

IMAGE PROCESSING APPARATUS FOR CHARACTER RECOGNITION, CONTROL METHOD OF THE SAME, STORAGE MEDIUM AND IMAGE PROCESSING SYSTEM
20210201065 · 2021-07-01 ·

An image processing apparatus acquires a plurality of captured images of characters captured in time series, each of the characters including a plurality of segments, recognize the characters captured for each of the plurality of captured images, and determine which one of the characters recognized from each of the plurality of captured images is to be output. The image processing apparatus determines, in accordance with a change aspect in time series of the characters recognized from each of the plurality of captured images, which one of the characters recognized from each of the plurality of captured images is to be output.

Information extraction from images using neural network techniques and anchor words
11847806 · 2023-12-19 · ·

Scene text information extraction of desired text information from an image can be performed and managed. An information management component (IMC) can determine an anchor word based on analysis of an image. To facilitate determining desired text information in the image, IMC can re-orient the image to zero or substantially zero degrees if it determines that the orientation is skewed. IMC can utilize a neural network to determine and apply bounding boxes to text strings in the image. Using a rules-based approach or machine learning techniques, employing a trained machine learning component, IMC can utilize the anchor word along with inline grouping of textual information in the image, deep text recognition analysis, or bounding box prediction to determine or predict the desired text information in the image. IMC can facilitate presenting the desired text information, anchor word, or other information obtained from the image in an editable format.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
20210056336 · 2021-02-25 ·

An image processing apparatus obtains a read image of a document including a handwritten character, generates a first image formed by pixels of the handwritten character by extracting the pixels of the handwritten character from pixels of the read image using a first learning model for extracting the pixels of the handwritten character, estimates a handwriting area including the handwritten character using a second learning model for estimating the handwriting area, and performs handwriting OCR processing based on the generated first image and the estimated handwriting area.

SYSTEMS AND METHODS FOR AUGMENTING A DISPLAYED DOCUMENT

A server comprises a communications module; a processor coupled to the communications module; and a memory coupled to the processor, the memory storing processor-executable instructions which, when executed, configure the processor to receive, via the communications module and from a computing device, a signal comprising image data representing a first document; automatically analyze text in the first document based on stored classification data to identify a first parameter from the text in the first document; compare the first parameter to a second parameter, the second parameter being obtained from a data store and being associated with a second document; determine annotation data based on the comparison, the annotation data determined based on the first parameter and the second parameter; and provide, to the computing device via the communications module, a signal that includes an instruction to cause the annotation data to be overlaid on a display of the computing device, the display of the computing device displaying a real-time image of the first document, the instruction including marker data identifying a location associated with the first document and influencing a location of the annotation in the display.

Monitoring device, monitoring system, method, computer program and machine-readable storage medium
11869245 · 2024-01-09 · ·

The invention relates to a monitoring device (10) for recognizing persons in a monitoring region (2), the monitoring region (2) being video-monitored by means of at least one camera (6) and the camera (6) being designed to provide monitoring images (7) to the monitoring device (10) as video data, the monitoring device comprising: a feature determination apparatus (13), the feature determination apparatus (13) being designed to determine a feature vector (19) for each object in at least one of the monitoring images (7); a person recognition apparatus (16), the person recognition apparatus (16) being designed to detect in the monitoring images (7) a person to be recognized (11), on the basis of the determined feature vector and/or the determined feature vectors (19) of the feature determination apparatus (13) and/or a combined feature vector (18); an association apparatus (14), the association apparatus (14) being designed to determine a feature vector (19) for each person to be recognized (11) and each associated environment object of the person to be recognized (11), the association apparatus (14) being designed to determine the combined feature vector (18) on the basis of the feature vector (19) of the person to be recognized (11) and the feature vector or the feature vectors (20) of the associated environment objects.

COGNITIVE MULTIPLE-LEVEL HIGHLIGHT CONTRASTING FOR ENTITIES
20200301999 · 2020-09-24 ·

A computer identifies entity-containing content. The computer analyzes the entity-containing content for entities. The computer identifies a plurality of hierarchy levels for the entities. The computer receives selections of highlights for the entities, wherein the highlights for the entities within each hierarchy level share one or more characteristics. The computer applies entity contrasting. The computer outputs the entity-containing content with applied entity contrasting to a user.

Systems and methods for augmenting a displayed document

There may be provided a processor-implemented method of causing annotation data to be overlaid on a viewport. The method may include: receiving a signal comprising image data, the image data representing a first document; performing optical character recognition on the image data to identify text in the first document; automatically analyzing the text based on stored classification data to identify a first parameter associated with the first document; comparing the first parameter to a second parameter, the second parameter being obtained from a data store and being associated with a second document; determining annotation data based on the comparison, the annotation data determined based on the first parameter and the second parameter; and providing a signal that includes an instruction to cause the annotation data to be overlaid on a viewport displaying a real-time image of the first document.