G06V30/1463

Systems and methods for trigger-based updates to camograms for autonomous checkout in a cashier-less shopping

Systems and methods for tracking inventory items in an area of real space are disclosed. The method includes receiving a signal generated in dependence on sensors. The signal indicates a change to a portion of an image of an area of real space. The method includes, in response to receiving the signal, implementing a trained location detection model to determine, based on inputs, whether an inventory item identified in the portion of the image has changed a position in the area of real space. The method includes implementing a trained item classification model to determine a classification of the inventory item. The method includes updating an inventory database with inventory item data determined in dependence on the classification of the inventory item to provide an updated map of the area of real space as a result of the received signal indicating the change to the portion of the image.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING SYSTEM, OUTPUT APPARATUS, IMAGE PROCESSING METHOD, AND RECORDING MEDIUM IN WHICH IMAGE PROCESSING PROGRAM IS RECORDED
20250329146 · 2025-10-23 ·

An image processing apparatus includes an acquisition processing unit that acquires character image data, and a generation processing unit that generates learning data by executing predetermined augmentation processing on the character image data. In a case where the character image data is a specific character, the generation processing unit generates learning data by executing, on the specific character, augmentation processing different from augmentation processing for character image data other than the specific character.

SYSTEMS AND METHODS FOR OPTICAL CHARACTER RECOGNITION USING TARGETED REGIONS OF INTEREST

Embodiments of the present disclosure provide systems and methods for optical character recognition (OCR) using targeted regions of interest (ROIs). In one embodiment, a method includes receiving, by one or more processors, data representative of an image comprising a text string, causing, by the one or more processors, a user interface to display the image comprising the text string, causing, by the one or more processors, the user interface to display a window on the image, the window representative of a region for performing an OCR operation, and performing, by the one or more processors, the OCR operation for the region based at least in part on a composite directionality condition of the text string. In some examples, the composite directionality condition of the text string includes a reading direction of the text string and a character orientation of the text string.

IMAGE PROCESSING APPARATUS

A control portion recognizes, as an independent character region, a region of which the absolute value of the difference between the width in a first direction, which is the writing direction, and the width in a second direction, which is orthogonal to the first direction, is smaller than a first threshold value, and checks, based on the width in the first direction of a reference region that is a character region adjacent, in the second direction, to a plurality of independent character regions aligned in the first direction but that is not an independent character region, whether a character string composed of characters in the plurality of independent character regions is one word. On judging it to be one word, the control portion deals with the character string resulting from uniting the characters in the plurality of independent character regions as one word.

System and method for automated document analysis

A method involves detecting primary entities in a document, involving determining that a subset of the primary entities are associated with a first primary entity type, and determining a second primary entity type of one of the primary entities. The method further involves processing the primary entity of the second primary entity type to determine a secondary entity type of the primary entity. The secondary entity type is a subcategory of the second primary entity type. The method also involves hierarchically organizing the primary entities into a document layout structure that includes a top level and a child level. The top level is established by the first subset of primary entities based on the first primary entity type identifying the first subset as headings, and the child level is established by the primary entity based on the second primary entity type, the child level identifying the secondary entity type.

CHARACTER RECOGNITION AND DOCUMENT INTERPRETATION METHOD AND SYSTEM BASED ON LAYOUT RECOGNITION

A character recognition system includes: a character-related information extraction unit configured to include a deep learning model trained to extract character area information, inter-character space area information, interline scale information of each character, and orientation information of each character from an image including text; a word unit division recognition unit configured to obtain word division information obtained by dividing characters included in the image into word units based on the character area information and inter-character space area information; a text line recognition unit configured to recognize text lines in the image based on the character area information, interline scale information, and orientation information; a layout analysis unit configured to obtain layout information of the text included in the image based on the recognized text lines; and a character recognition unit configured to recognize each of the character included in the image and obtain text data in which the recognized characters are aligned based on the word division information and the layout information.

Systems and methods for generating textual instructions for manufacturers from hybrid textual and image data

A system for generating textual instructions for manufacturers from hybrid textual and image data includes a manufacturing instruction generator that may generate a language processing module from a first training set including at least a training annotated file describing at least a first product to manufacture, the at least an annotated file containing one or more textual data, and at least an instruction set containing one or more manufacturing instructions to manufacture the at least a first product. Manufacturing instruction generator may use the language processing to generate textual instructions for manufacturers from at least an annotated file and may initiate manufacture using the generated manufacturing instructions.

Automatic orientation correction for captured images

In some implementations, a device may receive an image of a document, the image depicting a reference feature associated with the document, the reference feature including at least one of: a face of a person, a machine-readable code, or a text field. The device may identify a rotational angle of the reference feature as depicted in the image based on comparing the reference feature as depicted in the image to one or more orientation parameters of the reference feature associated with a display orientation associated with the document. The device may rotate the image of the document by an angle to obtain an orientated image of the document, the angle being based on the rotational angle of the reference feature as depicted in the image. The device may provide the orientated image of the document for display.

VISUAL CODE AUTHENTICATION VIA HUMAN MOTION AND SENSOR MEASUREMENTS
20260051187 · 2026-02-19 · ·

Systems, apparatuses, and methods may provide for technology that identifies user data decoded from a visual code and an orientation of a mobile device that displayed the visual code. The technology identifies based on the user data, a sensor measurement generated by a sensor of the mobile device, and determines whether to perform a computing process based on the sensor measurement and the orientation.

Dental scanning
12551313 · 2026-02-17 · ·

The present teachings relate to a method for assisting an intraoral scan including providing an intraoral image of a patient, and providing an extraoral image; the extraoral image being representative of the position of an extraoral scanner part. The teaches further include generating, using the intraoral image and the extraoral image, a mapping function correlating the position of the extraoral scanner part with the position of the intraoral scanner part; and computing, using the mapping function, a desired extraoral position of the extraoral scanner part; the desired extraoral position corresponding to a preferable intraoral position of the intraoral scanner part. The present teachings also relate to a system, a device, a use, data, and a storage medium.