G06V30/155

TEXT EXTRACTION USING OPTICAL CHARACTER RECOGNITION

Provided herein are systems and methods for extracting text from a document. Different optical character recognition (OCR) tools are used to extract different versions of the text in the document. Metrics evaluating the quality of the extracted text are compared to identify and select higher quality extracted text. A selected portion of text is compared to a threshold to ensure minimal quality. The selected portion of text is then saved. Error correction can be applied to the selected portion of text based on errors specific to the OCR tools or the document contents.

LINE REMOVAL METHOD, APPARATUS, AND COMPUTER-READABLE MEDIUM
20190095743 · 2019-03-28 ·

Complete removal of an underline which intersects a character may cause problems in a subsequent character recognition or conversion process, when parts of the character which coincided with the underline are also removed. To help reduce the problems, parts of underline may be removed from an image while parts of the character that coincide with the underline are maintained in the image. Areas where the character coincides with the underline are defined from a reduced version of the underline. When the underline is removed, the areas where the character coincide with the underline are maintained in a second image. The second image may then be subjected to a character recognition or conversion process with potentially fewer problems.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD AND NON-TRANSITORY READABLE STORAGE MEDIUM
20190096040 · 2019-03-28 ·

In accordance with an embodiment, an image processing apparatus comprises a scanner interface, a display section interface, and a processor. The scanner interface acquires a scanned image obtained by scanning a document. The display section interface communicates with a display. The processor generates a dropout image by deleting an area of a predetermined color from the scanned image and displays the dropout image and the scanned image on the display.

Scoring method and system

A method to score a round of golf using a golf scoring system. The method includes capturing a photograph of a physical scorecard including handwritten text using a camera of a user device, wherein the physical scorecard includes handwritten characters disposed at least partially within rectilinear boxes. The method also includes accessing the photograph in an application of the user device. The method also includes identifying the rectilinear boxes by using a color contrast between the rectilinear boxes of the physical scorecard and a background color of the physical scorecard. The method also includes removing the rectilinear boxes. The method also includes after removing the rectilinear boxes, extracting at least the handwritten characters from the physical scorecard. The method also includes after extracting at least the handwritten characters, calculating a score of the round of golf using the extracted handwritten characters on the application.

INFORMATION PROCESSING SYSTEM, METHOD, AND NON-TRANSITORY COMPUTER-EXECUTABLE MEDIUM
20240257547 · 2024-08-01 ·

An information processing system includes circuitry. The circuitry acquires a captured image by capturing a document. The circuitry performs an analysis process using the captured image. The circuitry selects, for each of at least one setting item of a plurality of setting items relating to image processing to be performed on the captured image, at least one setting value from among configurable setting values as a candidate for a recommended setting. The circuitry performs image processing repeatedly on the captured image while changing setting values of the plurality of setting items with a setting value of the at least one setting item restricted to the at least one setting value selected as the candidate for the recommended setting. The circuitry determines recommended settings for the plurality of setting items relating to image processing to obtain an image suitable for character recognition.

EXTRACTING DATA FROM ELECTRONIC DOCUMENTS
20180276462 · 2018-09-27 ·

A structured data processing system includes hardware processors and a memory in communication with the hardware processors. The memory stores a data structure and an execution environment. The data structure includes an electronic document. The execution environment includes a data extraction solver configured to perform operations including identifying a particular page of the electronic document; performing an optical character recognition (OCR) on the page to determine a plurality of alphanumeric text strings on the page; determining a type of the page; determining a layout of the page; determining at least one table on the page based at least in part on the determined type of the page and the determined layout of the page; and extracting a plurality of data from the determined table on the page. The execution environment also includes a user interface module that generates a user interface that renders graphical representations of the extracted data; and a transmission module that transmits data that represents the graphical representations.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
20180268212 · 2018-09-20 · ·

An information processing apparatus includes: a first extracting unit that extracts a position of a character entry box in an input image; a recognizing unit that recognizes a character string written in the character entry box; a calculating unit that calculates recognition accuracy of each of characters of the character string recognized by the recognizing unit; a first detector that detects that a value based on the recognition accuracy is equal to or larger than a preset threshold value; a second extracting unit that extracts a position of a circumscribed rectangle for each character of the character string in the input image; a second detector that detects contact of the circumscribed rectangle with the character entry box; and a display that displays the character string to be corrected on the basis of a result of detection by the first detector and a result of detection by the second detector.

Document optical character recognition
10068132 · 2018-09-04 · ·

Vehicles and other items often have corresponding documentation, such as registration cards, that includes a significant amount of informative textual information that can be used in identifying the item. Traditional OCR may be unsuccessful when dealing with non-cooperative images. Accordingly, features such as dewarping, text alignment, and line identification and removal may aid in OCR of non-cooperative images. Dewarping involves determining curvature of a document depicted in an image and processing the image to dewarp the image of the document to make it more accurately conform to the ideal of a cooperative image. Text alignment involves determining an actual alignment of depicted text, even when the depicted text is not aligned with depicted visual cues. Line identification and removal involves identifying portions of the image that depict lines and removing those lines prior to OCR processing of the image.

TEXT RECOGNIZER USING CONTOUR SEGMENTATION

Examples of a computing device for text recognition is provided. The computing device comprises a processor coupled to a storage medium that stores instructions, which upon execution by the processor, cause the processor to receive a data file comprising an image, identify at least one contour in the image, partition the at least one contour into a plurality of segments, and identify a text character in each segment of the plurality of segments.

Shadow detection and removal in license plate images

A method, system, and apparatus for license plate relighting comprises collecting an image of a license plate, performing license plate recognition on the image of the license plate; calculating a confidence metric for the license plate recognition; and performing a shadow detection and relighting method if the confidence metric is below a predetermined threshold, comprising identifying a shaded region of said license plate, determining if the shaded region is actually shaded, and relighting the actually shaded region.