G06V30/155

Masking non-public content

Systems and techniques for masking non-public content in screen images are provided. An example system includes a screen capture tool, a region-based object detection system, a classifier, and an image masking engine. The screen capture tool may be configured to generate a screen image representing a screen being displayed by the system. The region-based object detection system may be configured to identify multiple regions within the screen image as potential non-public content regions. The classifier may be configured to selectively classify the identified regions as non-public content regions. The image masking engine may be configured to generate a masked image by masking the regions classified as non-public content regions in the screen image.

SYSTEMS AND METHODS FOR SEPARATING LIGATURE CHARACTERS IN DIGITIZED DOCUMENT IMAGES
20200302209 · 2020-09-24 ·

Embodiments disclosed herein provide for systems and methods of separating characters associated with ligatures in digitized documents. The systems and methods provide for a ligature detection engine configured to identify the ligatures, and a ligature processing engine configured to identify and remove the glyphs attaching the separate characters forming the ligature.

AUTOMATIC IMAGE FEATURE REMOVAL
20200257919 · 2020-08-13 ·

Apparatus and methods are described including receiving, via a computer processor, at least one image of a portion of a subject's body. One or more features that are present within the image of the portion of the subject's body, and that were artificially added to the image subsequent to acquisition of the image, are identified. In response thereto, an output is generated on an output device.

Extracting data from electronic documents

A structured data processing system includes hardware processors and a memory in communication with the hardware processors. The memory stores a data structure and an execution environment. The data structure includes an electronic document. The execution environment includes a data extraction solver configured to perform operations including identifying a particular page of the electronic document; performing an optical character recognition (OCR) on the page to determine a plurality of alphanumeric text strings on the page; determining a type of the page; determining a layout of the page; determining at least one table on the page based at least in part on the determined type of the page and the determined layout of the page; and extracting a plurality of data from the determined table on the page. The execution environment also includes a user interface module that generates a user interface that renders graphical representations of the extracted data; and a transmission module that transmits data that represents the graphical representations.

INFORMATION PROCESSING APPARATUS AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM

An information processing apparatus includes a character recognition section that performs character recognition of an input image to output a character recognition result, a receiving section that receives an input of a character recognition result by a person on the input image, a detection section that detects a strikethrough from the input image, a matching section that matches the character recognition result output by the character recognition section with the character recognition result by the person, which is received by the receiving section, and a control section that performs control for causing the matching section to perform matching so as to obtain a final character recognition result based on a result of the matching, in a case where the detection section detects the strikethrough.

IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM
20200175308 · 2020-06-04 ·

An image processing apparatus includes a determination unit configured to determine a region of the image on which to perform character recognition processing, a decision unit configured to decide, based on a number of black pixels in contact with the region determined by the determination unit, whether to perform the character recognition processing on an expanded region obtained by expanding the region determined by the determination unit rather than on the region determined by the determination unit, and a character recognition unit configured to perform the character recognition processing on that region of the image decided by the decision unit.

UTILIZING A MACHINE LEARNING MODEL TO PREDICT METRICS FOR AN APPLICATION DEVELOPMENT PROCESS
20200174774 · 2020-06-04 ·

A device receives historical application creation data that includes data associated with creation of a plurality of applications, and processes the historical application creation data, with one or more data processing techniques, to generate processed historical application creation data. The device trains a machine learning model, with the processed historical application creation data, to generate a trained machine learning model, and receives new application data associated with a new application to be created. The device processes the new application data, with the trained machine learning model, to generate one or more predictions associated with the new application, and performs one or more actions based on the one or more predictions associated with the new application.

AUTOMATING TEXT AND GRAPHICS COVERAGE ANALYSIS OF A WEBSITE PAGE
20240021003 · 2024-01-18 · ·

Methods, system, and non-transitory processor-readable storage medium for a website page density and readability system are provided herein. An example method includes capturing an image of a website page rendered in a web browser. The website page density and readability system determines a text density associated with text content in the image, and then removes the text content from the image. The website page density and readability system determines a graphic density associated with graphic content in the image, and determines a website page density associated with the website page using the text density and graphic density.

Line removal method, apparatus, and computer-readable medium
10586125 · 2020-03-10 · ·

Complete removal of an underline which intersects a character may cause problems in a subsequent character recognition or conversion process, when parts of the character which coincided with the underline are also removed. To help reduce the problems, parts of underline may be removed from an image while parts of the character that coincide with the underline are maintained in the image. Areas where the character coincides with the underline are defined from a reduced version of the underline. When the underline is removed, the areas where the character coincide with the underline are maintained in a second image. The second image may then be subjected to a character recognition or conversion process with potentially fewer problems.

System and method for segmenting text lines in documents
RE047889 · 2020-03-03 · ·

Methods and systems of the present embodiment provide segmenting of connected components of markings found in document images. Segmenting includes detecting aligned text. From this detected material an aligned text mask is generated and used in processing of the images. The processing includes breaking connected components in the document images into smaller pieces or fragments by detecting and segregating the connected components and fragments thereof likely to belong to aligned text.