Patent classifications
G06V30/184
END-TO-END SYSTEM FOR EXTRACTING TABULAR DATA PRESENT IN ELECTRONIC DOCUMENTS AND METHOD THEREOF
The present disclosure describes a method, system, and a computer readable medium for extracting tabular data present in a document. The method comprises detecting presence of at least one table in the document using a deep learning based model and a statistical method. The method further comprises identifying a type of the table based on determining a count of horizontal and vertical lines, presence of outer borders, and presence of row-column intersections in the table. The type of the table comprises a bordered table, a partially bordered table, or a borderless table. The method further comprises processing the detected table, depending on its type, to identify one or more cells present in the table. The method further comprises generating an output file by extracting the tabular data present in the table, where the extracting comprises performing optical character recognition on the identified one or more cells.
END-TO-END SYSTEM FOR EXTRACTING TABULAR DATA PRESENT IN ELECTRONIC DOCUMENTS AND METHOD THEREOF
The present disclosure describes a method, system, and a computer readable medium for extracting tabular data present in a document. The method comprises detecting presence of at least one table in the document using a deep learning based model and a statistical method. The method further comprises identifying a type of the table based on determining a count of horizontal and vertical lines, presence of outer borders, and presence of row-column intersections in the table. The type of the table comprises a bordered table, a partially bordered table, or a borderless table. The method further comprises processing the detected table, depending on its type, to identify one or more cells present in the table. The method further comprises generating an output file by extracting the tabular data present in the table, where the extracting comprises performing optical character recognition on the identified one or more cells.
POLYGON DETECTION DEVICE, POLYGON DETECTION METHOD, AND POLYGON DETECTING PROGRAM
An object is to provide a polygon detection device, a polygon detection method, and a polygon detection program to accurately detect a polygon resembling a reference polygon from an image.
The polygon detection device acquires a ratio among lengths of sides of a reference polygon included in an appearance of a predetermined object. The polygon detection device acquires a photographic image of the predetermined object. The polygon detection device detects line segments from the acquired photographic image. The polygon detection device forms at least one polygon based on the detected line segments. The polygon detection device identifies, from the formed polygon, a polygon corresponding to the reference polygon based on a degree of similarity between a ratio among lengths of sides of the formed polygon and the acquired ratio among the lengths of sides of the reference polygon, among from the formed polygon.
Image processing apparatus for placing a character recognition target region at a position of a predetermined region in an image conforming to a predetermined format
An image processing apparatus includes a storage device for storing a position of a predetermined region in an image conforming to a predetermined format, a processor for acquiring an input image including a character recognition target region, cutting out a region corresponding to the character recognition target region from the input image or an image generated from the input image to generate a corrected image in which the region is placed at the position of the predetermined region in the image conforming to the predetermined format, and detecting a character from the corrected image, and an output device for outputting information related to the detected character.
IMAGE PROCESSING APPARATUS FOR PLACING A CHARACTER RECOGNITION TARGET REGION AT A POSITION OF A PREDETERMINED REGION IN AN IMAGE CONFORMING TO A PREDETERMINED FORMAT
An image processing apparatus includes a storage device for storing a position of a predetermined region in an image conforming to a predetermined format, a processor for acquiring an input image including a character recognition target region, cutting out a region corresponding to the character recognition target region from the input image or an image generated from the input image to generate a corrected image in which the region is placed at the position of the predetermined region in the image conforming to the predetermined format, and detecting a character from the corrected image, and an output device for outputting information related to the detected character.
End-to-end system for extracting tabular data present in electronic documents and method thereof
The present disclosure describes a method, system, and a computer readable medium for extracting tabular data present in a document. The method comprises detecting presence of at least one table in the document using a deep learning based model and a statistical method. The method further comprises identifying a type of the table based on determining a count of horizontal and vertical lines, presence of outer borders, and presence of row-column intersections in the table. The type of the table comprises a bordered table, a partially bordered table, or a borderless table. The method further comprises processing the detected table, depending on its type, to identify one or more cells present in the table. The method further comprises generating an output file by extracting the tabular data present in the table, where the extracting comprises performing optical character recognition on the identified one or more cells.
End-to-end system for extracting tabular data present in electronic documents and method thereof
The present disclosure describes a method, system, and a computer readable medium for extracting tabular data present in a document. The method comprises detecting presence of at least one table in the document using a deep learning based model and a statistical method. The method further comprises identifying a type of the table based on determining a count of horizontal and vertical lines, presence of outer borders, and presence of row-column intersections in the table. The type of the table comprises a bordered table, a partially bordered table, or a borderless table. The method further comprises processing the detected table, depending on its type, to identify one or more cells present in the table. The method further comprises generating an output file by extracting the tabular data present in the table, where the extracting comprises performing optical character recognition on the identified one or more cells.
Systems and methods for strike through detection
The present disclosure is directed to systems and methods for strike through detection and, more particularly, to systems and methods for detecting a strike through in an address block of a mailpiece. The method is implemented in a computing device and includes: generating edges of lines within a text block identified through optical character recognition processes; locating text lines within the text block; characterizing the edges within the text lines and outside of the text lines; and grouping identified edges of the characterized edges outside of the text lines into co-linear groups.
Optical character recognition of series of images
Systems and methods for performing optical character recognition (OCR) are disclosed. An example method may include receiving a current image that overlaps with a previous image of a series of images of an original document; performing OCR of the current image to produce an OCR text; identifying a plurality of textual artifacts in the images that are each represented by a sequence of symbols having a frequency of occurrence within the OCR text falling below a threshold frequency; identifying corresponding base points that are each associated with a textural artifact; identifying parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image; associating part of the OCR text with a cluster of symbol sequences, the symbol sequences being produced by processing previously received images; identifying a median string representing the cluster; and producing a resulting OCR text representing a portion of the original document.
Interactive 3D annotation tool with slice interpolation
A 3D segmentation editing system accurately updates the segmentations of non-edited images of a 3D scan to reflect segmentation edits applied to other images of the scan using localized interpolation. In one or more embodiments, rather than replacing the entireties of the initial segmentations of non-edited images with newly generated, globally interpolated segmentations, the segmentation editing system applies a distance-based criterion to the interpolation of segmentation edits, such that only portions of the segmentations of the non-edited images that correspond to areas that were manually annotated in the edited images will be modified by the interpolation process, and the initial segmentations will be maintained outside of those edited areas. In this way, the system merges the interpolated segmentation with the initial segmentation for each non-edited image in a manner that mitigates unreliable modifications to the initial segmentations in areas far from the edited areas.