Patent classifications
G06V30/19093
Glyph accessibility system
Glyph accessibility techniques are described as implemented by a digital content processing system involving accessing glyphs and glyph alternatives. These techniques include preprocessing techniques in which a base font is used to determine similarity of glyphs within the base font to each other. Glyph metadata that describes this similarity is cached in a storage device and used during runtime to increase efficiency in locating similar glyphs in other fonts.
IMAGE PROCESSING APPARATUS, INFORMATION PROCESSING APPARATUS, IMAGE PROCESSING SYSTEM, IMAGE PROCESSING METHOD, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM
The image processing apparatus obtains scanned images of each page obtained by scanning business forms including a plurality of pages or business forms of different types collectively, manages a division method associated with feature information on each of previous scanned images and the previous scanned images, analyzes, based on the feature information, whether any of the previous scanned images similar to a scanned image of the first page of the obtained scanned images exists, and divides, in a case where any of the previous scanned images similar to the scanned image of the first page exists, the obtained scanned images by a division method associated with the previous scanned image similar to the scanned image of the first page.
System and method for real-time automated project specifications analysis
Various methods, apparatuses/systems, and media for real-time automated analysis of project specifications are disclosed. A processor calls an API to invoke an OCR micro-service with the project specifications data as input data received from a plurality of applications each including a file corresponding to real-time project specifications data; determines whether the file corresponding to the project specification data is an image file; implements, based on determining, a neural network based image processing algorithm to extract data corresponding to the project specifications data from the input data; compares the extracted data corresponding to the project specifications data with predefined expected business results data; generates a similarity score, based on comparing, that identifies how similar the project specifications data is compared to the predefined expected business results data; and automatically generates a real-time analysis report on the project specifications in connection with the plurality of applications based on the similarity score.
PERFORMING OPTICAL CHARACTER RECOGNITION BASED ON FUZZY PATTERN SEARCH GENERATED USING IMAGE TRANSFORMATION
A system recognizes text in an input image. The system provides the input image to one or more optical character recognition (OCR) models to obtain predicted texts. The system determines a set of candidate text predictions by performing text recognition on each transformed image of the set of transformed images. The system generates a regular expression based on the predicted characters of the candidate text predictions and confidence score corresponding to each predicted character. The system matches the regular expression against text values in a database. The system selects one or more text values from the database based on the matching and returns the one or more text values as results of recognition of text of the input image.
FRAMEWORK FOR DOCUMENT LAYOUT AND INFORMATION EXTRACTION
Provided herein are system, apparatus, device, method, and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for extracting data from a file. Embodiments described herein provide a framework to merge outputs of various models comprising extracted information from a file with its location information and annotated regions of interest into an output file ingestible by a database or knowledge base.
Generation of Training Materials for Optical Character Recognition
The application is directed to the generation of training materials for optical character recognition. Generating the training materials for optical character recognition can include selecting a plurality of terms that include a string of characters. For each term, generating multiple digital term images that each includes the term with a different visual appearance. For generation of a training document, the method includes positioning the term images on a digital background and generating the digital training material.
METHODS AND SYSTEMS FOR AUTOMATED CROSS-BROWSER USER INTERFACE TESTING
Methods and apparatuses are described for automated cross-browser user interface testing. A computing device captures (i) a first image file corresponding to a first current user interface view of a web application on a first testing platform and (ii) a second image file corresponding to a second current user interface view of a web application on a second testing platform. The computing device prepares the image files, and compares the prepared image files using a structural similarity index measure. The computing device determines that the prepared first image file and the prepared second image file represent a common user interface view when the structural similarity index measure is within a predetermined range. The computing device highlights corresponding regions that visually diverge from each other in each of the prepared image files and transmits a notification message comprising the highlighted image files.
METHOD AND APPARATUS FOR DECHIPERING OBFUSCATED TEXT FOR CYBER SECURITY
Provided is a method for deciphering obfuscated text for cyber security and an apparatus for the same. The method according to some embodiments includes: converting text including a target character string into an image; recognizing a character string in the image using a text recognition model; and determining that the target character string is an obfuscated character string, based on a similarity between the target character string and the recognized character string being equal to or less than a first reference value.
Image processing apparatus, information processing apparatus, image processing system, image processing method, information processing method, and storage medium
The image processing apparatus obtains scanned images of each page obtained by scanning business forms including a plurality of pages or business forms of different types collectively, manages a division method associated with feature information on each of previous scanned images and the previous scanned images, analyzes, based on the feature information, whether any of the previous scanned images similar to a scanned image of the first page of the obtained scanned images exists, and divides, in a case where any of the previous scanned images similar to the scanned image of the first page exists, the obtained scanned images by a division method associated with the previous scanned image similar to the scanned image of the first page.
CLASSIFICATION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
Provided are a classification method and apparatus, an electronic device and a storage medium, which relate to the field of artificial intelligence and in particular, to the fields of natural language processing and deep learning. The classification method comprises: performing coding processing on to-be-classified data to obtain a to-be-classified coding feature; determining reference coding features of reference classification data similar to the to-be-classified data according to the to-be-classified coding feature; and determining a target category of the to-be-classified data according to the reference coding features and reference categories of the reference classification data.