G06V30/18086

Fraud detection via automated handwriting clustering

A computer-implemented method for automatically analyzing handwritten text to determine a mismatch between a purported writer and an actual writer is disclosed. The method comprises receiving two samples of digitized handwriting each allegedly created by one individual and received and entered into a digital system by another. The method further comprises performing a series of feature extractions to convert the samples into two vectors of extracted features; automatically clustering a set of vectors such that the first vector and the second vector are assigned to the same cluster among multiple clusters, based on vector similarity; and automatically determining that a same individual being associated with both the first and second samples indicates a heightened probability that the individual fraudulently created both samples. Finally, the method comprises automatically transmitting a message to flag additional samples of digitized handwriting entered into a digital system as possibly fraudulent.

Resume document parsing using computer vision and optical character recognition with reblocking feedback

Systems and methods are disclosed for parsing resume documents using computer vision and optical character recognition technology in combination with a user feedback interface system to facilitate user feedback to improve the overall processing quality of the resumes that are imported into computer resume processing systems. In at least one embodiment, the system and method prompt a user to upload an input resume document, which is processed with a first parsing pass to generate initial resume data by extracting a plurality of resume text blocks. Further processing identifies an initial set of bounding blocks and to visually displays the initial resume data for user review and feedback to regroup one or more of the initial set of bounding blocks into a regrouped bounding block. Additional processing consolidates into a group text block each of the resume text blocks corresponding to the regrouped one or more of the initial set of bounding blocks.

System and method for detecting forgeries

A document forgery detection method comprising using at least one processor for providing at least one histogram of gray level values occurring in at least a portion of at least one channel of an image assumed to represent a document including text, the histogram having been generated by image processing at least a portion of at least one channel of an image assumed to represent a document including text, the image having been sent by a remote end user to an online service over a computer network, evaluating monotony of at least a portion of the at least one histogram; and determining whether the image is authentic or forged based on at least one output of the evaluating.

Apparatus, method and non-transitory storage medium for changing position coordinates of a character area stored in association with a character recognition result
10395131 · 2019-08-27 · ·

In the case where a user extracts a desired character string by specifying a range by using a finger or the like of him/herself on an image including a character, a specific character (space or the like) located at a position adjacent to the desired character string is prevented from being included unintendedly in the selected range. The character area corresponding to each character included in the image is identified and character recognition processing is performed for each of the identified character areas. Then, from results of the character recognition processing, a specific character is determined and the character area corresponding to the determined specific character is extended. Then, the range selected by the user in the displayed image is acquired and character recognition results corresponding to a plurality of character areas included in the selected range are output.

Robust audio identification with interference cancellation

Audio distortion compensation methods to improve accuracy and efficiency of audio content identification are described. The method is also applicable to speech recognition. Methods to detect the interference from speakers and sources, and distortion to audio from environment and devices are discussed. Additional methods to detect distortion to the content after performing search and correlation are illustrated. The causes of actual distortion at each client are measured and registered and learnt to generate rules for determining likely distortion and interference sources. The learnt rules are applied at the client, and likely distortions that are detected are compensated or heavily distorted sections are ignored at audio level or signature and feature level based on compute resources available. Further methods to subtract the likely distortions in the query at both audio level and after processing at signature and feature level are described.

Digital video fingerprinting using motion segmentation
10318813 · 2019-06-11 · ·

Methods of processing video are presented to generate signatures for motion segmented regions over two or more frames. Two frames are differenced using an adaptive threshold to generate a two-frame difference image. The adaptive threshold is based on a motion histogram analysis which may vary according to motion history data. Also, a count of pixels is determined in image regions of the motion adapted two-frame difference image which identifies when the count is not within a threshold range to modify the motion adaptive threshold. A motion history image is created from the two-frame difference image. The motion history image is segmented to generate one or more motion segmented regions and a descriptor and a signature are generated for a selected motion segmented region.

Reference line setting device, reference line setting method and reference line setting program
10311318 · 2019-06-04 · ·

A reference line setting device includes an image acquisition means to acquire an image containing a character region, a recognition means to recognize characters from the character region of the image by a specified recognition method, a line position information acquisition means to acquire line position information of a plurality of characters out of the characters recognized by the recognition means with reference to a storage means storing, for each character, line position information concerning a position which at least two reference lines pass through in a vertical direction of characters, the reference lines being lines drawn in an alignment direction of characters, along which a certain part of each character is to be placed, and a setting means to set each of the reference lines to the image based on a plurality of line position information for each reference line acquired by the line position information acquisition means.

Image reading systems, methods and storage medium for performing geometric extraction

Geometric extraction is performed on an unstructured document by recognizing textual blocks on at least a portion of a page of the unstructured document, generating bounding boxes that surround and correspond to the textual blocks, determining search paths having coordinates of two endpoints and connecting at least two bounding boxes, and generating a graph representation of the at least a portion of the page, the graph representation including the plurality of textual blocks, the coordinates of the vertices of each bounding box and the coordinates of the two endpoints of each search path.

FRAUD DETECTION VIA AUTOMATED HANDWRITING CLUSTERING

A computer-implemented method for automatically analyzing handwritten text to determine a mismatch between a purported writer and an actual writer is disclosed. The method comprises receiving two samples of digitized handwriting each allegedly created by one individual and received and entered into a digital system by another. The method further comprises performing a series of feature extractions to convert the samples into two vectors of extracted features; automatically clustering a set of vectors such that the first vector and the second vector are assigned to the same cluster among multiple clusters, based on vector similarity; and automatically determining that a same individual being associated with both the first and second samples indicates a heightened probability that the individual fraudulently created both samples. Finally, the method comprises automatically transmitting a message to flag additional samples of digitized handwriting entered into a digital system as possibly fraudulent.

Digital Video Fingerprinting Using Motion Segmentation
20190138813 · 2019-05-09 ·

Methods of processing video are presented to generate signatures for motion segmented regions over two or more frames. Two frames are differenced using an adaptive threshold to generate a two-frame difference image. The adaptive threshold is based on a motion histogram analysis which may vary according to motion history data. Also, a count of pixels is determined in image regions of the motion adapted two-frame difference image which identifies when the count is not within a threshold range to modify the motion adaptive threshold. A motion history image is created from the two-frame difference image. The motion history image is segmented to generate one or more motion segmented regions and a descriptor and a signature are generated for a selected motion segmented region.