G06V30/166

SYSTEMS AND METHODS FOR AUTOMATIC IMAGE CAPTURE ON A MOBILE DEVICE

Real-time evaluation and enhancement of image quality prior to capturing an image of a document on a mobile device is provided. An image capture process is initiated on a mobile device during which a user of the mobile device prepares to capture the image of the document, utilizing hardware and software on the mobile device to measure and achieve optimal parameters for image capture. Feedback may be provided to a user of the mobile device to instruct the user on how to manually optimize certain parameters relating to image quality, such as the angle, motion and distance of the mobile device from the document. When the optimal parameters for image capture of the document are achieved, at least one image of the document is automatically captured by the mobile device.

Electronic handwriting analysis through adaptive machine-learning
10740601 · 2020-08-11 · ·

An improved machine learning system is provided. For example, a content management server may provide a digital assessment of a user's handwriting to assess the user's knowledge of a language. The assessment may comprise adaptive technology to help determine initial questions to provide to the user as well as follow-up questions to clarify appropriate remediation content in a particular context. The content management server may also provide real-time analysis, including assessing multiple users at the same time in adjusting the assessment based on the digital input from each of these users. In some examples, the content management server may incorporate handwriting analysis methods to perform object detection and score handwriting input.

SYSTEMS AND METHODS FOR MOBILE IMAGE CAPTURE AND PROCESSING OF DOCUMENTS
20200151703 · 2020-05-14 ·

Techniques for processing images of documents captured using a mobile device are provided. The images can include different sides of a document from a mobile device for an authenticated transaction. In an example implementation, a method includes inspecting the images to detect a feature associated with a first side of the document. In response to determining an image is the first side of the document, a type of content is selected to be analyze on the image of the first side and one or more of regions of interests (ROIs) are identified on the image of the first side that are known to include the selected type of content. A process can include receiving a sub-image of the image of the first side from the preprocessing unit, and performing content detection test on the sub-image.

Image data extraction using neural networks
10650230 · 2020-05-12 · ·

Embodiments of the present disclosure pertain to extracting data from images using neural networks. In one embodiment, an image is fit to a predetermined bounding window. The image is then processed with a convolutional neural network to produce a three dimensional data cube. Slices of the cube are processed by an encoder RNN, and the results concatenated. The concatenated results are processed by an attention layer with input from a downstream decoder RNN. The attention layer output is provided to the decoder RNN to generate a probability array where values in the probability array correspond to particular characters in a character set. The maximum value is selected, and translated into an output character. In one embodiment, an amount may be extracted from an image of a receipt.

SYSTEMS AND METHODS FOR MOBILE AUTOMATED CLEARING HOUSE ENROLLMENT
20200097930 · 2020-03-26 ·

Systems and methods for mobile enrollment in automated clearing house (ACH) transactions using mobile-captured images of financial documents are provided. Applications running on a mobile device provide for the capture and processing of images of documents needed for enrollment in an ACH transaction, such as a blank check, remittance statement and driver's license. Data from the mobile-captured images that is needed for enrolling in ACH transactions is extracted from the processed images, such as a user's name, address, bank account number and bank routing number. The user can edit the extracted data, select the type of document that is being captured, authorize the creation of an ACH transaction and select an originator of the ACH transaction. The extracted data and originator information is transmitted to a remote server along with the user's authorization so the ACH transaction can be setup between the originator's and receiver's bank accounts.

Systems and methods for mobile image capture and processing of documents

Techniques for processing images of documents captured using a mobile device are provided. The images can include different sides of a document from a mobile device for an authenticated transaction. In an example implementation, a method includes inspecting the images to detect a feature associated with a first side of the document. In response to determining an image is the first side of the document, a type of content is selected to be analyze on the image of the first side and one or more of regions of interests (ROIs) are identified on the image of the first side that are known to include the selected type of content. A process can include receiving a sub-image of the image of the first side from the preprocessing unit, and performing content detection test on the sub-image.

IMAGE DATA EXTRACTION USING NEURAL NETWORKS
20190384970 · 2019-12-19 ·

Embodiments of the present disclosure pertain to extracting data from images using neural networks. In one embodiment, an image is fit to a predetermined bounding window. The image is then processed with a convolutional neural network to produce a three dimensional data cube. Slices of the cube are processed by an encoder RNN, and the results concatenated. The concatenated results are processed by an attention layer with input from a downstream decoder RNN. The attention layer output is provided to the decoder RNN to generate a probability array where values in the probability array correspond to particular characters in a character set. The maximum value is selected, and translated into an output character. In one embodiment, an amount may be extracted from an image of a receipt.

Image processing apparatus, image processing method, and non-transitory storage medium
11941903 · 2024-03-26 · ·

An image processing apparatus that generates an image for character recognition from a read image includes at least one memory that stores instructions, and at least one processor that executes the instructions to perform extracting of an area of handwritten character information and an area of printed character information from the read image, clipping of a partial image of the area of handwritten character information and a partial image of the area of printed character information out of the read image, and generating of the image for character recognition by combining the partial image of the area of handwritten character information and the partial image of the area of printed character information being associated with each other.

Commodity sales data processing apparatus and method
11928987 · 2024-03-12 · ·

A commodity sales data processing apparatus includes an image processor and a controller. The image processor captures an image including a symbol. The symbol includes discount information relating to a discount on a price of a commodity. The controller acquires commodity information that uniquely specifies the commodity. The controller extracts a region from the image captured by the camera. The region includes the symbol. The controller transmits, to a server, an image of the region extracted by the controller. The controller acquires, from the server, the discount information based on the image transmitted to the server. The controller registers the commodity information acquired by the controller and a discounted price of the commodity based on the discount information acquired by the controller in association with each other.

CONNECTING VISION AND LANGUAGE USING FOURIER TRANSFORM
20240127616 · 2024-04-18 ·

A method for text-image integration is provided. The method may include receiving a question related to pairable data comprising text data and image data. Embeddings are generated from the text tokens and image encodings. Embeddings are generated from the text tokens and image encodings. The embeddings include text embeddings and image embeddings. A spectral conversion of the text embeddings and the image embeddings is performed to generate spectral data. The spectral data is processed to extract text-image features. The text-image features are processed to generate inferred answers to the question.