G06V30/19173

Systems and methods for processing images
11514702 · 2022-11-29 · ·

Systems and methods for identifying landmarks of a document from a digital representation of the document. The method comprises accessing the digital representation of the document and operating a Machine Learning Algorithm (MLA), the MLA having been trained based on a set of training digital representations of documents associated with labels. The operating the MLA comprises down-sampling the digital representation of the document, detecting landmarks, generating fractional pixel coordinates for the detected landmarks. The method further determines the pixel coordinates of the landmarks by upscaling the fractional pixel coordinates from the second resolution to the first resolution and outputs the pixel coordinates of the landmarks.

IMAGE RECOGNITION METHOD AND APPARATUS, TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
20220375207 · 2022-11-24 ·

An image recognition method and apparatus, a training method, an electronic device, and a storage medium are provided. The image recognition method includes: acquiring an image to be recognized, the image to be recognized including a target text; and determining text content of the target text based on knowledge information and image information of the image to be recognized.

SYSTEM AND METHOD FOR DETECTING PHISHING-DOMAINS IN A SET OF DOMAIN NAME SYSTEM (DNS) RECORDS

This document describes a system and method for detecting phishing-domains, which are used by cyber-attackers to carry out phishing attacks, in a set of Domain Name System (DNS) records, the system comprising a homoglyph phishing domain detection module, a typo-squatting phishing domain detection module, a general phishing domain detection module and an alert module. These modules are configured to collaboratively detect and identify phishing-domains from the set of DNS records using a combination of homoglyph, typo-squatting and general phishing domain techniques. Subsequently, an alert module may be used to correlate the alerts from the various phishing detection modules to discover phishing campaigns occurring in DNS network data.

Deep learning based on image encoding and decoding
11593632 · 2023-02-28 · ·

A deep learning based compression (DLBC) system trains multiple models that, when deployed, generates a compressed binary encoding of an input image that achieves a reconstruction quality and a target compression ratio. The applied models effectively identifies structures of an input image, quantizes the input image to a target bit precision, and compresses the binary code of the input image via adaptive arithmetic coding to a target codelength. During training, the DLBC system reconstructs the input image from the compressed binary encoding and determines the loss in quality from the encoding process. Thus, the models can be continually trained to, when applied to an input image, minimize the loss in reconstruction quality that arises due to the encoding process while also achieving the target compression ratio.

Object detection and image cropping using a multi-detector approach
11593585 · 2023-02-28 · ·

Computer-implemented methods for detecting objects within digital image data based on color transitions include: receiving or capturing a digital image depicting an object; sampling color information from a first plurality of pixels of the digital image, wherein each of the first plurality of pixels is located in a background region of the digital image; optionally sampling color information from a second plurality of pixels of the digital image, wherein each of the second plurality of pixels is located in a foreground region of the digital image; assigning each pixel a label of either foreground or background using an adaptive label learning process; binarizing the digital image based on the labels assigned to each pixel; detecting contour(s) within the binarized digital image; and defining edge(s) of the object based on the detected contour(s). Corresponding systems and computer program products configured to perform the inventive methods are also described.

Self-supervised hierarchical motion learning for video action recognition

There are numerous features in video that can be detected using computer-based systems, such as objects and/or motion. The detection of these features, and in particular the detection of motion, has many useful applications, such as action recognition, activity detection, object tracking, etc. The present disclosure provides a neural network that learns motion from unlabeled video frames. In particular, the neural network uses the unlabeled video frames to perform self-supervised hierarchical motion learning. The present disclosure also describes how the learned motion can be used in video action recognition.

Systems and methods for searching audiovisual data using latent codes from generative networks and models

Systems and methods for viewing, storing, transmitting, searching, and editing application-specific audiovisual content (or other unstructured data) are disclosed in which edge devices generate content on the fly from a partial set of instructions rather than merely accessing the content in its final or near-final form. An image processing architecture may include a generative model that may be a deep learning model. The generative model may include a latent space comprising a plurality of latent codes and a trained generator mapping. The trained generator mapping may convert points in the latent space to uncompressed data points, which in the case of audiovisual content may be generated image frames. The generative model may be capable of closely approximating (up to noise or perceptual error) most or all potential data points in the relevant compression application, which in the case of audiovisual content may be source images.

On-device artificial intelligence systems and methods for document auto-rotation
11509795 · 2022-11-22 · ·

An auto-rotation module having a single-layer neural network on a user device can convert a document image to a monochrome image having black and white pixels and segment the monochrome image into bounding boxes, each bounding box defining a connected segment of black pixels in the monochrome image. The auto-rotation module can determine textual snippets from the bounding boxes and prepare them into input images for the single-layer neural network. The single-layer neural network is trained to process each input image, recognize a correct orientation, and output a set of results for each input image. Each result indicates a probability associated with a particular orientation. The auto-rotation module can examine the results, determine what degree of rotation is needed to achieve a correct orientation of the document image, and automatically rotate the document image by the degree of rotation needed to achieve the correct orientation of the document image.

Machine learning prediction and document rendering improvement based on content order

Various disclosed embodiments can resolve output inaccuracies produced by many machine learning models. Embodiments use content order as input to machine learning model systems so that they can process documents according to the position or rank of instances in a document or image. In this way, the model is less likely to misclassify or incorrectly detect instances or the ordering between predicted instances. The content order in various embodiments can be used as an additional signal to classify or make predictions.

Information processing method, information processing apparatus, and computer readable storage medium

An information processing method includes: reading a layer structure and parameters of layers from each of models of two neural networks; and determining a degree of matching between the models of the two neural networks, by comparing layers, of the respective models of the two neural networks, that are configured as a graph-like form in respective hidden layers, in order from an input layer using breadth first search or depth first search, based on similarities between respective layers.