Patent classifications
G06V30/19
TEXT DETECTION METHOD, TEXT RECOGNITION METHOD AND APPARATUS
The present disclosure provides a text detection method, a text recognition method and an apparatus, which relate to the field of artificial intelligence technology, in particular to the field of deep learning and computer vision technologies, and can be applied to scenarios such as optical character recognition. The text detection method is: acquiring an image feature of a text strip in a to-be-recognized image; performing visual enhancement processing on the to-be-recognized image to obtain an enhanced feature map of the to-be-recognized image; comparing the image feature of the text strip with the enhanced feature map for similarity to obtain a target bounding box of the text strip on the enhanced feature map.
Optimizing inference time of entity matching models
Methods, systems, and computer-readable storage media for receiving input data including a set of entities of a first type and a set of entities of a second type, providing a set of features based on entities of the first type, the set of features including features expected to be included in entities of the second type, filtering entities of the second type based on the set of features to provide a sub-set of entities of the second type, and generating an output by processing the set of entities of the first type and the sub-set of entities of the second type through a ML model, the output comprising a set of matching pairs, each matching pair in the set of matching pairs comprising an entity of the set of entities of the first type and at least one entity of the sub-set of entities of the second type.
METHODS AND DEVICES FOR GENERATING TRAINING SAMPLE, TRAINING MODEL AND RECOGNIZING CHARACTER
Methods and devices for generating a training sample, training a model and recognizing a character are provided. The method for generating a training sample comprises: acquiring an image of characters, and determining respective characters contained in the image; and using a projection method to determine weights of the respective characters contained in the image, tagging the image with labels according to the weights of the respective characters contained in the image, and forming a training sample. The method for training a model comprises: using the training sample to train a character recognition model. The method for recognizing a character comprises: using the character recognition model to perform character recognition. The above methods and devices realize accurate recognition of characters, such as double-half characters, contained in an image of a wheel-type meter, and can provide a highly accurate biased recognition result.
Text recognition for a neural network
Image data having text associated with a plurality of text-field types is received, the image data including target image data and context image data. The target image data including target text associated with a text-field type. The context image data providing a context for the target image data. A trained neural network that is constrained to a set of characters for the text-field type is applied to the image data. The trained neural network identifies the target text of the text-field type using a vector embedding that is based on learned patterns for recognizing the context provided by the context image data. One or more predicted characters are provided for the target text of the text-field type in response to identifying the target text using the trained neural network.
Electronic document data extraction
Methods, systems, and computer storage media are provided for data extraction. A target document representation may be generated based on modified text of a target electronic document. A measure of similarity may be determined between the target document representation and a reference document representation, which may be based on modified text of a reference electronic document. Based on the measure of similarity, the reference document representation may be selected. An extraction model associated with the selected reference document representation can then be used to extract data from the target document.
Table item information extraction with continuous machine learning through local and global models
A bipartite application implements a table auto-completion (TAC) algorithm on the client side and the server side. A client module runs a local model of the TAC algorithm on a user device and a server module runs a global model of the TAC algorithm on a server machine. The local model is continuously adapted through on-the-fly training, with as few as a negative example, to perform TAC on the client side, one document at a time. Knowledge thus learned by the local model is used to improve the global model on the server side. The global model can be utilized to automatically and intelligently extract table information from a large number of documents with significantly improved accuracy, requiring minimal human intervention even on complex tables.
System and Method of Identifying Visual Objects
A system and method of identifying objects is provided. In one aspect, the system and method includes a hand-held device with a display, camera and processor. As the camera captures images and displays them on the display, the processor compares the information retrieved in connection with one image with information retrieved in connection with subsequent images. The processor uses the result of such comparison to determine the object that is likely to be of greatest interest to the user. The display simultaneously displays the images the images as they are captured, the location of the object in an image, and information retrieved for the object.
MULTI-DOMAIN CONVOLUTIONAL NEURAL NETWORK
In one embodiment, an apparatus comprises a memory and a processor. The memory is to store visual data associated with a visual representation captured by one or more sensors. The processor is to: obtain the visual data associated with the visual representation captured by the one or more sensors, wherein the visual data comprises uncompressed visual data or compressed visual data; process the visual data using a convolutional neural network (CNN), wherein the CNN comprises a plurality of layers, wherein the plurality of layers comprises a plurality of filters, and wherein the plurality of filters comprises one or more pixel-domain filters to perform processing associated with uncompressed data and one or more compressed-domain filters to perform processing associated with compressed data; and classify the visual data based on an output of the CNN.
WINE LABEL RECOGNITION METHOD, WINE INFORMATION MANAGEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
A wine label recognition method, a wine information management method and apparatus, a computer device, and a computer-readable storage medium are provided. The method includes: obtaining a wine image, and performing optical character recognition (OCR) on the wine image in a preset OCR manner, to obtain text included in the wine image (S21); performing deep learning recognition on the wine image in a preset deep learning recognition manner, to obtain an image feature included in the wine image (S22); and sifting out a target wine label matching the text and the image feature from a preset wine label database according to the text and the image feature, and using the target wine label as a wine label corresponding to the wine image (S33). Advantages of deep learning and OCR are fully utilized thereby improving accuracy and efficiency of wine label recognition and improving automation efficiency of wine information management.
WINE PRODUCT POSITIONING METHOD, WINE PRODUCT INFORMATION MANAGEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
Disclosed are a wine product positioning method, a wine product information management method and apparatus, a computer device, and a computer-readable storage medium. Based on a preset camera in a wine cellar, a wine product image captured by the preset camera and corresponding to a target wine product is acquired (S21). Based on a preset wine label recognition method combining optical character recognition (OCR) and deep learning recognition, the wine product image is recognized to obtain a wine label corresponding to the wine product image (S22). A preset capture position corresponding to the camera is acquired, and the preset capture position is taken as a current position corresponding to the target wine product (S23). A position corresponding to the target wine product is described by using the wine label and the current position, to position the target wine product (S24).