Patent classifications
G06V30/148
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM
An information processing apparatus includes a processor configured to acquire, from a read image, a predetermined item, and a value corresponding to the item, the read image being obtained by reading a document and being subjected, prior to acquisition of the item and the value, to preprocessing and character recognition. Further, the processor is configured to, in response to not successfully acquiring at least one of the item and the value, change a setting on the preprocessing or a setting on the character recognition in accordance with the acquisition or non-acquisition state of the item and the value, and then perform the preprocessing or the character recognition. In response to not successfully acquiring at least one of the item and the value, the processor is further configured to identify where the item and the value are located.
Text Classification Method and Text Classification Device
Disclosed is a text classification method and a text classification device. The text classification method includes: receiving text data (S1), the text data comprising one or more text semantic units; replacing the text semantic unit with a corresponding text keyword (S2), based on a correspondence between text semantic elements and text keywords; extracting, with a semantic model, a semantic feature of the text keyword (S3); and classifying, with a classification model, the text keyword at least based on the semantic feature, as a classification result of the text data (S4).
VIDEO CROPPING METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
Provided are a video cropping method and apparatus, a device, and a storage medium. The method includes: obtaining at least one detection box of a first image frame; determining, based on at least one of an importance score, a coverage area, or a smoothing distance of any detection box in the at least one detection box, a cost of the detection box; determining a first detection box having a minimum cost among the at least one detection box as a cropping box; and cropping the first image frame based on the cropping box. Based on a cost of each detection box, the first detection box having the minimum cost among the at least one detection box is determined as the cropping box to crop the first image frame, which can not only improve flexibility of video cropping, but also improve a cropping effect while simplifying the video cropping process.
METHOD FOR EXTRACTING CHARACTERS FROM VEHICLE LICENSE PLATE, AND LICENSE PLATE CHARACTER EXTRACTION DEVICE FOR PERFORMING METHOD
There is provided a method of extracting characters from a license plate of a vehicle performed by a license plate character extraction device. The method comprises: converting a input image obtained by capturing the license plate of the vehicle into a grayscale image; generating a converted image based on a result of comparing a value of at least one pixel included in the grayscale image with a first average of values of pixels adjacent to the at least one pixel; generating a refined image based on a result of comparing the converted image with a binarized image obtained by binarizing the converted image; and extracting characters included in the refined image.
METHOD AND APPARATUS FOR IDENTIFYING KEY INFORMATION IN A MULTI-PARTY MULTIMEDIA COMMUNICATION
An apparatus for identifying key information in a multi-party multimedia communication includes a processor, and a memory storing instructions that, when executed by the processor, configure the apparatus to perform a method. The method includes receiving multi-modal data including video data and audio data for each of multiple participants, and presentation data presented on one or more multimedia devices. Vision information and at least one of tonal information or text information is used to determine a representative participation score (RPS) for one or more participants. Content from presentation data presented during or proximate to pronounced RPS movement is identified and sent for display.
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
A method of controlling an image processing apparatus includes: obtaining a candidate image group including a plurality of images; determining a specific condition for preferentially selecting an image from the candidate image group: analyzing the images in the candidate image group; analyzing captions attached to the images in the candidate image group; and selecting a specific image from the candidate image group based on results of the determining the specific condition, the analyzing the images, and the analyzing the captions.
METHOD AND ELECTRONIC DEVICE FOR RECOGNIZING TEXT IN IMAGE
A method and an electronic device for recognizing text are provided. The method includes detecting positions of pieces of text included in the text in the image, generating cropped images by cropping areas corresponding to the pieces of text in the image, recognizing characters of the pieces of text based on the cropped images, generating a sentence by inputting the positions of the pieces of text and the characters of the pieces of text to a multimodal language model, wherein the multimodal language model is an artificial intelligence (AI) model for inferring an original sentence of the text, and displaying the sentence.
METHOD AND ELECTRONIC DEVICE FOR RECOGNIZING TEXT IN IMAGE
A method and an electronic device for recognizing text are provided. The method includes detecting positions of pieces of text included in the text in the image, generating cropped images by cropping areas corresponding to the pieces of text in the image, recognizing characters of the pieces of text based on the cropped images, generating a sentence by inputting the positions of the pieces of text and the characters of the pieces of text to a multimodal language model, wherein the multimodal language model is an artificial intelligence (AI) model for inferring an original sentence of the text, and displaying the sentence.
Text refinement network
Systems and methods for text segmentation are described. Embodiments of the inventive concept are configured to receive an image including a foreground text portion and a background portion, classify each pixel of the image as foreground text or background using a neural network that refines a segmentation prediction using a key vector representing features of the foreground text portion, wherein the key vector is based on the segmentation prediction, and identify the foreground text portion based on the classification.
Code reader and method of reading an optical code
A method of reading an optical code is provided in which a brightness profile of the code is recorded, light and dark part regions are identified in the brightness profile, and the code content of the optical code is read, First sum measurements for the light quantity in the respective light part regions and second sum measurements for the light quantity lacking for a white level in the respective dark part regions are determined from the brightness profile and the code content is read based on the first and second sum measurements.