G06V30/147

INTEGRATING OVERLAID TEXTUAL DIGITAL CONTENT INTO DISPLAYED DATA VIA GRAPHICS PROCESSING CIRCUITRY USING A FRAME BUFFER

An apparatus, method, and computer readable medium for generating and displaying a dynamic language translation overlay that include accessing a frame buffer of the GPU, analyzing, in the frame buffer of the GPU, a frame representing a section of a stream of displayed data that is being displayed by a display device, based on the analyzed frame, identifying a reference patch that includes an instruction to identify an object comprising original text, based on the instruction included in the reference patch, recognizing the original text, generating translated text, generating an overlay comprising an augmentation layer, the augmentation layer including the translated text, and overlaying the overlay, onto the displayed data such that the translated text is viewable while the original text is obscured from view.

Logo picture processing method, apparatus, device and medium

The present disclosure provides a logo picture processing method, apparatus, device and medium, and relates to technical field of image processing, and specifically to the technical field of artificial intelligence such as deep learning and computer vision. The logo picture processing method includes: obtaining a logo picture including: a current logo graph and current text information; performing text recognition on the logo picture to obtain the current text information; searching for a picture that matches both the current logo graph and the current text information, to obtain a matched picture. The present disclosure may improve the accuracy of the matched picture of the logo picture and thereby improve the logo picture recognition accuracy.

System and Method for Analyzing Videos in Real-Time

A method and a sports analytics system (SAS) for analyzing a live video broadcast stream (LVBS) of a sporting event are provided. The SAS splits the LVBS into a real time messaging protocol (RTMP) stream and a hypertext transfer protocol live stream (HLS) and analyses the RTMP stream using a phase difference between the RTMP stream and the HLS. The SAS detects persons present in a frame of the RTMP stream using a first set of cues and tracks the detected persons by analyzing preceding frames. The SAS recognizes the tracked persons using a second set of cues, assigns individual weights to each of the second set of cues, and compares the assigned weights of each of the recognized persons with pre-existing data of all players to identify the players in the frame. The SAS transmits the HLS and contextual interactive content of the identified players to a user device.

ENTRY DETECTION AND RECOGNITION FOR CUSTOM FORMS

The disclosure herein describes providing signature data of an input document. Text data of the input document is obtained (e.g., OCR data generated from image data) and a first set of signature fields are identified using signature key-value pairs of the text data. A first subset of signed signature fields and a first subset of unsigned signature fields are determined based on mapping to a set of predicted values. A second set of signature fields are determined using a region prediction model applied to image data of the input document. Region images associated with the first subset of unsigned signature fields and with second set of signature fields are obtained and a second set of signed signature fields and a second set of unsigned signature fields are determined using a signature recognition model. Signature output data is provided including signed signature fields and/or unsigned signature fields.

COMPUTER-READABLE, NON-TRANSITORY RECORDING MEDIUM CONTAINING THEREIN IMAGE PROCESSING PROGRAM FOR GENERATING LEARNING DATA OF CHARACTER DETECTION MODEL, AND IMAGE PROCESSING APPARATUS

A computer-readable, non-transitory recording medium contains therein an image processing program. The image processing program is for generating learning data of a character detection model that at least detects, to recognize a character in a document contained in an image, a position of the character in the image, and configured to cause a computer to generate a cropped image by cropping the image, and adopt the cropped image not containing an image representing a split character as the learning data, instead of adopting the cropped image containing the image representing the split character as the learning data.

Image processing method, apparatus, electronic device and computer readable storage medium

The present application discloses an image processing method, apparatus, electronic device and computer readable storage medium. The image processing method comprises detecting a text region in an image to be processed, recognizing the text region to obtain a text recognition result. In this application, the text recognition in the image to be processed is realized, the recognition manner for the text in the image is simplified, and the recognition effect for the text is improved.

Information processing apparatus, information processing method, and non-transitory storage medium
11637937 · 2023-04-25 · ·

A printed matter reviewed by a user is read, and a difference image is generated based on a first image obtained as a result of reading the printed matter and an electronic document that is a printing source from which the printed matter is generated. Based on the difference image, a process is performed to identify an instruction relating to a revision additionally written on the printed matter and a character string to be subjected to the revision in the electronic document. Thereafter, a particular process is executed on the electronic document based on the instruction related to the revision and the character string to which the revision is applied.

IDENTIFYING WRITING SYSTEMS UTILIZED IN DOCUMENTS

Systems and methods for identifying writing systems utilized in documents. An example method comprises: receiving a document image; splitting the document image into a plurality of image fragments; generating, by a neural network processing the plurality of image fragments, a plurality of probability vectors, wherein each probability vector of the plurality of probability vectors is produced by processing a corresponding image fragments and contains a plurality of numeric elements, and wherein each numeric element of the plurality of numeric elements reflects a probability of the image fragment containing a text associated with a respective writing system; computing an aggregated probability vector by aggregating the plurality of probability vectors, wherein each numeric element of the aggregated probability vector reflects a probability of the image containing a text associated with a writing system that is identified by an index of the numeric element within the aggregated probability vector; and responsive to determining that a maximum numeric element of the aggregated probability vector exceeds a predefined threshold value, concluding that the document image contains one or more symbols associated with a respective writing system.

DOCUMENT READING DEVICE AND METHOD FOR CONTROLLING THE SAME
20230069064 · 2023-03-02 ·

A document reading device includes a document conveyer, a first reader reading, in a first reading position, a first surface of a conveyed document such that a read area is larger than the conveyed document, a second reader reading, in a second reading position, a surface (second surface) opposite to the first surface such that a read area is larger than the conveyed document, a region detector executing a process of detecting a first document region that is a region of a document in first document image data and a process of detecting a second document region that is a region of the document in second document image data, and a cropping processor cropping a document portion on the first surface as first cropped image data and cropping a document portion on the second surface as second cropped image data, based on one of the document regions successfully detected.

DISPLAY CONTROL INTEGRATED CIRCUIT APPLICABLE TO PERFORMING REAL-TIME VIDEO CONTENT TEXT DETECTION AND SPEECH AUTOMATIC GENERATION IN DISPLAY DEVICE

A display control integrated circuit (IC) applicable to performing real-time video content text detection and speech automatic generation in a display device may include a pre-processing circuit, a character recognition circuit and a post-processing circuit. The pre-processing circuit may input a video signal to obtain a real-time video content carried by the video signal, and perform preliminary text detection on the real-time video content to generate a series of segmented character images to indicate a subtitle. The character recognition circuit may perform character recognition on the series of segmented character images to generate a series of characters, respectively. The post-processing circuit may perform vocabulary correction on the series of characters to selectively replace any erroneous character with a correct character to generate one or more vocabularies, for performing speech automatic generation.