Patent classifications
G06V30/148
DISPLAY CONTROL INTEGRATED CIRCUIT APPLICABLE TO PERFORMING REAL-TIME VIDEO CONTENT TEXT DETECTION AND SPEECH AUTOMATIC GENERATION IN DISPLAY DEVICE
A display control integrated circuit (IC) applicable to performing real-time video content text detection and speech automatic generation in a display device may include a pre-processing circuit, a character recognition circuit and a post-processing circuit. The pre-processing circuit may input a video signal to obtain a real-time video content carried by the video signal, and perform preliminary text detection on the real-time video content to generate a series of segmented character images to indicate a subtitle. The character recognition circuit may perform character recognition on the series of segmented character images to generate a series of characters, respectively. The post-processing circuit may perform vocabulary correction on the series of characters to selectively replace any erroneous character with a correct character to generate one or more vocabularies, for performing speech automatic generation.
DETECTION OF HEAT TREATED MARKINGS ON A WOODEN PALLET
A pallet inspection system includes a frame configured to have a pallet receiving area to receive a wooden pallet to be inspected for having at least one mark indicating that wood in the pallet has been heat treated. Cameras are carried by the frame to generate images of the wooden pallet in response to the wooden pallet being in the pallet receiving area. A processor is to perform object detection on each image to detect if the mark is present, crop each image having the mark so that an area surrounding the mark within the image is removed, and perform image segmentation on each cropped image so that pixels within the cropped image are classified into regions. The processor determines readability of the regions in each cropped image based on respective readability criteria thresholds. The mark is classified in each cropped image as readable based on the mark meeting the respective readability criteria thresholds.
Training digital content classification models utilizing batchwise weighted loss functions and scaled padding based on source density
Methods, systems, and non-transitory computer readable storage media are disclosed for training a machine-learning model utilizing batchwise weighted loss functions and scaled padding based on source density. For example, the disclosed systems can determine a density of words or phrases in digital content from a digital content source that indicate an affinity towards one or more content classes. In some embodiments, the disclosed systems can use the determined source density to split digital content from the source into segments and pad the segments with padding characters based on the source density. The disclosed systems can also generate document embeddings using the padded segments and then train the machine-learning model using the document embeddings. Furthermore, the disclosed system can use batchwise weighted cross entropy loss for applying different class weightings on a per-batch basis during training of the machine-learning model.
Generation of translated electronic document from an input image by consolidating each of identical untranslated text strings into a single element for translation
A method of generating an editable translated electronic document from an input image of an original document with a first layout includes: segmenting the input image to generate a first region including first untranslated text; extracting, from the first region, the first untranslated text and a first layout information; generating an editable output data including the first untranslated text and the first layout information; translating the first untranslated text into a translated text; editing the output data to include the translated text; and generating, using the first layout information, the translated electronic document including the translated text and a second layout that is identical to the first layout.
Method for optical character recognition in document subject to shadows, and device employing method
A method for recognition of characters by optical means in an unclear or non-optimal image of an object document, the image carrying shadows or other impediments inputs the document into a shadow prediction model to obtain a shadow mask. A determination is made as to whether the shadow mask of the document affect an optical character recognition (OCR) performance. The method further inputs the document into a shadow removing model for removal of shadows to obtain an intermediate document if the shadow mask are deemed to affect the OCR performance, then OCR can then be performed on the final object document.
USER INTERFACES FOR MANAGING VISUAL CONTENT IN MEDIA
The present disclosure generally relates to methods and user interfaces for managing visual content at a computer system. In some embodiments, methods and user interfaces for managing visual content in media are described. In some embodiments, methods and user interfaces for managing visual indicators for visual content in media are described. In some embodiments, methods and user interfaces for inserting visual content in media are described. In some embodiments, methods and user interfaces for identifying visual content in media are described. In some embodiments, methods and user interfaces for translating visual content in media are described.
Systems and methods of product identification within an image
Some embodiments provide systems to identify products comprising: product vector database; a plurality of portable computing devices comprising a camera and a control circuit configured to: access an image captured by the camera; perform an optical character recognition on the image; apply a vector modeling rule to key text, generate a first query product vector and wirelessly communicate the first query product vector to the product recommendation system; the product recommendation system is configured to apply a vector evaluation rule to the first query product vector to identify a first product; and wirelessly communicate to the portable computing device the first product identifier; wherein the control circuit receives the first product identifier, accesses a product information, causes product information to be displayed; and causes the first product to be virtually added to a virtual cart.
Methods, systems, articles of manufacture and apparatus to decode receipts based on neural graph architecture
Methods, apparatus, systems, and articles of manufacture are disclosed to decode receipts based on neural graph architecture. An example apparatus for decoding receipts includes, vertex feature representation circuitry to extract features from optical-character-recognition (OCR) words, polar coordinate circuitry to: calculate polar coordinates of the OCR words based on respective ones of the extracted features, graph neural network circuitry to generate an adjacency matrix based on the extracted features, post-processing circuitry to traverse the adjacency matrix to generate cliques of OCR processed words, and output circuitry to generate lines of text based on the cliques of OCR processed words.
Identifying invalid identification documents
The method, system, and non-transitory computer-readable medium embodiments described herein provide for identifying invalid identification documents. In various embodiments, an application executing on a user device prompts the user device to transmit an image of the identification document. The application receives an image including the identification document in response to the identification document being within a field of view of a camera of the user device. The identification document includes a plurality of visual elements, and one or more visual elements of the plurality of visual elements are one or more invalidating marks. The application detects a predetermined pattern on the identification document in the image, the predetermined pattern formed from the one or more invalidating marks. The application determines that the identification document is invalid based on the detected predetermined pattern.
METHOD AND APPARATUS FOR GENERATING PREDICTION INFORMATION, AND ELECTRONIC DEVICE AND MEDIUM
Disclosed are a method and apparatus for generating prediction information, and an electronic device and a medium. One embodiment of the method comprises: acquiring at least one input word; generating a word vector of each input word of the at least one input word to obtain a word vector set, wherein the at least one input word is obtained by performing word segmentation on target input text; generating an input text vector on the basis of the word vector set; and on the basis of the input text vector and a user vector, generating prediction information for predicting a user intention, wherein the user vector is obtained on the basis of user historical record information. In this embodiment, prediction information for predicting a user intention is generated, such that the popping up of unnecessary information is reduced. A user can be prevented from being disturbed, thereby improving the user experience.