G06F16/5846

Identifying product metadata from an item image

A metadata extraction machine accesses an image that depicts an item. The item depicted in the image may have an attribute that describes a characteristic of the item and an attribute descriptor that corresponds to the attribute of the item and specifies a value of the attribute. The metadata extraction machine performs an analysis of the image. The analysis may include identifying the attribute descriptor corresponding to the attribute based on image segmentation of the image. The metadata extraction machine transmits a communication to a device of a user based on the identifying of the attribute descriptor corresponding to the attribute of the item depicted in the image.

In-store card activation
11538022 · 2022-12-27 · ·

A user having an account with a payment provider receives an unregistered payment card that is associated with the payment provider, and that includes a magnetic strip encoded with a number unique to the card and a machine readable code such as a QR/barcode embossed thereon. The user may then open an application on the user's mobile device to capture the number associated with the card by, for example, scanning the QR/barcode, capturing an image of the number, speaking the number into the device, or manually entering the number into the user's device. The user may also authenticate with the payment provider by entering login credentials. The user may then confirm a request to link the number of the card with the user's payment provider account, which activates and links the card to the user account so that the user can immediately use the card for purchases.

System and method to support synchronization, closed captioning and highlight within a text document or a media file
11537781 · 2022-12-27 ·

The present invention relates to a system and method for synchronizing and highlighting a target text and audio associated with a reference document. The system and method may comprise one or more of an input unit, an extracting unit, a mapping unit, a processing unit, and an image resizing unit. The system and method may synchronize the target text and audio in order to provide a user with a Read Along. The invention further synchronizes and highlights closed captions and audio that helps people with hearing impairment to comprehend better while watching a movie or listening to songs.

SYSTEM AND METHOD FOR IDENTIFYING A LOCATION USING IMAGE RECOGNITION
20220398839 · 2022-12-15 ·

A system and method for identifying a location using image recognition. STR listing images are analyzed and assigned an archetype. Optionally, STR listing images are analyzed with an object detection model and associated with archetypes. The STR dwelling unit type may be determined from the combination of STR image archetypes. A location for the STR listing may then be determined by comparing to images of dwelling units retrieved from databases.

METHOD OF TRAINING IMAGE-TEXT RETRIEVAL MODEL, METHOD OF MULTIMODAL IMAGE RETRIEVAL, ELECTRONIC DEVICE AND MEDIUM

A method of training an image-text retrieval model, a method of multimodal image retrieval, an electronic device and a storage medium, each relating to the technical field of artificial intelligence, and in particular, to fields of computer vision and deep learning technologies. Sample data including a sample text and a sample image is acquired. The sample text includes a sample text in a first language and a sample text in a second language. The sample text in the first language and the sample text in the second language are processed by using the text encoding sub-model to obtain a sample text feature of the sample data. The sample image is processed by using the image encoding sub-model to obtain a sample image feature of the sample data. The image-text retrieval model is trained according to the sample text feature and the sample image feature.

Generating sentiment metrics using emoji selections
11521149 · 2022-12-06 · ·

Methods, devices and systems for measuring emotions expressed by computing emoji responses to videos are described. An example method includes receiving user input corresponding to an emoji at a selected time, assigning at least one meaning-bearing word to the emoji, wherein the at least one meaning-bearing word has an intended use or meaning that is represented by the emoji, associating a corresponding vector with the at least one meaning-bearing word, wherein the corresponding vector is a vector of a plurality of vectors in a vector space, and aggregating the plurality of vectors to generate an emoji vector that corresponds to the user sentiment.

INFORMATION PROCESSING APPARATUS, NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM, AND INFORMATION PROCESSING METHOD
20220383023 · 2022-12-01 · ·

An information processing apparatus includes a processor configured to acquire a text recognition result including a text string included in an image and position information of the text string in the image, display the text string included in the text recognition result, and specify, in a case where the displayed text string is corrected, position information corresponding to the corrected text string, among pieces of the position information associated with each text string included in the text recognition result.

INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING SYSTEM, AND METHOD OF INFORMATION PROCESSING
20220382804 · 2022-12-01 · ·

An information processing apparatus include circuitry that outputs a search request using first item information as a search key to a search engine, the first item information corresponding to a first item name included in a character string group extracted from form image data, and identifies second item information corresponding to a second item name included in the character string group, based on a search result acquired from the search engine.

INFORMATION RETRIEVAL SYSTEM AND METHOD OF INFORMATION RETRIEVAL

A system for retrieving information from an instructional document, includes a processor configured to: receive a query from a user; compare the query with one or more text sections in the instructional document; obtain, from the one or more text sections, top x text sections relevant to the query using a pre-trained encoder; compare the query with one or more images in the instructional document; obtain, from the images, top y images relevant to the query; generate top y image-text sections based on the top y images; obtain top k sections from the top x text sections and the top y image-text sections; obtain one or more most relevant sections from the top k sections using a domain-specific pre-trained encoder and a sequential classification model; and generate an answer based on the one or more most relevant sections and device context information using the domain-specific pre-trained encoder.

DYNAMIC FILE IDENTIFICATION FROM SCREEN SHARING

A computer-implemented method, a computer system and a computer program product dynamically identify a shared document, or a similar document. The method includes establishing a screen sharing session, where a presenter computing device transmits a first image of a shared document to a participant computing device. The method also includes generating one or more search parameters by scanning the first image using optical character recognition or object recognition in response to receiving the first image at the participant computing device. The method further includes performing a search of one or more memories accessible to the participant computing device using the generated one or more search parameters. Finally, the method includes displaying a prioritized list of search results.