Patent classifications
G06V30/148
TERM WEIGHT GENERATION METHOD, APPARATUS, DEVICE AND MEDIUM
A term weight determination method includes: obtaining a video and video-associated text, the video-associated text including at least one term; generating a halfway vector of the term by performing multimodal feature fusion on the features of the video, the video-associated text and the at least one term; and generating the weight of the at least one term based on the halfway vector of the at least one term.
INFORMATION PROCESSING APPARATUS, COMPUTER-READABLE STORAGE MEDIUM, AND INFORMATION PROCESSING METHOD
According to an embodiment, an information processing apparatus includes a second recognition unit, an information processing unit, and an information output unit. The second recognition unit recognizes, by second recognition processing, a destination of an article with the destination. not recognized by first recognition processing by a first recognition unit. The information processing unit generates recognition processing information proving that the second recognition processing has been executed by the second recognition unit. The information output unit outputs the recognition processing information.
Information processing apparatus, information processing method, and program
To provide an information processing apparatus, an information processing method, and a program that make it possible to suitably provide three-dimensional property information. A floor-plan identifying unit that generates floor plan information on the basis of a floor plan image and a model generating unit that generates a three-dimensional model using the floor plan information are included. The floor-plan identifying unit includes: a line-segment detecting unit that detects a line segment corresponding to a wall on a floor plan, a segmentation processing unit that identifies a room region corresponding to a room on the floor plan, a character recognizing unit that recognizes a character string included in the floor plan image, a fixture detecting unit that detects a fixture sign included in the floor plan image, and an integration unit that identifies a type of room of the room region and complements a room structure. The model generating unit includes an estimating unit that estimates a scale of the floor plan and a generating unit that generates a three-dimensional model of the real-estate property on the basis of the floor plan identified from the floor plan information, the scale, and an estimated ceiling height.
Cross-platform content muting
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, facilitate cross-platform content muting. Methods include detecting a request from a user to remove, from a user interface, a media item that is provided by a first content source and presented on a first platform. One or more tags that represent the media item are determined. These tags, which indicate that the user removed the media item represented by the one or more tags from presentation on the first platform, are stored in a storage device. Subsequently, content provided by a second content source (different from the first content source) on a second platform (different from the first platform) is prevented from being presented. This content is prevented from being presented based on a tag representing the content matching the one or more tags stored in the storage device.
Content capturing system and content capturing method
A content capturing system is suitable for capturing content in an image of a document. The content capturing system includes a processor and a storage device. The processor accesses the program stored in the storage device to implement a cutting module and a processing module. The cutting module receives a corrected image. The content in the corrected image includes a plurality of text areas, and the cutting module inputs the corrected image or a first text area into a convolutional neural network. The convolutional neural network outputs the coordinates of the first text area. The cutting module cuts the first text area according to the coordinates of the first text area. The cutting module inputs the cut first text area into a text recognition system and obtains a plurality of first characters in the first text area from the text recognition system.
Media management system for video data processing and adaptation data generation
In various embodiments, methods and systems for implementing a media management system, for video data processing and adaptation data generation, are provided. At a high level, a video data processing engine relies on different types of video data properties and additional auxiliary data resources to perform video optical character recognition operations for recognizing characters in video data. In operation, video data is accessed to identify recognized characters. A video OCR operation to perform on the video data for character recognition is determined from video character processing and video auxiliary data processing. Video auxiliary data processing includes processing an auxiliary reference object; the auxiliary reference object is an indirect reference object that is a derived input element used as a factor in determining the recognized characters. The video data is processed based on the video OCR operation and based on processing the video data, at least one recognized character is communicated.
MULTI-ANCHOR BASED EXTRACTION, RECOGNITION, AND MACHINE LEARNING OF USER INTERFACE (UI)
Multiple anchors may be utilized for robotic process automation (RPA) of a user interface (UI). The multiple anchors may be utilized to determine relationships between elements in the captured image of the Ul for RPA. The results of the anchoring may be utilized for training or retraining of a machine learning (ML) component.
Determining payment details based on contextual and historical information
A computing system may be configured to determine payment details for funds transfers from unstructured sets of data. The system may maintain historical transaction information associated with each of a plurality of users. The system may receive data associated with a transaction. The system may identify a user associated with the data, wherein the user is one of the plurality of users. The system may determine, based on contextual information of the data and the historical transaction information associated with the user, payment details about the transaction. The payment details may include a payment amount and one or more recipients. The system may execute a funds transfer for the payment amount from a source account associated with the user to one or more destination accounts associated with the one or more recipients.
Method and apparatus for improved presentation of information
A method and apparatus comprising generating a dynamic personalized webpage is disclosed. At least two webpages are loaded in a fashion that is hidden from the user. Content from the at least two webpages is extracted based on classification “of interest” by an artificial intelligence algorithm. A dynamic personalized webpage comprising extracted content is then generated and displayed to the user. In the preferred embodiment, the user's dynamic personalized webpage will be filled with advertisements tailored to the user and the user would receive at least some revenue from advertisements.
Object collating device and object collating method
It is an object of the present invention to provide an object collating device and an object collating method that enable matching of images of a dividable medical article with desirable accuracy and easy confirmation of matching results. In the object collating device according to the first aspect, when the object is determined to be divided, the first image for matching is collated with the image for matching (the second matching image) for the objects in the undivided state, so that the region to be matched is not narrowed, and matching of the images of the dividable medical article is achieved with desirable accuracy. In addition, since the first and second display processing is performed on the images for display determined to contain the objects of the same type, matching results can easily be confirmed.