Patent classifications
G06V30/191
LOW POWER MACHINE LEARNING USING REAL-TIME CAPTURED REGIONS OF INTEREST
Systems and methods are described for generating image content. The systems and methods may include, in response to receiving a request to cause a sensor of a computing device to identify image content associated with optical data captured by the sensor, detecting a first sensor data stream having a first image resolution, and detecting a second sensor data stream having a second image resolution. The systems and method may also include identifying, by processing circuitry of the computing device, at least one region of interest in the first sensor data stream, determining cropping coordinates that define a first plurality of pixels in the at least one region of interest in the first sensor data stream, and generating a cropped image representing the at least one region of interest.
DYNAMIC CAPTURE PARAMETER PROCESSING FOR LOW POWER
In one general aspect, a method can include capturing, using an image sensor, a first raw image at a first resolution, converting the first raw image to a digitally processed image using an image signal processor, and analyzing at least a portion of the digitally processed image based on a processing condition. The method can include determining that the first resolution does not satisfy the processing condition; and triggering capture of a second raw image at the image sensor at a second resolution greater than the first resolution.
WINE PRODUCT POSITIONING METHOD, WINE PRODUCT INFORMATION MANAGEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
Disclosed are a wine product positioning method, a wine product information management method and apparatus, a computer device, and a computer-readable storage medium. Based on a preset camera in a wine cellar, a wine product image captured by the preset camera and corresponding to a target wine product is acquired (S21). Based on a preset wine label recognition method combining optical character recognition (OCR) and deep learning recognition, the wine product image is recognized to obtain a wine label corresponding to the wine product image (S22). A preset capture position corresponding to the camera is acquired, and the preset capture position is taken as a current position corresponding to the target wine product (S23). A position corresponding to the target wine product is described by using the wine label and the current position, to position the target wine product (S24).
SYSTEM AND METHOD FOR AUTOMATICALLY OBTAINING AND PROCESSING LOGISTICS AND TRANSPORTATION REQUESTS
The present disclosure relates to a system and method for automatically obtaining and processing logistics requests is provided. Embodiments include automatically identifying, using a processor, a logistics request from a logistics request receiver. In response to automatically identifying, embodiments include extracting information from the logistics request and providing the extracted information to an automated quoting validator. Embodiments also include automatically generating at least one quote for the request at the automated quoting validator based upon, at least in part, one or more user defined parameters.
INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING DEVICE, AND INFORMATION PROCESSING METHOD
An information processing system includes an imaging unit that generates an image signal by imaging and an information processing device. The information processing device performs at least any one of plural kinds of image processing on a taken image corresponding to the image signal. The information processing device specifies an object corresponding to a partial image included in the taken image on the basis of a state of the object corresponding to the partial image included in the taken image or a degree of reliability given to a processing result of the performed image processing.
VIDEO PROCESSING OPTIMIZATION AND CONTENT SEARCHING
Techniques are disclosed for automatic scene detection and character extraction. In one example, audiovisual content with video frames, an audio recording, and timing information is received. A score, based on the frame's visual characteristics, is determined for a first frame and subsequent frames. The first frame's score and subsequent frame's scores are compared to determine if the difference between the scores is above a threshold. When the difference in scores is above a threshold, the subsequent frame is classified as a new scene. The audiovisual content is segmented into scenes and textual characters are identified in at least one frame from each scene. The characters are stored and indexed in a searchable database with the timing information for the scene where the characters were identified. The audio recording is transcribed and the transcribed words are stored and indexed in the searchable database with timing information.
METHODS, SYSTEMS, AND MEDIA FOR GENERATING VIDEO CLASSIFICATIONS USING MULTIMODAL VIDEO ANALYSIS
Methods, systems, and media for generating video classifications using multimodal video analysis are provided. In some embodiments, a method for classifying videos comprising: receiving, from a computing device, a video identifier; parsing a video associated with the video identifier into an audio portion and a plurality of image frames; analyzing the plurality of images frames associated with the video using (i) an optical character recognition technique to obtain first textual information corresponding to text appearing in at least one of the plurality of image frames and (ii) an image classifier to obtain, for each of a plurality of objects appearing in at least one of the plurality of frames of the video, a probability that an object appearing in at least one of the plurality of images falls within an image class; concurrently with analyzing the plurality of image frames associated with the video, analyzing the audio portion of the video using an automated speech recognition technique to obtain second textual information corresponding to words spoken in the video; combining the first textual information, the probability of each of the plurality of objects appearing in the at least one of the plurality of frames of the video, and the second textual information to obtain a combined analysis output for the video; determining, using a neural network, a safety score for each of a plurality of categories that the video contains content belonging to a category of the plurality of categories, wherein the combined analysis output is input into the neural network; and, in response to receiving the video identifier, transmitting a plurality of safety scores corresponding to the plurality of categories to the computing device for the video associated with the video identifier.
CONVERSION OF TABULAR FORMAT DATA TO MACHINE READABLE TEXT FOR QA OPERATIONS
A system and method for table conversion including converting a table containing text in tabular form to an image, labeling each text area of the image with a bounding box, determining for each bounding box, a position information, a semantic information, and an image information, reconstructing the image into a graph form having a plurality of nodes, wherein each node represents the bounding box of the text areas of the image, inputting at least two nodes into a trained neural network to determine a relative relationship between the at least two nodes, building a knowledge graph using the relative relationship of the at least two nodes, and translating the knowledge graph into machine readable natural language.
SYSTEM AND METHOD FOR APPLYING DEEP LEARNING TOOLS TO MACHINE VISION AND INTERFACE FOR THE SAME
This invention overcomes disadvantages of the prior art by providing a vision system and method of use, and graphical user interface (GUI), which employs a camera assembly having an on-board processor of low to modest processing power. At least one vision system tool analyzes image data, and generates results therefrom, based upon a deep learning process. A training process provides training image data to a processor remote from the on-board processor to cause generation of the vision system tool therefrom, and provides a stored version of the vision system tool for runtime operation on the on-board processor. The GUI allows manipulation of thresholds applicable to the vision system tool and refinement of training of the vision system tool by the training process. A scoring process allows unlabeled images from a set of acquired and/or stored images to be selected automatically for labelling as training images using a computed confidence score.
IDENTIFICATION ASSISTANCE SYSTEM, IDENTIFICATION ASSISTANCE CLIENT, IDENTIFICATION ASSISTANCE SERVER, AND IDENTIFICATION ASSISTANCE METHOD
The present invention aims to provide an identification assistance system, an identification assistance client, and an identification assistance method that enable the user to identify drugs accurately and easily. In the identification assistance system according to an aspect of the present invention, first text which is the result of voice recognition is corrected, and thus errors of the voice recognition can be corrected. In addition, the first text is corrected with reference to a drug search dictionary having learned expressions used for drug identification, and thus expressions unique to drug identification can be taken into consideration. The user can perform a search not only by using the code and/or the name of the drug but also by speaking aloud the external appearance information on the drug. Thus, even if the code and the name are unknown, the user can perform a search by using the external appearance information.