G06V30/274

ANALYTIC IMAGE FORMAT FOR VISUAL COMPUTING

In one embodiment, an apparatus comprises a storage device and a processor. The storage device stores a plurality of images captured by a camera. The processor: accesses visual data associated with an image captured by the camera; determines a tile size parameter for partitioning the visual data into a plurality of tiles; partitions the visual data into the plurality of tiles based on the tile size parameter, wherein the plurality of tiles corresponds to a plurality of regions within the image; compresses the plurality of tiles into a plurality of compressed tiles, wherein each tile is compressed independently; generates a tile-based representation of the image, wherein the tile-based representation comprises an array of the plurality of compressed tiles; and stores the tile-based representation of the image on the storage device.

Visually guided machine-learning language model

Visually guided machine-learning language model and embedding techniques are described that overcome the challenges of conventional techniques in a variety of ways. In one example, a model is trained to support a visually guided machine-learning embedding space that supports visual intuition as to “what” is represented by text. The visually guided language embedding space supported by the model, once trained, may then be used to support visual intuition as part of a variety of functionality. In one such example, the visually guided language embedding space as implemented by the model may be leveraged as part of a multi-modal differential search to support search of digital images and other digital content with real-time focus adaptation which overcomes the challenges of conventional techniques.

FLOORPLAN GENERATION BASED ON ROOM SCANNING

Various implementations disclosed herein include devices, systems, and methods that generate floorplans and measurements using a three-dimensional (3D) representation of a physical environment generated based on sensor data.

Dynamic radiometric thermal imaging compensation

Systems and methods for dynamic radiometric thermal imaging compensation. The method includes analyzing a visible light image to determine an emissivity value for each of a plurality of visible light pixels making up the visible light image. The method includes associating each of the plurality of thermal pixels making up a thermal image corresponding to the visible light image with at least one of the plurality of visible light pixels making up the visible light image. The method includes generating a second thermal image by, for each of the plurality of thermal pixels making up the thermal image, determining a temperature value based on the thermal pixel value of the thermal pixel and the emissivity value of the at least one of the plurality of visible light pixels associated with the thermal pixel.

EXTRACTION OF GENEALOGY DATA FROM OBITUARIES

Systems, methods, and other techniques for extracting data from obituaries are provided. In some embodiments, an obituary containing a plurality of words is received. Using a machine learning model, an entity tag from a set of entity tags may be assigned to each of one or more words of the plurality of words. Each particular tag from the set of entity tags may include a relationship component and a category component. The relationship component may indicate a relationship between a particular word and the deceased individual. The category component may indicate a categorization of the particular word to a particular category from a set of categories. The extracted data may be stored in a genealogical database.

METHODS AND SYSTEMS FOR SEMANTIC SCENE COMPLETION FOR SPARSE 3D DATA
20230105331 · 2023-04-06 ·

Methods and systems for performing semantic scene completion of sparse 3D data are described. A frame of sparse 3D data is preprocessed into a sparse 3D tensor and a sparse 2D tensor. A partially completed 3D tensor is generated from the sparse 3D tensor using a 3D prediction network, and a semantically completed 2D tensor is generated from the sparse 2D tensor using a 2D prediction network. The partially completed 3D tensor is completed to obtain a semantically completed 3D tensor by assigning a given class label, which has been assigned to a given pixel in the semantically completed 2D tensor, to a voxel at a corresponding x-y coordinate in the partially completed 3D tensor.

DEEP LEARNING BASED TABLE DETECTION AND ASSOCIATED DATA EXTRACTION FROM SCANNED IMAGE DOCUMENTS

The need for extracting information trapped in unstructured document images is becoming more acute. A major hurdle to this objective is that these images often contain information in the form of tables and extracting data from tabular sub-images presents a unique set of challenges. Embodiments of the present disclosure provide systems and methods that implement a deep learning network for both table detection and structure recognition, wherein interdependence between table detection and table structure recognition are exploited to segment out the table and column regions. This is followed by semantic rule-based row extraction from the identified tabular sub-regions.

Information processing apparatus, non-transitory computer readable medium, and character recognition system
11659106 · 2023-05-23 · ·

An information processing apparatus includes a processor configured to acquire a result of character recognition of a character string formed on a medium and read by scanning that is subject to character recognition and replace a character or a symbol in a subject with a reference character string that is referred to by the character or the symbol.

A Method for Detecting Correspondence of a Built Structure with a Designed Structure
20230147610 · 2023-05-11 ·

This invention relates to a computer implemented method for detecting correspondence of a built structure with a designed structure. The method comprises the steps of: receiving an unorganized point cloud (PC) of the built structure, the unorganized PC containing a data set with a plurality of data points, each of which has an RGB value associated therewith; receiving an Industry Foundation Classes (IFC) model of the designed structure, the IFC model including an IFC schema file classifying all object types and object characteristics in the IFC model, the object characteristics including a model object RGB value; integrating the unorganized PC with the IFC model and generating a semantic point cloud by overwriting the RGB value of the data point in the PC with the model object RGB value from the IFC model. In this way, the resulting PC will be easier to evaluate and the evaluation will be more effective.

Deep learning based table detection and associated data extraction from scanned image documents

The need for extracting information trapped in unstructured document images is becoming more acute. A major hurdle to this objective is that these images often contain information in the form of tables and extracting data from tabular sub-images presents a unique set of challenges. Embodiments of the present disclosure provide systems and methods that implement a deep learning network for both table detection and structure recognition, wherein interdependence between table detection and table structure recognition are exploited to segment out the table and column regions. This is followed by semantic rule-based row extraction from the identified tabular sub-regions.