G06V10/806

Lane detection method and system based on vision and lidar multi-level fusion

A lane detection method based on vision and lidar multi-level fusion includes: calibrating obtained point cloud data and an obtained video image; constructing a point cloud clustering model by fusing height information, reflection intensity information of the point cloud data, and RGB information of the video image, obtaining point clouds of a road based on the point cloud clustering model, and obtaining a lane surface as a first lane candidate region by performing least square fitting on the point clouds; obtaining four-channel road information by fusing the reflection intensity information of the point cloud data and the RGB information of the video image, inputting the four-channel road information into the semantic segmentation network 3D-LaneNet, and outputting an image of a second lane candidate region; and fusing the first lane candidate region and the second lane candidate region, and combining the two lane candidate regions into a final lane region.

Method and apparatus for remote sensing of objects utilizing radiation speckle
10935364 · 2021-03-02 ·

Disclosed are systems and methods to extract information about the size and shape of an object by observing variations of the radiation pattern caused by illuminating the object with coherent radiation sources and changing the wavelengths of the source. Sensing and image-reconstruction systems and methods are described for recovering the image of an object utilizing projected and transparent reference points and radiation sources such as tunable lasers. Sensing and image-reconstruction systems and methods are also described for rapid sensing of such radiation patterns. A computational system and method is also described for sensing and reconstructing the image from its autocorrelation. This computational approach uses the fact that the autocorrelation is the weighted sum of shifted copies of an image, where the shifts are obtained by sequentially placing each individual scattering cell of the object at the origin of the autocorrelation space.

ACTION PREDICTION

According to one aspect, action prediction may be implemented via a spatio-temporal feature pyramid graph convolutional network (ST-FP-GCN) including a first pyramid layer, a second pyramid layer, a third pyramid layer, etc. The first pyramid layer may include a first graph convolution network (GCN), a fusion gate, and a first long-short-term-memory (LSTM) gate. The second pyramid layer may include a first convolution operator, a first summation operator, a first mask pool operator, a second GCN, a first upsampling operator, and a second LSTM gate. An output summation operator may sum a first LSTM output and a second LSTM output to generate an output indicative of an action prediction for an inputted image sequence and an inputted pose sequence.

Method and system for classifying content using scoring

An interface module obtains content comprising one or more elements. One or more feature vectors are extracted from the content. The one or more feature vectors comprise a feature vector that identifies an element of the one or more elements of the content. A classification scoring module generates one or more classification vectors from the one or more feature vectors. The one or more classification vectors comprise a classification vector that identifies one or more characteristics of the element from the content. The one or more classification vectors are combined and one or more characteristics of the content are identified to form an aggregated vector. A goal of the content is detected by generating a string that describes the content from the aggregated vector. The goal is presented with the content.

METHODS AND SYSTEMS FOR FIRE MONITORING AND EARLY WARNING IN A SMART CITY BASED ON INTERNET OF THINGS

This present disclosure provides a method and system for fire monitoring and early warning in a smart city based on Internet of Things. The method being executed by the management platform, and the method comprising: obtaining monitoring data collected by the object platforms through the sensor network platform, the monitoring data including smoke data, temperature data, image data of a drone, and manual inspection data, the manual inspection data being obtained based on a manual inspection interval; determining a fire risk level based on the monitoring data; in response to the fire risk level meeting a preset condition, sending an alarm to the user platform through the service platform; and in response to the fire risk level not meeting the preset condition, determining the manual inspection interval based on the fire risk level, and sending the manual inspection interval to the service platform and/or the user platform.

FEATURE EXTRACTION METHOD AND APPARATUS
20230419646 · 2023-12-28 ·

Embodiments of this disclosure relate to the field of artificial intelligence, and disclose a feature extraction method and apparatus. The method includes: obtaining a to-be-processed object, and obtaining a segmented object based on the to-be-processed object, where the segmented object includes some elements in the to-be-processed object, a first vector indicates the segmented object, and a second vector indicates some elements in the segmented object; performing feature extraction on the first vector to obtain a first feature, and performing feature extraction on the second vector to obtain a second feature; fusing at least two second features based on a first target weight, to obtain a first fused feature; and performing fusion processing on the first feature and the first fused feature to obtain a second fused feature, where the second fused feature is used to obtain a feature of the to-be-processed object.

Temporally distributed neural networks for video semantic segmentation

A Video Semantic Segmentation System (VSSS) is disclosed that performs accurate and fast semantic segmentation of videos using a set of temporally distributed neural networks. The VSSS receives as input a video signal comprising a contiguous sequence of temporally-related video frames. The VSSS extracts features from the video frames in the contiguous sequence and based upon the extracted features, selects, from a set of labels, a label to be associated with each pixel of each video frame in the video signal. In certain embodiments, a set of multiple neural networks are used to extract the features to be used for video segmentation and the extraction of features is distributed among the multiple neural networks in the set. A strong feature representation representing the entirety of the features is produced for each video frame in the sequence of video frames by aggregating the output features extracted by the multiple neural networks.

MAP ELEMENT EXTRACTION METHOD AND APPARATUS, AND SERVER
20210035314 · 2021-02-04 ·

This application discloses a map element extraction method and apparatus, and a server. The map element extraction method includes obtaining a laser point cloud and an image of a target scene, the target scene including a map element; performing registration between the laser point cloud and the image to obtain a depth map of the image; performing image segmentation on the depth map of the image to obtain a segmented image of the map element in the depth map; and converting a two-dimensional location of the segmented image in the depth map to a three-dimensional location of the map element in the target scene according to a registration relationship between the laser point cloud and the image.

SPOOFING DETECTION APPARATUS, SPOOFING DETECTION METHOD, AND COMPUTER-READABLE RECORDING MEDIUM
20210034893 · 2021-02-04 · ·

A spoofing detection apparatus comprises obtaining, from an image capture apparatus, a first image frame that includes the face of a subject person obtained when a light-emitting apparatus is emitting light and a second image frame that includes the face of the subject person obtained when the light-emitting apparatus is turned off, extracting, from the first image frame, information specifying a face portion of the subject person, and extract, from the second image frame, information specifying a face portion of the subject person, extracting a portion that includes a bright point formed by reflection in an iris region of an eye of the subject person, from the first image frame, extracts a portion corresponding to the portion that includes the bright point, from the second image frame, and calculates a feature that is independent of the position of the bright point, and determining authenticity of subject person based on the feature.

ANNOTATION TASK INSTRUCTION GENERATION
20210034698 · 2021-02-04 ·

One embodiment provides a method, including: receiving, from a client, (i) a task of annotating information, (ii) a set of instructions for performing the task, and (iii) client annotations for a subset of the information within the task; assigning the subset to a plurality of annotators; obtaining (i) annotator annotations for the subset and (ii) a response time for providing the annotator annotation for each piece of information within the subset; identifying improvements to the set of instructions by (i) comparing the annotator annotations to the client annotations and (ii) identifying discrepancies made by the annotators in view of the response time; and generating a new set of instructions, wherein the generating comprises (i) identifying at least one feature of the information that distinguishes correctly annotated information from incorrectly annotated information and (ii) generating an instruction from the at least one feature.