Patent classifications
G06V10/513
NOVEL VIEW SYNTHESIS FROM SPARSE VOLUME DATA STRUCTURE
A computer-implemented method for transforming a neural radiance field model is described. A plurality of inputs are provided to a neural radiance field (NeRF) model that represents a 3-dimensional space having a subject, wherein each input of the plurality of inputs includes a location and a view direction and corresponds to respective colors of voxels that represent the 3-dimensional space. A spectral analysis is performed on a plurality of outputs of the NeRF model based on the plurality of inputs, wherein the plurality of outputs include the respective colors of the voxels. Frequency components of the spectral analysis that represent colors for at least some of the voxels are extracted. A sparse volume data structure that represents the 3-dimensional space and the respective colors for the at least some of the voxels is generated.
System and method for fast object detection
One embodiment provides a method comprising identifying a salient part of an object in an input image based on processing of a region of interest (RoI) in the input image at an electronic device. The method further comprises determining an estimated full appearance of the object in the input image based on the salient part and a relationship between the salient part and the object. The electronic device is operated based on the estimated full appearance of the object.
Image recognition method, electronic apparatus and readable storage medium
An image recognition method includes: determining a first feature map of the current frame image by using a convolutional neural network based on a type of a current frame image; determining a second feature map of a key frame image before the current frame image; performing feature alignment on the first feature map and the second feature map to obtain a first aligned feature map; fusing the first feature map and the first aligned feature map to obtain a first fused feature map; and recognizing content in the current frame image based on the first fused feature map.
UAV Real-Time Path Planning Method for Urban Scene Reconstruction
A method for urban scene reconstruction uses the top view of a scene as priori information to generate a UVA initial flight path, optimizes the initial path in real time, and realizes 3D reconstruction of the urban scene. There are four steps: (1): to analyze the top view of a scene, obtain the scene layout, and generate a UAV initial path; (2): to reconstruct the sparse point cloud of the building and estimate the building height according to the initial path, combine the scene layout to generate a rough scene model, and adjust the initial path height; (3): to use the rough scene model, sparse point cloud and the UAV flight trajectory to obtain the scene coverage confidence map and the details that need close-ups, optimize the flight path in real time; and (4): to obtain high resolution images, reconstruct them to obtain a 3D model of the scene.
Convolutional neural network pruning method based on feature map sparsification
A convolutional neural network pruning method based on feature map sparsification, which relates to how to compress the convolutional neural network to reduce the number of parameters and amount of computation so as to facilitate actual deployment, is provided. In the training process, by adding regularization to the feature map L1 or L2 after the activation layer in the loss function, the corresponding feature map channels have different sparsity. Under a certain pruned ratio, the convolution kernels corresponding to the channels are pruned according to the sparsity of the feature map channels. After fine-tune pruning, the network obtains new accuracy, and the pruned ratio is adjusted according to the change of accuracy before and after pruning. After multiple iterations, the near-optimal pruned ratio is found, and pruning is realized to the maximum extent under the condition that the accuracy does not decrease.
VISION SYSTEM FOR OBJECT DETECTION, RECOGNITION, CLASSIFICATION AND TRACKING AND THE METHOD THEREOF
Aspects of the present disclosure are directed to, for example, a method for object detection, recognition, classification and tracking using a distributed networked architecture. In some embodiments, the distributed network architecture may include one or more sensor units wherein the image acquisition and the initial feature extraction are performed and a gateway processor for further data processing. Some aspects of the present disclosure are also directed to a vision system for object detection, and to algorithms implemented in the vision system for executing the method acts for object detection, recognition, classification and/or tracking.
SPECIFYING METHOD, DETERMINATION METHOD, NON-TRANSITORY COMPUTER READABLE RECORDING MEDIUM, AND INFORMATION PROCESSING APPARATUS
An information processing apparatus (100) determines, by referring to a storage unit that stores therein contour data of a plurality of objects, whether a plurality of pieces of contour data associated with a contour of a subject included in a captured image. The information processing apparatus (100) acquires, when a determination result is affirmative, by referring to the storage unit, a plurality of pieces of region data associated with the plurality of pieces of corresponding contour data associated with the contour of the subject and specifies, based on the plurality of pieces of acquired region data, an object associated with the subject from among the plurality of objects.
EFFICIENT IMAGE CLASSIFICATION METHOD BASED ON STRUCTURED PRUNING
The present invention provides an efficient image classification method based on structured pruning, which incorporates a spatial pruning method based on variation regularization, including steps such as image data preprocessing, inputting images to neural network, image model pruning and retraining, and new image class predication and classification. The present invention adopts a structured pruning method that removes unimportant weight parameters of the original network model and reduces unnecessary computational and memory consumptions caused by the network model in image classification to simplify the image classifier, and then uses the sparsified network model to predict and classify new images. The simplified method according to the present invention improves the original network model in image classification efficiency by nearly two times, costs about 30% less memory consumption and produces a better classification result.
Identification and/or verification by a consensus network using sparse parametric representations of biometric images
Image data is run through a neural network, and the neural network produces a vector representation of the image data. Random sparse sampling masks are created. The vector representation of the image data is masked with each of the random sparse sampling masks, the masking generating corresponding sparsely sampled vectors. The sparsely sampled vectors are transmitted to nodes of a consensus network, wherein a sparsely sampled vector of the sparsely sampled vectors is transmitted to a node of the consensus network. Votes from the nodes of the consensus network are received. Whether a consensus is achieved in the votes is determined. Responsive to determining that the consensus is achieved, at least one of identification and verification of the image data may be provided.
DIGITAL FOVEATION FOR MACHINE VISION
A machine vision method includes obtaining a first representation of an image captured by an image sensor array, analyzing the first representation for an assessment of whether the first representation is sufficient to support execution of a machine vision task by the processor, if the first representation is not sufficient, determining, based on the first representation, a region of the image of interest for the execution of the machine vision task, reusing the image captured by the image sensor array to obtain a further representation of the image by directing the image sensor array to sample the image captured by the image sensor array in a manner guided by the determined region of the image of interest and by the assessment, and analyzing the further representation to assess whether the further representation is sufficient to support the execution of the machine vision task by implementing a procedure for the execution of the machine vision task in accordance with the further representation.