Patent classifications
G06V10/513
FAST SPARSE NEURAL NETWORKS
A neural network system includes at least one layer which applies a 1×1 convolution to a dense activation matrix, using a kernel defined by a sparse weight matrix. The layer is implemented by a processor with access to a sparsity dataset which indicates where the null weights are located in the weight matrix. The processor selects the feature values corresponding to the other weights from a memory unit configured to store the activation matrix, and then uses these extracted feature values for calculating the convolved values.
SYSTEMS AND METHODS FOR A LIGHTWEIGHT PATTERN-AWARE GENERATIVE ADVERSARIAL NETWORK
A computer-implemented method includes training at least a generative adversarial network, the method operable on one or more processors. The method includes at least (1) applying pattern extraction to a set of training data to extract one or more feature embeddings representing one or more features of the training data, (2) attenuating the one or more feature embeddings to create one or more attenuated feature embeddings, (3) providing the one or more attenuated embeddings to a generator of the generative adversarial network as a condition to at least partly control the generator in generating synthetic data, the providing being performed automatically and dynamically during training of the generator, and (4) with the generator, generating synthetic data based at least in part on the attenuated embeddings.
METHOD FOR IMPROVING LOCALIZATION ACCURACY OF A SELF-DRIVING VEHICLE
The invention relates to a method for improving localization accuracy of a self-driving vehicle (100). The method comprises steps of receiving from one or more range sensing devices (110) point cloud data related to surface (130) characteristics of an environment of a self-driving vehicle (100), and based on receiving, constructing a modified normal distributions transform (NDT) histogram having a set of Gaussian distributions in a plurality of histogram bins, each of the plurality of histogram bins providing different constraining features, performing subsampling for each histogram bins in the constructed NDT histogram, in which subsampling a number of Gaussian distributions from each histogram bin is removed to construct a vector h.sup.S representing the target height of each histogram bin, and after subsampling, selecting h.sub.i.sup.S Gaussian distributions from the corresponding histogram bins of vector h.sup.S based on the constraining features given by the Gaussian distributions and adding them to the subsample set S in order to localize the self-driving vehicle (100)) with respect to the point cloud data received.
Method and system for depth map reconstruction
A method includes accessing image data and depth data corresponding to image frames to be displayed on an extended reality (XR) display device, and determining sets of feature points corresponding to the image frames based on a multi-layer sampling of the image data and the depth data. The method further includes generating a set of sparse feature points based on an integration of the sets of feature points. The set of sparse feature points are determined based on relative changes in depth data with respect to the sets of feature points. The method further includes generating a set of sparse depth points based on the set of sparse feature points and the depth data and sending the set of sparse depth points to the XR display device for reconstruction of a dense depth map corresponding to the image frames utilizing the set of sparse depth points.
SIGNAL PROCESSING
A computer-implemented method is provided for classifying an input signal against a set of pre-classified signals. A computer system may calculate, for each of one or more signals of the set of pre-classified signals, a parallelism value indicating a level of the parallelism between that signal and the input signal. The computer system may calculate, for a first subset of the set of pre-classified signals, a sparse vector, wherein each element of the sparse vector serves as a coefficient for a corresponding signal of the first subset. The computer system may determine, for each of the signals in the set of pre-classified signals, a similarity value indicating a level of similarity between that signal and the input signal.
Objective assessment method for color image quality based on online manifold learning
An objective assessment method for a color image quality based on online manifold learning considers a relationship between a saliency and an image quality objective assessment. Through a visual saliency detection algorithm, saliency maps of a reference image and a distorted image are obtained for further obtaining a maximum fusion saliency map. Based on maximum saliencies of image blocks in the maximum fusion saliency map, a saliency difference between each reference image block and a corresponding distorted image block is measured through an absolute difference, and thus reference visual important image blocks and distorted visual important image blocks are screened and extracted. Through manifold eigenvectors of the reference visual important image blocks and the distorted visual important image blocks, an objective quality assessment value of the distorted image is calculated. The method has an increased assessment effect and a higher correlation between an objective assessment result and a subjective perception.
VIDEO PROCESSING METHOD, APPARATUS AND DEVICE, AND COMPUTERREADABLE STORAGE MEDIUM
A video processing method is provided. The method includes extracting at least two adjacent video frame images from a frame image sequence corresponding to a video, positioning a text region of each video frame image in the at least two adjacent video frame images, determining a degree of similarity between text regions of each video frame image in the at least two adjacent video frame images, determining, based on the degree of similarity, a key video frame segment comprising a same text in the video, and determining a text key frame in the video based on the key video frame segment.
Configuring spanning elements of a signature generator
Systems, and method and computer readable media that store instructions for configuring spanning elements of a signature generator.
A GENERIC MODULAR SPARSE THREE-DIMENSIONAL (3D) CONVOLUTION DESIGN UTILIZING SPARSE 3D GROUP CONVOLUTION
Embodiments are generally directed to sparse 3D convolution acceleration in a convolutional layer of an artificial neural network model. An embodiment of an apparatus includes one or more processors including a graphics processor to process data; and a memory for storage of data, including feature maps. The one or more processors are to provide for sparse 3D convolution acceleration by applying a shared 3D convolutional kernel/filter to an input feature map to produce an output feature map, including increasing sparsity of the input feature map by partitioning it into multiple disjoint input groups; generation of multiple disjoint output groups corresponding to the input groups by performing a convolution calculation represented by the shared 3D convolutional kernel/filter on all feature values associated with active/valid voxels of each input group to produce corresponding feature values within corresponding output groups; and outputting the output feature map by sequentially stacking the output groups.
Anomaly Detector for Detecting Anomaly using Complementary Classifiers
Embodiments of the present disclosure disclose an anomaly detector for detecting an anomaly in a sequence of poses of a human performing an activity. The anomaly detector includes an input interface configured to accept input data indicative of a distribution of the sequence of poses, a memory configured to store a discriminative one-class classifier having a pair of complementary classifiers bounding normal distribution of pose sequences in a reproducing kernel Hilbert space (RKHS), a processor configured to embed the input data into an element of the RKHS and classify the embedded data using the discriminative one-class classifier, and an output interface configured to render a classification result.