Patent classifications
G06V10/955
LANDMARK DETECTION USING CURVE FITTING FOR AUTONOMOUS DRIVING APPLICATIONS
In various examples, one or more deep neural networks (DNNs) are executed to regress on control points of a curve, and the control points may be used to perform a curve fitting operation—e.g., Bezier curve fitting—to identify landmark locations and geometries in an environment. The outputs of the DNN(s) may thus indicate the two-dimensional (2D) image-space and/or three-dimensional (3D) world-space control point locations, and post-processing techniques—such as clustering and temporal smoothing—may be executed to determine landmark locations and poses with precision and in real-time. As a result, reconstructed curves corresponding to the landmarks—e.g., lane line, road boundary line, crosswalk, pole, text, etc.—may be used by a vehicle to perform one or more operations for navigating an environment.
Machine learning runtime library for neural network acceleration
Embodiments herein describe techniques for interfacing a neural network application with a neural network accelerator using a library. The neural network application may execute on a host computing system while the neural network accelerator executes on a massively parallel hardware system, e.g., a FPGA. The library operates a pipeline for submitting the tasks received from the neural network application to the neural network accelerator. In one embodiment, the pipeline includes a pre-processing stage, an FPGA execution stage, and a post-processing stage which each correspond to different threads. When receiving a task from the neural network application, the library generates a packet that includes the information required for the different stages in the pipeline to perform the tasks. Because the stages correspond to different threads, the library can process multiple packets in parallel which can increase the utilization of the neural network accelerator on the hardware system.
Machine learning technique for automatic modeling of multiple-valued outputs
A method and system are disclosed for training a model that implements a machine-learning algorithm. The technique utilizes latent descriptor vectors to change a multiple-valued output problem into a single-valued output problem and includes the steps of receiving a set of training data, processing, by a model, the set of training data to generate a set of output vectors, and adjusting a set of model parameters and component values for at least one latent descriptor vector in the plurality of latent descriptor vectors based on the set of output vectors. The set of training data includes a plurality of input vectors and a plurality of desired output vectors, and each input vector in the plurality of input vectors is associated with a particular latent descriptor vector in a plurality of latent descriptor vectors. Each latent descriptor vector comprises a plurality of scalar values that are initialized prior to training the model.
Computer-implemented perceptual apparatus
A method for compressing a digital representation of a stimulus includes encoding the digital representation as a feature vector within a feature space. The method also includes multiplying the feature vector with a Jacobian that maps the feature space to a non-Euclidean perceptual space according to a perceptual system that is capable of perceiving the stimulus. This multiplication generates a perceptual vector within the non-Euclidean perceptual space. The method also includes applying an update operator to the perceptual vector to move the perceptual vector in the perceptual space to an updated vector such that the updated vector has a lower entropy than the perceptual vector. The method also includes rounding the updated vector into a compressed vector that is smaller than the feature vector.
Optical neural network apparatus including passive phase modulator
An optical neural network apparatus that optically implements an artificial neural network includes an input layer, a hidden layer, and an output layer sequentially arranged in a traveling direction of light, wherein the output layer includes an image sensor including a plurality of light sensing pixels arranged in two dimensions, and wherein the input layer or the hidden layer includes at least one passive phase modulator configured to locally modulate a phase of incident light depending on positions on a two dimensional plane.
System and method for dynamic scheduling of distributed deep learning training jobs
A scheduling algorithm for scheduling training of deep neural network (DNN) weights on processing units identifies a next job to provisionally assign a processing unit (PU) based on a doubling heuristic. The doubling heuristic makes use of an estimated number of training sets needed to complete training of weights for a given job and/or a training speed function which indicates how fast the weights are converging. The scheduling algorithm solves a problem of efficiently assigning PUs when multiple DNN weight data structures must be trained efficiently. In some embodiments, the training of the weights uses a ring-based message passing architecture. In some embodiments, performance using a nested loop approach or nested loop fashion is provided. In inner iterations of the nested loop, PUs are scheduled and jobs are launched or re-started. In outer iterations of the nested loop, jobs are stopped, parameters are updated and the inner iteration is re-entered.
Gated truncated readout system
A gated truncated readout system for position sensitive or imaging detectors that improves resolution over traditional readout systems. The readout system includes two or more amplifiers that receive a multichannel output analog data from the detector. Analog gates control circuitry, included in the readout circuit, receives the signals from the amplifiers, determines a fractional value of the sum-integral of the signals, and enables analog gates operation around an area of interest, disabling all other channels where noise dominates the signal value and thereby improving interpolation accuracy of the signals centroid position and the detector resolution. Filtered signals are transmitted to a centroid interpolation signal processing device for computation of the centroid position. As a result disabling all channels where noise dominates the signal value, the gated truncated readout system provides better accuracy improved detector resolution.
Discrete three-dimensional processor
A discrete three-dimensional (3-D) processor comprises first and second dice. The first die comprises 3-D memory (3D-M) arrays, whereas the second die comprises logic circuits and at least an off-die peripheral-circuit component of the 3D-M array(s). Typical off-die peripheral-circuit component could be an address decoder, a sense amplifier, a programming circuit, a read-voltage generator, a write-voltage generator, a data buffer, or a portion thereof.
Scalable architectures for reference signature matching and updating
Methods, apparatus, systems and articles of manufacture are disclosed for scalable architectures for reference signature matching and updating. An example method for scalable architectures for reference signature matching and updating includes accessing site signatures to be compared to reference signatures from a first group of media sources. The example method also include determining if a first reference node is an owner of a first one of the site signatures, comparing a neighborhood of site signatures including the first site signature to reference signatures in a first subset of reference signatures when the first reference node is the owner of the first site signature, the first subset of references signatures stored in a first memory partition associated with the first reference node, and not comparing site signature to reference signatures when the first reference node is not the owner of the first one of the site signatures.
IMAGE DETECTION METHOD AND APPARATUS, COMPUTER DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
The present application provides an image detection method performed by a server. The method includes: intercepting a first image and a second image at a preset time interval from a video stream; performing pixel matching on the first image and the second image to obtain a value of total matching pixels between the first image and the second image; performing picture content detection on the second image in response to determining that the value of total matching pixels between the first image and the second image satisfies a preset matching condition based on the value of total matching pixels; and determining that the video stream is abnormal in response to determining that no picture content is in the second image by the picture content detection. In this way, an image recognition manner can be used to perform detection on image pictures of the video stream at the preset time interval.