G06N3/084

Machine vision as input to a CMP process control algorithm

During chemical mechanical polishing of a substrate, a signal value that depends on a thickness of a layer in a measurement spot on a substrate undergoing polishing is determined by a first in-situ monitoring system. An image of at least the measurement spot of the substrate is generated by a second in-situ imaging system. Machine vision processing, e.g., a convolutional neural network, is used to determine a characterizing value for the measurement spot based on the image. Then a measurement value is calculated based on both the characterizing value and the signal value.

Pointer sentinel mixture architecture

The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Pointer sentinel mixture architecture

The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Learning device, signal processing device, and learning method

A learning data processing unit accepts, as input, a plurality of pieces of learning data for a respective plurality of tasks, and calculates, for each of the tasks, a batch size which meets a condition that a value obtained by dividing a data size of corresponding one of the pieces of learning data by the corresponding batch size is the same between the tasks. A batch sampling unit samples, for each of the tasks, samples from corresponding one of the pieces of learning data with the corresponding batch size calculated by the learning data processing unit. A learning unit updates a weight of a discriminator for each of the tasks, using the samples sampled by the batch sampling unit.

Neural network learning device, method, and program
11580383 · 2023-02-14 · ·

A large amount of training data is typically required to perform deep network leaning, making it difficult to achieve using a few pieces of data. In order to solve this problem, the neural network device according to the present invention is provided with: a feature extraction unit which extracts features from training data using a learning neural network; an adversarial feature generation unit which generates an adversarial feature from the extracted features using the learning neural network; a pattern recognition unit which calculates a neural network recognition result using the training data and the adversarial feature; and a network learning unit which performs neural network learning so that the recognition result approaches a desired output.

Tensor dropout using a mask having a different ordering than the tensor

A method for selectively dropping out feature elements from a tensor in a neural network is disclosed. The method includes receiving a first tensor from a first layer of a neural network. The first tensor includes multiple feature elements arranged in a first order. A compressed mask for the first tensor is obtained. The compressed mask includes single-bit mask elements respectively corresponding to the multiple feature elements of the first tensor and has a second order that is different than the first order of their corresponding feature elements in the first tensor. Feature elements from the first tensor are selectively dropped out based on the compressed mask to form a second tensor which is propagated to a second layer of the neural network.

Electrical meter for training a mathematical model for a device using a smart plug

An electrical panel or an electrical meter may provide improved functionality by interacting with a smart plug. A smart plug may provide a smart-plug power monitoring signal that includes information about power consumption of devices connected to the smart plug. The smart-plug power monitoring signal may be used in conjunction with power monitoring signals from the electrical mains of the building for providing information about the operation of devices in the building. For example, the power monitoring signals may be used to (i) determine the main of the house that provides power to the smart plug, (ii) identify devices receiving power from the smart plug, (iii) improve the accuracy of identifying device state changes, and (iv) train mathematical models for identifying devices and device state changes.

Disk drive failure prediction with neural networks

Techniques are described herein for predicting disk drive failure using a machine learning model. The framework involves receiving disk drive sensor attributes as training data, preprocessing the training data to select a set of enhanced feature sequences, and using the enhanced feature sequences to train a machine learning model to predict disk drive failures from disk drive sensor monitoring data. Prior to the training phase, the RNN LSTM model is tuned using a set of predefined hyper-parameters. The preprocessing, which is performed during the training and evaluation phase as well as later during the prediction phase, involves using predefined values for a set of parameters to generate the set of enhanced sequences from raw sensor reading. The enhanced feature sequences are generated to maintain a desired healthy/failed disk ratio, and only use samples leading up to a last-valid-time sample in order to honor a pre-specified heads-up-period alert requirement.

System and method for compressing activation data
11580402 · 2023-02-14 · ·

A method for adapting a trained neural network is provided. Input data is input to the trained neural network and a plurality of filters are applied to generate a plurality of channels of activation data. Differences between corresponding activation values in the plurality of channels of activation data are calculated and an order of the plurality of channels is determined based on the calculated differences. The neural network is adapted so that it will output channels of activation data in the determined order. The ordering of the channels of activation data is subsequently used to compress activation data values by taking advantage of a correlation between activation data values in adjacent channels.

System and method for compressing activation data
11580402 · 2023-02-14 · ·

A method for adapting a trained neural network is provided. Input data is input to the trained neural network and a plurality of filters are applied to generate a plurality of channels of activation data. Differences between corresponding activation values in the plurality of channels of activation data are calculated and an order of the plurality of channels is determined based on the calculated differences. The neural network is adapted so that it will output channels of activation data in the determined order. The ordering of the channels of activation data is subsequently used to compress activation data values by taking advantage of a correlation between activation data values in adjacent channels.