G06N3/047

Electronic apparatus and method for optimizing trained model

An electronic apparatus is provided. The electronic apparatus includes: a memory storing a trained model including a plurality of layers; and a processor initializing a parameter matrix and a plurality of split variables of a trained model, calculating a new parameter matrix having a block-diagonal matrix for the plurality of split variables and the trained model to minimize a loss function for the trained model, a weight decay regularization term, and an objective function including a split regularization term defined by the parameter matrix and the plurality of split variables, vertically splitting the plurality of layers according to the group based on the computed split parameters and reconstruct the trained model using the computed new parameter matrix as parameters of the vertically split layers.

Systems and methods for generating names using machine-learned models
11580310 · 2023-02-14 · ·

A computing system can include one or more machine-learned models configured to receive context data that describes one or more entities to be named. In response to receipt of the context data, the machine-learned model(s) can generate output data that describes one or more names for the entity or entities described by the context data. The computing system can be configured to perform operations including inputting the context data into the machine-learned model(s). The operations can include receiving, as an output of the machine-learned model(s), the output data that describes the name(s) for the entity or entities described by the context data. The operations can include storing at least one name described by the output data.

Pointer sentinel mixture architecture

The technology disclosed provides a so-called “pointer sentinel mixture architecture” for neural network sequence models that has the ability to either reproduce a token from a recent context or produce a token from a predefined vocabulary. In one implementation, a pointer sentinel-LSTM architecture achieves state of the art language modeling performance of 70.9 perplexity on the Penn Treebank dataset, while using far fewer parameters than a standard softmax LSTM.

Computer-implemented interfaces for identifying and revealing selected objects from video

A computer-implemented visual interface for identifying and revealing objects from video-based media provides visual cues to enable users to interact with video-based media. Objects in videos are inferred and identified based upon automatic interpretations of the video and/or audio that is associated with the video. The automatic interpretations may be performed by a computer-implemented neural network. The computer-implemented visual interface is integrated with the video to enable users to interact with the identified objects. User interactions with the visual interface may be through either touch or non-touch means. Information is delivered to users that is based upon the identified objects, including in augmented or virtual reality-based form, responsive to user interactions with the computer-implemented visual interface.

Machine-learning training service for synthetic data

Various embodiments, methods and systems for implementing a distributed computing system machine-learning training service are provided. Initially a machine learning model is accessed. A plurality of synthetic data assets are accessed, where a synthetic data asset is associated with asset-variation parameters that are programmable for machine-learning. The machine learning model is retrained using the plurality of synthetic data assets. The machine-learning training service is further configured for executing real-time calls to generate an on-the-fly-generated synthetic data asset such that the on-the-fly-generated synthetic data asset is rendered in real-time to preclude pre-rendering and storing the on-the-fly-generated synthetic data asset. The machine-learning training service further supports hybrid-based machine learning training, where the machine learning model is trained based on a combination of the plurality of synthetic data assets, a plurality of non-synthetic data assets, and synthetic data asset metadata associated with the plurality of synthetic data assets.

System and method for convolutional layer structure for neural networks

An electronic device, method, and computer readable medium for 3D association of detected objects are provided. The electronic device includes a memory and at least one processor coupled to the memory. The at least one processor configured to convolve an input to a neural network with a basis kernel to generate a convolution result, scale the convolution result by a scalar to create a scaled convolution result, and combine the scaled convolution result with one or more of a plurality of scaled convolution results to generate an output feature map.

Machine-learning based optimization of data center designs and risks

In exemplary aspects of optimizing data centers, historical data corresponding to a data center is collected. The data center includes a plurality of systems. A data center representation is generated. The data center representation can be one or more of a schematic and a collection of data from among the historical data. The data center representation is encoded into a neural network model. The neural network model is trained using at least a portion of the historical data. The trained model is deployed using a first set of inputs, causing the model to generate one or more output values for managing or optimizing the data center with respect to design and risk aspects.

Generating approximations of cardiograms from different source configurations
11576624 · 2023-02-14 · ·

Systems are provided for generating data representing electromagnetic states of a heart for medical, scientific, research, and/or engineering purposes. The systems generate the data based on source configurations such as dimensions of, and scar or fibrosis or pro-arrhythmic substrate location within, a heart and a computational model of the electromagnetic output of the heart. The systems may dynamically generate the source configurations to provide representative source configurations that may be found in a population. For each source configuration of the electromagnetic source, the systems run a simulation of the functioning of the heart to generate modeled electromagnetic output (e.g., an electromagnetic mesh for each simulation step with a voltage at each point of the electromagnetic mesh) for that source configuration. The systems may generate a cardiogram for each source configuration from the modeled electromagnetic output of that source configuration for use in predicting the source location of an arrhythmia.

Accelerated training of a machine learning based model for semiconductor applications

Methods and systems for accelerated training of a machine learning based model for semiconductor applications are provided. One method for training a machine learning based model includes acquiring information for non-nominal instances of specimen(s) on which a process is performed. The machine learning based model is configured for performing simulation(s) for the specimens. The machine learning based model is trained with only information for nominal instances of additional specimen(s). The method also includes re-training the machine learning based model with the information for the non-nominal instances of the specimen(s) thereby performing transfer learning of the information for the non-nominal instances of the specimen(s) to the machine learning based model.

Method for generating web code for UI based on a generative adversarial network and a convolutional neural network
11579850 · 2023-02-14 · ·

Provided is a method for generating web codes for a user interface (UI) based on a generative adversarial network (GAN) and a convolutional neural network (CNN). The method includes steps described below. A mapping relationship between display effects of a HyperText Markup Language (HTML) element and source codes of the HTML element is constructed. A location of an HTML element in an image I is recognized. Complete HTML codes of the image I are generated. The similarity between manually-written HTML codes and the generated complete HTML codes and the similarity between the image I and an image I.sub.1 generated by the generated complete HTML codes are obtained. After training, an image-to-HTML-code generation model M is obtained. A to-be-processed UI image is input into the model M so as to obtain corresponding HTML codes. According to the method of the present disclosure, an image-to-HTML-code generation model M can be obtained.