G06N3/045

Phased deployment of deep-learning models to customer facing APIs

Techniques for phased deployment of machine learning models are described. Customers can call a training API to initiate model training, but then must wait while the training completes before the model can be used to perform inference. Depending on the type of model, machine learning algorithm being used for training, size of the training dataset, etc. this training process may take hours or days to complete. This leads to significant downtime where inference requests cannot be served. Embodiments improve upon existing systems by providing phased deployment of custom models. For example, a simple, less accurate model, can be provided synchronously in response to a request for a custom model. At the same time, one or more machine learning models can be trained asynchronously in the background. When the machine learning model is ready for use, the customers' traffic and jobs can be transferred over to the better model.

Autonomous vehicle operation feature monitoring and evaluation of effectiveness

Methods and systems for monitoring use and determining risks associated with operation of a vehicle having one or more autonomous operation features are provided. According to certain aspects, operating data may be recorded during operation of the vehicle. This may include information regarding the vehicle, the vehicle environment, use of the autonomous operation features, and/or control decisions made by the features. The control decisions may include actions the feature would have taken to control the vehicle, but which were not taken because a vehicle operator was controlling the relevant aspect of vehicle operation at the time. The operating data may be recorded in a log, which may then be used to determine risk levels associated with vehicle operation based upon risk levels associated with the autonomous operation features. The risk levels may further be used to adjust an insurance policy associated with the vehicle.

Methods, systems, and computer readable media for mask embedding for realistic high-resolution image synthesis
11580673 · 2023-02-14 · ·

The subject matter described herein includes methods, systems, and computer readable media for mask embedding for realistic high-resolution image synthesis. According to one method for mask embedding for realistic high-resolution image synthesis includes receiving, as input, a mask embedding vector and a latent features vector, wherein the mask embedding vector acts as a semantic constraint; generating, using a trained image synthesis algorithm and the input, a realistic image, wherein the realistic image is constrained by the mask embedding vector; and outputting, by the trained image synthesis algorithm, the realistic image to a display or a storage device.

Multimodal based punctuation and/or casing prediction

Techniques for predicting punctuation and casing using multimodal fusion are described. An exemplary method includes processing generated text by: tokenizing the generated text into sub-words, and generating a sequence of lexical features for the sub-words using a pre-trained lexical encoder; processing audio of the audio by: generating a sequence of frame level acoustic embeddings using a pre-trained acoustic encoder on the audio, and generating task specific embeddings from the frame level acoustic embeddings; performing multimodal fusion of the sub-word level acoustic embeddings and the sequence of lexical features by: aligning the task specific embeddings to the sequence of lexical features, and combining the sequence of lexical features and aligned acoustic sequence; predicting punctuation and casing from the combined sequence of lexical features and aligned acoustic sequence; concatenating the sub-words of the text, and applying the predicted punctuation and casing; and outputting text having the predicted punctuation and casing.

Method for training speech recognition model, method and system for speech recognition

Disclosed are a method for training speech recognition model, a method and a system for speech recognition. The disclosure relates to field of speech recognition and includes: inputting an audio training sample into the acoustic encoder to represent acoustic features of the audio training sample in an encoded way and determine an acoustic encoded state vector; inputting a preset vocabulary into the language predictor to determine text prediction vector; inputting the text prediction vector into the text mapping layer to obtain a text output probability distribution; calculating a first loss function according to a target text sequence corresponding to the audio training sample and the text output probability distribution; inputting the text prediction vector and the acoustic encoded state vector into the joint network to calculate a second loss function, and performing iterative optimization according to the first loss function and the second loss function.

Reinforcement learning using a relational network for generating data encoding relationships between entities in an environment

A neural network system is proposed, including an input network for extracting, from state data, respective entity data for each a plurality of entities which are present, or at least potentially present, in the environment. The entity data describes the entity. The neural network contains a relational network for parsing this data, which includes one or more attention blocks which may be stacked to perform successive actions on the entity data. The attention blocks each include a respective transform network for each of the entities. The transform network for each entity is able to transform data which the transform network receives for the entity into modified entity data for the entity, based on data for a plurality of the other entities. An output network is arranged to receive data output by the relational network, and use the received data to select a respective action.

Convolutional layer acceleration unit, embedded system having the same, and method for operating the embedded system

Disclosed herein are a convolutional layer acceleration unit, an embedded system having the convolutional layer acceleration unit, and a method for operating the embedded system. The method for operating an embedded system, the embedded system performing an accelerated processing capability programmed using a Lightweight Intelligent Software Framework (LISF), includes initializing and configuring, by a parallelization managing function entity (FE), entities present in resources for performing mathematical operations in parallel, and processing in parallel, by an acceleration managing FE, the mathematical operations using the configured entities.

Automated personalized classification of journey data captured by one or more movement-sensing devices

A technique is described herein for automatically logging journeys taken by a user, and then automatically classifying the purposes of the journeys. In one implementation, the technique obtains journey data from one or more movement-sensing devices as a user travels from a starting location to an ending location in a vehicle. The technique generates a set of features based on the journey data, and then uses a machine-trainable model (such as a neural network) to make its classification based on the features. The machine-trainable model accepts at least one feature that is based on statistical information regarding at least one aspect of prior journeys that the user has taken. Overall, the technique provides a resource-efficient solution that rapidly provides personalized results to individual respective users. In some implementations, the technique performs its personalization without sharing journey data with a remote server.

Method for generating a model for generating a synthetic ECG and a method and system for analysis of heart activity

A method of generating a model for generating a synthetic electrocardiography (ECG) signal comprises: receiving subject-specific training data for machine learning, said training data comprising a photoplethysmography (PPG) signal acquired from the subject and an ECG signal acquired from the subject, wherein the ECG signal provides a ground truth of the subject for associating the ECG signal with the PPG signal; using associated pairs of a time-series of the PPG signal and a corresponding time-series of the ECG signal as input to a deep neural network, DNN; and determining, through the DNN, a subject-specific model relating the PPG signal of the subject to the ECG signal of the subject for converting the PPG signal to a synthetic ECG signal using the subject-specific model.

Processing communications signals using a machine-learning network

Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for processing communications signals using a machine-learning network are disclosed. In some implementations, pilot and data information are generated for a data signal. The data signal is generated using a modulator for orthogonal frequency-division multiplexing (OFDM) systems. The data signal is transmitted through a communications channel to obtain modified pilot and data information. The modified pilot and data information are processed using a machine-learning network. A prediction corresponding to the data signal transmitted through the communications channel is obtained from the machine-learning network. The prediction is compared to a set of ground truths and updates, based on a corresponding error term, are applied to the machine-learning network.