G06N3/092

METHOD AND APPARATUS FOR AUTONOMOUS DRIVING CONTROL BASED ON ROAD GRAPHICAL NEURAL NETWORK
20230211799 · 2023-07-06 ·

Provided are an autonomous driving control apparatus and method based on a Road-GNN. By using road graph-based data, a network can more accurately and efficiently understand road shape information, and driving performance is improved.

Method and Device for Optimum Parameterization of a Driving Dynamics Control System for Vehicles

A method and device parameterize a driving dynamics controller of a vehicle, which intervenes in a controlling manner in a driving dynamics of the vehicle. The driving dynamics controller ascertains an action depending on a vehicle state. The method includes providing a model for predicting a vehicle state. The model configured to predict a subsequent vehicle state depending on the vehicle state and the action. At least one data tuple is ascertained including a sequence of vehicle states and respectively associated actions. The vehicle states are ascertained by the driving dynamics controller using the model depending on an ascertained action. The parameters of the driving dynamics controller are changed/adjusted such that a cost function which ascertains costs of the trajectory depending on the vehicle states and on the ascertained actions of the respectively associated vehicle states and is dependent on the parameters of the driving dynamics controller is minimized.

DECISION OPTIMIZATION UTILIZING TABULAR DATA

A computer-implemented method for automated policy decision making optimization is disclosed. The computer-implemented method includes creating a dataset from a tabular database, wherein the dataset includes one or more columns selected as state variables, a column selected as action variables, and a column selected as reward variables. The computer-implemented method further includes determining a candidate function approximator Q based on applying at least one state variable, one action variable, and one reward variable to a trained regression model. The computer-implemented method further includes learning a decision policy based on applying the candidate function approximator Q to a reinforcement learning algorithm. The computer-implemented method further includes determining, based on the learned decision policy, an expected reward.

Data retrieval using reinforced co-learning for semi-supervised ranking
11544553 · 2023-01-03 · ·

A computer-implement method comprises: training a classifier with labeled data from a dataset; classifying, by the trained classifier, unlabeled data from the dataset; providing, by the classifier to a policy gradient, a reward signal for each data/query pair; transferring, by the classifier to a ranker, learning; training, by the policy gradient, the ranker; ranking data from the dataset based on a query; and retrieving data from the ranked data in response to the query.

AI ENGINE-SUPPORTING DOWNLINK RADIO RESOURCE SCHEDULING METHOD AND APPARATUS

An Artificial Intelligence (AI) engine-supporting downlink radio resource scheduling method and apparatus are provided. The AI engine-supporting downlink radio resource scheduling method includes: constructing an AI engine, establishing a Socket connection between an AI engine and an Open Air Interface (OAI) system, and configuring the AI engine into an OAI running environment to utilize the AI engine to replace a Round-Robin scheduling algorithm and a fair Round-Robin scheduling algorithm adopted by a Long Term Evolution (LTE) at a Media Access Control (MAC) layer in the OAI system for resource scheduling to take over a downlink radio resource scheduling process; sending scheduling information to the AI engine through Socket during the downlink radio resource scheduling process of the OAI system; and utilizing the AI engine to carry out resource allocation according to the scheduling information, and returning a resource allocation result to the OAI system.

METHOD AND APPARATUS TO FACILITATE GENERATING A LEAF SEQUENCE FOR A MULTI-LEAF COLLIMATOR

A memory has a fluence map that corresponds to a particular patient stored therein. This memory also has at least one deep learning model stored therein trained to deduce a leaf sequence for a multi-leaf collimator from a fluence map. A control circuit operably coupled to that memory iteratively optimizes a radiation treatment plan to administer therapeutic radiation to that patient by, at least in part, generating a leaf sequence as a function of the at least one deep learning model and the fluence map that corresponds to the patient.

REINFORCEMENT-LEARNING BASED SYSTEM FOR CAMERA PARAMETER TUNING TO IMPROVE ANALYTICS

A method for automatically adjusting camera parameters to improve video analytics accuracy during continuously changing environmental conditions is presented. The method includes capturing a video stream from a plurality of cameras, performing video analytics tasks on the video stream, the video analytics tasks defined as analytics units (AUs), applying image processing to the video stream to obtain processed frames, filtering the processed frames through a filter to discard low-quality frames and dynamically fine-tuning parameters of the plurality of cameras. The fine-tuning includes passing the filtered frames to an AU-specific proxy quality evaluator, employing State-Action-Reward-State-Action (SARSA) reinforcement learning (RL) computations to automatically fine-tune the parameters of the plurality of cameras, and based on the reinforcement computations, applying a new policy for an agent to take actions and learn to maximize a reward.

SYSTEM AND METHOD FOR PROVIDING AUTOMATIC GUIDANCE IN DATA FLOW JOURNEYS
20220414527 · 2022-12-29 ·

A system and method of offering task-specific guidance to users of software. The system and method can intelligently determine which task the user is likely performing and what sequence of steps (data journey) will offer the user the most efficient route in completing the task. In some embodiments, the proposed system collects data representing in-app behavior for a large group of users in order to train a model that will predict what the user's next actions are likely to be. Furthermore, in some cases, current data for a user may include screen captures or other image data that can be compared with stored image data in order to help identify the user's current task.

CONTROLLING AGENTS INTERACTING WITH AN ENVIRONMENT USING BRAIN EMULATION NEURAL NETWORKS
20220414419 · 2022-12-29 ·

In one aspect, there is provided a method performed by one or more data processing apparatus for selecting actions to be performed by an agent interacting with an environment, the method including, at each of multiple time steps, receiving an observation characterizing a current state of the environment at the time step, providing an input including the observation to an action selection neural network having a brain emulation sub-network with an architecture that is based on synaptic connectivity between biological neurons in a brain of a biological organism, processing the input including the observation characterizing the current state of the environment at the time step using the action selection neural network having the brain emulation sub-network to generate an action selection output, and selecting an action to be performed by the agent at the time step based on the action selection output.

LEARNING ROBOTIC SKILLS WITH IMITATION AND REINFORCEMENT AT SCALE

Utilizing an initial set of offline positive-only robotic demonstration data for pre-training an actor network and a critic network for robotic control, followed by further training of the networks based on online robotic episodes that utilize the network(s). Implementations enable the actor network to be effectively pre-trained, while mitigating occurrences of and/or the extent of forgetting when further trained based on episode data. Implementations additionally or alternatively enable the actor network to be trained to a given degree of effectiveness in fewer training steps. In various implementations, one or more adaptation techniques are utilized in performing the robotic episodes and/or in performing the robotic training. The adaptation techniques can each, individually, result in one or more corresponding advantages and, when used in any combination, the corresponding advantages can accumulate. The adaptation techniques include Positive Sample Filtering, Adaptive Exploration, Using Max Q Values, and Using the Actor in CEM.