G05B2219/39376

CONTROL DEVICE, ROBOT, AND ROBOT SYSTEM

A control device includes a processor that is configured to execute computer-executable instructions so as to control a robot, wherein the processor is configured to calculate an image processing parameter related to image processing on an image of a target object captured by a camera, by using machine learning, detect the target object on the basis of an image on which the image processing is performed by using the calculated image processing parameter, and control a robot on the basis of a detection result of the target object.

CONTROL DEVICE, ROBOT, AND ROBOT SYSTEM
20180225113 · 2018-08-09 ·

A control device includes a processor that is configured to execute computer-executable instructions so as to control a robot, wherein the processor is configured to calculate a force control parameter related to force control of a robot by using machine learning, and control the robot on the basis of the calculated force control parameter.

Bayesian-centric autonomous robotic learning
09984332 · 2018-05-29 · ·

Various apparatus and methods include autonomous robot operations to perturb a current Bayesian equation and determining whether the perturbed Bayesian equation yields an improved probability of success of achieving a goal relative to the current Bayesian equation. In an illustrative example, the perturbation may modulate a coefficient of a parameter in the Bayesian equation. In some examples, the perturbation may include assessment of whether adding or removing a parameter may improve the probability of success of achieving the goal. The parameters of the Bayesian equation may include, for example, current state information, alone or in combination with sensor input values and/or historical information, for example. In some implementations, the robot may advantageously autonomously optimize its operations by perturbing a current Bayesian equation associated with, for example, a current goal, sub-goal, task, or probability of success criteria.

System(s) and method(s) of using imitation learning in training and refining robotic control policies

Implementations described herein relate to training and refining robotic control policies using imitation learning techniques. A robotic control policy can be initially trained based on human demonstrations of various robotic tasks. Further, the robotic control policy can be refined based on human interventions while a robot is performing a robotic task. In some implementations, the robotic control policy may determine whether the robot will fail in performance of the robotic task, and prompt a human to intervene in performance of the robotic task. In additional or alternative implementations, a representation of the sequence of actions can be visually rendered for presentation to the human can proactively intervene in performance of the robotic task.

SYSTEM AND METHOD OF CONTROLLING ROBOT
20170113350 · 2017-04-27 · ·

A robot control system includes: an interface configured to receive a user input; a controller configured to generate a motion command corresponding to the user input and a motion command group including the motion command, and generate hardware interpretable data by analyzing the motion command; and a driver configured to drive a motion of at least one hardware module based on the hardware interpretable data to be interpreted by the at least one hardware module.

SERVO CONTROL SYSTEM EQUIPPED WITH LEARNING CONTROL APPARATUS HAVING FUNCTION OF OPTIMIZING LEARNING MEMORY ALLOCATION
20170031349 · 2017-02-02 ·

A servo control system for controlling a plurality of axes of a machine tool, comprises: a plurality of servo control units for controlling the plurality of axes, respectively; a plurality of learning control units that are provided one each in the plurality of servo control units, and each configured to control a cyclic operation highly precisely; a common learning memory for storing correction data which at least a portion of the plurality of learning control units generates; a memory allocation unit for allocating at least a portion of a memory area in the learning memory to the axis that the learning control unit that generated the correction data controls; and a memory amount notifying unit for notifying the memory allocation unit as to the amount of memory that each of the plurality of learning control units of the respective axes requires.

SYSTEMS AND METHODS FOR SKILL LEARNING WITH MULTIPLE CRITICS
20250164966 · 2025-05-22 · ·

Systems and methods are disclosed for determining a policy to recommend transition in a position-representing space for a robotic device using a multi-critic architecture. To learn policy in a multi-critic architecture, a set of critics is defined pertaining to a position-representing space where each critic corresponds to a different objective function such as reach-reward, discovery-reward, and safety-reward. For each one of the critics of the set of critics, a learned value function in position-representing space is determined. The policy is learned based on the weighted feedback of the learned value functions to recommend transitions that are safe in the position-representing space. The multi-critic architecture minimizes interference between multiple reward functions and learns a safe and stable policy for the robotic device.

Systems and methods for skill learning with multiple critics
12393175 · 2025-08-19 · ·

Systems and methods are disclosed for determining a policy to recommend transition in a position-representing space for a robotic device using a multi-critic architecture. To learn policy in a multi-critic architecture, a set of critics is defined pertaining to a position-representing space where each critic corresponds to a different objective function such as reach-reward, discovery-reward, and safety-reward. For each one of the critics of the set of critics, a learned value function in position-representing space is determined. The policy is learned based on the weighted feedback of the learned value functions to recommend transitions that are safe in the position-representing space. The multi-critic architecture minimizes interference between multiple reward functions and learns a safe and stable policy for the robotic device.

Device and Method for Natural Language Controlled Industrial Assembly Robotics

A computer-implemented method of determining actions for controlling a robot, in particular an assembly robot, includes (i) receiving a first and second input, wherein the first input is a sentence describing an action which should be carried out by the robot, wherein the second input is an image of a current state of an environment of the robot, (ii) feeding the first input into a first machine learning model and feeding the second input into a second machine learning model, wherein the first and second machine learning models are configured to determine tokens for their respective inputs, and (iv) feeding the tokens into a third machine learning model, wherein the third machine learning model outputs two outputs, wherein the first output is a switch for incorporating specialized skill networks and the second output are actions.

System and Method for Controlling Robotic Manipulator with Self-Attention Having Hierarchically Conditioned Output

A method for controlling a robotic manipulator according to a task comprises accepting a feedback signal including a sequence of multi-modal observations of a state of execution of the task. The multi-modal observations are processed with a neural network having a self-attention module with a hierarchically conditioned output to produce a skill of the robotic manipulator and an action conditioned on the skill. The neural network is trained in a supervised manner with demonstration data to produce a sequence of skills and a corresponding sequence of actions for the actuators of the robotic manipulator to perform the task. The method further comprises determining one or more control commands for the one or more actuators based on the produced action and submitting the one or more control commands to the one or more actuators causing a change of the state of execution of the task.