G05B2219/39298

Robot system

A robot system includes: at least one non-learned robot that has not learned a learning compensation amount of position control based on an operation command; at least one learned robot that has learned the learning compensation amount of the position control based on the operation command; and a storage device that stores the operation command and the learning compensation amount of the learned robot, the non-learned robot comprising a compensation amount estimation unit that compensates the learning compensation amount of the learned robot stored in the storage device based on a difference between the operation command of the learned robot stored in the storage device and an operation command of an own robot, and estimates the compensated learning compensation amount as a learning compensation amount of the own robot.

SIMULATION APPARATUS
20190351520 · 2019-11-21 ·

A simulation apparatus includes a machine learning device for learning a change in a machining route in machining of a workpiece. The machine learning device observes data indicating the changed machining route and data indicating a machining condition of the workpiece as a state variable, and also acquires determination data for determining whether or not a cycle time obtained by simulation using the changed machining route is appropriate, and learns by associating the machining condition of the workpiece with the change in the machining route, using the state variable and the determination data.

DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION
20240131695 · 2024-04-25 ·

Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

ROBOT PATH GENERATING DEVICE AND ROBOT SYSTEM
20190314989 · 2019-10-17 · ·

To generate a more appropriate path, provided is a robot path generation device including circuitry configured to: hold a track planning module learning data set, in which a plurality of pieces of path data generated based on a motion constraint condition of a robot, and evaluation value data, which corresponds to each of the plurality of pieces path data and is a measure under a predetermined evaluation criterion, are associated with each other; and generate, based on a result of a machine learning process that is based on the track planning module learning data set, a path of the robot between a set start point and a set end point, which are freely set.

Automatic Robot Perception Programming by Imitation Learning

Apparatus, systems, methods, and articles of manufacture for automatic robot perception programming by imitation learning are disclosed. An example apparatus includes a percept mapper to identify a first percept and a second percept from data gathered from a demonstration of a task and an entropy encoder to calculate a first saliency of the first percept and a second saliency of the second percept. The example apparatus also includes a trajectory mapper to map a trajectory based on the first percept and the second percept, the first percept skewed based on the first saliency, the second percept skewed based on the second saliency. In addition, the example apparatus includes a probabilistic encoder to determine a plurality of variations of the trajectory and create a collection of trajectories including the trajectory and the variations of the trajectory. The example apparatus also includes an assemble network to imitate an action based on a first simulated signal from a first neural network of a first modality and a second simulated signal from a second neural network of a second modality, the action representative of a perceptual skill.

CONTROLLER AND MACHINE LEARNING DEVICE
20190291271 · 2019-09-26 · ·

A machine learning device is provided in a versatile controller capable of inferring command data to be issued to each axis of a robot. The device includes an axis angle conversion unit calculating, from the trajectory data, an amount of change of an axis angle of an axis of the robot, a state observation unit observing axis angle data relating to the amount of change of the axis angle of the axis of the robot as a state variable representing a current state of an environment, a label data acquisition unit acquiring axis angle command data relating to command data for the axis of the robot as label data, and a learning unit learning the amount of change of the axis angle of the axis of the robot and the command data for the axis in association with each other by using the state variable and the label data.

ADAPTIVE PREDICTOR APPARATUS AND METHODS
20190255703 · 2019-08-22 ·

Apparatus and methods for training and operating of robotic devices. Robotic controller may comprise a predictor apparatus configured to generate motor control output. The predictor may be operable in accordance with a learning process based on a teaching signal comprising the control output. An adaptive controller block may provide control output that may be combined with the predicted control output. The predictor learning process may be configured to learn the combined control signal. Predictor training may comprise a plurality of trials. During initial trial, the control output may be capable of causing a robot to perform a task. During intermediate trials, individual contributions from the controller block and the predictor may be inadequate for the task. Upon learning, the control knowledge may be transferred to the predictor so as to enable task execution in absence of subsequent inputs from the controller. Control output and/or predictor output may comprise multi-channel signals.

DEEP REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATION
20190232488 · 2019-08-01 ·

Implementations utilize deep reinforcement learning to train a policy neural network that parameterizes a policy for determining a robotic action based on a current state. Some of those implementations collect experience data from multiple robots that operate simultaneously. Each robot generates instances of experience data during iterative performance of episodes that are each explorations of performing a task, and that are each guided based on the policy network and the current policy parameters for the policy network during the episode. The collected experience data is generated during the episodes and is used to train the policy network by iteratively updating policy parameters of the policy network based on a batch of collected experience data. Further, prior to performance of each of a plurality of episodes performed by the robots, the current updated policy parameters can be provided (or retrieved) for utilization in performance of the episode.

APPARATUS AND METHODS FOR OPERATING ROBOTIC DEVICES USING SELECTIVE STATE SPACE TRAINING
20190217467 · 2019-07-18 ·

Apparatus and methods for training and controlling of e.g., robotic devices. In one implementation, a robot may be utilized to perform a target task characterized by a target trajectory. The robot may be trained by a user using supervised learning. The user may interface to the robot, such as via a control apparatus configured to provide a teaching signal to the robot. The robot may comprise an adaptive controller comprising a neuron network, which may be configured to generate actuator control commands based on the user input and output of the learning process. During one or more learning trials, the controller may be trained to navigate a portion of the target trajectory. Individual trajectory portions may be trained during separate training trials. Some portions may be associated with robot executing complex actions and may require additional training trials and/or more dense training input compared to simpler trajectory actions.

Machine learning device that performs learning using simulation result, machine system, manufacturing system, and machine learning method
10317854 · 2019-06-11 · ·

A machine learning device that learns a control command for a machine by machine learning, including a machine learning unit that performs the machine learning to output the control command; a simulator that performs a simulation of a work operation of the machine based on the control command; and a first determination unit that determines the control command based on an execution result of the simulation by the simulator.