Patent classifications
G05B2219/39289
Predictive robotic controller apparatus and methods
Robotic devices may be trained by a user guiding the robot along target action trajectory using an input signal. A robotic device may comprise an adaptive controller configured to generate control signal based on one or more of the user guidance, sensory input, performance measure, and/or other information. Training may comprise a plurality of trials, wherein for a given context the user and the robot's controller may collaborate to develop an association between the context and the target action. Upon developing the association, the adaptive controller may be capable of generating the control signal and/or an action indication prior and/or in lieu of user input. The predictive control functionality attained by the controller may enable autonomous operation of robotic devices obviating a need for continuing user guidance.
Trajectory Planning with Droppable Objects
Example implementations may relate to methods and systems for determining a safe trajectory for movement of an object by a robotic system. According to these various implementations, the robotic system may determine at least first and second candidate trajectories for moving the object. For at least a first point along the first candidate trajectory, the robotic system may determine a predicted cost of dropping the object at the first point along the first candidate trajectory. And for at least a second point along the second candidate trajectory, the robotic system may determine a predicted cost of dropping the object at the second point along the second candidate trajectory. Then, based on these various determined predicted costs, the robotic system may select between the first and second candidates trajectories and may then move the object along the selected trajectory.
APPARATUS AND METHODS FOR OPERATING ROBOTIC DEVICES USING SELECTIVE STATE SPACE TRAINING
Apparatus and methods for training and controlling of e.g., robotic devices. In one implementation, a robot may be utilized to perform a target task characterized by a target trajectory. The robot may be trained by a user using supervised learning. The user may interface to the robot, such as via a control apparatus configured to provide a teaching signal to the robot. The robot may comprise an adaptive controller comprising a neuron network, which may be configured to generate actuator control commands based on the user input and output of the learning process. During one or more learning trials, the controller may be trained to navigate a portion of the target trajectory. Individual trajectory portions may be trained during separate training trials. Some portions may be associated with robot executing complex actions and may require additional training trials and/or more dense training input compared to simpler trajectory actions.
Apparatus and methods for operating robotic devices using selective state space training
Apparatus and methods for training and controlling of e.g., robotic devices. In one implementation, a robot may be utilized to perform a target task characterized by a target trajectory. The robot may be trained by a user using supervised learning. The user may interface to the robot, such as via a control apparatus configured to provide a teaching signal to the robot. The robot may comprise an adaptive controller comprising a neuron network, which may be configured to generate actuator control commands based on the user input and output of the learning process. During one or more learning trials, the controller may be trained to navigate a portion of the target trajectory. Individual trajectory portions may be trained during separate training trials. Some portions may be associated with robot executing complex actions and may require additional training trials and/or more dense training input compared to simpler trajectory actions.
Data-efficient hierarchical reinforcement learning
Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).
System and method for controlling a robotic manipulator based on hierarchical reinforcement learning of control policies
A feedback controller for controlling a robotic manipulator to perform a task is provided. The feedback controller comprises a memory configured to store a hierarchical reinforcement learning (HRL) neural network including (i) nominal control policy, and (ii) recovery control policy. The recovery control policy is trained based on a recovery policy reward to select a switch-to-nominal action to transfer control to the nominal control policy to perform the task. The recovery policy reward is dependent on a nominal policy reward for training the nominal control policy. The feedback controller further comprises a processor configured to iteratively execute the HRL neural network to select at least one of a nominal action based on the nominal control policy, and a recovery action based on the recovery control policy and to control the robotic manipulator to perform the task based on the selected action.