Patent classifications
G05B2219/33056
Automatic control artificial intelligence device and method for update control function
Disclosed herein is an automatic control artificial intelligence device including a collection unit configured to acquire an output value according to control of a control system; and an artificial intelligence unit operably coupled to the collection unit and configured to: communicate with the collection unit; set at least one of one or more base lines and a reward based on a gap between the one or more base lines and the output value, according to a plurality of operation goals of the control system; and update a control function for providing a control value to the control system by performing reinforcement learning based on the gap between the one or more base lines and the output value.
Tuning of axis control of multi-axis machines
A system for tuning of axis control of a multi-axis machine and a method of operating the same are provided. The system includes a knowledge base for acquiring and maintaining factual knowledge associated with the tuning of the axis control. The factual knowledge has a uniform ontology a uniform data representation, and includes known input facts associated with known output facts. The system further includes an inference unit for automatically inferring new output facts associated with given new input facts in accordance with the factual knowledge.
Machine learning device, control device and machine learning method
A machine learning device that performs reinforcement learning for a servo control device and optimizes a coefficient of a filter for attenuating a specific frequency component provided in the servo control device includes a state information acquisition unit which acquires state information that includes the result of calculation of at least one of an input/output gain of the servo control device and a phase delay of input and output, the coefficient of the filter and conditions, and an action information output unit which outputs, to the filter, action information including adjustment information of the coefficient. A reward output unit determines evaluation values under the conditions based on the result of the calculation to output, as a reward, the value of a sum of the evaluation values. A value function updating unit updates an action value function based on the value of the reward, the state information and the action information.
Sensor use and analysis for dynamic update of interaction in a social robot
A method of optimizing social interaction between a robot and a human. The method comprises generating then executing a robot motion script for interaction with a human by a robot based on a characteristic detected by at least one of a plurality of sensors on the robot. The method further comprises detection, by at least one sensor of the robot, a reaction of the human during a first period. The robot then analyzes the reaction of the human and assigns a positive or negative classification to the reaction based on pre-defined mapping stored in the memory of the robot. The method further comprises modifying the robot motion script to incorporate a pre-defined modification based on the determination of a negative classification of the human reaction. The method further comprises executing the modified robot motion script during a second period to obtain an improved interaction with the human.
TOOLPATH GENERATION BY REINFORCEMENT LEARNING FOR COMPUTER AIDED MANUFACTURING
Methods, systems, and apparatus, including medium-encoded computer program products, for computer aided design and manufacture of physical structures using toolpaths generated by reinforcement learning for use with subtractive manufacturing systems and techniques, include: obtaining, in a computer aided design or manufacturing program, a three dimensional model of a manufacturable object; generating toolpaths that are usable by a computer-controlled manufacturing system to manufacture at least a portion of the manufacturable object by providing at least a portion of the three dimensional model to a machine learning algorithm that employs reinforcement learning, wherein the machine learning algorithm includes one or more scoring functions that include rewards that correlate with desired toolpath characteristics comprising toolpath smoothness, toolpath length, and avoiding collision with the three dimensional model; and providing the toolpaths to the computer-controlled manufacturing system to manufacture at least the portion of the manufacturable object.
METHOD AND DEVICE FOR OPERATING A ROBOT
Device and method for operating a robot. As a function of a first state of the robot and/or its surroundings and as a function of an output of a first model, a first part of a manipulated variable for activating the robot for a transition from the first state into a second state of the robot is determined. A second part of the manipulated variable is determined as a function of the first state and regardless of the first model. A quality measure is determined as a function of the first state and of the output of the first model using a second model. A parameter of the first model is determined as a function of the quality measure. A parameter of the second model is determined as a function of the quality measure and a setpoint value. The setpoint value is determined as a function of a reward.
Method and device for operating a robot
Device and method for operating a robot. As a function of a first state of the robot and/or its surroundings and as a function of an output of a first model, a first part of a manipulated variable for activating the robot for a transition from the first state into a second state of the robot is determined. A second part of the manipulated variable is determined as a function of the first state and regardless of the first model. A quality measure is determined as a function of the first state and of the output of the first model using a second model. A parameter of the first model is determined as a function of the quality measure. A parameter of the second model is determined as a function of the quality measure and a setpoint value. The setpoint value is determined as a function of a reward.
DEVICE AND METHOD FOR TRAINING A CONTROL STRATEGY WITH THE AID OF REINFORCEMENT LEARNING
A method for training a control strategy with the aid of reinforcement learning. The method includes carrying out passes, in each pass, an action that is to be carried out being selected for each state of a sequence of states of an agent, for at least some of the states the particular action being selected by specifying a planning horizon that predefines a number of states, ascertaining multiple sequences of states, reachable from the particular state, using the predefined number of states, by applying an answer set programming solver to an answer set programming program which models the relationship between actions and the successor states that are reached by the actions, selecting the sequence that delivers the maximum return, and selecting an action as the action for the particular state via which the first state of the selected sequence may be reached, starting from the particular state.
METHOD, SYSTEM, AND APPARATUS FOR FORMING A WORKPIECE
A system for straightening a workpiece includes a fabricating machine; a dimensional measurement system; a machine learning module; and a controller. The controller determines a plurality of design dimensions for the workpiece, and determines, via the dimensional measurement system, a plurality of initial dimensional parameters for the workpiece. A plurality of settings for the fabricating machine are determined, via the machine learning module, based upon the plurality of initial dimensional parameters for the workpiece and the plurality of design dimensions for the workpiece. The workpiece is secured into the fixture, and the fabricating machine is arranged employing the plurality of settings. The fabricating machine executes a plurality of operations on the workpiece employing the plurality of settings for the fabricating machine, and the dimensional measurement system verifies that the workpiece exhibits the plurality of design dimensions.
USER FEEDBACK FOR ROBOTIC DEMONSTRATION LEARNING
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for providing user feedback for robotic demonstration learning. One of the methods includes initiating a local demonstration learning process to collect respective local demonstration data for each of one or more demonstration subtasks defined by a skill template to be executed by a robot. Local demonstration data is repeatedly collected for each of the one or more demonstration subtasks of the skill template while a user manipulates a robot to perform each of the one or more demonstration subtasks defined by the skill template. A respective progress value for each of the one or more demonstration subtasks defined by the skill template is maintained. A user interface presentation is generated that presents a suggested demonstration to be performed by the user based on a respective progress value for each demonstration subtask.