Systems and Methods for Joint Design of Actuators and Control for Robots
20250303567 ยท 2025-10-02
Assignee
Inventors
- Yebin Wang (Cambridge, MA)
- Adrian Stein (Buffalo, NY, US)
- Jui-Te Lin (San Diego, CA, US)
- Zehui Lu (West Lafayette, IN, US)
Cpc classification
B25J9/1656
PERFORMING OPERATIONS; TRANSPORTING
G05B2219/40099
PHYSICS
B25J9/161
PERFORMING OPERATIONS; TRANSPORTING
B25J9/1664
PERFORMING OPERATIONS; TRANSPORTING
B25J9/163
PERFORMING OPERATIONS; TRANSPORTING
International classification
Abstract
An engineering system comprises a memory having instructions stored thereon and at least one processor configured to execute the instructions to cause the system to collect a plurality of tasks for a manipulator actuated by a motor. Structural parameters of the motor, a plurality of reference trajectories of the motor for actuating the manipulator to perform the plurality of tasks, and parameters of a feedback control policy for the manipulator are jointly determined to increase overlap between a probability distribution of values of operational data of the motor operating according to different real trajectories from a plurality of real trajectories and an efficiency map of the motor defined in a domain of the operational data of the motor. The structural parameters of the motor, the plurality of reference trajectories, and the feedback control policy are output for performing the plurality of tasks.
Claims
1. An engineering system, comprising: at least one processor; and a memory having instructions stored thereon that, when executed by at least one processor, cause the system to: collect a plurality of tasks for a manipulator actuated by one or multiple actuators including a motor; determine, jointly and in interdependence on each other, structural parameters of the motor, a plurality of reference trajectories of the motor for actuating the manipulator to perform the plurality of tasks, and parameters of a feedback control policy for the manipulator, to increase overlap between a probability distribution of values of operational data of the motor operating according to different real trajectories from a plurality of real trajectories and an efficiency map of the motor defined in a domain of the operational data of the motor; and output the structural parameters of the motor, the feedback control policy, and the plurality of reference trajectories for performing the plurality of tasks.
2. The engineering system of claim 1, wherein the processor is further configured to: generate a design of the motor according to the structural parameters; compute the plurality of reference trajectories for the motor to actuate the manipulator for performing the plurality of tasks; and determine parameters of the feedback control policy which commands the motor such that the manipulator follows the plurality of reference trajectories.
3. The engineering system of claim 1, wherein the domain of the operational data of the motor is a two-dimensional space defined by speed and torque of the motor.
4. The engineering system of claim 1, wherein the structural parameters of the motor define one or a combination of a permanent magnet thickness of the motor, a tooth width of the motor, a tooth height of the motor, and a slot opening of the motor.
5. The engineering system of claim 1, wherein each of the plurality of reference trajectories of the motor is defined by one or a combination of a state of the motor as a function of time, a control command to the motor as a function of time or a state of the manipulator as a function of time.
6. The engineering system of claim 1, wherein the processor is configured to determine the structural parameters of the motor, the parameters of the feedback control policy, and the plurality of reference trajectories of the motor jointly and in interdependence on each other as optimization parameters of an alternative optimization.
7. The engineering system of claim 1, wherein the processor is configured to iteratively determine the structural parameters, the parameters of the feedback control policy, and the plurality of reference trajectories of the motor until a termination condition is met, wherein, to perform a current iteration, the processor is configured to: determine current reference trajectories for the plurality of tasks that optimize a cost function based on an ideal differentiable simulator characterizing the motor and manipulator dynamics, wherein in the current iteration, the motor has values of the structural parameters determined during a previous iteration, and the motor and manipulator dynamic model parameters are updated based on values of the structural parameters determined during the previous iteration; train the feedback control policy to optimize a reward function indicating tracking performance of a manipulator control system during execution of the current reference trajectories, wherein the manipulator control system produces the plurality of real trajectories, and wherein the manipulator control system comprises at least the feedback control policy to be trained, a trajectory tracking controller, and a non-ideal simulator characterizing the motor and manipulator dynamics subject to uncertainties; determine a current probability distribution of values of operational data of the motor operating according to the real trajectories; and update the values of the structural parameters of the motor for the current iteration to increase the overlap of the efficiency map of the motor with the updated values of the structural parameters and the current probability distribution of values of the operational data of the motor.
8. The engineering system of claim 7, wherein the termination condition includes a condition that an error between structural parameters in the current iteration and the structural parameters determined during a previous iteration is below a threshold.
9. The engineering system of claim 8, wherein the threshold is defined as a sum of squares of the error.
10. The engineering system of claim 1, wherein to determine the reference trajectories for the plurality of tasks that optimize a cost function, the processor is configured to solve a motion planning problem using an ideal differentiable simulator where the dynamical models of the motor and manipulator are updated according to the latest structural parameters of the motor.
11. The engineering system of claim 1, wherein the processor determines the efficiency map of the motor according to design parametrization of 2D geometry of the motor and one or more operational constraints of the motor.
12. The engineering system of claim 1, wherein the processor is configured to determine the parameters of the feedback control policy for controlling the motor jointly and interdependently with the structural parameters of the motor to track the plurality of reference trajectories of the motor.
13. The engineering system of claim 12, wherein the feedback control policy includes a combination of a soft-actor-critic neural network and a classic position trajectory controller, wherein parameters of the soft-actor-critic neural network are updated according to the reference trajectories and the real trajectories to optimize a reward function indicating a degree of overlap between the real trajectories and the reference trajectories, and wherein the classic position trajectory controller comprises at least a feedforward controller and a proportional, integral and derivative (PID) controller.
14. The engineering system of claim 12, wherein the parameters of the soft-actor-critic neural network are updated by: fetching a reference trajectory of the plurality of reference trajectories; updating the parameters of the soft-actor-critic neural network by simulating a manipulator control system to track the fetched reference trajectory until the parameters converge; and repeating the fetching and the updating until all reference trajectories have been used to update the parameters of the soft-actor-critic neural network.
15. The engineering system of claim 12, wherein the parameters of the soft-actor-critic neural network are updated by: fetching a reference trajectory of the plurality of reference trajectories; updating the parameters of the soft-actor-critic neural network by simulating a manipulator control system to track the fetched reference trajectory; and repeating the fetching and the updating until the parameters of the soft-actor-critic neural network converge.
16. A computer-implemented method for jointly designing actuators and control for a robotic manipulator, the method comprising: collecting a plurality of tasks for the manipulator actuated by one or multiple actuators including a motor; determining, jointly and in interdependence on each other, structural parameters of the motor, a plurality of reference trajectories of the motor for actuating the manipulator to perform the plurality of tasks and parameters of a feedback control policy for the manipulator, to increase overlap between a probability distribution of values of operational data of the motor operating according to different real trajectories from a plurality of real trajectories and an efficiency map of the motor defined in a domain of the operational data of the motor; and outputting the structural parameters of the motor, the feedback control policy, and the plurality of reference trajectories for performing the plurality of tasks.
17. The method of claim 16, wherein the structural parameters of the motor define one or a combination of a permanent magnet thickness of the motor, a tooth width of the motor, a tooth height of the motor, and a slot opening of the motor.
18. The method of claim 16, wherein each of the plurality of reference trajectories of the motor is defined by one or a combination of a state of the motor as a function of time, a control command to the motor as a function of time, or a state of the manipulator as a function of time.
19. The method of claim 16, wherein the structural parameters of the motor, the parameters of the feedback control policy, and the plurality of reference trajectories of the motor are determined jointly and in interdependence on each other as optimization parameters of an alternative optimization.
20. The method of claim 16, wherein the structural parameters of the motor, the parameters of the feedback control policy, and the plurality of reference trajectories of the motor are determined iteratively until a termination condition is met, wherein a current iteration includes: determining current reference trajectories for the plurality of tasks that optimize a cost function based on an ideal differentiable simulator characterizing the motor and manipulator dynamics, wherein in the current iteration, the motor has values of the structural parameters determined during a previous iteration, and the motor and manipulator dynamic model parameters are updated based on values of the structural parameters determined during the previous iteration; training the feedback control policy to optimize a reward function indicating tracking performance of a manipulator control system during execution of the current reference trajectories, wherein the manipulator control system produces the plurality of real trajectories, and wherein the manipulator control system comprises at least the feedback control policy to be trained, a trajectory tracking controller, and a non-ideal simulator characterizing the motor and manipulator dynamics subject to uncertainties; determining a current probability distribution of values of operational data of the motor operating according to the real trajectories; and updating the values of the structural parameters of the motor for the current iteration to increase the overlap of the efficiency map of the motor with the updated values of the structural parameters and the current probability distribution of values of the operational data of the motor.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] The presently disclosed embodiments will be further explained with reference to the following drawings. The drawings shown are not necessarily to scale, with emphasis instead generally being placed upon illustrating the principles of the presently disclosed embodiments.
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042] While the above-identified drawings set forth presently disclosed embodiments, other embodiments are also contemplated, as noted in the discussion. This disclosure presents illustrative embodiments by way of representation and not limitation. Numerous other modifications and embodiments can be devised by those skilled in the art which fall within the scope and spirit of the principles of the presently disclosed embodiments.
DETAILED DESCRIPTION
[0043] The following description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the following description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.
[0044] Specific details are given in the following description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like-reference numbers and designations in the various drawings may indicate like elements.
[0045] Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.
[0046] Furthermore, embodiments of the subject matter disclosed may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine-readable medium. A processor(s) may perform the necessary tasks.
[0047]
[0048] Some example embodiments provide systems and methods for jointly designing and control of a class of n degree-of-freedom (DOF) open-chain manipulator such as the manipulator 100 illustrated in
[0049] In some embodiments, a joint of the manipulator 100 may be of any suitable type including but not limited to: revolute, prismatic, helical etc. The movements of the joints of the manipulator 100 may be controlled by one or more actuators coupled to the joints such that the manipulator 100 can be moved in accordance with one or more control inputs to effectuate manipulation of the payload 107 along any dimension.
[0050] According to some embodiments one or more joints such as the first-axis joint 101a may be a revolute type.
[0051] In some embodiments, the electric motor is of type of surface-mounted permanent magnet synchronous motor (SPMSM), where the electromagnetic interaction between stator and rotor is implemented by supplying certain voltages to the windings of stator, whereas the rotor may comprise multiple permanent magnet pieces.
[0052] According to some embodiments, the manipulator 100 may be a robot arm, which is a combination of joints, links and gearboxes, and multiple motors. Such a robot arm may be configured to move objects or payloads from an initial position to a desired final position.
[0053]
[0054] In some embodiments, the final position 110b may be predefined or specified by a user/operator and pre-coded in software, and thus may be independent from sensing and perception module 120 and environment 155. The drive unit 140 may be realized in the form of an inverter-a kind of power electronics which converts a DC power into an AC power according to the command input 136. The motors 145 may be of surface-mounted permanent magnet motor (SPMSM) type where it contains three-phase windings in its stator. The three-phase SPMSM is supplied with three-phase voltages from the drive unit 140.
[0055] In some embodiments, the reference motion trajectory 131 of the robot arm may be represented by the trajectories of angles of all joints.
[0056] In some embodiments, the sensing and perception module 120 measures the currents flowing through motors 145 at all joints, the angles of rotors of all motors 145, etc., and thus signal 121 comprises motor currents and angular positions of motor rotors. The sensing and perception module 120 may also sense the environment by suitable sensors such as but not limited to a camera, LiDAR, or microphone to detect location of static obstacles or movements of dynamic obstacles. Hence, signal 121 may also include locations, sizes, shapes, and bounding boxes of objects in environment which might critically impact the motion generation and/or its execution toward accomplishment of a certain task.
[0057] In some embodiments, the control command 136 provided to the drives 140 may be in the form of reference torques to be produced by the motors 145. In some embodiments, the drive unit 140 determines and outputs the three-phase voltages needed by the motors 145.
[0058] In some embodiments, it may be an objective of the manipulator 100 to complete a task as fast as possible (i.e., within a threshold time period). In some other embodiments, it may be an objective to complete a task within a fixed amount of time while minimizing the energy consumption (i.e., with a threshold energy expenditure). Accordingly, some embodiments are directed towards solutions that jointly design the motors 145, the tracking controller 135, and the motion planner 130 toward accomplishment of one or multiple tasks.
[0059] According to some embodiments, the one or more tasks may constitute a given application that may be specified by operators or other machines coupled to the manipulator 100. In some embodiments, the joint design of motors, tracking controller, and motion planner for a given application may be formulated as a constrained optimization problem and solved efficiently. The constrained optimization problem involves characterizing user specification in a mathematically rigorous way to be absorbed into the optimization problem, modeling and simulation of the manipulator, devising an optimization process, control policy synthesis, and optimization algorithms.
[0060]
[0061] The output 231 from the synthesizing step 230 includes the measured signals 121 of the closed-loop robot control system combining feedback control policy and the non-ideal differentiable simulator 225. The output 231 is used to update 235 the motor design. At 240, if convergence criteria is met, then the optimization process stops and outputs the last motor design 236 updated at step 235, otherwise, the last motor design 236 is fed back into module 220 and the steps of determining 220, synthesizing 230, and updating 235 are repeated. Both the ideal differentiable simulator 210 and the non-ideal differentiable simulator 225 are initialized with the initial motor design but are updated according to the last motor design produced at step 235 in each iteration.
[0062] In some embodiments, the convergence criteria typically reflect the differences between latest motor design and previous motor design. For instance, if the motor design is parameterized by a real vector , then the criteria can be defined as the sum of squares of .sub.k.sub.k1, where .sub.k, .sub.k1 denote the latest motor design and previous motor design, respectively.
[0063] Notations used in motor modeling are summarized in Table 1.
Notations Used in the SPMSM Model
TABLE-US-00001 Notation Description rotor speed .sub.m permanent magnet flux i.sub.d, i.sub.q current in d- and q-axis u.sub.d, u.sub.q voltage in d- and q-axis L.sub.d, L.sub.q inductance in d- and q-axis P number of pole pairs R winding resistance J rotor inertia load torque
[0064] Some embodiments are directed towards differentiable modeling for both motors and robot arm to analytically capture the relationship between decision variables and all equations and cost functions involved in the determination of reference motion trajectories 220, the synthesis of feedback control policy 230, and the update of motor design 235.
[0065]
[0066]
[0078] In some embodiments, the motor may be assumed with zero skewness, which yields a slot/pole ratio of 12/8 and a winding factor of k.sub.w1=0.866. Table 2 lists an example of the nominal values, units, and bounds of variables in .
Motor geometry design variables .
TABLE-US-00002 Parameter Nominal Value Unit Bounds [.sub.lb, .sub.ub] L 20 mm [20, 100] R.sub.ro 18 mm [10, 100] R.sub.so 30 mm [10, 100] h.sub.m 3 mm [1, 5] h.sub.sy 5 mm [5, 10] w.sub.tooth 7 mm [5, 20] b.sub.0 2 mm [1, 10]
[0079] Referring back to
where:
The stator weight is thus given by:
[0081] In some embodiments, the motor design constraint modeling 410 leads to the following design constraints 411 on material cost and compatibility of dimension:
[0083] According to some embodiments, magnetic equivalent circuit modeling 415 is performed as follows, based on the design parameters 406. The resistance of phase winding may be calculated as
where
is the slots per phase (in this work m=3),
is the resistance per tooth and L.sub.coil is the coil length. We know L.sub.coil=2 (L+L.sub.end,av), where
is the average length of the end-winding of the coil and
is the arc span per slot.
[0084] Given Carter's coefficient
where
the magnetic flux density across the air-gap may be expressed as
[0085] The flux density of the first harmonics is B.sub.g,1=4B.sub.g/ and the flux per tooth per single turn is
Without skewness, the flux linkage is given by
where k.sub.w=k.sub.pk.sub.d is the winding factor with
is the slots per pole per phase and gcd() means the great common divisor.
[0086] The inductance is given by
where L.sub.turn=p.sub.g+3p.sub.so+3p.sub.tt is the inductance per turn and per tooth. Here p.sub.g and p.sub.so, and p.sub.tt, denoting the permeance of the magnetic path across the air gap and the slot opening, and the permeance of the curved magnetic path from tip to tip, respectively, are given by:
[0087] In some embodiments, the magnetic equivalent circuit modeling 415 based on design parameters 406 leads to the motor dynamic model 416 as follows:
[0090] The differentiable modeling 305 illustrated in
[0091] In some embodiments, a motor is subject to the following operational constraints 417 on currents and voltages:
[0093] In some embodiments, the operation point 418 of a motor may be represented by (, .sub.e, u.sub.ub, i.sub.ub). Referring to
[0095]
[0096] In some embodiments, the analytical formulas of P.sub.mech, P.sub.cu, P.sub.hyst, P.sub.eddy may be derived as follows. A max voltage V.sub.emf,max which is available to overcome back electromagnetic force (back-EMF) is first derived as V.sub.emf,max=u.sub.ubR.sub.maxi.sub.ub, where R.sub.maxi.sub.ub is the max voltage drop induced by the max resistance and max current of the phase winding. In some embodiments, the max resistance 486 is approximated 485 by
[0098] Given operation point 418, the current minimization control strategy is applied to 460 determine steady-state motor currents 461 by solving the following problem:
[0100] In some embodiments, the hysteresis and eddy losses for a given motor design is derived as part of step 470 as
[0103] Given iron losses (6), mechanical power P.sub.mech, and the approximate copper losses P.sub.cu,ap=R.sub.0((i.sub.d*).sup.2+(i.sub.q*).sup.2) where (i.sub.d*, i.sub.q*) are steady-state motor currents 461, the approximate input power 476 is calculated as P.sub.in,ap=P.sub.hyst+P.sub.eddy+P.sub.mech+P.sub.cu,ap.
[0104] Next, P.sub.in,ap is used to 485 determine the temperature-dependent winding resistance as R=R.sub.0(1+(TT.sub.0)) with
where P.sub.rated is the rated power of motor. Finally, the copper loss model is derived as P.sub.cu=R(i.sub.d*).sup.2+(i.sub.q*).sup.2).
[0105] The dynamic model of an n-degree of freedom (DoF) manipulator 100 can be written as follows:
, {dot over ()}, and {umlaut over ()} are the angles, velocities, accelerations of links, with the units of rad, rad/s, rad/s.sup.2, respectively; subscript L.sub.k represents the link at the kth axis; .sub.L=[.sub.L.sub.
[0108] In some embodiments, the gearbox at the k+1.sup.th axis has a transmission ratio Z.sub.k+1. At the k.sup.th axis, the motor velocity .sub.k is proportional to the link velocity {dot over ()}.sub.L.sub.
[0109] In some embodiments, the design parameters of the motor at the k.sup.th axis is denoted as &k, and the design parameters of all motors are denoted as =[.sub.1.sup., . . . , .sub.n.sup.].sup.. Let i.sub.d,k, i.sub.q,k, .sub.k, u.sub.d,k, u.sub.q,k be quantities associated with the motor at the kth axis (interchangeably, the kth motor). Denote the vectors of currents and control inputs of the k th motor as i.sub.k=[i.sub.d,k, i.sub.q,k].sup. and u.sub.K=[u.sub.d,k, u.sub.q,k].sup., respectively. Let i=[i.sub.1.sup., . . . , i.sub.n.sup.].sup. and u=[u.sub.1.sup., . . . , u.sub.n.sup.].sup. encapsulate the currents and control inputs of all motors, respectively. The state of the system including motors and robot manipulator is given by x=[i.sup., .sup., {dot over ()}.sup.].sup..
[0110]
The inertia of the stator of the motor can be similarly derived, which is straightforward to those skilled in the art and thus omitted. The functions of inertia matrices 506 are used to update 510 robot link inertia matrices as analytical functions of design . Particularly, the inertia matrix function of stator of the motor at the ith-axis is incorporated into the inertia matrix function of the ith-axis link. With the updated inertia matrix functions of all links 511, inertia matrix functions of rotors and stators of all motors 506, and the motor dynamic model 416, an ideal differentiable simulator 210 can be implemented 515 by the algorithm 550 jointly illustrated in
[0111] The ideal differentiable simulator 210 implements the forward dynamics of manipulator 100 including all motor dynamic models 416 and the forward dynamics of the robot arm, which is expressed as an analytical function of design parameters . In the algorithm 550, p, A, M, G, F.sub.tip, and V represent the biased force, screw axis, homogeneous transformation matrix, spatial inertia matrix, wrench at the tip, and twist, respectively. Other notations follow the description provided by the book MODERN ROBOTICS MECHANICS, PLANNING, AND CONTROL authored by Kevin M. Lynch and Frank C. Park, the contents of which are incorporated in entirety.
[0112] The forward dynamics of a robot arm is essentially a state space representation of a system dynamics (8): {dot over (x)}.sub.robot=f (x.sub.robot, u.sub.robot) with x.sub.robot=[.sup., {dot over ()}.sup.].sup. being state of the robot arm and u.sub.robot=.sub.L being control input of the robot arm. The notation of forward dynamics is well-understood for those skilled in the art and can be derived suitably in accordance with known methods.
[0113] In some embodiments, a non-ideal differentiable simulator 225 may implement the algorithm 550 of
[0114] In some embodiments, given differentiable models, any software tool supporting auto-differentiation, e.g., CasADi, may be employed to implement differentiable simulators to expedite computationally efficient optimization algorithms.
[0115] In some embodiments, the k.sup.th link inertia G.sub.L.sub.
[0116] In some embodiments, an application of interest to customers/operators may be characterized by a set of n.sub.t tasks 215a: {
.sub.1,
.sub.2, . . . ,
.sub.n.sub.
.sub.i is distinctively described by a tuple (.sub.0, .sub.f, M.sub.p), where .sub.0, .sub.f, and M.sub.p are the initial configuration, the final configuration, and the inertia matrix of the payload, respectively. Initial velocity {dot over ()}(0) and final velocity {grave over ()}(t.sub.f) are typically zero. For the motor currents in d-axis and q-axis corresponding to .sub.0 and .sub.f, from {dot over ()}(0)={dot over ()}(t.sub.f)=0, we choose d-axis current being 0: i.sub.d(0)=i.sub.d(t.sub.f)=0. However, q-axis currents, i.sub.q(0) and i.sub.q(t.sub.f), which reflect the necessary torques to maintain balance at the corresponding configurations, might not be zero. The necessary torques may be determined based on inverse dynamics of the manipulator 100, which further imply i.sub.q by solving the nonlinear algebraic equations induced by setting the right-hand side of the motor dynamic model 416 to 0. Hence, from configurations .sub.0 and .sub.f, the initial state x.sub.0 and final state x.sub.f may be readily obtained. The application of inverse dynamics of a robot arm are straightforward for those skilled in the art and thus omitted in this description.
[0117] In some embodiments, the initial position 110a and final position 110b for a given task (application data) 215a are represented by .sub.0 and .sub.f respectively, which can be used to infer the initial state and final state. In another embodiment, the initial position 110a and final position 110b for a given task 215a are represented by initial state x.sub.0 and final state x.sub.f of system, respectively.
[0118] In some embodiments, the initial position 110a and final position 110b for a given task 215a are represented by the initial pose and final pose of the end-effector. The construction of initial configuration .sub.0 and final configuration .sub.f can be inferred by solving inverse kinematics problem, which is straightforward to those skilled in the art.
[0119] 215a where the design of all motors might be specified as an initial motor design 215b or an updated motor design 236. First, motor design 215b or 236 is passed into 605 to update the ideal differentiable simulator 210. The reference motion trajectories 611 for all tasks are obtained 610 based on the set of tasks 215a and the ideal differentiable simulator 210. In some embodiments, reference motion trajectories 611 for all n.sub.t tasks are given by ({circumflex over (x)}.sub.1*, . . . , {circumflex over (x)}.sub.n.sub.
[0120]
[0121] In some embodiments, determining reference motion trajectory for each task 625 by solving motion planning is achieved by solving for the solution of an optimal control problem (OCP). A fixed-final-time open-loop OCP can be formulated as follows. Given dynamic models and constraints (1), (2), (3), (8), (9), initial state x.sub.0, final state x.sub.f, and final time t.sub.f, determine the optimal control u* which minimizes a certain cost function J(x, u)=.sub.0.sup.t.sup.(x, u)dt. In one embodiment, the cost can be written as follows:
[0123] According to some embodiments, one approach to the open-loop OCP is by transcribing it into a non-linear optimization problem (NLP) by discretizing the open-loop OCP over the given time interval [0, t.sub.f]. The corresponding nonlinear optimization problem is formulated as follows:
(X.sub.k, U.sub.k) is the discretized counterpart of J(x, u); f(,) is the discretized state dynamics induced from (2) and (8); c.sub.I(,) and c.sub.e(,) are discretized constraints induced from (3) and (9). The non-linear optimization problem can be solved by established numerical optimization algorithms, for instance interior-point optimization algorithm. The formulation of open-loop OCP and the transcription of the corresponding nonlinear optimization problem, given a task characterized by its initial state x.sub.0, final state x.sub.f and a load M.sub.p, may be carried out through known technique in the art.
[0125]
[0126]
[0127] In some embodiments, the measured signal 121 comprise the motor current, speed, the angles of all links. The reference trajectory 131 comprises the angle trajectories of all links of the robot arm. The pre-processing 730 outputs auxiliary signals consisting of: reference trajectory 131, the derivative of the reference trajectory 131, an error trajectory between 131 and 121, the derivative of the error trajectory between 131 and 121.
[0128] In some embodiment, the convergence criteria in 720 includes a condition that the errors of parameters in the latest control policy 716 and the control policy synthesized in the previous iteration are below a certain threshold. The threshold may be defined by a user-defined parameter such that the time required for the training/iterations is inversely proportional to the threshold. Alternatively, the convergence criteria in 720 is that the errors of reward (reward prediction error) of the latest control policy 716 and the reward of the control policy synthesized in the previous iteration is below a certain threshold.
[0129] In some embodiments, reinforcement learning (RL) framework may be exploited to train the learning-based controller 735 towards convergence. The RL conundrum centers on an agent that actively engages with its environment, monitoring states to determine actions that yield higher rewards. The RL agent interacts with the environment which is implemented in simulation as the non-ideal differentiable simulator 225. The learning-based policy 735 is denoted as . The RL algorithm tries to learn policy by evaluating the observation/measured data (auxiliary signal 731) against a certain reward function. The observation data are given by:
x=[*,{dot over ()}*,e*,*], [0130] where * and {dot over ()}* are the reference motion trajectory 131 and its time derivative, respectively, e*=* with is the error between the reference and measured trajectories of angles of all links, and * is the derivative of e*. In some embodiments, the RL goal may be to train a control policy that can track any reference trajectory under bounded disturbances, the reward function R may be designed as follow:
[0132]
[0133]
[0134] According to some embodiments, for k.sup.th motor, its operational distribution S.sub.k(, .sub.e) over the speed-torque plane based on measured motion trajectories 711 or 121 (u.sub.1, . . . , u.sub.n.sub.
[0135] Given the distributions S.sub.k(, .sub.e) for 1kn, the motor design is updated by maximizing the overlaps between the distributions S.sub.k(, .sub.e) and the efficiency map of motor for all 1kn. This is achieved by solving the following optimization problem 811:
S.sub.k(, .sub.e) (1.sub.k(, .sub.e, .sub.k))d.sub.ed, .sub.k, admitting the formula (4), represents the efficiency of the k.sup.th motor, and
.sub.k is the support of S.sub.k(, .sub.e) in the speed-torque plane. Note that for the k.sup.th motor, design and operational constraints, (3) and (1), are imposed on every operation point over
.sub.k, which induces an infinite number of constraints and thus impractical.
[0138] The motor design optimization problem (13), or 811 in
[0139] One embodiment of discretization 815 of the motor design problem 811 is given as follows. The domain k is discretized into a mesh with
where the ijth mesh covers the region
.sub.k.sub.
.sub.k.sub.
with
being the number of operational points in .sub.k.sub.
=20.
[0141] The efficiency model .sub.k(, .sub.e, .sub.k) for the mesh is defined 855 as
Hence the mesh-cost function 851 is derived as follows:
[0142] The summation of all mesh-cost functions for all meshes and motors, equivalently the discretized counterpart of cost function in (13), can be written as follows:
[0143] Note that in (14),
is a numerical value whereas
is a differentiable function w.r.t. .sub.x for a given operational point (, .sub.e).
[0144] In some embodiments, the motor design problem tries to determine the motor design by minimizing
.
[0146]
[0147] According to some embodiments, the modules described with reference to
[0148] The above description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the above description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.
[0149] Specific details are given in the above description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like reference numbers and designations in the various drawings indicated like elements.
[0150] Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.
[0151] Furthermore, embodiments of the subject matter disclosed may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine readable medium. A processor(s) may perform the necessary tasks.
[0152] Various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.
[0153] Embodiments of the present disclosure may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts concurrently, even though shown as sequential acts in illustrative embodiments. Although the present disclosure has been described with reference to certain preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the present disclosure. Therefore, it is the aspect of the append claims to cover all such variations and modifications as come within the true spirit and scope of the present disclosure.