Projection exposure apparatus with optimized adjustment possibility
10054860 ยท 2018-08-21
Assignee
Inventors
Cpc classification
G03F7/70266
PHYSICS
G03F7/70191
PHYSICS
G03F7/705
PHYSICS
G03F7/70525
PHYSICS
G03F7/70308
PHYSICS
International classification
Abstract
A projection apparatus for microlithography for imaging an object field includes an objective, one or a plurality of manipulators for manipulating one or a plurality of optical elements of the objective, a control unit for regulating or controlling the one or the plurality of manipulators, a determining device for determining at least one or a plurality of image aberrations of the objective, a memory comprising upper bounds for one or a plurality of specifications of the objective, including upper bounds for image aberrations and/or movements for the manipulators, wherein when determining an overshooting of one of the upper bounds by one of the image aberrations and/or an overshooting of one of the upper bounds by one of the manipulator movements by regulation or control of at least one manipulator within at most 30000 ms, or 10000 ms, or 5000 ms, or 1000 ms, or 200 ms, or 20 ms, or 5 ms, or 1 ms, an undershooting of the upper bounds can be effected.
Claims
1. A system, comprising: an objective, comprising: an optical element; and a manipulator configured to manipulate the optical element; and a control unit configured to control the manipulator, the control unit comprising: a first device configured to control the manipulation of the optical element by the manipulator; a memory comprising a bound for a range of the manipulation of the optical element by the manipulator; and a second device configured to calculate a value of a merit function based on at least one error and configured to minimize the merit function subordinate to the bound for the range of the manipulation of the optical element by the manipulator, wherein the merit function comprises a regularization parameter and the objective is a microlithography projection objective.
2. The system of claim 1, wherein the objective comprises a plurality of optical elements and a plurality of manipulators, and each optical element has a corresponding manipulator.
3. The system of claim 2, wherein the first device is configured to control movement of each manipulator.
4. The system of claim 1, wherein the at least one error is selected from the group consisting of scale error, telecentricity error, overlay error, depth of focus error, best focus errors and errors due to image aberrations produced by integration of a plurality of field points.
5. The system of claim 1, wherein the at least one error comprises at least one image aberration.
6. The system of claim 1, wherein the merit function comprises a parameter describing a sensitivity matrix.
7. The system of claim 1, wherein the objective comprises first, second and third optical elements, the first optical element is near a pupil plane of the objective, the second optical element is near a field plane of the objective, and the third optical element is not near a pupil plane of the objective or a field plane of the objective.
8. The system of claim 1, wherein the control unit is configured to control the manipulator in real time within 15000 ms.
9. The system of claim 1, wherein the control unit is configured to control the manipulator in real time within 200 ms.
10. The system of claim 1, wherein the control unit is configured to control the manipulator in real time within 20 ms.
11. The system of claim 1, wherein the merit function comprises a linear function of a degree of freedom of the manipulation of the optical element by the manipulator.
12. The system of claim 1, wherein the merit function comprises a quadratic function of a degree of freedom of the manipulation of the optical element by the manipulator.
13. The system of claim 1, wherein the second device is configured to minimize the merit function using a linear programming.
14. The system of claim 1, wherein the second device is configured to minimize the merit function using a quadratic programming.
15. The system of claim 1, wherein calculating the value of the merit function comprises generating the merit function.
16. The system of claim 15, wherein generating the merit function comprises determining a parameter of the merit function based on a statistical distribution.
17. The system of claim 15, wherein generating the merit function comprises determining a parameter of the merit function based on a look-up table.
18. The system of claim 1, wherein calculating the value of the merit function comprises adjusting weighting coefficients of the merit function.
19. The system of claim 18, wherein the weighting coefficients represent the weighting of different degrees of freedom of the manipulation.
20. The system of claim 18, wherein the weighting coefficients represent the weighting of different errors.
21. The system of claim 1, wherein the manipulation of the optical element by the manipulator comprises at least one element selected from the group consisting of shifting the optical element, rotating the optical element and deforming the optical element.
22. The system of claim 1, wherein the manipulation of the optical element by the manipulator comprises exchanging the optical element.
23. The system of claim 1, wherein the manipulation of the optical element by the manipulator comprises at least one element selected from the group consisting of heating the optical element and cooling the optical element.
24. The system of claim 1, wherein the optical element is a reflective optical element.
25. The system of claim 1, wherein the optical element is a reflective optical element in the vicinity of a pupil plane.
26. The system of claim 1, wherein: the objective has a folded design comprising first, second and third objective parts; the optical element comprises a reflective optical element in the second objective part; the first objective part comprises refractive optical elements; and the third objective part comprises refractive optical elements.
27. The system of claim 1, wherein: the objective has a folded design comprising first, second and third objective parts; the optical element comprises a reflective optical element in the vicinity of a pupil plane in the second objective part; the first objective part comprises refractive optical elements; and the third objective part comprises refractive optical elements.
28. The system of claim 1, wherein the optical element comprises a refractive optical element.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The disclosure is explained below on the basis of the exemplary embodiments illustrated in the figures, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
DETAILED DESCRIPTION
(15)
(16) The objective 110 contains optical elements such as lenses 111, mirror 112 and a plane plate 113. A manipulator 121 acts on one of the lenses, which manipulator can displace, bend, heat and/or cool the lens. A second manipulator 122 acts on the mirror 112 in the same way or in a different way than manipulator 121, and a third manipulator 123 serves for exchanging the plane plate 113 for a further plane plate (not illustrated here), which is aspherized.
(17) Given a predefined aperture, maximum light beams delimited by the aperture emerge from the two field points 103 and 104. The outermost rays of said light beams are illustrated here as interrupted lines. These outermost rays delimit the wavefronts respectively associated with the field points 103 and 104. For illustration purposes, said wavefronts are assumed to be spherical. A wavefront sensor and/or further sensors and/or a prediction model forms a determining unit 150, which supplies information about image aberrations on the basis of the measurement of the wavefronts after the passage thereof through the objective 110. Said further sensors are, for example, air pressure sensors, sensors for measuring the temperature inside of the objective 110 or sensors which measure the temperature on lenses such as lens 111 or on the rear side of mirrors such as mirror 112.
(18) The manipulators 121,122,123 are controlled by a control unit 130. The control unit can also be embodied as a regulating unit.
(19) The control unit 130 obtains upper bounds for image aberrations and manipulator ranges in the form of specifications from a memory 140 and also information about the measured image aberrations or wavefronts from the determining unit 150.
(20) The control unit 130 contains an adjustment algorithm, which, upon determination of an overshooting of one of the upper bounds by one of the image aberrations at one of the field points by regulation and associated manipulation of the one or of the plurality of optical elements 111,112,113 within 30000 ms, or 10000 ms, or 5000 ms, or 1000 ms, or 200 ms, or 20 ms, or 5 ms, or 1 ms, effects an undershooting of the upper bounds for the one or the plurality of specifications. The different time intervals above result from the different applications of adjustment to the projection exposure apparatus. In particular, the time periods 30000 ms, or 10000 ms, or 5000 ms, or 1000 ms, are advantageous for the initial adjustment. The time periods 30000 ms, or 10000 ms, or 5000 ms, or 1000 ms, or 200 ms, or 20 ms, are advantageous for the repair adjustment. Finally, the time periods 200 ms, or 20 ms, or 5 ms, or 1 ms, are advantageous for the fine adjustment.
(21)
(22)
(23) The wavefronts and/or aerial images are measured with respect to a plurality of field points. These are arranged on a rectangular grid, for example, and correspond to a matrix having mn field points p.sub.ij. Typical numbers of field points are 57, 313, 513 or 713. Other possible forms of a grid arrangements are rhomboidal grids or spoke-shaped grids which follow a curved field profile. The field points of each of these grid forms can be arranged in a matrix.
(24) The measurement data thus obtained are optionally freed of numerical noise by filtering. The wavefronts (p.sub.ij), associated with the individual field points p.sub.ij, whereby the wavefronts of the aberrations, that is to say the deviations from the ideal spherical form, are meant in the context of this application, are numerically decomposed into Zernike polynomials Z.sub.1 or a, preferably orthonormal, function system up to a predefined order n:
(p.sub.ij)=.sub.l=1.sup.n.sub.ijlZ.sub.l.
(25) The order n of this expansion is generally 36, 49, 64 or 100. For the definition of the Zernike polynomials, cf. e.g. DE102004035595A1 or table 11-1 from Handbook of Optical Systems, Herbert Gross, ed., Vol. 1: Fundamentals of Technical Optics. The Zernike polynomials presented therein follow the fringe numbering
Z.sub.1(r,)=1
Z.sub.2(r,)=r cos
Z.sub.3(r,)=r sin
Z.sub.4(r,)=2r.sup.21
Z.sub.5(r,)=r.sup.2 cos 2
Z.sub.6(r,)=r.sup.2 sin 2
Z.sub.7(r,)=(3r.sup.32r)cos
Z.sub.8(r,)=(3r.sup.32r)sin
Z.sub.9(r,)=6r.sup.46r.sup.2+1,
which is listed here up to the order n=9. The highest occurring exponent r determines the radial order of the Zernike polynomial Z, and the highest occurring exponent determines the azimuthal order of Zernike polynomial Z. The Zernike polynomials are orthogonal with respect to the scalar product
(26)
and have the norm
(27)
where k=2 if Z has a radial order greater than 0, and k=1 if Z has radial order 0, and q denotes the azimuthal order.
(28) On the basis of the coefficients .sub.ijl, image aberrations such as scale error, telecentricity error, overlay and depth of focus, best focus and further image aberrations produced by integration over a plurality of field points are determined: the latter are, for example, the rms (root mean square) and also grouped rms such as, for example, rms.sub.spherical, rms.sub.coma x, rms.sub.coma y, rms.sub.coma, rms.sub.ast 90, rms.sub.ast 45 and rms.sub.ast, rms.sub.3foil x, rms.sub.3foil y and rms.sub.3foil, residual rms, and also fading.
(29) The rms at a field point p.sub.ij is given by
rms.sup.2(p.sub.ij)=.sub.l=5.sup.n.sub.ijl.sup.2,
where n=36 or n=49 or n=100 holds true. The centered rms.sub.z at a field point p.sub.ij is given by
(30)
(31) The residual rms.sub.res at a field point p.sub.ij is given by
(32)
where the values 50 or 101 are also used instead of 37. The grouped rms at a field point p.sub.ij are given by
rms.sub.spherical(p.sub.ij).sup.2=.sub.ij9.sup.2+.sub.ij16.sup.2+.sub.ij25.sup.2+ . . . ,
rms.sub.coma x(p.sub.ij).sup.2=.sub.ij7.sup.2+.sub.ij14.sup.2+.sub.ij23.sup.2+ . . . ,
rms.sub.coma y(p.sub.ij).sup.2=.sub.ij8.sup.2+.sub.ij15.sup.2+.sub.ij24.sup.2+ . . . ,
rms.sub.coma(p.sub.ij)=max{rms.sub.comax(p.sub.ij),rms.sub.comay(p.sub.ij)},
rms.sub.ast 90(p.sub.ij).sup.2=.sub.ij12.sup.2+.sub.ij21.sup.2+.sub.ij32.sup.2+ . . . ,
rms.sub.ast 45(p.sub.ij).sup.2=.sub.ij13.sup.2+.sub.ij22.sup.2+.sub.ij33.sup.2+ . . . ,
rms.sub.ast(p.sub.ij)=max{rms.sub.ast 90(p.sub.ij),rms.sub.ast 45(p.sub.ij)},
rms.sub.3 foil x(p.sub.ij).sup.2=.sub.ij10.sup.2+.sub.ij19.sup.2+.sub.ij30.sup.2+ . . . ,
rms.sub.3 foil y(p.sub.ij).sup.2=.sub.ij11.sup.2+.sub.ij20.sup.2+.sub.ij31.sup.2+ . . . ,
or
rms.sub.3 foil(p.sub.ij)=max{rms.sub.3 foilx(p.sub.ij),rms.sub.3 foily(p.sub.ij)}
(33) The fading, FAD.sub.x and FAD.sub.y, in the x and y direction, respectively, is a scan-integrated image aberration and a measure of the field-dependent distortion of a structure to be imaged. During the operation of the projection exposure apparatus, the position of the structure to be imaged varies on account of the field-dependent distortion in the x and y directions. The structure is therefore imaged with a reduced contrast at the averaged position. The fading intensity is characterized by a mean standard deviation of the distortion and is calculated for example in the x direction for a projection optical unit that scans in they direction, as follows:
(34) Firstly, a distinction is made between the so-called core structure and the peripheral structure. When using an x-dipole illumination (see, e.g., Handbook of optical systems, vol. 2, W. Singer, M. Totzeck, H. Gross, pp. 257, which shall hereby be incorporated by reference fully in this application), vertically oriented parallel lines, for example, represent the core structure since they have to be imaged with a higher resolution. Horizontal structures, such as horizontally oriented parallel lines, for example, are referred to in this case as a peripheral structure.
(35) Besides the distinction between core structure and peripheral structure, the field-point-dependent structure offset x.sub.ij, for a field point x.sub.ij with indices ij, generally also depends on the distance between the structures to be imaged, referred to here as pitch .sub.v. In general, an interval of pitches is considered: .sub.1=2*structure width to .sub.N=10*structure width, where the structure width is 60 nm, 45 nm, 32 nm, or only 22 nm. The interval is subdivided equidistantly with a sufficiently fine step size of =15 nm or 10 nm or 5 nm or 1 nm. For each field point p.sub.ij, where for example i=1, . . . , 13 in the x direction and j=1, . . . , 7 in they direction, and for each pitch .sub.v, firstly the generally field-point-dependent and structure-dependent offset x.sub.ij in the x direction is determined. This structure offset x.sub.ij can generally be measured directly or else be derived using linear factors and an associated wavefront measurement. A simulation or a hybrid method formed from measurement and simulation can also be employed instead of a measurement.
(36) The scanner-weighted and pitch-dependent mean value of the structure offset in the x direction is defined for core structure and peripheral structure, depending on the horizontal field point index i, in each case as
(37)
(38) Only the core structure is considered below, for the sake of clarity. All the formulae are correspondingly applicable to the peripheral structure. Furthermore, a distinction is made here only in the x direction on account of the scan integration in they direction for the field points p.sub.ij. The g.sub.j are scanner weights resulting from the intensity distribution of the underlying illumination. In general, at j=1, . . . , 7 field points, p.sub.ij is evaluated in they direction and the g.sub.j correspond to a ramp function, for example, i.e.
g.sub.j=j/k.sub.1,1jk.sub.1;g.sub.j=1,k.sub.1<j<k.sub.2,g.sub.j=1(jk.sub.2)/(kk.sub.2+1),k.sub.2jk
with k, k.sub.1 and k.sub.2 chosen in accordance with the illumination intensity. As an alternative, the g.sub.j can also follow some other density function, such as a Gaussian function, for example. In this case, the density functions can each be normalized to one. Functions similar to ramp or Gaussian functions are furthermore alternatively employed as well. In this case, the similarity of a function to a predefined function should be understood to mean a quantified deviation with respect to a predefined function . In this case, said deviation is measured by a likewise predefined norm . The norm used is primarily the maximum norm .sub.max=max|(x)|. The predefined deviation used is a percentage deviation where =1.1, or =1.5.
(39) In this case, the underlying illumination can be coherent, partly coherent, a dipole illumination, a quadrupole illumination, an annular illumination or some other freely defined illumination setting.
(40) To finish the definition of fading the scanner-weighted variance .sup.2.sub.i is then calculated as:
(41)
(42) The mean standard deviation is thus
MSD.sub.i.sup.x(.sub.v)={square root over (.sub.i.sup.2(.sub.v))}
(43) The maximum value over the field points and over all the pitches .sub.v,
(44)
is then designated as x-fading FAD.sub.x.
(45) Analogous relationships are applicable to the fading in they direction FAD.sub.y, except that a variable y.sub.ij designating the structure-dependent offset in they direction is used instead of x.sub.ij.
(46)
(47) The image aberration overlay, OVL, is likewise dependent on core structure, peripheral structure and pitch and is a measure of the scanner-averaged distortion. As previously defined separately for each structure orientation, for a predefined, finite sequence of pitches .sub.v, v=1, . . . , N and field points x.sub.i,i=1, . . . , 13 in the x direction, the offset, or synonymously centroid, is defined by
(48)
(49) The offset defines the expected value, once again only the core structure being considered below. For each, now fixed, pitch .sub.v, there is then precisely one maximum overlay value OVL.sub.x,y
(50)
(51) The maximum over all the pitches is finally designated as the overlay error OVL.sub.x in the x direction for the given structure orientation
(52)
(53) Analogous relationships are applicable to the overlay error in the y direction OVL.sub.y, except that the scanner-integrated variable
(54)
(55) The image aberration best focus, BF, is a measure of the scanner-integrated focus error and likewise depends on the pitch considered. In accordance with the above definitions for the overlay, for each field point and for each pitch firstlysynonymously with the pitch-dependent mean value of the structure offsetthe structure-dependent average focus position is determined by measurement or simulation or a hybrid method. A distinction is made between the core structure and the peripheral structure in this case, too. There is, of course, no longer a distinction according to x and y. The centroid, synonymously offset, is then determined over all the pitches and over all the field points in the x direction, in a manner analogous to that in the definition of the overlay. It defines the expected focus position. From the latter, the maximum deviation of the scanner-averaged focus position is subsequently determined for each pitch. The maximum deviation over all the pitches is finally designated as the best focus error.
(56)
(57) Beside those integrated aberrations also individual coefficients themselves as determined above are also appropriate as image aberrations.
(58) For all the image aberrations or at least those image aberrations which are relevant to the imaging performance of the objective with regard to the imaging performance currently desired, such as, e.g., overlay, best focus, fading both for core and for peripheral structures, and individual Zernike coefficients, and also rms, grouped rms and residual rms, upper bounds are read from a memory. Appropriate upper bounds include for example for overlay 5 nm, 2 nm, 1 nm, 0.5 nm or 0.1 nm. Best focus can be specified with 50 nm, 20 nm, 10 nm, 5 nm or 1 nm. For fading, 10 nm, 5 nm, 2 nm or 1 nm can constitute upper bounds. 1.0 nm or 2.0 nm, by way of example, is appropriate for individual Zernike coefficients. These upper bounds are generally not permitted to be overshot by the corresponding image aberrations.
(59) The aim of ensuring that all the relevant image aberrations are below their respective upper bounds after the setting of the manipulators produces a series of first boundary conditions. Namely complying with upper bounds, hereinafter referred to generally as spec, for 1) Zernike specs: spec.sub.M for example 2.0 nm for Z.sub.i, i6 and 1.5 nm for Z.sub.i,6<i36 RMS specs: spec.sub.R for example 3.0 nm for rms, 1.0 nm for rms.sub.z for Zernikes specified in greater detail, such as Z.sub.i, i=5, . . . , 49 and 2.0 nm for rms.sub.res for the residual rms 2) grouped RMS specs: spec.sub.G for example 0.8 nm for rms.sub.ast, rms.sub.coma and rms.sub.3foil 3) Fading specs: spec.sub.F for example 5.0 nm (core and periphery) 4) OVL specs: for example 2.0 nm core, 5.0 nm periphery 5) Best Focus specs: for example 20.0 nm core, 50.0 nm periphery
(60) In addition, specifications for at least some of the manipulators are read from a further or the same memory 140. They include the maximum movements of the manipulators. Appropriate maximum movements include the following, by way of example: 1. maximum movement of a manipulator which displaces a lens in the direction of the optical axis, 100 micrometers, 2. maximum movement of a manipulator which displaces a lens orthogonally thereto, 20 micrometers, and 3. maximum movement of a manipulator which tilts a lens about an axis orthogonal to the optical axis 300 microrad.
(61) In the case of a mirror, the corresponding values are 40 micrometers, 35 micrometers, and 140 microrad, respectively. A manipulator which bends a lens can for example be maximally moved to an extent such that the positional alteration of each point of each of the two lens surfaces is at most 1 micrometer in the direction of the optical axis. Depending on the lens form and the positions of the deforming force inputs and/or torques, upper bounds thus arise indirectly for said forces and/or torques. In the case of a manipulator which applies heat and/or cold to an optical element, the following upper bounds are applicable, by way of example: 4. maximum temperature change +/0.65 K. A maximum temperature change that is not symmetrical with respect to zero, such as 0.5 K to +0.75 K, for example, is also used, 5. maximum power input +/150 W/m.sup.2. A maximum power input that is not symmetrical with respect to zero, such as 120 W/m.sup.2 to +200 W/m.sup.2, for example, is also used here.
(62) There then follows the time-critical step (II) of calculating the optimum manipulator movements, after which, in a further step, the manipulators are set in accordance with the movement respectively determined for them.
(63) The image aberrations resulting from the movement of an individual degree of freedom of a manipulator and their expansion into Zernike polynomials can be determined a priori. This is generally done by simulation or measurement in the case of a standard movement assigned to the manipulator and to one of its degrees of freedom.
(64) This is explained below on the basis of a manipulator which displaces a lens of the objective in a defined direction. This manipulator has one degree of freedom. Its effect on the individual wavefronts at a predefined selection of field points p.sub.ij is determined by measuring or simulating the wavefronts of the objective in the case of a predefined standard movement x, generally one micrometer, of the manipulator and subtracting therefrom the wavefronts of the objective in the case of a unmoved manipulator. This subtraction is realized by the expansion of the respective wavefronts into Zernike polynomials and subtraction of the coefficients of the two expansions. The expansions into the Zernike polynomials are performed up to an order n.
(65) By way of example, the values i=7,j=13 and n=36 or 49 are used. In this case, a total of i.Math.j.Math.n=7.Math.13.Math.36=3276 image aberrations are determined in accordance with the expansion of the wavefronts at all the given field points p.sub.ij.
(66) Besides the coefficients of the Zernike polynomials, use is also made of other image aberrations, such as, for example, the residual rms.sub.res defined above. These can either be calculated using the wavefronts determinedas explained aboveor they are measured or they are simulated.
(67) The difference thus obtained is designated as the sensitivity a of the manipulator. It defines the optical effect of the manipulator in the case of its standard movement x. For small movements, said optical effect is proportional to the sensitivity a.
(68) If the manipulator has more than one degree of freedom, then the sensitivities are calculated separately for each of its degrees of freedom, its standard movement being predefined for each degree of freedom. This holds true for all the manipulators of the projection apparatus. A matrix A of sensitivities is obtained:
A=(a.sub.mn).sub.m=1, . . . ,i.Math.j.Math.k,n=1, . . . ,l,
where j is the number of field points in the scanning direction, i is the number of field points orthogonally to the scanning direction, k is the summed number of all the degrees of freedom of all the manipulators of the projection apparatus, and l is the number of image aberrations calculated.
(69) The standard movements are given for example by 1 micrometer for a manipulator which displaces a lens perpendicular to the optical axis of the projection objective, 1 micrometer for a manipulator which displaces a lens in the direction of the optical axis of the projection objective, 1 Watt/cm.sup.2 power for each heating zone of a heating manipulator 1 bar pressure in the case of the bending of a lens element 1 millimeter for the relative displacement of a pair of Alvares plates
(70) A manipulator which exhibits a linear behavior, such as, for example, manipulators which slightly shift the position of optical elements, can be used as continuously movable on account of its effect proportional to the displacement, since its effect can be calculated for all movements on the basis of its effect in the case of its standard movement. In order to be able to continuously move manipulators which do not exhibit a linear behavior, such as, for example, manipulators which apply a high degree of heat to a lens or Alvarez plates with a great relative displacement of a number of millimeters, their effect at different movements is determined, in particular measured or simulated, and their effect is interpolated with the aid of the data thus obtained.
(71) In the case aberrations that leave the linearity range, linear interpolation is likewise effected in accordance with the methods specified above. Such aberrations can be for example the lithographic system parameters such as overlay core structure, overlay peripheral structure, best focus core structure or best focus peripheral structure.
(72) The total number of sensitivities can be more than n=10, 20, 50, 100, 500 or 1000.
(73) The sensitivities A thus defined are also referred to as static sensitivities since these are determined individually for each field point p.sub.ij. If the projection exposure apparatus is operated in scanning operation, then scan-integrated sensitivities are also employed, which are defined as follows.
(74) Given a density function such as the above-described ramp functions or a Gaussian function, the scan-integrated sensitivities are obtained from the static sensitivities A by the respective image aberrations or coefficients of the Zernike polynomials being scan-integrated in the case of their standard movements, that is to say the coefficients, weighted with the given density in the scanning direction, that is to say in the direction of the movement of the reticle, are added and subsequently divided by the number of summands. In the case of scan-integrated sensitivities, therefore, the matrix of the sensitivities acquires the form
that is to say that it has fewer rows than the matrix of the static sensitivities by a factor j.
(75) The image aberration fading FAD.sub.x, FAD.sub.y is already by definition scan-integrated.
(76) Besides the image aberration fading, fading sensitivities are also defined. These are obtained from the static sensitivities by subtracting the scan-integrated sensitivities A from the static sensitivities A: =A. In this case, the scan-integrated sensitivities are assumed to be constant in the scanning direction.
(77) The sensitivities are always designated by A hereinafter, which can be taken to mean both static and scan-integrated or fading sensitivities. If reference is explicitly made to scan or fading sensitivities, then these are again designated by or , respectively.
(78) The sensitivities of all the manipulators of the projection apparatus span a vector space.
V={x.sub.1a.sub.l+ . . . +x.sub.na.sub.n:x.sub.1, . . . ,x.sub.n real}
(79) The latter is as is also referred to as a (mathematical) adjustment space.
(80) On account of the movement distance restrictions given, V is generally not a vector space but rather a convex set, more precisely a polyhedron, the (mathematical) adjustment polyhedron already defined above.
(81) If an image aberration b is intended, then, to be set or compensated for by the manipulators of the objective, then it is necessary to solve the linear equation system
x.sub.1a.sub.1+ . . . +x.sub.na.sub.n=b
or for short
Ax=b
(82) A is a matrix containing the sensitivities of the respective manipulators with regard to their degrees of freedom, and x is a vector describing the unknown movements of the respective manipulators. It should be taken into consideration here that the dimension of the matrix A is smaller by the factor j, the number of field points in the scanning direction, if scan-integrated sensitivities are involved.
(83) In this case, the image aberration b is generally not an element of V and so the above equation generally does not have a solution. The case where a plurality of such solutions x exist can likewise occur.
(84) Therefore, instead of said equation, the following optimization problem is solved:
minAxb.sub.2.sup.2,(a)
with Euclidean norm .sub.2, which can be found by solving the normal equation
A.sup.tAx=A.sup.tb(a)
(85) Instead of a minimum problem like (a) obviously a equivalent maximum problem can be stated by max (Axb,Axb) In this case, it is possible to use direct methods such as the Gaussian elimination method or Moore-Penrose inverse or alternatively iterative methods such as the quasi-Newton method or else the cg method (conjugate gradient method). In this respect, cf. Angelika Bunse-Gerstner, Wolfgang Bunse, Numerische Lineare Algebra, Teubner Studienbcher, 1985, in particular the algorithm 3.7.10, pp. 153-156, and for stabilization purposes the algorithm 3.7.11, pp. 157-158. Iterative methods can be terminated on the basis of an a priori error estimation, an a posteriori error estimation or after a fixedly predefined number of iteration steps, such as, for example, 500, 200, 100, 50, 20, 5, or 1, iteration steps.
(86) The Euclidean norm .sub.2 is always used hereinafter, but it can also be provided with weights d=(d.sub.i):
(87)
(88) This embodiment is not explicitly mentioned below; however, for each exemplary embodiment, a weighted Euclidean norm constitutes an alternative to the Euclidean norm.
(89) Such a weighting d of the Euclidean norm is employed for example when the image aberration b, corresponds to the coefficient of a Zernike polynomial. In accordance with the fringe numbering, cf. Handbook of Optical Systems, Herbert Gross, ed., Vol. 1: Fundamentals of Technical Optics, the following is set in this case:
(90)
(91) In other words, the image aberration corresponding to a Zernike coefficient is additionally weighted with the norm of the corresponding Zernike.
(92) Furthermore, such a weighting of the Euclidean norm is employed in particular when the image aberration exhibits a production-dictated field profile. Thus, the use of aspherical lenses that are near the field and have a highly pronounced asphericity exhibits the effect of excessively increased image aberrations at the edge of the object field. In order to prevent the influence of such an excessive increase from becoming overly great, field points are weighted to a lesser extent at the edge of the object field than in the field center. By way of example, the following procedure is adopted in order to solve this problem.
(93) If x denotes the field coordinate perpendicular to the scanning direction, 0 is assumed as the center of this coordinate, and x.sub.max and x.sub.max denote the minimum and maximum field coordinate, respectively, in this coordinate, then it is found that the scan-integrated Zernikes or the scan-integrated image aberrations in part tend to assume at the field edge x.sub.max and x.sub.max and in its vicinity high values; cf. for example the best focus from
(94) Since an as far as possible constant field profile of the scan-integrating image aberrations perpendicular to the scanning direction is desirable, however, the Euclidean norm is correspondingly weighted by more weight being imparted to the field edge than to the field center as seen in relative terms. As such a weighting, the following weighting, formulated here for the image aberration best focus BF, has proved to be advantageous:
(95)
(96) Corresponding weightings are also employed for the other scan-integrating image aberrations such as fading FAD.sub.x and FAD.sub.y, overlay OVL.sub.x and OVL.sub.y, rms and residual rms.
(97) Weightings are also employed in combined fashion, such as, for example, in the case of a common optimization with regard to static and scan-integrated image aberrations.
(98) Further embodiments of the solution to the problem (II) or (a) now follow.
(99) The above iterative methods, e.g., quasi-Newton method or cg method, for solving (a) can be terminated upon an a priori time bound being overshot, such that the solution is determined only approximately. This is advantageous particularly in the case for real-time optimizations as in the case described. The following cases alternatively occur for such an only approximate solution: (i) If an a priori or an a posteriori error estimation exists for the iterative method, it is possible to determine whether the approximate solution can be used since the image aberrations are uniformly continuously dependent on the manipulator movements. (ii) A check is made to determine whether the solution to the inverse problem, and the manipulator setting that is thus optimum according to the predefined criteria, can be shifted into the next regulation interval. This is done by calculation or simulation of the resulting image aberrations if the manipulators were moved in accordance with the approximate solution. (iii) If (i) or (ii) are not possible, then a solution to an alternative inverse problem is generated with the aid of a rapidly convergent method, for example a Tikhonov regularization. For this alternative inverse problem, only those manipulators whose movement range is noncritical are taken into account for a reduced adjustment polyhedron. These are for example displacements of optical elements whose movement range is generally sufficiently large.
(100) In general, the problem (a) is ill-conditioned, that is to say that the condition number
(101)
of the problem (a) is generally very high and can overshoot values of 1.0E6, 1.0E8 or even 1.0E12. The condition of an individual manipulator can also reach values of up to 1.0E3. This leads to an instability of the numerical methods mentioned above. This can have the effect that the calculated solution impairs the problem, or that the algorithm regards the optimization problem as insoluble. This is the case particularly for the traditional simplex method.
(102) Methods for regularizing such high-dimensional, ill-conditioned (synonymously: ill-posed) problems include Singular-Value-Decomposition (SVD decomposition) with singular value truncation, and also Tikhonov regularization. In the case of SVD decomposition, the matrix A is diagonalized with regard to its eigenvectors, and the resulting new degrees of freedom of the manipulator system are sorted according to the absolute value of their eigenvalues. Degrees of freedom which correspond to eigenvectors with eigenvalues of an absolute value of less than 1.0E-7, for example, are not utilized in the adjustment. In this respect, cf. Andreas Rieder, Keine Probleme mit Inversen Problemen, Vieweg, 2003, chapter 2.3 Spektraltheorie kompakter Operatoren: Die Singulrwertzerlegung [Spectral theory of compact operators: the singular value decomposition], pp. 28-35, in particular definition 2.3.6, and also Angelika Bunse-Gerstner, Wolfgang Bunse, Numerische Lineare Algebra, Teubner Studienbcher, 1985, pp. 26-27, formulation 1.6.10, for numerical calculation pp. 288-296, in particular algorithm 4.8.1. These contents shall hereby be incorporated within their full scope in this application. A further possibility of regularization is preconditioning. Cf Angelika Bunse-Gerstner, Wolfgang Bunse, Numerische Lineare Algebra, Teubner Studienbcher, 1985, in particular section Vorkonditionierung [preconditioning], pp. 156-160, which shall hereby be incorporated by reference within its full scope in this application.
(103) Tikhonov regularization involves solving the minimization problem
minAxb.sub.2.sup.2+Gx.sub.2.sup.2(a)
with a suitably chosen matrix G instead of the problem (a).
(104) In this case a multiple of the unit matrix is especially appropriate as a choice for G. In this case, the above minimization problem is presented as
minAxb.sub.2.sup.2+x.sub.2.sup.2(a)
(105) In the case of the matrix G, a correlation matrix is alternatively used. The latter is formed by inputting the individual degrees of freedom x.sub.i as random variables. The distribution thereof over their respective interval of possible movements is determined statistically. In this case, as a first possibility, a starting distribution is taken as a basis, the parameters of said starting distribution being estimated using statistical methods during the operation of the projection exposure apparatus (a posteriori), or, as a second possibility, a look-up table is used (a priori). In the case of the first possibility, Gaussian distributions, in particular, are used as starting distributions with the expected values E(x.sub.i) and the variances (x.sub.i) as parameters to be estimated.
(106) Tikhonov regularization seeks a compromise from minimum error Axb.sub.2.sup.2 and, in the case (a), minimum movements x of the manipulators. The question with regard to an optimum is preferably answered using the L-curve method. In the case of the latter, the movements obtained in the case of the minimization are plotted as a function of . The which has the gradient having the largest absolute value is then selected as the optimum . For further details in this respect, cf. Analysis of discrete ill-posed problems using the L-curve, P. C. Hansen, SIAM Review, 34, 4, 1992, pp. 561-580, which shall hereby be incorporated within its full scope in this application. The generalization specified in chapter 3.6 Heuristic (epsilon-free) parameter strategies, pp. 82-90, of Andreas Rieder, Keine Probleme mit Inversen Problemen [No problems with inverse problems], Vieweg, 2003, shall likewise be incorporated within its full scope in this application.
(107) On account of the different types of manipulators, the regularizing parameter is alternatively configured in vector-valued fashion, =(.sub.i), where i indicates the degrees of freedom of all the manipulators. G in (a) is then a diagonal matrix and then acquires the form
(108)
(109) The minimization problem (a.sup.v) can be formulated both for static sensitivities A and for scan-integrated sensitivities . In the latter case, the dimension of the Euclidean norm decreases by the factor j of the number of field points p.sub.ij in the scanning direction. Correspondingly, the image aberration vector b is also replaced by the scan-integrated image aberration vector
(110) In lithographic applications wherein it is desired to achieve a specific predefined ratio of scan-integrated to static image aberrations, a combination of scan-integrated and static sensitivities is also used for formulating the minimization problem:
(111)
(112) The weights d are occasionally set to 0 since not all the Zernike coefficients influence the fading. In particular, the spherical coefficients, that is to say those whose Zernike polynomials are rotationally symmetrical, are weighted with 0.
(113) In this case, the image aberration vector {tilde over (b)} is defined by {tilde over (b)}=b
(114) What can prove to be problematic in the case of the above Tikhonov regularizations is that the regularizing parameter only affects the square of the movements of the individual manipulators and thus ignores the sign of the movements. This can result in the formation of a drift of the affected manipulator in a preferred direction, which ultimately entails the risk of an overshooting of the manipulator range. This is combated in two ways:
(115) A posteriori: (a) is replaced by
minAxb.sup.2+.sub.1x.sup.2+.sub.2xx.sup.2+.sub.3x+.sub.4(xx)(a.sup.v)
(116) In this case, the parameters .sub.1 and .sub.2 are scalars, and .sub.3 and .sub.4 are vectors having dimensions corresponding to the degrees of freedom of the manipulators. x are the movements at which the individual manipulators are situated at the current point in time. x are the notified movements that are to be assessed. .sub.1 is the proportionality constant used to weight the total movement x of the manipulators, independently of their instantaneous movement state x and their movement direction. It defines the extent to which an excessively high total movement of the manipulators is intended to be penalized. .sub.2 is the proportionality constant used to weight the additional movement xx necessary to attain the movement x from the movement x, independently of its direction. It defines the extent to which an instantaneous movement of the manipulators is penalized. .sub.3 is a vector. Its direction predefines the direction in which the total movement x of the manipulators is unfavorable, and its absolute value defines the extent to which an excessively high total movement of the manipulators in this direction is penalized. .sub.4 is a vector. Its direction predefines the direction in which an additional movement xx of the manipulators is unfavorable, and its absolute value defines the extent to which an excessively high additional movement of the manipulators in this direction is penalized.
(117)
(118) A priori: In (a), is replaced by a suitable function such as, for example,
(119)
where [x.sub.i,min,x.sub.i,max] in each case describes the individual range of the i-th degree of freedom of the manipulator. Cf in addition the interior point method which is explained further below.
(120) What is disadvantageous about these Tikhonov weightings with parameters .sub.i is that for a given movement of the manipulators x=(x.sub.i), the residual image aberrations Axb and the manipulator movements x, couple only indirectly, that is to say by way of the merit function (a)(a.sup.v). In this case, the type of coupling is independent of the image aberration level that can be attained and also independent of the manipulator movements necessary therefore.
(121) One variant of the Tikhonov regularization mentioned above consists in calculating a first manipulation prescription x.sub.1 with a regularizing, vector-valued, parameter .sub.1. This manipulation prescription x.sub.1=(x.sub.1i) is then examined as to the extent to which individual degrees i of freedom of manipulators are moved little or not at all. In the case of such degrees of freedom, the associated regularizing parameter .sub.2i<.sub.1i is decreased, such that the associated degree of freedom of the manipulator contributes less to the merit function (a)-(a.sup.v). Conversely, all degrees of freedom whose movements, in the case of the calculated prescription, are close to their range limits are allocated a higher regularizing parameter .sub.2i>.sub.1i. These increases and decreases can be chosen in each case by 10%, or 20%, or 50% or 100%. With this new parameter .sub.2=(.sub.2i), a second manipulation prescription x.sub.2 is calculated and compared with x.sub.i. That manipulation prescription x.sub.1,2 which has the smaller residual image aberration Ax.sub.1,2b is preferred to the other manipulation prescription if it comprises no range overshooting. Alternatively, others of the preferences mentioned above are appropriate in the selection of the manipulator prescriptions to be implemented, in particular that of the stability of the solution x.
(122) As an alternative thereto, this method is carried out in multistage fashion with x.sub.1, 2, . . . , n manipulation prescriptions and residual image aberrations Ax.sub.1, 2, . . . , nb, where the manipulation prescription which comprises no range overshooting and achieves the smallest residual image aberration is finally selected. In this case, moved little is understood to mean a movement of less than 10%, or less than 20% or less than 50% of the available range for the relevant degree of freedom, and close to the range limit is understood to mean a movement of more than 50%, or more than 80% or more than 90% of the available range for the relevant degree of freedom. In addition, these gradations 50%, 20%, 10% can also be varied during this multistage method.
(123) As an alternative thereto, the individual image aberration weightings can also be carried out in multistage fashion. This is preferably performed with the aid of a weighted Euclidean norm .sub.2,d, the weighting d=(d.sub.i) of which is varied. (a) then has the following form in the i-th method step
minAx.sub.ib.sub.2,d.sub.
where for i=0 d.sub.0 as start value is set to 1 and then d.sub.i for i1 is functionally dependent on the specifications spec.sub.i of the individual image aberrations b.sub.i and the residual image aberration b.sub.i=Ax.sub.ib that can be achieved in the previous method stage. It has proved to be expedient to predefine the following functional relationship:
(124)
if i is an even number and
(125)
if i is an odd number.
(126) This weighting thus defined is to be performed individually for each specification spec of an image aberration. This can involve weightings of individual Zernikes at selected field points, scan-integrated image aberrations such as fading FAD.sub.x, FAD.sub.y or fully integrating image aberrations such as rms.
(127) These two multistage methods can also be combined.
(128) Methods which implicitly regularize weakly, such as, e.g., the cg method (conjugate gradient method), are alternatively employed. Details and numerous further methods can be gathered from, e.g., Rieder, A., Keine Probleme mit inversen Problemen, Vieweg, 2003 which is hereby incorporated by reference within their full scope in this application.
(129) Methods of the type Ruin and Recreate are furthermore alternatively used. In this case, an already determined solution x.sup.1.sub.1, . . . , x.sup.1.sub.n to the inverse problem is taken as a basis, which solution is intended to be improved. A portion of the manipulators, e.g. i=m, (m+1), . . . , n, is then shut down, that is to say their degrees of freedom are not used. A solution x.sup.2.sub.1, . . . , x.sup.2.sub.m1 with this reduced set of manipulators is then determined. The solution thus determined is naturally worse than the previous solution (Ruin). The shut-down manipulators are then activated again, although x.sup.2.sub.1, . . . , x.sup.2.sub.m1 is no longer altered (Recreate), that is to say that x.sub.m, x.sub.m+11, . . . , x.sub.n are available as degrees of freedom. Overall, a solution x.sup.2.sub.1, . . . , x.sup.2.sub.n is thus generated, which is compared with the solution x.sup.1.sub.1, . . . , x.sup.1.sub.n.
(130) Ruin and Recreate is used particularly in iterative methods as an intermediate step which is intended to prevent the iterative method from getting stuck at a suboptimum solution.
(131) In general, however, there is no intention at all to reduce a specific image aberration to zero. The objective only has to ensure an imaging performance suitable for the lithography. Said imaging performance is generally ensured by upper bounds for those image aberrations which are critical for the imaging performance of the objective. They include for example scale error, telecentricity error, overlay and depth of focus, and also image aberrations arising as a result of integration of a plurality of field points, such as rms, grouped rms, Fading, and also lithographic requirements and further wavefront dimension figures. Cf spec.sub.M, spec.sub.R, spec.sub.G and spec.sub.F from 1)-6) as indicated above.
(132) For the further description, initially no distinction is made with regard to different image aberrations and the relevant upper bound is always designated by spec.
(133) Instead of equation (a), now a solution to the inequality
|Axb|spec
or equivalently
Axbspec
(Axb)spec(b)
is sought, which always has a solution given a suitable spec predefinition. The inequality sign should in this case be interpreted in vector-valued fashion. This affords the possibility of interpreting the inequality (b) as a side condition of the minimization problem
min c.sup.tx(b)
where the latter with a selectable weight vector c affords the possibility of influencing the relative manipulator movements x=(x.sub.1, . . . , x.sub.n).
(134) Linear programming methods are used as an algorithm for solving (b), (b). Besides the Simplex method (see Jane, F., Stoer, J., Optimierung [Optimization], Springer, 2004 or more general active set methods of linear programming, the interior point method is used (see Fiacco, A. V., McCormick, G. P., Nonlinear Programming: Sequential Unconstrained Minimization Techniques, John Wiley & Sons, 1968, Karmarkar, N., A new polynomial-time algorithm for linear programming, Combinatorica 4 1984), no. 4, 373-395, or Mehrotra, S., On the implementation of primal-dual interior point method, SIAM J. Optim. 2 1992), no. 4, 575-601. The latter guarantees polynomial convergence, in contrast to the Simplex method. These sources shall hereby be incorporated by reference within their full scope in this application.
(135) In the case of the interior point method, (b) is replaced by
(136)
while the side conditions (b) are maintained. (c) is solved with the aid of the Newton method, in which case .fwdarw.0 holds true in the course of the (iterative) method depending on the results of the Newton method.
(137) As a further alternative to the minimization problem (b), (b) and the linear programming used for solving said problem, quadratic programming is used. This involves solving, instead of (b) the problem
(138)
under the side conditions
Axbspec
(Axb)spec(d)
(139) The matrix H is again chosen in a suitable manner, e. g. the identity matrix.
(140) The methods from Dantzig-Wolfe and alternatively from Hildreth-d'Esopo or Quadprog are used for solving (d), (d). Cf. C. Hildreth, A quadratic programming procedure, Naval Res. Logistics Quart. 4 1957) 79-85 (Erratum, ibid., p. 361). D. A. D'Esopo, A convex programming procedure, Naval Res. Logistics Quart. 6 1959) 33-42. The Simplex Method for Quadratic Programming Philip Wolfe Econometrica, Vol. 27, No. 3 (July, 1959), pp. 382-398. Gill, P. E. and W. Murray, and M. H. Wright, Practical Optimization, Academic Press, London, U K, 1981. These sources shall hereby be incorporated by reference within their full scope in this application.
(141) In contrast to the problem (b), (b), the condition of the problem (d) and (d) incorporates not only the matrix of the side conditions A, but also the condition of the matrix H. The following orders of magnitude are usually found:
(142) Condition of the matrix H in (d): 3.8E12
(143) Condition of the entire side conditions in (d): 3.2E5
(144) A further method used for solving (d) and (d) is the Downhill Simplex Method, cf. Nelder, J. A., R. Mead, A Simplex Method for Function Minimization, Computer J. 7 1965), pp 308-313, which shall hereby be incorporated by reference within its full scope in this application. This method is a derivative-free method which generally has linear convergence and is numerically robust. However, as a result of manipulator restrictions, predefined edges of the adjustment polyhedron can be implemented only inadequately.
(145) Besides these methods mentioned, even further methods are used such as e.g. simulated annealing, cf. Dueck et. al., Threshold Accepting: A General Purpose Optimization Algorithm Appearing superior to Simulated Annealing, Journal of Comp. Physics, Vol. 9, pp. 161-165, 199, or evolutionary (e.g., genetic) algorithms, which shall hereby be incorporated by reference within its full scope in this application. The disadvantage of these methods is firstly they are generally stochastic in nature and secondly the convergence toward the global minimum is not necessarily provided.
(146) Hitherto, the problem area has remained open as to how the spec predefinitions should be chosen in order that a solution to the inverse problem can be ensured in the first place.
(147) For this purpose, it is possible to modify the problem (b) and (b) to
min t,|Axb|t spec(e)
spec accordingly becomes a variable spec: t spec This problem can be solved with the aid of linear programming. A disadvantage of this method is that it constrains a purely linear problem and does not permit simple regularization (see Gembicki, F. W., Vector Optimization for Control with Performance and Parameter Sensitivity Indices, Ph.D. Thesis, Case Western Reserve Univ., Cleveland, Ohio, 1974 and U.S. Pat. No. 7,301,615).
(148) Methods of nonlinear programming are also alternatively used for solving the problems (a)-(e). In this respect, see Gill, P. E., W. Murray, M. H. Wright, Numerical Linear Algebra and Optimization, Vol. 1, Addison Wesley, 1991 and K. Schittkowski (1981): The nonlinear programming method of Wilson, Han and Powell. Part 1: Convergence analysis, Numerische Mathematik, Vol. 38, 83-114, Part 2: An efficient implementation with linear least squares subproblems, Numerische Mathematik, Vol. 38, 115-127 which shall hereby be incorporated by reference within their full scope in this application.
(149) The following method, called active constraints method hereinafter, is also used as a further variant: the matrix .sub.k of the active constraints is formed iteratively from the set of the above side conditions. This is carried out inductively as follows:
(150) Induction Basis:
(151) Is the start state if the latter meets the side conditions. Should this not be the case, then the side conditions are weakened until this is the case. The adjustment polyhedron is as it were inflated. This inflation is possible since the specifications are specified with the aid of the Gembicki variables t, i. e. it is possible to relax t such that the side conditions are fulfilled. For the purpose of generating a start state, alternatively it is possible to use, e.g., Tikhonov regularization with stronger movement distance restriction. Alternatively, it is even possible to take the unoptimized state as a start state, in which case a higher number of iterations will be necessary in order to attain the vicinity of the optimum.
(152) Indication Step:
(153) Suppose then that the state x.sub.k meets all the side conditions. Some side conditions are met almost exactly, that is to say that in (d) = holds true apart from a small deviation E, e.g., <1E-8. These are the active constraints. The space that is orthogonal to the active constraints with respect to the Euclidean scalar product is then formed and the minimization problem (d) is solved in said space. If actuating distances of the manipulators that are obtained in this way are not within the permitted range, then they are suitably trimmed at the edge of the permitted range, whereby the state x.sub.k+1 that is permissible with respect to the actuating distances of the manipulators is attained.
(154) Details concerning this induction step can be gathered e.g. from W. Alt, Nichtlineare Optimierung [Nonlinear optimization], Vieweg 2002. This source shall hereby be incorporated by reference within its full scope in this application.
(155) Consequently, a sequence (x.sub.k) is constructed which converges toward the global optimum x. In this respect, also cf. (Gill, P. E., W. Murray, M. H. Wright, Numerical Linear Algebra and Optimization, Vol. 1, Addison Wesley, 1991). Besides the traditional QPSOL, a Fortran program for quadratic programming, LSSOL, in particular, shall be mentioned here (cf. the program package http://www.sbsi-sol-optimize.com/asp/sol_product_lssol.htm from Stanford Business Software Inc.).
(156) As an alternative, instead of the start state, the result of the calculation of a solution using linear programming, as is proposed as a standard method in the literature, could also be utilized as an induction basis. This is even absolutely necessary if Gembicki variables are not used. On account of the unsatisfactory and poorly estimable convergence behavior of linear programming, this is disadvantageous for real-time optimizations.
(157) Genetic or rather, generally, evolutionary algorithms can also be used for solving the inverse problem. They are characterized in that they iteratively run through the phases of initialization, evaluation, selection, recombination and mutation until a suitable termination criterion is fulfilled.
(158) The numerical methods specified above are not only used in pure form, rather they can also be changed for each necessary solution to an inverse problem. In particular, this change, in the case of iterative methods, can also be performed in the form of approximations to the solution which would bring no a priori undershooting of the overshot upper bounds, if, in the case of such an approximation, a change to an alternative method promises a better convergence.
(159) The optimization methods already listed have strengths, but also weaknesses. These are specifically: Quadratic optimization without side conditions (a), (a), (a) (a): owing to a lack of side conditions only implicit access to a large portion of the variables to be optimized (e.g., Zernike-Specs) and also the risk of violation of the movement distance restrictions for the manipulator movements Linear programming (b), (c): quadratic optimization terms cannot be taken into account Quadratic programming (d): nonlinear side conditions cannot be taken into account; the question regarding optimum spec predefinitions is open
(160) Therefore, there is the additional problem of how the necessary regularization is suitably integrated into the non-trivial optimization method to be chosen.
(161) What is disadvantageous about the idea of F. Gembicki is that on the one hand it is an extension of the linear programmingwithout any direct possibility for regularizationand on the other hand it only optimizes an individual, global spec. On the one hand, this can lead to very large, virtually impracticable movement distances; on the other hand, all the specs are deliberately exhausted to the permitted limit. This can have the effect that, for the purpose of a minimum global spec improvement, some specs are exhausted considerably further.
(162) The disclosure uses a further algorithm, which combines the positive properties of the algorithms mentioned above whilst avoiding their disadvantages. It is outlined as followsthe term multivariable specs is defined further below. 1. Return of a quadratic optimization problem under linear and quadratic side conditions to the quadratic programming 2. Simplification of the start value finding by use of variable and/or multivariable specs 3. Tikhonov regularization with quadratic programming 4. Adaption of the active constraints method to multivariable specs and application to quadratic programming
(163) Specifically, besides the matrix A, comprising the sensitivities of the manipulators, the upper bounds, referred to as specs, are defined in the following manner:
(164) Side conditions of linear type: 1) Zernike specs, characterized by a vector with spec predefinition spec.sub.A 2) determined (measured and/or (partly) extrapolated) error, characterized by a vector with spec predefinition b 3) maximum movements of manipulators, characterized by a vector with spec predefinition spec.sub.V and a current movement distance state vb. It can happen here that the actual maximum movements are to be calculated from the actuating distances with the aid of a matrix V (e.g. in the case of heat and temperature side conditions) 4) lithographic system variables such as e.g. overlay or best focus, characterized by a matrix L and a vector with spec predefinition spec.sub.L 5) further linear optimization side conditions, characterized by a matrix M and a vector with spec predefinition spec.sub.M
(165) Side conditions of nonlinear type: 6) Fading specs, characterized by the positive definite Hermitian matrix F with spec predefinition spec.sub.F 7) RMS specs, characterized by the positive definite Hermitian matrix R with spec predefinition spec.sub.R 8) grouped RMS spec, characterized by the positive definite Hermitian matrix G with spec predefinition spec.sub.G 9) further quadratic optimization side conditions, characterized by a matrix Q and a vector with spec predefinition spec.sub.Q
(166) The optimization problem to be considered therefore has the following maximum possible side conditions:
Axbspec.sub.A
Ax+bspec.sub.A
L(Axb)spec.sub.L
L(Axb)spec.sub.L
M(Axb)spec.sub.M
M(Axb)spec.sub.M
V(Axv.sub.b)spec.sub.V
V(Axv.sub.b)spec.sub.V
x.sup.tFx2b.sup.tFx+b.sup.tbspec.sub.F
x.sup.tRx2b.sup.tRx+b.sup.tbspec.sub.R
x.sup.tGx2b.sup.tGx+b.sup.tbspec.sub.G
x.sup.tQx2b.sup.tQx+b.sup.tbspec.sub.Q(f)
(167) The upper and lower limits do not have to be symmetrical, for example. One-sided limits may also be necessary.
(168) Firstly, a suitable minimization function is additionally chosen freely. However, this can have the effect that the calculated solution x has a very large norm and thus has very large manipulator movements. This has the effect that large changes to the manipulator actuating distances have to be made in the case of small changes in the objective state. This can greatly impair practical implementability.
(169) The use of Tikhonov regularization with a suitably chosen weight matrix W.sub.Tikh solves this problem. The weight matrix W.sub.Tikh is preferably generated by identical weighting of different degrees of freedom of identical type. The minimization problem, with the above side conditions being maintained, then reads
min x.sup.tW.sub.Tikh.sup.tW.sub.Tikhx(f)
(170) Linear programming cannot be applied to this statement of the problem owing to the nonlinear optimization term in (f). On account of the nonlinear side conditions from (f)the last four lines thereofthe advantages of quadratic programming (good and stable convergence toward the minimum, selection from various fast algorithms) likewise cannot initially be utilized.
(171) Linear and nonlinear side conditions are present, then, in (f). Besides the two sources for nonlinear programming as specified above, the problem (f) can be solved with the aid of SQP, sequential quadratic programming. For details see W. Alt, Nichtlineare Optimierung, Vieweg 2002, which is incorporated by reference within its full scope. The method of sequential quadratic programming is based on iteratively locally linearizing the optimization problem and applying the above-described linear programming to this linearization in order thus to obtain a new start point for the linearization. The method of SQP additionally advantageously permits side conditions (f) and merit function (f) to be formulated with the aid of arbitrary functions.
(172) As an alternative, the quadratic side conditions in (f) can be replaced as followed by a multiplicity of linear side conditions, such that the problem resulting therefrom can be solved using the quadratic programming described further above. The quadratic side conditions span a respective ellipse which can be described by approximation with any desired accuracy by the section of a finite number of hyperplanes (given by a respective linear side condition).
(173) A further method for resolving the side conditions (f) is explained below, this method being very advantageous with regard to the computational speed in comparison with SQP.
(174) With the aid of Lagrange multipliers (see Gill, P. E., W. Murray, M. H. Wright, Numerical Linear Algebra and Optimization, Vol. 1, Addison Wesley, 1991 the problem (f), (f) is reformulated in a canonical manner as follows:
(175) Solve
min x.sup.t(W.sub.Tikh.sup.tW.sub.Tikh+W.sub.F.sup.tFW.sub.F+W.sub.R.sup.tRW.sub.R+W.sub.G.sup.tGW.sub.G+W.sub.Q.sup.tQW.sub.Q)x+2(w.sub.R.sub.R+w.sub.G.sub.G+w.sub.F.sub.F+w.sub.Q.sub.Q).sup.tx(f)
under the side conditions
Axpspec.sub.A
Ax+pspec.sub.A
L(Axb)spec.sub.L
L(Axb)spec.sub.L
M(Axb)spec.sub.M
M(Axb)spec.sub.M
V(Axv.sub.b)spec.sub.V
V(Axv.sub.b)spec.sub.V(f)
(176) In this case, W.sub.F, W.sub.R and W.sub.G are suitable weight matrices for the quadratic component. These additional weight matrices can preferably also be multiplicative multiples of the unit matrix. Suitable additional weight matrices for the linear component are designated by w.sub.F, w.sub.R and w.sub.G.
(177) Preferably, the text below additionally provides a solution to the problem that on the one hand the predefined spec values in the side conditions are utilized up to the limit, but on the other hand no solution can be found in the case of excessively hard, that is to say non-relaxable, spec predefinitions, since the convex set spanned by the side conditions is empty. The following procedure will be referred to as multivariable specs.
(178) In this respect, let {tilde over (x)}:=(x,t).sup.t. In this case, the vector t can be formed from a high-dimensional space such as one having for instance more than 10 dimensions, or more than 100 dimensions or even more than 1000 dimensions. As above, the manipulator actuating distances to be optimized are designated by x and the Gembicki variables to be optimized are designated by t. A suitable spec matrix adapted to the Gembicki variables is designated by spe{tilde over (c)}_; it emerges from the vector spec_. For a suitable weight matrix W.sub.Gemb for the Gembicki variables, the optimization problem consists in minimizing
min {tilde over (x)}.sup.t(W.sub.Gemb.sup.tW.sub.Gemb+W.sub.Tikh.sup.tW.sub.Tikh+W.sub.F.sup.tFW.sub.F+W.sub.R.sup.tRW.sub.R+W.sub.G.sup.tGW.sub.G){tilde over (x)}+2(w.sub.R.sub.R+w.sub.G.sub.G+w.sub.F.sub.F).sup.t{tilde over (x)}(f.sup.v)
under the linear side condition
Axpspe{tilde over (c)}.sub.At
Ax+pspe{tilde over (c)}.sub.At
L(Axb)spe{tilde over (c)}.sub.Lt
L(Axb)spe{tilde over (c)}.sub.Lt
M(Axb)spe{tilde over (c)}.sub.Mt
M(Axb)spe{tilde over (c)}.sub.Mt
V(Axv.sub.b)spe{tilde over (c)}.sub.Vt
V(Axv.sub.b){tilde over (c)}spe{tilde over (c)}.sub.Vt(f.sup.v)
(179) The matrices occurring therein are suitably adapted, in comparison with the embodiment above, in accordance with the variable extension now carried out. It is likewise possible to provide a portion of the specified side conditions with Gembicki variables t, while the other side conditions are provided with hard spec limits, that is to say spec limits not multiplied by the parameter t. It is likewise possible to provide some, a plurality or all of the Gembicki variables multiplicatively with an additional Gembicki variable, which regulates the size of said Gembicki variables. This can be continued iteratively in nested fashion. It is likewise conceivable to provide specifications both with a hard spec and using an additional specification with a Gembicki variable. This advantageously brings about the economical spec exhaustion in combination with adherence to the hard spec within predefined limits. An inflation of the adjustment polyhedron is now possible in different ways: add an additional, over-all Gembecki variable and relax it until all side conditions are fulfilled or relax only those already used Gembecki variables where the side conditions are not fulfilled. The latter method being possible to use in case of all hard side conditions are fulfilled.
(180) The formulation of the inverse problem with the aid of the Gembicki variables, as in the case of (e), or the multivariable specs, as in the case of (f.sup.v) in conjunction with (f.sup.v), has the additional advantage that this can ideally be combined with Quadprog for solving the problem. The algorithm of Quadprog presupposes that a start state which meets the side conditions can be specified. If said start state is determined using linear programming, this can already exceed the permissible computation time before a genuine minimization of the functional actually occurs at all. With the aid of the Gembicki variables, said state can be attained by the inflation of the adjustment polyhedron as already mentioned. The start point for Quadprog is thus attained by a softening of the specs and not by an alteration of the manipulator setting. An additionally afforded advantage of this combination of Quadprog and the formulation of the inverse problem with the aid of the Gembicki variables is a numerical consistency of the manipulator movements that is inherent to the algorithm. To put it more precisely: since the instantaneous manipulator movements are identified as good enough for the start value of Quadprog for the current inverse problem, the solution when using Quadprog numerically initially also does not depart from the optimum solution of (e), or (f.sup.v) in conjunction with (f.sup.v), that is finally determined by Quadprog. This saves computational time in addition to the above statements.
(181) The inverse problems to be solved generally occur in such close succession temporally that the case can arise that a new inverse problem is to be solved before the manipulators attain the movements determined in accordance with the solution to the preceding inverse problem. In accordance with the above method, the manipulators would be stopped and the current position would be used as a start value, as described above. However, since it can be assumed that the image aberrations to be compensated for change continuously in small steps, the successive inverse problems to be solved also have neighboring solutions. Therefore, as an alternative, instead of the current positions of the manipulators, the movements determined in accordance with the preceding inverse problem or the movements determined in accordance with a preceding calculation step, in case of existence, are used as the start value for Quadprog for solving the new inverse problem.
(182) This overlap can also concern more than two inverse problems: the manipulators are moved in the direction of the solution x.sub.n to the n-th inverse problem. Before all the manipulators attain their end positions in accordance with said solution x.sub.n, the subsequent inverse problems n+1, . . . , n+m accumulate. For the start value of Quadprog for solving the n+j-th inverse problem, the solution to the n+j1-th problem, as described above, using polyhedron inflation is then used. This procedure will be referred to as cascading Quadprog.
(183) In the case of an iterative solution to the inverse problem, the following property of the fine adjustment is also advantageously utilized: in the case of heating of the objective, the image aberrations initially vary greatly. With increasing heat absorption, however, a saturation state is established which only varies slightly from die to die. Furthermore, the solution to the adjustment problem is continuously dependent on the varying boundary conditions such as the heat input, for example. This has the consequence that even in the case of an only approximate numerical solution to the inverse problem and thus suboptimal regulation of the manipulator system, these ensure compliance with the upper bounds of the specification.
(184) This inertia of the heat absorption also has disadvantages, however, in the fine adjustment. If, during operation of the projection exposure apparatus, the illumination setting or the reticle is changed or if a new batch is begun, then a virtually discontinuous variation of the image aberration profile occurs. Therefore, the solution to the inverse problem no longer necessary lies in the vicinity of the previous manipulator movements. In such a case, a relatively sluggish manipulator, such as a manipulator which applies heat to an optical element, for example, can require an unreasonable time period to attain the movement resulting from the new solution to the inverse problem. This problem can be solved in two ways: 1. The sluggish manipulator is specified with regard to its movements between expected discontinuities in the image aberration profiles in such a way that its maximum movements to be used correspond to its maximum possible movements by a mean value not amounting to 100%. Values of 80%, or 50% or 20%, or 10%, of the maximum possible movements are advantageous here. 2. The movement distance to be expected in the future and its future direction are determined for the sluggish manipulator using a prediction model, a short-term impairment of the present image aberration level is afforded tolerance by increasing the specifications, for example by 50%, or 20%, or 15%, or 10%, or 5%, or 2%, or 1% and the sluggish manipulator is moved in its future direction as far as is permitted by the reductions of the specifications. In this case, short-term should be understood to mean a time interval which extends into the future and which amounts for example to 60000 ms, or 20000 ms, or 10000 ms, or 5000 ms, or 1000 ms, or 200 ms, or 20 ms. Within this time interval, the reduced specification has to be guaranteed by the prediction model. In this case, the movement of the sluggish manipulator in its future direction is generally accompanied by the movement of less sluggish manipulators. Particularly advantageous here are the pairings (50%, 60000 ms), (20%, 20000 ms), (15%, 10000 ms), (10%, 5000 ms), (2%, 1000 ms), (2%, 200 ms), (1%, 20 ms) for increasing the specifications in association with the time interval reaching into the future, for which these have to be ensured.
(185) What are appropriate for the image aberration profile, in the case of point 2. above, are in particular the image aberrations overlay, best focus, fading both for core structures and for peripheral structures, and individual Zernike coefficients, and also rms, grouped rms and residual rms, and also any desired subsets thereof.
(186) Previously, individual methods for solving (II) were presented. The manipulator actuating distances were subdivided merely using a distinction between fast and slow manipulators. However, the above methods can also be combined. This procedure is referred to as toggling.
(187) In the case of two algorithms, toggling is manifested as follows:
(188) With a first algorithm, Alg.sub.1, at high speed a first manipulator adjusting distance x.sub.1 is identified, which solves the minimization problem (a)-(a.sup.v) at high speed, but only yields an a priori suboptimal solution, i.e. minAxb is not necessarily attained by x=x.sub.1 and the residual image aberration b.sub.1=Ax.sub.1b is not minimal with regard to the norm used. However, Alg.sub.1 ensures that a solution which does not entail any overshootings of the individual manipulator ranges is determined in the available time. By way of example, Tikhonov regularizations in accordance with (a)-(a.sup.v) are employed as Alg.sup.1. In this case, the parameter for this Tikhonov regularization is chosen in such a way that no overshootings of the respective manipulator ranges occur.
(189) With a second algorithm, Alg.sub.2, a rather sluggish, not precisely predictable convergence behavior of the minimization problem (a)-(a.sup.v) is accepted for its solution x.sub.2, where the main emphasis is on the optimality of the solution x.sub.2, that is to say that a smaller norm of the residual image aberration b.sub.2 can generally be expected for b.sub.2=Ax.sub.2b. By way of example, the active constraints method or the Gembicki algorithm is used as Alg.sub.2.
(190) These two algorithms are then toggled, that is to say that Alg.sub.1 and Alg.sub.2 are used in parallel and the results of Alg.sub.2 are employed precisely when b.sub.2 is less than b.sub.1 with regard to its norm and at the same time its solution x.sub.2 does not leave the manipulator range. If this is not the case, it is possible to have recourse to the solution x.sub.1 of Alg.sub.1.
(191) If the decision is made in favor of Alg.sub.2, then after a new image aberration measurement the difference between image aberration determined and a newly measured image aberration can additionally be subsequently optimized in the remaining time and with the aid of Alg.sub.1.
(192) This method is refined further if Alg.sub.1 is applied iteratively until Alg.sub.2 has its result. Then, as an alternative, an image aberration prediction is made in each iteration step and the fast algorithm Alg.sub.1 is applied to the new image aberrations.
(193) This parallel solution of inverse problems, or parallelization of the minimum search, is preferably used in the context of a plurality of computing processors. A plurality of such processors can also serve to ensure that, within an individual algorithm, too, a numerical parallelization is carried out in the case of the matrix multiplications necessary there. In particular, the Strassen algorithm is used. Cf Volker Strassen: Gaussian Elimination is not Optimal, Numer. Math. 13, pp. 354-356, 1969 which shall hereby be incorporated by reference within their full scope in this application.
(194)
b.sub.11=Ax.sub.11b,b.sub.12=Ax.sub.12b.sub.11,b.sub.13=Ax.sub.13b.sub.12
and
b.sub.2=Ax.sub.2b.
(195) Afterwards, the results x.sub.11, x.sub.11+x.sub.12 and x.sub.11+x.sub.12+x.sub.13 and x.sub.2 are compared with regard to range overshootings and residual image aberrations b.sub.11, b.sub.12, b.sub.13 and b.sub.2 and the optimum solution, that is to say the one which results in a minimum residual image aberration without range violations, is output as manipulator movement distance x.
(196) Optionally, the results x.sub.11 and x.sub.11+x.sub.12 can be reached by the manipulators in the meantime of solving b.sub.12=Ax.sub.12b.sub.11 and b.sub.13=Ax.sub.13b.sub.12, respectively in order to ensure continuously small image aberrations. This is illustrated in
(197) In particular, Tikhonov regularizations in accordance with (a)-(a.sup.v) with a varying parameter =(i) are used as algorithms
(198) If an algorithm for solving the inverse problem does not attain the predefined specifications spec, then the latter are relaxed, as already indicated. Besides a relation by a predetermined percentage such as 10%, 50% or 100%, a so-called joker regulation is used. In the case of the latter, individual image aberrations are combined in groups, which are then relaxed jointly in the sense of a summation. Quantitatively, the same gradations 10%, 50% or 100% are used in this case. Appropriate groups include, in particular, Zernike coefficients having an identical azimuthal behavior. By way of example, all the Zernike coefficients a.sub.i with respect to the Zernike polynomials {Z.sub.5, Z.sub.12, Z.sub.21, . . . } which are of the type P()cos(2); P polynomial, that is to say behave azimuthally like cos (2), form such a group. The relaxation for this group is then given as above by a percentaged relaxation of
(199)
The background for the choice of exactly such groups is based on the fact that the intention primarily is to prevent image aberrations that have an identical azimuthal behavior from cumulating.
(200)
(201) TABLE-US-00001 TABLE 1 Manipulator distribution with regard to the exemplary embodiment from FIG. 7. 3.121.1 XY 3.121.2 Z 3.121.3 XY 3.121.4 XYZ tilt 3.123.5 Exchange/Aspherization 3.121.5 Heating/Cooling 3.121.6 Z 3.121.7 XY 3.121.8 Exchange/Aspherization
(202) In this case: Z is understood as displacement in the direction of the optical axis of the objective (one degree of freedom) XY is understood as displacements in the directions perpendicular to the optical axis of the objective (two degrees of freedom) XYZ tilt is understood as displacement in the direction of the optical axis of the objective, in the directions perpendicular to the optical axis of the objective and as a tilt about two axes perpendicular to the optical axis of the objective (five degrees of freedom) Exchange/Aspherization are 36 or 49 or 100 or more degrees of freedom since a freeform surface calculated from such a number of basis functions is generally used for the aspherization; in addition, these functionalities can be combined. This is the case for example with a pair of Alvarez plates configured in exchangeable fashion: in this case, two aspherized plane plates are displaced relative to one another. In this respect, also cf. EP851304A2. The above number of degrees of freedom follows the square numbers and follows the orthonormal system of Zernike polynomials which is suitable not only for describing wavefront deformations but also for describing aspheres. Besides the Zernike polynomials, splines or wavelets are also used for describing aspheres and give rise to different numbers of degrees of freedom. In a manner not illustrated here, in the case of the initial adjustment, this aspherization takes place not only on one or both optically active surfaces of plane plates but also on one or both optically active surfaces of some of the optical elements, preferably lenses or mirrors. Heating/Cooling can be interpreted as p=nm degrees of freedom, depending on how many locations are used for heating and/or cooling. Use is normally made of n=4=m, n=7=m, n=10=m, n=15=m, or n=20=m.
(203) The manipulators 3.121.5 and 3.123.5 can be used alternatively or in combination at the plane plate.
(204) Between 85 and 313 degrees of freedom are accordingly obtained, wherein the manipulators of the type Z, XY, XYZ, XYZ tilt and also the manipulators of types Heating/Cooling and Deforming have to be driven and regulated in real time and the inverse problem (II) is solved using the techniques described above.
(205) Finally, a distinction is made between manipulators which are provided for the initial, repair and fine adjustment. By way of example, individual Z and XYZ tilt manipulators can be provided for the initial, repair and fine adjustment and some different individual XY manipulators can be provided just for the initial adjustment.
(206) Table 2 lists the design data of the exemplary embodiment concerning
(207) TABLE-US-00002 TABLE 2 Design data concerning the exemplary embodiment from FIG. 7. NA: 0.9 Wavelength: 193.37 nm 2Y: 28.04 beta: 0.25 FN Radius Thickness/Distance Medium Refractive index at 193.37 nm free diameter 0 0.000000 0.000000 AIR 1.00030168 56.080 1 0.000000 40.078816 AIR 1.00030168 56.080 2 6478.659586 10.843586 SIO2 1.5607857 65.807 3 1354.203087 2.423172 N2 1.00029966 66.705 4 1087.803717 9.621961 SIO2 1.5607857 67.029 5 183.366809 2.746191 N2 1.00029966 70.249 6 206.367009 8.085674 SIO2 1.5607857 71.462 7 193.387116 36.794321 N2 1.00029966 72.483 8 140.799170 50.095072 SIO2 1.5607857 73.484 9 373.463518 1.000056 N2 1.00029966 103.736 10 561.452806 22.561579 SIO2 1.5607857 107.508 11 263.612680 1.000757 N2 1.00029966 111.562 12 49392.564837 53.841314 SIO2 1.5607857 124.515 13 266.359005 15.247581 N2 1.00029966 130.728 14 840.618795 29.011390 SIO2 1.5607857 141.816 15 926.722503 1.005611 N2 1.00029966 142.120 16 2732.904696 38.725042 SIO2 1.5607857 141.999 17 356.203262 2.005496 N2 1.00029966 141.858 18 318.151930 16.617316 SIO2 1.5607857 124.740 19 513.819497 1.562498 N2 1.00029966 122.663 20 171.455701 30.277694 SIO2 1.5607857 111.385 21 154.841383 1.064446 N2 1.00029966 98.077 22 127.756842 43.191495 SIO2 1.5607857 94.695 23 104.271940 52.476004 N2 1.00029966 74.378 24 283.692700 8.000000 SIO2 1.5607857 68.565 25 242.925344 39.949820 N2 1.00029966 64.404 26 117.414779 8.181192 SIO2 1.5607857 63.037 27 197.144513 26.431530 N2 1.00029966 69.190 28 244.477950 44.225451 SIO2 1.5607857 71.085 29 230.356430 1.409104 N2 1.00029966 88.427 30 1472.096761 21.137737 SIO2 1.5607857 99.340 31 450.715283 1.259334 N2 1.00029966 101.126 32 3573.378947 8.391191 SIO2 1.5607857 105.206 33 7695.066698 1.258010 N2 1.00029966 106.474 34 1029.326175 8.390466 SIO2 1.5607857 108.186 35 243.058844 29.823514 N2 1.00029966 112.152 36 29057.985214 38.911793 SIO2 1.5607857 114.058 37 232.205631 1.000000 N2 1.00029966 116.928 38 270.144711 55.850950 SIO2 1.5607857 139.162 39 1183.955772 20.935175 N2 1.00029966 138.048 40 0.000000 2.958031 N2 1.00029966 138.244 41 368.838237 22.472410 SIO2 1.5607857 141.049 42 220.058627 26.974362 N2 1.00029966 137.707 43 355.728536 58.022036 SIO2 1.5607857 140.923 44 861.478061 4.104304 N2 1.00029966 142.103 45 420.713002 55.049896 SIO2 1.5607857 142.502 46 478.998238 1.000000 N2 1.00029966 141.431 47 122.579575 48.569396 SIO2 1.5607857 106.623 48 223.612364 1.000000 N2 1.00029966 99.428 49 132.028747 49.487311 SIO2 1.5607857 88.176 50 247.223694 10.595002 N2 1.00029966 65.249 51 712.954951 8.355490 SIO2 1.5607857 57.430 52 163.735059 3.094307 N2 1.00029966 47.446 53 154.368613 19.294967 SIO2 1.5607857 44.361 54 677.158668 2.851896 N2 1.00029966 33.956 55 0.000000 10.000000 SIO2 1.5607857 29.686 56 0.000000 4.000000 AIR 1.00030168 22.559 57 0.000000 0.000000 AIR 1.00030168 14.020 aspherical constants FN 2 6 12 17 30 K 0 0 0 0 0 C1 1.38277367E07 1.02654080E08 3.36870323E09 2.29017476E10 1.51349530E08 C2 1.88982133E11 1.22477004E11 1.77350477E13 4.92394931E14 9.73999326E13 C3 1.94899866E15 1.70638250E15 1.19052376E19 2.34180010E19 8.62745113E18 C4 3.04512613E19 2.48526394E19 1.17127296E22 2.74433865E23 5.94720340E22 C5 3.31424645E23 2.38582445E23 9.25382522E27 8.02938234E29 4.71903409E26 C6 2.70316185E27 1.51451580E27 4.88058037E31 1.05282366E32 2.87654316E31 C7 1.30470314E31 6.30610228E32 1.32782815E35 1.44319713E38 4.40822786E35 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 FN 39 44 48 51 K 0 0 0 0 C1 5.16807805E09 3.74086200E09 2.07951112E09 6.57065732E09 C2 6.52986543E14 9.09495287E14 3.24793684E14 2.35659016E12 C3 6.91577796E19 9.58269360E19 4.06763809E18 1.23585829E16 C4 3.61532300E24 2.46215375E23 4.85274422E22 5.34294269E20 C5 1.38222518E27 8.23397865E28 2.39376432E27 1.12897797E23 C6 1.06689880E31 1.33400957E32 2.44680800E30 1.37710849E27 C7 1.65303231E36 5.95002910E37 5.62502628E35 1.15055048E31 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00
(208)
(209) TABLE-US-00003 TABLE 3 Manipulator distribution concerning the exemplary embodiment from FIG. 8. 4.121.1 XY 4.121.2 Z 4.121.3 XY 4.121.4 Z 4.121.5 Z 4.122.6 Deforming 4.122.7 Heating/Cooling 4.121.8 XY 4.121.9 Z 4.121.10 Heating/Cooling 4.123.11 Exchange/Aspherization 4.121.12 Z 4.121.13 Heating/Cooling 4.121.14 XY 4.121.15 XY
(210) In this case: Deforming is understood as applying forces and/or torques to an optical element, specifically a mirror in this case, such that the latter changes its form. 36 or 49 or 100 degrees of freedom are available here since the optical element to be deformed generally models its form on the Zernike polynomials.
(211) The manipulators 4.122.6 and 4.122.7 and also 4.121.12 and 4.121.13 can be used alternatively or in combination at the plane plate. Between 79 and 509 degrees of freedom are accordingly obtained.
(212) Table 4 lists the design data of the exemplary embodiment from
(213) TABLE-US-00004 TABLE 2 Design data concerning the exemplary embodiment from FIG. 8. NA: 1.25 Wavelength: 193.3 nm 26 mm 4 mm beta: 0.25 FN Radius Thickness/Distance Medium Refractive index at 193.37 nm free diameter 0 0.000000 81.909100 1.0000000 60.033 1 2634.494170 21.250400 SIO2 1.5603261 84.607 2 395.771680 1.000000 1.0000000 86.438 3 150.000000 50.000000 SIO2 1.5603261 93.055 4 369.687330 54.915200 1.0000000 87.911 5 179.714460 34.086800 SIO2 1.5603261 79.061 6 477.803632 6.693200 1.0000000 75.808 7 88.938160 50.000000 SIO2 1.5603261 61.395 8 91.869190 23.605900 1.0000000 41.199 9 98.632420 50.000000 SIO2 1.5603261 38.263 10 88.506390 12.049500 1.0000000 54.125 11 76.470080 38.657300 SIO2 1.5603261 55.652 12 344.460330 15.702800 1.0000000 81.919 13 334.926670 50.066100 SIO2 1.5603261 90.780 14 117.238730 1.000000 1.0000000 96.774 15 395.286603 43.871600 SIO2 1.5603261 102.141 16 181.497120 1.000000 1.0000000 106.823 17 289.196280 27.848300 SIO2 1.5603261 102.338 18 5892.122010 12.151700 1.0000000 100.491 19 227.013620 27.157000 SIO2 1.5603261 91.787 20 3443.763345 69.000000 1.0000000 88.482 21 0.000000 236.511600 1.0000000 93.010 22 107.026046 12.500000 SIO2 1.5603261 77.379 23 1144.459840 50.132600 1.0000000 93.528 24 110.859760 12.500000 SIO2 1.5603261 94.408 25 213.248200 26.158800 1.0000000 121.413 26 155.158660 26.158800 1.0000000 124.079 27 213.248200 12.500000 SIO2 1.5603261 121.279 28 110.859760 50.132600 1.0000000 94.366 29 1144.459840 12.500000 SIO2 1.5603261 93.590 30 107.026046 236.511600 1.0000000 78.711 31 0.000000 64.048900 1.0000000 80.845 32 3037.951580 22.331200 SIO2 1.5603261 81.395 33 259.310450 1.000000 1.0000000 84.258 34 470.923230 24.545000 SIO2 1.5603261 91.158 35 700.750920 1.000000 1.0000000 92.143 36 228.288980 45.979800 SIO2 1.5603261 94.586 37 4362.499070 1.000000 1.0000000 91.793 38 147.001560 50.000000 SIO2 1.5603261 87.420 39 505.438519 13.175800 1.0000000 77.709 40 810.594260 12.500000 SIO2 1.5603261 76.617 41 96.147375 40.925200 1.0000000 67.165 42 2113.410760 12.500000 SIO2 1.5603261 70.138 43 144.960906 16.180300 1.0000000 73.606 44 562.313340 30.687700 SIO2 1.5603261 75.291 45 1126.648250 80.233900 1.0000000 81.957 46 3405.414609 22.658500 SIO2 1.5603261 119.099 47 586.423270 1.000000 1.0000000 121.813 48 361.039350 33.153400 SIO2 1.5603261 134.636 49 3170.027570 1.000000 1.0000000 135.165 50 310.029270 49.249300 SIO2 1.5603261 138.460 51 809.565830 9.868200 1.0000000 137.458 52 0.000000 5.372200 1.0000000 134.639 53 777.317070 35.882400 SIO2 1.5603261 133.952 54 1312.612220 1.000700 1.0000000 131.798 55 319.735750 35.943900 SIO2 1.5603261 123.507 56 3225.490720 1.000000 1.0000000 120.740 57 130.495300 28.495000 SIO2 1.5603261 95.630 58 196.7895749 1.000000 1.0000000 88.921 59 95.22134 34.303600 SIO2 1.5603261 76.079 60 216.9390336 1.000000 1.0000000 66.955 61 61.85167 50.000000 SIO2 1.5603261 49.647 62 0 1.000000 H2O 1.4368163 16.616 63 0 0.000000 H2O 1.4368163 15.010 aspherical constants FN 6 15 20 22 30 K 0 0 0 0 0 C1 7.81812000E08 1.14607000E08 1.29530000E08 8.88014000E08 8.88014000E08 C2 6.03387000E13 4.60861000E13 2.79320000E13 3.40911000E12 3.40911000E12 C3 3.16794000E16 1.61766000E17 1.95862000E17 1.98985000E16 1.98985000E16 C4 3.45599000E20 5.41414000E24 6.49032000E22 1.45801000E20 1.45801000E20 C5 1.67268000E24 5.36076000E27 1.02409000E26 9.23066000E26 9.23066000E26 C6 0.00000000E+00 1.16131000E31 4.06450000E32 1.30730000E28 1.30730000E28 C7 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 FN 39 41 43 46 51 K 0 0 0 0 0 C1 3.21829000E08 1.40846000E08 3.76564000E08 1.54429000E08 9.78469000E09 C2 4.08976000E13 3.73235000E12 2.04565000E12 1.52631000E13 2.15545000E14 C3 9.46190000E17 5.78170000E17 6.72661000E17 1.17235000E17 2.66488000E17 C4 1.12686000E20 4.02044000E20 3.35779000E21 3.02626000E22 1.19902000E21 C5 1.09349000E24 1.81116000E24 5.51576000E25 2.05070000E28 2.50321000E26 C6 2.30304000E29 3.46502000E28 2.95829000E28 3.61487000E31 2.10016000E31 C7 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 FN 58 60 K 0 0 C1 2.76215000E09 1.08228000E07 C2 4.06793000E12 9.51194000E12 C3 4.51389000E16 1.14605000E15 C4 5.07074000E20 1.27400000E19 C5 1.83976000E24 1.59438000E23 C6 6.22513000E29 5.73173000E28 C7 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00
(214)
(215) TABLE-US-00005 TABLE 5 Manipulator distribution concerning the exemplary embodiment from FIG. 9. 5.121.1 XY 5.121.2 XYZ tilt 5.123.3 Exchange/Aspherization 5.121.3 Heating/Cooling 5.121.4 Z 5.122.5 XYZ tilt 5.122.6 Heating/Cooling 5.122.7 Deforming 5.121.8 XY 5.121.9 Deforming 5.121.10 XY 5.121.11 XY 5.121.12 XYZ tilt
(216) The manipulators 5.123.3 and 5.121.3 and 5.122.5, 5.122.6 and 5.122.7 can be used alternatively or in combination at the plane plate. Between 79 and 509 degrees of freedom are accordingly obtained.
(217) Table 6 lists the design data of the exemplary embodiment from
(218) TABLE-US-00006 TABLE 6 Design data concerning the exemplary embodiment from FIG. 9. NA: 1.2 Wavelength: 193.37 nm 2Y: 33.0 mm beta: 0.25 FN Radius Thickness/Distance Medium Refractive index at 193.37 nm free diameter 0 0.000000 0.000000 AIR 1.0003096 66.000 1 0.000000 29.975639 AIR 1.0003096 66.000 2 585.070331 17.118596 SIO2 1.5607857 76.447 3 766.901651 0.890161 HELIUM 1.0000329 78.252 4 145.560665 45.675278 SIO2 1.5607857 85.645 5 2818.543789 40.269525 HELIUM 1.0000329 83.237 6 469.396236 29.972759 SIO2 1.5607857 75.894 7 193.297708 21.997025 HELIUM 1.0000329 73.716 8 222.509238 27.666963 SIO2 1.5607857 57.818 9 274.231957 16.483375 HELIUM 1.0000329 52.595 10 0.000000 10.117766 SIO2 1.5607857 36.873 11 0.000000 30.361487 HELIUM 1.0000329 39.808 12 26971.109898 14.803554 SIO2 1.5607857 54.127 13 562.070426 45.416373 HELIUM 1.0000329 58.058 14 510.104298 35.926312 SIO2 1.5607857 76.585 15 118.683707 36.432152 HELIUM 1.0000329 80.636 16 0.000000 199.241665 HELIUM 1.0000329 86.561 17 181.080772 199.241665 HELIUM 1.0000329 147.683 18 153.434246 199.241665 HELIUM 1.0000329 102.596 19 0.000000 36.432584 HELIUM 1.0000329 105.850 20 408.244008 54.279598 SIO2 1.5607857 118.052 21 296.362521 34.669451 HELIUM 1.0000329 118.397 22 1378.452784 22.782283 SIO2 1.5607857 106.566 23 533.252331 0.892985 HELIUM 1.0000329 105.292 24 247.380841 9.992727 SIO2 1.5607857 92.481 25 103.088603 45.957039 HELIUM 1.0000329 80.536 26 1832.351074 9.992069 SIO2 1.5607857 80.563 27 151.452362 28.883857 HELIUM 1.0000329 81.238 28 693.739003 11.559320 SIO2 1.5607857 86.714 29 303.301679 15.104783 HELIUM 1.0000329 91.779 30 1016.426625 30.905849 SIO2 1.5607857 95.900 31 258.080954 10.647394 HELIUM 1.0000329 99.790 32 1386.614747 24.903261 SIO2 1.5607857 108.140 33 305.810572 14.249112 HELIUM 1.0000329 112.465 34 11755.656826 32.472684 SIO2 1.5607857 124.075 35 359.229865 16.650084 HELIUM 1.0000329 126.831 36 1581.896158 51.095339 SIO2 1.5607857 135.151 37 290.829022 5.686977 HELIUM 1.0000329 136.116 38 0.000000 0.000000 HELIUM 1.0000329 131.224 39 0.000000 28.354383 HELIUM 1.0000329 131.224 40 524.037274 45.835992 SIO2 1.5607857 130.144 41 348.286331 0.878010 HELIUM 1.0000329 129.553 42 184.730622 45.614622 SIO2 1.5607857 108.838 43 2501.302312 0.854125 HELIUM 1.0000329 103.388 44 89.832394 38.416586 SIO2 1.5607857 73.676 45 209.429378 0.697559 HELIUM 1.0000329 63.921 46 83.525032 37.916651 CAF2 1.5017542 50.040 47 0.000000 0.300000 SIO2 1.5607857 21.479 48 0.000000 0.000000 SIO2 1.5607857 21.115 49 0.000000 3.000000 H2O 1.4364132 21.115 50 0.000000 0.000000 H2O 1.4364132 16.505 aspherical constants FN 2 5 7 12 14 K 0 0 0 0 0 C1 5.72012211E08 4.71048005E08 1.75086747E07 8.29030145E08 4.34813024E08 C2 2.97210914E13 7.03645794E12 1.17024854E11 1.87068852E13 1.58782568E12 C3 1.03373633E18 1.09436502E16 1.34272775E15 7.03882158E16 6.81156672E17 C4 2.75620768E20 2.90375326E20 5.44275165E20 6.64851833E20 5.02561613E21 C5 1.51222259E24 1.55397282E27 1.81522008E24 1.33132348E23 1.68149079E29 C6 1.03524191E30 5.61276612E29 2.56002395E28 2.45514238E27 2.36033151E29 C7 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 FN 17 18 23 31 32 K 1.9785 2.0405 0 0 0 C1 2.94495560E08 5.77041586E08 7.05738830E08 3.41405490E08 4.84935278E08 C2 2.62639190E13 5.00405031E13 4.10958857E12 4.06789648E14 9.87851350E13 C3 6.10861502E18 2.67421248E17 1.18483664E16 8.09527811E17 7.36716691E17 C4 1.10681541E22 5.69249001E22 2.92033013E21 4.34256348E21 6.56379364E21 C5 2.00600333E27 1.89054849E26 3.23306884E26 7.59470229E25 6.53011342E25 C6 2.08120710E32 1.48621356E31 2.18022642E31 3.40748705E29 2.88019310E29 C7 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 0.00000000E+00 FN 34 40 43 K 0 0 0 C1 1.58884127E08 4.10094031E08 3.89229775E08 C2 1.51417786E12 3.03513679E13 4.76248499E12 C3 6.61629402E19 5.71449385E17 2.23473391E16 C4 1.71961448E21 1.72291437E21 8.89371535E21 C5 9.35857585E26 9.60153088E28 2.41148420E25 C6 2.35940587E30 3.81030848E31 3.42843475E30 C7 0.00000000E+00 0.00000000E+00 0.00000000E+00 C8 0.00000000E+00 0.00000000E+00 0.00000000E+00 C9 0.00000000E+00 0.00000000E+00 0.00000000E+00
(219)
(220) After passing through the mask, which is generally defined as a binary chrome or phase-shifting mask, the illumination light reaches the projection apparatus and the objective 110 therein. Said objective is operated with a diaphragm position corresponding to a sigma setting that is optimum for the imaging of the reticle currently being used. The sigma setting is defined as the quotient of output-side aperture of the illumination system and input-side aperture of the objective.
(221) During the exposure of a die, in the event of a change from die to die, in the event of a change from wafer to wafer, in the event of a change from reticle to reticle, or in the event of a change from batch to batch, the projection optical assembly is brought to specification again by regulation or control of manipulators upon the determination of an overshooting of an upper bound for a specified image aberration. This also holds true if, as an alternative or in addition, an overshooting of an upper bound for a specification of a manipulator is ascertained. This regulation is effected within a time period of 30000 ms, preferably 10000 ms, very preferably 5000 ms, extremely preferably 1000 ms, most preferably 200 ms, ideally 20 ms, very ideally 5 ms, extremely ideally 1 ms.
(222) This fine adjustment can also be effected regularly in time intervals of 30000 ms, preferably 10000 ms, very preferably 5000 ms, extremely preferably 1000 ms, most preferably 200 ms, ideally 20 ms, very ideally 5 ms, extremely ideally 1 ms.
(223) All three forms of adjustment differ in the steps (i) Determining the manipulator movements with solution of the inverse problem, (ii) Moving the manipulators to the new movements determined, in accordance with the solution to the inverse problem.
(224) The above time intervals for realizing steps (i) and (ii) are advantageously approximately halved in each case: 15000 ms, preferably 5000 ms, very preferably 2000 ms, extremely preferably 500 ms, most preferably 100 ms, ideally 10 ms, very ideally 2 ms, extremely ideally 0.5 ms. In the case of a relatively sluggish manipulator it is also possible to employ other ratios such as, for example: 1.5 s, preferably 500 ms, very preferably 200 ms, extremely preferably 50 ms, most preferably 10 ms, ideally 1 ms, very ideally 0.2 ms, extremely ideally 0.05 ms.