METHOD FOR MEASURING CHARACTHERISTIC OF THIN FILM

Abstract

A method for measuring a characteristic of a thin film is disclosed. The method includes a) obtaining a measured spectrum from a target region on the substrate by using a spectroscopic ellipsometer, b) obtaining a physical model capable of obtaining an estimated parameter value related to the characteristic of the thin film through regression analysis of the measured spectrum, c) obtaining a machine learning model capable of obtaining a reference parameter value related to the characteristic of the thin film by using the measured spectrum, and d) obtaining an integrated model which uses an integrated error function capable of considering both of a first error function and a second error function, and obtaining an optimum parameter value through regression analysis of the integrated model.

Claims

1. A method for measuring a characteristic of a thin film formed on a substrate, the method comprising: a) obtaining a measured spectrum (S.sub.E) from a target region on the substrate by using a spectroscopic ellipsometer, b) obtaining a physical model (M.sub.P) capable of obtaining an estimated parameter value (P.sub.P) related to the characteristic of the thin film through a regression analysis of the measured spectrum (S.sub.E), c) obtaining a machine learning model (M.sub.ML) capable of obtaining a reference parameter value (P.sub.ML) related to the characteristic of the thin film by using the measured spectrum (S.sub.E), and d) obtaining an integrated model which uses an integrated error function (f) capable of considering both of a first error function (f.sub.1) between the measured spectrum (S.sub.E) and a calculated spectrum (S.sub.P) by the physical model (M.sub.P) and a second error function (f.sub.2) between the estimated parameter value (P.sub.P) input into the physical model (M.sub.P) in order to obtain the calculated spectrum (S.sub.P) and the reference parameter value (P.sub.ML), and obtaining an optimum parameter value (P.sub.BEST) through a regression analysis of the integrated model.

2. The method of claim 1, wherein the first error function (f.sub.1) is expressed with equation 1 below: $\begin{matrix} f_{1} = \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{{S_{E, n} - S_{P, n})}^{2}}{σ_{E, n}^{2}}], & [Equation 1] \end{matrix}$ wherein N indicates the number of wavelength points of the measured spectrum (S.sub.E), M indicates the number of variables of the measured spectrum (S.sub.E), W.sub.1 indicates a weighted value of the first error function (f.sub.1), and σ.sub.E indicates a standard deviation of values of the measured spectrum (S.sub.E) at a corresponding wavelength point.

3. The method of claim 1, wherein the obtaining of the machine learning model at (c) is performed by obtaining the machine learning model (M.sub.ML) through machine training using both of the measured spectrum (S.sub.E) and the calculated spectrum (S.sub.P) generated using the physical model (M.sub.P).

4. The method of claim 1, wherein the second error function (f.sub.2) is expressed with equation 2 below: $\begin{matrix} f_{2} = W_{2} {.Math.}_{m = 1}^{M} [\frac{{(P_{ML, m} - P_{P, m})}^{2}}{σ_{P, m}^{2}}], & [Equation 2] \end{matrix}$ wherein M indicates the number of variables, W.sub.2 indicates a weighted value of the second error function, and σ.sub.P indicates a standard deviation of values of the estimated parameter value (P.sub.P).

5. The method of claim 1, wherein the integrated error function (f) is expressed with equation 3 below: $\begin{matrix} f = f_{1} + f_{2} & [Equation 3] \end{matrix}$ $= \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{{(S_{E, n} - S_{P, n})}^{2}}{σ_{E, n}^{2}}] + W_{2} {.Math.}_{m = 1}^{M} [\frac{{(P_{ML, m} - P_{P, m})}^{2}}{σ_{P, m}^{2}}]$ wherein N indicates the number of wavelength points of the measured spectrum (S.sub.E), M indicates the number of variables, W.sub.1 indicates a weighted value of the first error function, σ.sub.E indicates a standard deviation of values of the measured spectra (S.sub.E) at a corresponding wavelength point, W.sub.2 indicates a weighted value of the second error function (f.sub.2), and σ.sub.P indicates a standard deviation of values of the estimated parameter value (P.sub.P).

6. The method of claim 1, wherein in the obtaining of the integrated model at (d), by using equation 4 below which is obtained by partial differentiating the integrated error function (f) by the estimated parameter value (P.sub.P), a size or direction of the estimated parameter value (P.sub.P) is adjusted during a process of the regression analysis, the equation 4 is expressed as: $\begin{matrix} \frac{\partial f}{\partial P_{P, m}} = \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{(S_{E, n} - S_{P, n})}{σ_{E, n}^{2}} (\frac{\partial S_{P, n}}{\partial P_{P, m}})] + 2 W_{2} \frac{(P_{ML, m} - P_{P, m})}{σ_{P, m}^{2}}, & [Equation 4] \end{matrix}$ wherein N indicates the number of wavelength points of the measured spectrum (S.sub.E), M indicates the number of variables, W.sub.1 indicates a weighted value of the first error function (f.sub.1), σ.sub.E indicates a standard deviation of values of the measured spectrum (S.sub.E) at a corresponding wavelength point, W.sub.2 indicates a weighted value of the second error function (f.sub.2), and σ.sub.P indicates a standard deviation of values of the estimated parameter value (P.sub.P).

Description

BRIEF DESCRIPTION OF DRAWINGS

[0045] FIG. 1 is a flow chart of a method for measuring a characteristic of a thin film according to an embodiment of the present disclosure.

[0046] FIG. 2 is a flow chart of obtaining a machine learning model in FIG. 1.

[0047] FIG. 3 is a flow chart of obtaining an integrated model and an optimum parameter value in FIG. 1.

DETAILED DESCRIPTION

[0048] Hereinbelow, exemplary embodiments of the present disclosure will be described with reference to accompanying drawings. However, various changes to the embodiments of the present disclosure are possible and the scope of the present disclosure is not limited to the following embodiments. The embodiments of the present disclosure are presented to make complete understanding of the present disclosure and help a person of ordinary skill in the art best understand the present disclosure. Therefore, it should be understood that the shape and size of the elements shown in the drawings may be exaggeratedly drawn to provide an easily understood description, and the same reference numerals will be used throughout the drawings and the description to refer to the same or like elements or parts.

[0049] FIG. 1 is a flow chart of a method for measuring a characteristic of a thin film according to an embodiment of the present disclosure. According to an exemplary embodiment, the characteristic of a thin film includes not only the thickness of the thin film, but also arbitrary parameters input into the physical model such as a refractive index (n), an extinction coefficient (k), etc. The method for measuring a characteristic of a thin film of the exemplary embodiment may be used to measure at least one parameter of the above-mentioned parameters.

[0050] As shown in FIG. 1, according to the embodiment of the present disclosure, the method for measuring a characteristic of a thin film starts with obtaining a measured spectrum S.sub.E from a target region on a substrate by using a spectroscopic ellipsometer, at S1.

[0051] In this stage, a spectrum is obtained from a single-layered or multi-layered thin film deposited on the substrate by using the spectroscopic ellipsometer. The substrate may be a metal substrate or a semiconductor substrate. The thin film layer may be a layer constituting a semiconductor device or an electronic device such as a display device, a solar cell, etc. The thin film layer may be a semiconductor or a metal layer. The substrate may be in a fixed state to a stage or a table of a deposition chamber forming the thin film.

[0052] The spectroscopic ellipsometer is a device radiating polarized light to the target region and then measuring variation of polarization of light returned from the target region. The spectroscopic ellipsometer may include a lighting system and a spectrometer.

[0053] The lighting system may emit the polarized light on the target region. The polarization may be linear polarization. The light may be light having a predetermined wavelength band.

[0054] The spectrometer may measure a polarized state of a light reflected from the target region after being incident to the target region by the lighting system. The reflected light may be changed in the polarized state while being reflected. For example, the reflected light may have an elliptically polarized state.

[0055] The spectroscopic ellipsometer may obtain a spectrum indicating variation of α value and a β value according to photon energy or wavelength, or a spectrum indicating variation of a Ψ value and a Δ value. The α value is a cosine Fourier coefficient according to continuous rotation of a polarizer, an analyzer, or a compensator, and the β value is a sine Fourier coefficient. The Δ value is a phase difference that a P wave and a S wave that are incident to the target region with the same phase obtain after being reflected, and the Ψ value indicates an angle of a reflected coefficient ratio (tan Ψ) of the p wave and the S wave of the reflected light.

[0056] The obtained measured spectrum includes information of each thin film of a layer of the substrate, but does not obtain the characteristic of each thin film of each layer from the measured spectrum by direct conversion, and may obtain the characteristic with the regression analysis using modeling.

[0057] Next, through the regression analysis, a physical model M.sub.P that may obtain parameter values related to the thin film characteristic such as a film thickness value is obtained, at S2.

[0058] As described above, since the measured spectrum provides only indirect information about the thickness and physical property of each thin film, the thickness of each thin film cannot be calculated from the measured spectrum.

[0059] In order to calculate the thickness, etc. of the film, the physical model for interpreting the measured spectrum may be obtained. In this stage, a physical model M.sub.P which may obtain an estimated parameter value P.sub.P related to the characteristic of the thin film such as the thickness of the film is obtained through the regression analysis of the measured spectrum S.sub.E.

[0060] According to an exemplary embodiment, the physical model is mainly referred to as a multi-layered thin film model consisting of the thickness of each thin film of a target sample and an optical constant, but herein, includes ‘an error function’ referred in comparison of a measured spectrum and a model value. As the optical constant, a value mainly expressed with a refractive index, an extinction coefficient, or a complex dielectric function may be used, and a constant related to a characteristic of an optical system of a measuring equipment may be included in the optical constant. Other constant values or a value by an optical dispersion model may be used as optical constant depending on wavelength, and as the dispersion model, depending on an optical characteristic of a substance, the Lorentz harmonic oscillator model, the Drude free-electron model, the Cauchy model, the Sellmeir model, the Forouhi-Bloomer model, the Tauc-Lorentz model, etc. may be used.

[0061] Parameter values in the initial physical model before optimization may be used by referring a basic value existing in a process plan or a value measured through other reference device, and an optical physical property (optical constant) is separately obtained when the substance is not an existing used substance and is a new substance and is used as an initial value. The physical model may be optimized through a process below.

[0062] When fitting is performed with respect to unknown parameters including the thickness, the optical constant (complex refractive index), etc. of the thin film included in the physical model M.sub.P by using a nonlinear regression analysis algorithm, the estimated parameter value P.sub.P related to the thin film characteristic such as the thin film thickness by the physical model M.sub.P may be obtained.

[0063] The fitting is referred to as the regression analysis process of finding combination of target parameters in which an error function of spectrums S.sub.E by measurement and spectrums by the physical model M.sub.P. The parameter value obtained by the fitting is a value of the estimated parameter value P.sub.P by the physical model M.sub.P. In general, as described above, the estimated parameter value P.sub.P obtained through the process of minimizing the error function of the measured spectrums and the spectrums by the physical model M.sub.P may be used as a measured value of the spectroscopic ellipsometer.

[0064] However, when referring only to the error function, there is a limit to evaluating and optimizing ‘key performance indicators’ of the model such as accuracy and precision of the model for calculating a state of a super-precision micro process such as semiconductor, display process, etc. or a change of the state.

[0065] Therefore, on the actual process, even data repeatability in response to measuring space and temporal variation and matching with reference data are required, and to this end, sensitivity or the effect in correlation of parameters to be analyzed need to be considered together. A process of testing and evaluating about whether or not ‘the key performance indicators’ are satisfied is equally required in estimation of each model of S2, S3, and S4.

[0066] As described in the prior part of [Background art], a model considering effects of ‘constancy of spectrum measuring device’, ‘sensitivity to spectrum of measurement parameter’, ‘correlation to spectral change between measurement parameters’ should be designed and optimization of the model should be performed.

[0067] When improvement of sensitivity and correlation of a parameter is difficult through improvement in the measured spectrum, in the analysis method, relationship between parameters and error functions should be variously re-defined to optimize the model. For the optimization, determination based on a lot of experience in the relevant application field is required, and the time required for optimization is very long.

[0068] According to an embodiment, although complex relationships between a plurality of parameters can be determined empirically as analytical numerical relationships or conclusively obtained, accuracy and reproducibility are very low, and a lot of trial and error is required for optimization.

[0069] Therefore, in exemplary embodiments, a machine learning method is adopted as described below so as to overcome the problem and the limitation described above. To this end, in the next stage, a machine learning model M.sub.ML which can obtain a reference parameter value P.sub.ML by using an arbitrary measured spectrum S.sub.E is obtained, at S3.

[0070] FIG. 2 is a flow chart of obtaining a machine learning model in FIG. 1. As shown in FIG. 2, the obtaining of the machine learning model includes generating training data for machine learning by performing labeling with a plurality of measured spectrums S.sub.E and actual parameter values corresponding to the spectrums, at S31, and leaning the training data, at S33, verifying and testing at S34, evaluating whether or not ‘the key performance indicators’ are satisfied, at S35.

[0071] The actual parameter values may be values measured by other measuring devices, for example, a transmission electron microscopy, a cross-sectional transmission electron microscopy, a spectroscopic reflectometry, an imaging reflectometry, or a measuring device that is standard referenced in the existing process.

[0072] Furthermore, exemplary embodiments may include generating training data by using the physical model, at S32. When the data of the actual parameter values is not sufficient, when necessary, additionally, the estimated parameter value P.sub.P, such as the thickness, corresponding to the calculated spectrum S.sub.P from the physical model M.sub.P obtained in S1 may be used as the training data for the machine learning.

[0073] When the arbitrary measured spectrum S.sub.E is input into the machine learning model M.sub.ML a generated in this stage, the reference parameter value P.sub.ML by the machine learning model M.sub.ML a may be obtained. The reference parameter value P.sub.ML is a significant value having statistically high accuracy and repeatability, and is a value expected as a value close to an optimum parameter value P.sub.BEST making an error function of the measured spectrum S.sub.E and the physical model M.sub.P into a global minimum value. Therefore, the reference parameter value P.sub.ML is low in sensitivity of a spectrum and high in correlation with other parameters. Therefore, when the thin film thickness is very high uncertainty or a range of a parameter is wider than the degree of change of a spectrum, the reference parameter value P.sub.ML may be used to improve the problem in which during fitting of the parameter, a parameter value does not find a global minimum and approaches a local minimum value.

[0074] Hereinbelow, a stage of obtaining an integrated model, i.e. a new physical model with a method of combining the machine learning model to the physical model, and obtaining the optimum parameter value P.sub.BEST from the integrated model, at S4, will be described.

[0075] FIG. 3 is a flow chart of obtaining the integrated model and the optimum parameter value in FIG. 1, at S4. As shown in FIG. 3, this stage includes obtaining of the reference parameter value P.sub.ML by the machine learning model at S4, obtaining of the integrated model at S42, and performing of a regression analysis by the integrated model at S43. When passing through this stage, an optimized integrated model M.sub.BEST and the optimum parameter value P.sub.BEST can be obtained.

[0076] Furthermore, when the optimized integrated model M.sub.BEST is obtained, the regression analysis of the optimized integrated model M.sub.BEST is performed with the arbitrary measured spectrum S.sub.E so as to directly obtain the optimum parameter value P.sub.BEST.

[0077] In the obtaining of the reference parameter value P.sub.ML at S41, the measured spectrum is input into the machine learning model to obtain the reference parameter value P.sub.ML.

[0078] In the obtaining of the integrated model at S42, the machine learning model M.sub.ML obtained from the process in FIG. 2 is combined to the physical model M.sub.P obtained from the process in FIG. 1. In other words, in a method of referring to the reference parameter value P.sub.ML (film thickness, film optical constant (complex refractive index), incident angle, wavelength, etc.) obtained by inputting the measured spectrum data S.sub.E into the machine learning model M.sub.ML, the machine learning model M.sub.ML obtained through the process in FIG. 2 is combined to the existing physical model M.sub.P generate in FIG. 1.

[0079] According to an exemplary embodiment, a model algorithm is configured to refer to the reference parameter value P.sub.ML by applying the reference parameter value P.sub.ML to a section such as the fitting algorithm or mean square error, as described above.

[0080] Furthermore, in the performing of the regression analysis of the integrated model at S43, a parameter value is obtained from the measured spectrum S.sub.E through the regression analysis of the integrated model, and the optimized integrated model M.sub.BEST finally desired may be obtained by evaluating and optimized the mean square error and the parameter value on the basis of ‘the key performance indicators’ of the model. After then, the optimized integrated model M.sub.BEST is used in the thin film characteristic analysis to obtain the optimum parameter value P.sub.BEST.

[0081] In order to perform optimization for the model to satisfy ‘the key performance indicators’, various parameters of the model need to be adjusted. For example, a wavelength range of spectrum and a weighted value for each range, types and number of parameters for which a value is to be found, etc. are appropriately selected. The optimization of the parameters should be performed in the direction that satisfies ‘the key performance indicators’ through ‘the regression analysis’ in common. According to an exemplary embodiment, a part that tries to apply a value of machine learning of the parameters to the model is an algorithmic formula part of ‘the regression analysis’, and the part allows determination of parameters with optimized high precision.

[0082] To this end, the stage uses the error function (equation 3) obtained by adding the error function (equation 2) with reference to the estimated parameter value P.sub.P used in spectral calculation using the physical model M.sub.P and the reference parameter values P.sub.ML obtained through the machine learning model M.sub.Ma to the basic error function (equation 1) related to a difference between the measured spectrum S.sub.E in the regression analysis and the spectrum S.sub.P obtained by using the physical model M.sub.P.

[0083] Like equation 4, when the degree of change of error function according to change of parameter values in the regression analysis algorithm such as Levenberg-Marquardt is calculated, a method of using the reference parameter value P.sub.ML derived from the machine learning model M.sub.ML will be proposed. According to an exemplary embodiment, the reference parameter values are values given through the machine learning model M.sub.ML.

[0084] However, in order to perform optimization of the model by using statistical advantage of machine learning sufficiently, for example, the estimated parameter values P.sub.P such as thin film thickness are obtained by performing the fitting in a direction that minimizes sum of differences so as to minimize a difference between the reference parameter value P.sub.ML obtained through the machine learning model M.sub.ML and the estimated parameter value P.sub.P input into the physical model M.sub.P. On the other hand, the size or direction of change in a parameter inside the regression analysis algorithm such as Lavenberg-Marquardt may be controlled within a weighted value or a range with reference to the reference parameter value P.sub.ML.

[0085] The integrated error function used herein consists of mainly two sections, as follows. the integrated error function (f)={circle around (1)} the first error function (f.sub.1)+{circle around (2)} the second error function (f.sub.2)

[0086] The first error function relates to a difference of spectrum S.sub.P theoretically calculated by using the measured spectrum S.sub.E measured by a device and the physical model M.sub.P, and various error functions, such as mean squared error (MSE), root mean squared error, mean absolute error, mean absolute percentage error, mean percentage error, etc., may be used as evaluation indicators of the physical model.

[0087] For example, the MSE value of Equation 1 below may be used as the first error function. Smallness of the first error function means smallness of difference of spectrums S.sub.P obtained using the measured spectrum S.sub.E and the physical model M.sub.P (high agreement).

[00005] $\begin{matrix} f_{1} = \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{{S_{E, n} - S_{P, n})}^{2}}{σ_{E, n}^{2}}] & [Equation 1] \end{matrix}$

[0088] Here N indicates the number of wavelength points of the measured spectrum (S.sub.E), M indicates the number of variables of the measured spectrum (S.sub.E), W.sub.1 indicates a weighted value of the first error function (f.sub.1), and σ.sub.E indicates a standard deviation of values of the measured spectrum (S.sub.E) at a corresponding wavelength point, and act as weighted values for each wavelength. According to an exemplary embodiment, when the weighted value W.sub.1 is used in a shape adding with the second error function, the weighted value W.sub.1 of the first error function may be used as a weighted value.

[0089] The second error function in Equation 2 is a parameter error function related to a difference between the reference parameter value P.sub.ML and the estimated parameter value P.sub.P used in the physical model M.sub.P.

[00006] $\begin{matrix} f_{2} = W_{2} {.Math.}_{m = 1}^{M} [\frac{{(P_{ML, m} - P_{P, m})}^{2}}{σ_{P, m}^{2}}] & [Equation 2] \end{matrix}$

[0090] Here, W.sub.2 is used to assign a relative weighted value with respect to the second error function, as a weighted value of the parameter error function. σ.sub.P indicates a standard deviation of the estimated parameter value P.sub.P. According to an exemplary embodiment, the reference parameter value Pw uses the estimated parameter value P.sub.P given through the machine learning model M.sub.ML. In some case, arbitrarily given value may be used.

[0091] Smallness of the second error function means smallness of a difference between the estimated parameter value P.sub.P used in the process of obtaining the theoretical spectrum S.sub.P and the reference parameter value P.sub.ML obtained through the machine learning model M.sub.ML (high agreement).

[0092] Equation 3 descried below is the sum of the first error function and the second error function, and the sum is the integrated error function that considers differences between the estimated parameter value P.sub.P and the reference parameter value Pw for not only spectrum, but also parameters at the same time. Here, W.sub.1 and W.sub.2 are relative weighted values, and as the sum of W.sub.1 and W.sub.2 is set as 1, relative weighted values may be assigned.

[00007] $\begin{matrix} f = f_{1} + f_{2} & [Equation 3] \end{matrix}$ $= \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{{(S_{E, n} - S_{P, n})}^{2}}{σ_{E, n}^{2}}] + W_{2} {.Math.}_{m = 1}^{M} [\frac{{(P_{ML, m} - P_{P, m})}^{2}}{σ_{P, m}^{2}}]$

[0093] Equation 4 may be used, as partial differentiation of the integrated error function (f) for any parameter, when the size or direction of change of the estimated parameter values P.sub.P inside the regression analysis algorithm is determined.

[00008] $\begin{matrix} \frac{\partial f}{\partial P_{P, m}} = \frac{W_{1}}{N - M} {.Math.}_{n = 1}^{N} [\frac{(S_{E, n} - S_{P, n})}{σ_{E, n}^{2}} (\frac{\partial S_{P, n}}{\partial P_{P, m}})] + 2 W_{2} \frac{(P_{ML, m} - P_{P, m})}{σ_{P, m}^{2}} & [Equation 4] \end{matrix}$

[0094] In this stage, through the fitting process of changing the estimated parameter values P.sub.P are changed and finding the optimized value (mostly, minimized value) of the integrated error function, the optimum parameter value P.sub.BEST is obtained.

[0095] Although the embodiment of the present disclosure has been disclosed for illustrative purposes, and the present disclosure is not limited to the embodiment disclosed in the detailed description, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims, and also various alternatives, modifications, equivalents and other embodiments that may be included within the spirit and scope of the present disclosure.

METHOD FOR MEASURING CHARACTHERISTIC OF THIN FILM

Inventors

Cpc classification

Classification Explorer

G01N21/211

PHYSICS

Classification Explorer

G01N2201/126

PHYSICS

Classification Explorer

G01N2021/213

PHYSICS

International classification

Classification Explorer

G01B11/28

PHYSICS

Classification Explorer

G01N21/21

PHYSICS

Abstract

Claims

Description