SMART SKIP TESTING METHOD FOR SEMICONDUCTOR MANUFACTURING
20220122864 · 2022-04-21
Inventors
Cpc classification
International classification
H01L21/67
ELECTRICITY
Abstract
Provided is a method for predicting and classifying yield to determine downstream testing steps. The method comprises obtaining and preprocessing historical input data from a semiconductor fabrication process, setting a yield threshold for a yield classification, and training a model using the historical input data as a training dataset. The model is configured to determine from a set of input data whether any of the wafers or lots have higher yield than the yield threshold and can skip next testing. The yield threshold is optimized during the model training to identify an optimal yield threshold at which total cost of wafer sorting, die assembly, and final test is minimal. The trained model is deployed and used for the yield prediction and classification using real time input data from semiconductor manufacturing, resulting in substantial savings in cost and test time and effectively increasing test capacity.
Claims
1. A method for predicting and classifying yield to determine downstream testing steps, comprising: obtaining and preprocessing historical input and labeling data from a semiconductor fabrication process, setting a unit yield threshold for unit yield classification, training of at least one unit yield model using historical input and labeling data as a training dataset, the unit yield model trained to determine from the training dataset whether any of the plurality of units have yield higher than the unit yield threshold and can skip next testing step, deploying of at least one unit yield model to evaluate the input of a current production unit, predicting, by employing the trained unit yield model and current input data, whether the unit has higher yield than the unit yield threshold, recommending whether the unit can skip next testing step.
2. The method of claim 1, wherein the input data is selected from a data group consisting of in-process data, and process control monitoring data from a semiconductor fabrication process and labeling data is the unit yield acquired from historical wafer sort data.
3. The method of claim 1, further comprising: acquiring an ensemble of cost data from wafer sorting, die assembly, and final test process steps, setting the unit yield threshold for unit yield classification, wherein the cost data is used for the unit yield threshold calculation.
4. The method of claim 3, further comprising: optimizing the unit yield threshold during unit yield model training to identify an optimal unit yield threshold at which the total cost of wafer sorting, die assembly, and final test is minimal with respect to the model's characteristics.
5. The method of claim 4, further comprising: predicting by employing the trained unit yield model using the current input data, whether a unit has higher yield than the optimal unit yield threshold.
6. The method of claim 1, wherein the unit corresponds to a wafer and the unit yield model is a wafer yield model.
7. The method of claim 1, wherein the unit corresponds to a lot and the unit yield model is a lot yield model.
8. The method of claim 7, wherein the lot yield model is trained using the wafer yield model.
9. The method of claim 1, wherein the next testing step is the wafer sort testing.
10. The method of claim 1, wherein the current input data is a stream of real time input data.
11. The method of claim 1, wherein the recommending step further comprises a step of automatically implementing the recommended decision in a manufacturing execution system.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
DETAILED DESCRIPTION
1. Overview
[0028] A method is described for using semiconductor in-process, PCM and WS data to adaptively modify production testing steps. In general, the embodiments described herein may be referred to as Smart Skip Testing method. The wafer level testing can include but is not limited to an individual die functional and reliability testing, or specific tests that employ high-temperature, high voltage testing or the like.
[0029] It is an object of the present invention to provide test cost reduction related to wafer level testing of IC's, diodes and transistors, in general of any device manufactured on a semiconductor wafer.
[0030] It is a further object of the present invention to significantly leverage the available test capacity for the wafer level testing of semiconductor devices, which in turn will increase the effective probe capacity.
[0031] Another object is to reduce the production cycle time, when a wafer or a lot that skip the WS can be sent directly to the next process step, such as assembly, after the wafer maps have been generated.
[0032] The above objects as well as further objects that will become apparent from an ensuing description are accomplished by a method according to the present invention.
2. Machine Learning
[0033] Machine Learning (ML) is a subset of Artificial intelligence (AI) and is located at the intersection of Statistics and Computer Science. ML/AI can be used to analyze large volumes of data and to generate predictions that help make pass/fail recommendations. Such recommendations can increase in accuracy with the volume of data analyzed. Moreover, by retraining the ML models, the recommendations can adapt to the latest distribution of the data. ML/AI is particularly useful for analysis of modern chip manufacturing and test data due to its ability to uncover nonlinear interactions in highly multidimensional data, as well as identify features that are important in predicting eventual die outcomes. The ML/AI approach is sensitive to the quality of the data and can be computationally intensive. Careful attention must therefore be paid to the statistical robustness of the results and to effective parallelization of the computational process.
[0034] The ML/AI methods open up the possibility of incorporating data analysis to make adjustments to the process flow in order to reduce cost and optimize processes including inventory management and control. Thus, if yield excursions can be predicted for a given wafer population, then those wafers could be routed for complete testing. Conversely, if yield is predicted to be good, then that wafer population could undergo a much lighter testing down the line.
[0035] Predictive analytics and ML/AI algorithms can thus be used to address many of the challenges facing the semiconductor industry. By drilling deeper into the details of semiconductor manufacturing and knowing how to apply predictive analytics to detect and resolve process issues faster, and to tighten and target the specifications of individual manufacturing steps, increased process efficiencies can result. Some of the known ML algorithms include but are not limited to: a tree-based algorithm, a neural net based algorithm, neighborhood-based approach, and others.
3. Machine Learning Method for Smart Skip Testing
[0036] Any machine learning model is created in a process called training (including validation and testing), and then applied to accomplish prediction. The model is trained for a unit yield classification, where the unit may correspond to a wafer or group of wafers such as a lot.
[0037] In one embodiment the training process 200 of the wafer yield classification model is illustrated in
[0038] The training process involves providing data from various stages of the semiconductor process to learn from. The training data may comprise some or all of the following types of historical input data: in-process data, PCM data, WS, ASSY and FT cost data, and labeling data that can be wafer yield acquired from the WS data.
[0039] The data is preprocessed in step 210 of
[0040] In step 230, an appropriate training algorithm is run on the preprocessed data to train the wafer yield classification model. Not all data listed above are required. The output of the training step 230 is a trained model 240.
[0041]
[0042] In another embodiment the process can be implemented to provide a lot classification. In this embodiment the training process 400 of the lot yield classification model is illustrated in
4. Method Modules
[0043] In one embodiment, the Machine Learning method disclosed herein includes following modules: (A) input and labeling data; (B) data preprocessing; (C) yield threshold for classification, (D) model training; (E) prediction and classification.
[0044] A. Input and Labeling Data
[0045] The training step can require some or all of the following types of historical input data: in-process data comprising measurement and defect data; PCM data; WS, ASSY and FT cost data; and labeling data that is wafer or lot yield acquired from the WS data.
[0046] The in-process data is obtained in step 110 of
[0047] In step 120 the PCM data is taken directly from the various test structures placed on at least a few predefined test sites per wafer (e.g. 5, 9, 13, etc.) or on all test sites. A large number of electrical parameters are measured from the test structures. These measurements may include but are not limited to a MOS transistor threshold voltage, a gate width, a current gain, a breakdown voltage, a contact and a via chain resistance, film resistor properties, interconnect integrity, and interconnect resistance.
[0048] The WS data is the result of step 130 from performing a plurality of, generally electrical, tests on individual integrated circuits formed on the wafers. These tests verify the functionality of the finished circuits. The labeling or bin data (die pass/fail) is acquired from the WS data and is used as a response for yield prediction and classification during the training process.
[0049] The cost data 220 is collected as a wafer sort in step 130 cost per wafer, die assembly in step 140, and final test in step 150 costs per device or package. The cost data can be updated on regular basis or if a systematic change is made.
[0050] The prediction step can require the same data as the training step, except for the labeling data.
[0051] Data inputs can be made available through a variety of methods, including but not limited to: download from relational, or NoSQL database and direct parsing from CSV or XML files in specified file location (e.g. cost data). Data download can be operated in both interactive and automated regimes.
[0052] B. Data Preprocessing
[0053] In one embodiment, the downloaded data is converted into data appropriate to be inputted to the ML models. The data preprocessing may be performed differently according to data types and characteristics.
[0054] Specifically, the following are some examples of data validations that can be included in the disclosed method (non-exhaustive list): comparison of common statistical quantities against predefined limits; use of minimum, maximum, average, standard deviation, percentiles, correlations; uniqueness checks for categorical and string data types (e.g. lot ids, wafer ids); check for date and time-stamp validity.
[0055] In addition, data anomalies like outliers can be identified with different techniques, including but not limited to: ML algorithms such as local outlier factor, isolation forests, and DBSCAN as well as statistical method that uses the interquartile range (IQR) to calculate the boundaries for what constitutes outliers. Data point is an outlier if it is beyond range of the first quartile minus the multiple of IQR to the third quartile plus the multiple of IQR. A common value for the IQR multiple is in the range of 1.5 to 6, but it can also have a higher value.
[0056] In addition, missing data points, including removed outliers can be replaced with values such as upper value, lower value, or median value as well as with values that can be predicted with algorithm such as k nearest neighbors (k-NN) or multivariate imputation by chained equations (MICE).
[0057] In addition, if features in datasets have different ranges, normalization can be used to rescale the input and output variables to values between 0 and 1 before to training models.
[0058] In addition, dataset balancing with respect to the response might be necessary.
[0059] In addition, some or all of the following data inputs are integrated into a single environment: in-process measurement and defect data, PCM data, WS data, and cost data of WS, ASSY and FT.
[0060] C. Yield Threshold for Classification
[0061] If a set of wafers with a certain yield distribution is split into wafers with lower yield than a yield threshold (Y.sub.T) that are tested at the WS and the remaining wafers skip the WS then the total costs (TC) of WS in step 130 in
where N.sub.WS is number of wafers tested at WS, C.sub.ws is the cost of WS per wafer, N.sub.Skip is number of wafers that skipped the WS, PDPW stands for Potential Die Per Wafer, which is the total number of dice on a wafer, Y.sub.i is the yield of a particular wafer that skipped the WS, and C.sub.AFT is the cost of Assembly and Final Testing per device or package.
[0062] The choice of yield threshold Y.sub.T affects the split of the wafer set in N.sub.WS and N.sub.Skip.
[0063] If the yield can be accurately predicted after step 120 of
[0064] If there are any further steps in the process flow in the classification process 300 between WS in step 130 and FT in step 150, the related costs can be added to the above mentioned expression.
[0065] The yield threshold setting can be determined also in a different way depending on which process or cost characteristic is most important for the user.
[0066] D. Model Training
[0067] As described earlier, the inputs to the model training step 230 of
[0068] The model or ensemble model that best meets desired goals represented by the objective function is selected. As the objective function it can be used standard ML metric such as mean square error, mean absolute error, or mean square error with an additional penalty function to penalize for missing desired specification. The objective function can be customized depending on which metric is of the most importance to the user.
[0069] The yield threshold can be optimized during the yield model training to identify the optimal yield threshold Y.sub.TO at which the total cost of wafer sorting, die assembly, and final test is minimal with respect to the model's characteristics.
[0070] The model is trained to provide a unit classification. In one embodiment in
[0071] Method 1: Wafer Level
[0072] The wafer yield classification model is trained in step 230 of
[0073] Method 2: Lot Level
[0074] An alternative approach to the wafer yield classification model would be a lot yield classification model. In this method, the first step 410 of
[0075] The output of the model training step can be an executable trained model 240 of
[0076] E. Prediction and Classification
[0077] In one embodiment in
[0078] Method 1: Wafer Level
[0079]
[0080] Method 2: Lot Level
[0081]
[0082] Thus, yield prediction and classification are useful in determining how to proceed with a wafer or lot processing in a cost-effective manner. The yield prediction and classification are not necessarily the only variable in making a decision how to continue processing of a product.
[0083] A wafer map is also generated for wafers skipped at the WS before being sent to assembly. In this case, there is information about the excluded dice.
[0084] The method can also include actions that increase the method robustness. For example, in case that a new input data issue like missing parameters or unusual behavior are detected during automated preprocessing, the method can be specified to mark all wafers as requiring testing or to trigger a model retraining request. Additionally, warning can be applied to inform the user when the number of wafers tested at the WS has changed beyond certain preset values.
5. Conclusion
[0085] The Smart Skip Testing method aims to identify incoming wafers or lots that are likely to pass tests with yield above the optimal yield threshold, and recommends that the identified wafers or lots skip the test altogether, resulting in substantial cost savings, increased effective testing capacity, and reduced production cycle time. Disclosed method is a real time method that is able to make cost-effective decisions without human intervention.
[0086] The foregoing written description is intended to enable one of ordinary skill to make and use the techniques described herein, but those of ordinary skill will understand that the description is not limiting and will also appreciate the existence of variations, combinations, and equivalents of the specific embodiments, methods, and examples described herein.
[0087] The data and the machine learning techniques mentioned in the above-described embodiments are merely examples, and may be replaced with others.