Control of Processing Equipment
20230245872 · 2023-08-03
Inventors
- Gregory Austin Daly (Bristol, GB)
- Gavin Randal Tabor (Exeter, GB)
- Jonathan Edward Fieldsend (Exeter, GB)
Cpc classification
Y02P90/02
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G05B2219/32187
PHYSICS
G05B2219/32181
PHYSICS
H01J37/32935
ELECTRICITY
G05B2219/32188
PHYSICS
International classification
H01L21/67
ELECTRICITY
Abstract
Broadly speaking, the present techniques provide a method and system for controlling a wafer production process in real-time using a trained machine learning, ML, model. Advantageously, the ML model uses multiple sensed parameters to determine a state of a plasma used in the wafer production process, and this can be used to adjust at least one control parameter of a plasma reactor used in the wafer production process to reduce process variability.
Claims
1-18. (canceled)
19. A computer-implemented method for controlling a wafer production process in real-time using a trained machine learning, ML, model, the method comprising: receiving sensor data from a plurality of sensors monitoring the wafer production process in real-time; inputting the sensor data from the plurality of sensors into a neural network of the trained ML model; generating, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjusting in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process.
20. The method as claimed in claim 19, wherein receiving sensor data comprises receiving: at least one image of the plasma used in the wafer production process, and at least one optical emission spectrograph of the plasma.
21. The method as claimed in claim 19, wherein receiving sensor data comprises receiving at least one of: RF power applied to the plasma reactor, temperature of chamber furniture inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.
22. The method as claimed in claim 19 wherein generating a latent representation of a state of a plasma used in the wafer production process comprises: combining, using the neural network, the sensor data to generate a latent representation in real-time of the state of the plasma.
23. The method as claimed in claim 19 wherein the neural network comprises an autoencoder.
24. The method as claimed in claim 19 further comprising: comparing the generated latent representation of the state of the plasma with a desired latent representation of an ideal state of the plasma; and identifying any difference between the generated and desired latent representations.
25. The method as claimed in claim 24 wherein adjusting at least one control parameter of a plasma reactor used in the wafer production process comprises: determining at least one parameter of the wafer production process to adjust to minimize any identified difference between the generated latent representation and the desired latent representation; and adjusting the determined at least one parameter.
26. The method as claimed in claim 24 further comprising: outputting an alert to an operator of the plasma reactor when the identified difference between the generated and latent representations exceeds a threshold value or cannot be minimized by adjusting at least one parameter.
27. The method as claimed in claim 22, wherein combining the sensor data comprises combining sensor data having different spatial and/or temporal dimensionality.
28. A computer-implemented method for training a machine learning, ML, model for controlling a wafer production process in real-time, the method comprising: receiving training data comprising sensor data from a plurality of sensors monitoring a wafer production process; inputting the training data into a neural network of the ML model; and training the neural network of the ML model to generate a latent representation of a state of a plasma in a plasma reactor used in the wafer production process.
29. The method as claimed in claim 28 wherein receiving training data comprises receiving a plurality of sets of data items, wherein each set of data items comprises an image of the plasma and an optical emission spectrograph of the plasma, and wherein for each set of data items the data items are collected at the same point in time.
30. The method as claimed in claim 29 wherein each set of data items further comprises at least one of: RF power applied to the plasma reactor, temperature inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.
31. The method as claimed in claim 29 wherein training the neural network comprises training an encoder of the neural network to: combine each set of data items to generate a latent representation of the state of the plasma at a particular point in time.
32. The method as claimed in claim 31 wherein training the neural network further comprises training a decoder of the neural network to: reconstruct, from the generated latent representation, a set of data items corresponding to the generated latent representation; and minimize, using backpropagation, a difference between the set of data items and the reconstructed set of data items.
33. The method as claimed in claim 28 wherein training the neural network further comprises: inputting, into the neural network, a desired latent representation of an ideal state of the plasma; training the neural network to identify any difference between each generated latent representation and the desired latent representation; and determining at least one parameter of the wafer production process to adjust to minimize any identified difference between each generated latent representation and the desired latent representation.
34. A system for wafer production, the system comprising: a plasma reactor; a plurality of sensors for monitoring a wafer production process; and a control unit, comprising at least one processor coupled to memory and comprising a trained machine learning, ML, model, wherein the control unit is arranged to: receive, in real-time, sensor data from the plurality of sensors monitoring the wafer production process; input the sensor data from the plurality of sensors into a neural network of the trained ML model; generate, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjust in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process.
35. The system as claimed in claim 34 wherein the plurality of sensors comprises any one or more of: a temperature sensor, a pressure sensor, an imaging device, in situ wafer metrology equipment, a spectrometer, optical emission spectroscopy equipment, a radio-frequency sensor, a photodiode, a microwave probe, a flow rate sensor.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0043] The invention will further be described, by way of example, with reference to the accompanying drawings, in which:
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
DETAILED DESCRIPTION OF THE DRAWINGS
[0050] Broadly speaking, the present techniques provide a method and system for controlling a wafer production process in real-time using a trained machine learning, ML, model. Advantageously, the ML model uses multiple sensed parameters to determine a state of a plasma used in the wafer production process, and this can be used to adjust at least one control parameter of a plasma reactor used in the wafer production process to reduce process variability.
[0051]
[0052] The system 10 may comprise a number of sensors 13 associated with the processing chamber 12. Outputs 13A of the sensors are supplied to a control unit 20, for example in the form of a suitably programmed computer. Whilst a suitably programmed computer is described as constituting the control unit 20, it will be appreciated the control unit 20 could take other forms, and may comprise a device specifically designed for use in the control of the processing equipment 10. The control unit 20 may comprise at least one processor coupled to memory. The at least one processor may comprise one or more of: a microprocessor, a microcontroller, and an integrated circuit. The memory may comprise volatile memory, such as random access memory (RAM), for use as temporary memory, and/or non-volatile memory such as Flash, read only memory (ROM), or electrically erasable programmable ROM (EEPROM), for storing data, programs, or instructions, for example.
[0053] The sensors are sensitive to a number of parameters associated with this processing chamber 12. The sensors 13 may comprise any one or more of: temperature and pressure sensors 22 sensitive to temperature and pressure conditions within the processing chamber 12, an optical camera 24 positioned to allow monitoring of the appearance of the wafer, in situ wafer metrology equipment 26, a spectrometer 28, and other optical monitors or sensors 30. In addition, the control unit 20 is supplied with flow rate information from the process gas control metering and valve arrangement 16, and impedance, phase and voltage information.
[0054] The sensors 13 may comprise a number of sensors for measuring properties of the plasma. The sensors may comprise an imaging device (e.g. camera or RGB camera) for imaging the plasma within the processing chamber 12, optical emission spectrography equipment, radio-frequency sensors, a photodiode, and/or microwave probes.
[0055] The sensors 13 may comprise one or more in-situ metrology sensors for determining properties of the designated metrology wafer in a batch of wafers. The metrology sensor may be a full wafer interferometer and/or a spectral reflectometer.
[0056] The sensors 13 may comprise one or more sensors for measuring properties of the processing chamber, such as pressure, voltage, temperature, and so on.
[0057] Preferably, to generate an accurate latent representation of the plasma at a given point in time, the sensor data is collected simultaneously from multiple sensors. Sensor data may be collected at regular time intervals, or after particular processing steps have been performed, for example.
[0058] It will be appreciated that some of the sensor outputs 13a such as temperature and pressure may be of relatively simple form, but that others such as spectrometer outputs and optical camera outputs may be of highly complex, data rich form.
[0059] The control unit 20 is operable, as described below, to control the control parameters of the processing equipment 10, such as the operation of the coil 18 and the control metering and valve arrangement 16 (and if desired, other control parameters associated with the processing equipment 10) in response to the received sensor information as set out below.
[0060] Thus, system 10 comprises: a plasma reactor 12; a plurality of sensors 13 for monitoring a wafer production process; and a control unit 20, comprising at least one processor coupled to memory. The control unit 20 further comprises a trained machine learning, ML, model, (not shown). The control unit 20 is arranged to: receive, in real-time, sensor data from the plurality of sensors 13 monitoring the wafer production process; input the sensor data from the plurality of sensors into a neural network of the trained ML model; generate, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjust in real-time, using the generated latent representation, at least one control parameter of the plasma reactor 12 used in the wafer production process.
[0061] As shown in
[0062] The autoencoder may combine the sensor outputs in any suitable manner, and so data with different temporal or spatial dimensionality may be combined, if desired.
[0063]
[0064] The method may comprise inputting the sensor data from the plurality of sensors into a neural network of the trained ML model (step S102).
[0065] The method may comprise generating, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process (step S104). The step of generating a latent representation of a state of a plasma used in the wafer production process may comprise: combining, using the neural network, the sensor data to generate a latent representation in real-time of the state of the plasma.
[0066] The method may further comprise: comparing the generated latent representation of the state of the plasma with a desired latent representation of an ideal state of the plasma; and identifying any difference between the generated and latent representations.
[0067] The method may comprise adjusting in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process (step S106). Preferably, adjusting at least one control parameter of a plasma reactor used in the wafer production process may comprise: determining at least one parameter of the wafer production process to adjust to minimise any identified difference between the generated latent representation and the desired latent representation; and adjusting the determined at least one parameter.
[0068] Optionally, the method may further comprise: outputting an alert to an operator of the plasma reactor when the identified difference between the generated and latent representations exceeds a threshold value or cannot be minimised by adjusting at least one parameter (step S108).
[0069]
[0070] A computer-implemented method for training a machine learning, ML, model for controlling a wafer production process in real-time, may comprise: receiving training data comprising sensor data from a plurality of sensors monitoring a wafer production process; inputting the training data into a neural network of the ML model; and training the neural network of the ML model to generate a latent representation of a state of a plasma in a plasma reactor used in the wafer production process.
[0071] As shown in
[0072] Collecting data from the sensors to form the training data may comprise running the system 10 for many days using different plasma conditions in order to collect hundreds of thousands of data points. In particular, image and spectra pairs at a plurality of time points may be collected. The different plasma conditions represent samples of conditions across a parameter space with high dimensionality (2 electrode powers, pressure, 3 temperatures (table, wall, liners), 6-10 process gasses with many possible mixtures).
[0073]
[0074] As shown in
[0075] For different temporal resolution data, two techniques may be used to combine the data. For example, if the input sensor data is obtained from an in-situ wafer metrology method/sensor that provides the average etch or deposition rate over tens of seconds (such as what may be obtained from a full wafer interferometer), the data could be combined with all the spectra collected over that time by first passing the time averaged metrology data through its own branch in the ML model to the deep encoder, and then applying one of the following techniques. One technique comprises passing each spectra through the convolutional branch to extract features, passing those features through a time series network like a long short-term memory (LSTM) network, and then passing the output of the LSTM network to the deep encoder. Another technique comprises stacking the optical emission spectra together to create a 2D spectrograph and passing this through a branch, similar to the image branch, to the deep encoder. Both of these techniques work similarly in higher or lower dimensions.
[0076] A Root Mean Squared Error of the output of the sensor deep decoder in the whole encoder is calculated and compared to the same output on the individual pre-trained individual sensor encoder. This helps to guide the neural network to form a similar representation from each sensor while training, but allows the deep encoder, latent representation and deep decoder enough freedom in training to find a good representation that gets to an overall lower loss.
[0077]
[0078] It can be seen from
[0079] Similarly,
[0080] Training the neural network may further comprise: inputting, into the neural network, a desired latent representation of an ideal state of the plasma; training the neural network to identify any difference between each generated latent representation and the desired latent representation; and determining at least one parameter of the wafer production process to adjust to minimise any identified difference between each generated latent representation and the desired latent representation.
[0081] The comparison of the single meaningful representation with the target representation is preferably undertaken using a reinforcement learning technique in which a reinforcement learning agent/module receives a continuous reward signal indicative of a difference between the single meaningful representation and a target representation and, whilst training, learns how adjustments to the control parameters impacts upon the rewards signal. Once trained, the reinforcement learning agent exploits its knowledge to maintain the processing equipment in a stable condition in which the products produced thereby are at an acceptable level of quality. During production, the reward signal can still be used to achieve additional training, and adjustments made to the control parameters to adjust for slow changes in behaviour. If sudden changes in behaviour are noted, identified by a sudden change in the reward signal, the operator may be notified and the processing equipment 10 shut down.
[0082] It will be appreciated that, in accordance with the invention, a large number of sensor outputs may be used, substantially in real time, in controlling the operation of the processing equipment. Accordingly, variations in processing of the wafers may be quickly addressed, leading to enhanced product uniformity. Closed loop control, using the outputs of a number of sensors sensitive to a wide range of parameters or characteristics may be achieved.
[0083] Further example embodiments and features are described in the numbered paragraphs below:
[0084] Example 1: A control method for use in controlling the processing equipment used in the processing of a wafer, the method comprising receiving sensor information from a plurality of sensors sensitive to product and/or processing characteristics, inputting the sensor information into an unsupervised machine learning or deep learning model, and using the output of the model, substantially in real time, in adjusting at least one control parameter of the processing equipment.
[0085] Example 2: The method of Example 1, wherein the processing characteristics monitored by the sensors include at least one of RF power, temperature, pressure, gas flow rate, and characteristics such as electron density, the appearance of the wafer as detected by an optical camera, and optical emission spectroscopy outputs.
[0086] Example 3: The method of Example 1, wherein the unsupervised machine learning or deep learning model comprises a neural network.
[0087] Example 4: The method of Example 3, wherein the neural network includes an autoencoder operable to merge a plurality of sensor outputs into a single meaningful representation, and to extract from that representation outputs (or adjusted inputs) suitable for use in adjusting control parameters of the processing equipment.
[0088] Example 5: The method of Example 4, wherein the autoencoder combines data with different spatial and/or temporal dimensionality.
[0089] Example 6: The method of Example 4 or Example 5, wherein certain of the autoencoder inputs are themselves the outputs from neural networks or the like.
[0090] Example 7: Processing equipment comprising a processing chamber, a plurality of sensors sensitive to product and/or processing characteristics, a control unit to which sensor information from the sensors is supplied, the control unit comprising an unsupervised machine learning or deep learning model operable to produce an output which, substantially in real time, is used to control at least one control parameter of the processing equipment.
[0091] Those skilled in the art will appreciate that while the foregoing has described what is considered to be the best mode and where appropriate other modes of performing present techniques, the present techniques should not be limited to the specific configurations and methods disclosed in this description of the preferred embodiment. Those skilled in the art will recognise that present techniques have a broad range of applications, and that the embodiments may take a wide range of modifications without departing from any inventive concept as defined in the appended claims.