Control of Processing Equipment

Abstract

Broadly speaking, the present techniques provide a method and system for controlling a wafer production process in real-time using a trained machine learning, ML, model. Advantageously, the ML model uses multiple sensed parameters to determine a state of a plasma used in the wafer production process, and this can be used to adjust at least one control parameter of a plasma reactor used in the wafer production process to reduce process variability.

Claims

1-18. (canceled)

19. A computer-implemented method for controlling a wafer production process in real-time using a trained machine learning, ML, model, the method comprising: receiving sensor data from a plurality of sensors monitoring the wafer production process in real-time; inputting the sensor data from the plurality of sensors into a neural network of the trained ML model; generating, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjusting in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process.

20. The method as claimed in claim 19, wherein receiving sensor data comprises receiving: at least one image of the plasma used in the wafer production process, and at least one optical emission spectrograph of the plasma.

21. The method as claimed in claim 19, wherein receiving sensor data comprises receiving at least one of: RF power applied to the plasma reactor, temperature of chamber furniture inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.

22. The method as claimed in claim 19 wherein generating a latent representation of a state of a plasma used in the wafer production process comprises: combining, using the neural network, the sensor data to generate a latent representation in real-time of the state of the plasma.

23. The method as claimed in claim 19 wherein the neural network comprises an autoencoder.

24. The method as claimed in claim 19 further comprising: comparing the generated latent representation of the state of the plasma with a desired latent representation of an ideal state of the plasma; and identifying any difference between the generated and desired latent representations.

25. The method as claimed in claim 24 wherein adjusting at least one control parameter of a plasma reactor used in the wafer production process comprises: determining at least one parameter of the wafer production process to adjust to minimize any identified difference between the generated latent representation and the desired latent representation; and adjusting the determined at least one parameter.

26. The method as claimed in claim 24 further comprising: outputting an alert to an operator of the plasma reactor when the identified difference between the generated and latent representations exceeds a threshold value or cannot be minimized by adjusting at least one parameter.

27. The method as claimed in claim 22, wherein combining the sensor data comprises combining sensor data having different spatial and/or temporal dimensionality.

28. A computer-implemented method for training a machine learning, ML, model for controlling a wafer production process in real-time, the method comprising: receiving training data comprising sensor data from a plurality of sensors monitoring a wafer production process; inputting the training data into a neural network of the ML model; and training the neural network of the ML model to generate a latent representation of a state of a plasma in a plasma reactor used in the wafer production process.

29. The method as claimed in claim 28 wherein receiving training data comprises receiving a plurality of sets of data items, wherein each set of data items comprises an image of the plasma and an optical emission spectrograph of the plasma, and wherein for each set of data items the data items are collected at the same point in time.

30. The method as claimed in claim 29 wherein each set of data items further comprises at least one of: RF power applied to the plasma reactor, temperature inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.

31. The method as claimed in claim 29 wherein training the neural network comprises training an encoder of the neural network to: combine each set of data items to generate a latent representation of the state of the plasma at a particular point in time.

32. The method as claimed in claim 31 wherein training the neural network further comprises training a decoder of the neural network to: reconstruct, from the generated latent representation, a set of data items corresponding to the generated latent representation; and minimize, using backpropagation, a difference between the set of data items and the reconstructed set of data items.

33. The method as claimed in claim 28 wherein training the neural network further comprises: inputting, into the neural network, a desired latent representation of an ideal state of the plasma; training the neural network to identify any difference between each generated latent representation and the desired latent representation; and determining at least one parameter of the wafer production process to adjust to minimize any identified difference between each generated latent representation and the desired latent representation.

34. A system for wafer production, the system comprising: a plasma reactor; a plurality of sensors for monitoring a wafer production process; and a control unit, comprising at least one processor coupled to memory and comprising a trained machine learning, ML, model, wherein the control unit is arranged to: receive, in real-time, sensor data from the plurality of sensors monitoring the wafer production process; input the sensor data from the plurality of sensors into a neural network of the trained ML model; generate, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjust in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process.

35. The system as claimed in claim 34 wherein the plurality of sensors comprises any one or more of: a temperature sensor, a pressure sensor, an imaging device, in situ wafer metrology equipment, a spectrometer, optical emission spectroscopy equipment, a radio-frequency sensor, a photodiode, a microwave probe, a flow rate sensor.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0043] The invention will further be described, by way of example, with reference to the accompanying drawings, in which:

[0044] FIG. 1 is a schematic block diagram of a system for wafer production;

[0045] FIG. 2 is a flow chart illustrating example steps to control a wafer production process in real-time using a trained machine learning model;

[0046] FIG. 3 is a diagrammatic representation of part of the control arrangement;

[0047] FIG. 4A is a schematic diagram illustrating an example machine learning model for use in controlling a wafer production process in real-time;

[0048] FIG. 4B is a schematic diagram illustrating a further example machine learning model for use in controlling a wafer production process in real-time; and

[0049] FIG. 5 shows an experimental data sweep pattern used to collect data for training the machine learning model.

DETAILED DESCRIPTION OF THE DRAWINGS

[0050] Broadly speaking, the present techniques provide a method and system for controlling a wafer production process in real-time using a trained machine learning, ML, model. Advantageously, the ML model uses multiple sensed parameters to determine a state of a plasma used in the wafer production process, and this can be used to adjust at least one control parameter of a plasma reactor used in the wafer production process to reduce process variability.

[0051] FIG. 1 is a schematic block diagram of a system 10 for wafer production (also referred to herein as “wafer processing equipment”). The system 10 comprises a processing chamber or plasma reactor 12 within which a wafer to be processed is located, in use. The terms “processing chamber” and “plasma reactor” are used interchangeably herein. A process gas is supplied to the processing chamber 12 from a source 14. A control metering and valve arrangement 16 is operable to control and monitor the rate at which the process gas is supplied to the processing chamber 12. An excitation coil 18 surrounds the processing chamber 12. It will be appreciated that by applying a suitable varying signal to the excitation coil 18, whilst delivering controlled pulses of the process gas to the processing chamber 12, plasma etching of the wafer located within the processing chamber 12 or plasma deposition may be achieved in a controlled manner. Plasma etching and/or deposition in this manner is well known and so is not described herein in further detail.

[0052] The system 10 may comprise a number of sensors 13 associated with the processing chamber 12. Outputs 13A of the sensors are supplied to a control unit 20, for example in the form of a suitably programmed computer. Whilst a suitably programmed computer is described as constituting the control unit 20, it will be appreciated the control unit 20 could take other forms, and may comprise a device specifically designed for use in the control of the processing equipment 10. The control unit 20 may comprise at least one processor coupled to memory. The at least one processor may comprise one or more of: a microprocessor, a microcontroller, and an integrated circuit. The memory may comprise volatile memory, such as random access memory (RAM), for use as temporary memory, and/or non-volatile memory such as Flash, read only memory (ROM), or electrically erasable programmable ROM (EEPROM), for storing data, programs, or instructions, for example.

[0053] The sensors are sensitive to a number of parameters associated with this processing chamber 12. The sensors 13 may comprise any one or more of: temperature and pressure sensors 22 sensitive to temperature and pressure conditions within the processing chamber 12, an optical camera 24 positioned to allow monitoring of the appearance of the wafer, in situ wafer metrology equipment 26, a spectrometer 28, and other optical monitors or sensors 30. In addition, the control unit 20 is supplied with flow rate information from the process gas control metering and valve arrangement 16, and impedance, phase and voltage information.

[0054] The sensors 13 may comprise a number of sensors for measuring properties of the plasma. The sensors may comprise an imaging device (e.g. camera or RGB camera) for imaging the plasma within the processing chamber 12, optical emission spectrography equipment, radio-frequency sensors, a photodiode, and/or microwave probes.

[0055] The sensors 13 may comprise one or more in-situ metrology sensors for determining properties of the designated metrology wafer in a batch of wafers. The metrology sensor may be a full wafer interferometer and/or a spectral reflectometer.

[0056] The sensors 13 may comprise one or more sensors for measuring properties of the processing chamber, such as pressure, voltage, temperature, and so on.

[0057] Preferably, to generate an accurate latent representation of the plasma at a given point in time, the sensor data is collected simultaneously from multiple sensors. Sensor data may be collected at regular time intervals, or after particular processing steps have been performed, for example.

[0058] It will be appreciated that some of the sensor outputs 13a such as temperature and pressure may be of relatively simple form, but that others such as spectrometer outputs and optical camera outputs may be of highly complex, data rich form.

[0059] The control unit 20 is operable, as described below, to control the control parameters of the processing equipment 10, such as the operation of the coil 18 and the control metering and valve arrangement 16 (and if desired, other control parameters associated with the processing equipment 10) in response to the received sensor information as set out below.

[0060] Thus, system 10 comprises: a plasma reactor 12; a plurality of sensors 13 for monitoring a wafer production process; and a control unit 20, comprising at least one processor coupled to memory. The control unit 20 further comprises a trained machine learning, ML, model, (not shown). The control unit 20 is arranged to: receive, in real-time, sensor data from the plurality of sensors 13 monitoring the wafer production process; input the sensor data from the plurality of sensors into a neural network of the trained ML model; generate, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process; and adjust in real-time, using the generated latent representation, at least one control parameter of the plasma reactor 12 used in the wafer production process.

[0061] As shown in FIG. 3, the machine learning model of the control unit 20 may be an unsupervised machine learning model or deep learning model. The neural network of the ML model may comprise an autoencoder 32 defining an encoder 34 in which the various sensor outputs 13a are combined with one another to form a single meaningful representation (i.e. the latent representation of the state of the plasma) which can be compared with an ideal, desired or target representation, and a decoder 36. The decoder 36 tries to reconstruct the inputs from the generated latent representation during training of the ML model. The decoder is therefore used to reduce an error between the reconstructed inputs and the original input data used to generate the latent representation, as part of the model training process. After the ML model has been trained, the control unit 20 uses the generated latent representation produced by the encoder 34 to control or adjust the control parameters of the processing equipment, such as the gas flow rate as controlled by the control metering and valve arrangement 16. In this manner, it will be appreciated that wafer processing can be controlled, substantially in real time, to compensate for variations in the manner in which the equipment is operating and variations in the wafers being processed, to achieve a good level of product uniformity and to reduce the level of waste produced through the control equipment producing products of unacceptable quality.

[0062] The autoencoder may combine the sensor outputs in any suitable manner, and so data with different temporal or spatial dimensionality may be combined, if desired.

[0063] FIG. 2 is a flow chart illustrating example steps to control a wafer production process in real-time using a trained machine learning model. The computer-implemented method comprises: receiving sensor data from a plurality of sensors monitoring the wafer production process in real-time (step S100). Receiving sensor data may comprise receiving: at least one image of the plasma used in the wafer production process, and at least one optical emission spectrograph of the plasma. Additionally or alternatively, the step of receiving sensor data may comprise receiving at least one of: RF power applied to the plasma reactor, temperature inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.

[0064] The method may comprise inputting the sensor data from the plurality of sensors into a neural network of the trained ML model (step S102).

[0065] The method may comprise generating, using the trained ML model, a latent representation of a state of a plasma used in the wafer production process (step S104). The step of generating a latent representation of a state of a plasma used in the wafer production process may comprise: combining, using the neural network, the sensor data to generate a latent representation in real-time of the state of the plasma.

[0066] The method may further comprise: comparing the generated latent representation of the state of the plasma with a desired latent representation of an ideal state of the plasma; and identifying any difference between the generated and latent representations.

[0067] The method may comprise adjusting in real-time, using the generated latent representation, at least one control parameter of a plasma reactor used in the wafer production process (step S106). Preferably, adjusting at least one control parameter of a plasma reactor used in the wafer production process may comprise: determining at least one parameter of the wafer production process to adjust to minimise any identified difference between the generated latent representation and the desired latent representation; and adjusting the determined at least one parameter.

[0068] Optionally, the method may further comprise: outputting an alert to an operator of the plasma reactor when the identified difference between the generated and latent representations exceeds a threshold value or cannot be minimised by adjusting at least one parameter (step S108).

[0069] FIG. 4A is a schematic diagram illustrating an example machine learning model for use in controlling a wafer production process in real-time. In this example, images and spectra (e.g. optical emission spectra) are input into the model to determine the latent representation of the state of the plasma, both during training of the model and during inference. Only the left-hand side of the model is used during inference (i.e. during run-time). The left-hand side shows an encoder portion of the neural network of the ML model, which is used to generate the latent representation. The right-hand side shows a decoder portion of the neural network, which is used during training of the model.

[0070] A computer-implemented method for training a machine learning, ML, model for controlling a wafer production process in real-time, may comprise: receiving training data comprising sensor data from a plurality of sensors monitoring a wafer production process; inputting the training data into a neural network of the ML model; and training the neural network of the ML model to generate a latent representation of a state of a plasma in a plasma reactor used in the wafer production process.

[0071] As shown in FIG. 4A, receiving training data may comprise receiving a plurality of sets of data items, wherein each set of data items comprises an image of the plasma and an optical emission spectrograph of the plasma. For each set of data items the data items are collected at the same point in time. This enables a more accurate representation of the state of a plasma at a given point in time to be generated.

[0072] Collecting data from the sensors to form the training data may comprise running the system 10 for many days using different plasma conditions in order to collect hundreds of thousands of data points. In particular, image and spectra pairs at a plurality of time points may be collected. The different plasma conditions represent samples of conditions across a parameter space with high dimensionality (2 electrode powers, pressure, 3 temperatures (table, wall, liners), 6-10 process gasses with many possible mixtures). FIG. 5 shows an experimental data sweep pattern used to collect data for training the machine learning model. A Sobol sequence may be used to generate a quasi-random sequence of data points to sample across the space efficiently and then sweep across the parameter space (as per the sweep plot shown in FIG. 5) to collect data. The sweep may be performed every 8 seconds, for example, where the parameters are changed at the same frequency.

[0073] FIG. 4A shows the connections in the autoencoder. It has been found that training the whole model simultaneously does not work, because one branch may train and dominate all other parts and branches of the model. Therefore, it has been determined that each sensor branch in FIG. 4A may need to be trained individually. Neural network weights determined after each sensor branch has been trained may then be transferred to the complete autoencoder.

[0074] As shown in FIG. 4A, each input sensor data is first dealt with separately by the encoder of the ML model. For example, the image data may be an RGB image that has low spectral resolution and high spatial resolution, and the spectral data may be a spectra that is a spatial average with high spectral resolution. A convolutional encoder of the ML model may branch to learn to extract features from each data item separately, as shown by the branches in FIG. 4A, and a deep encoder of the ML model may learn to combine the extracted features. Any suitable techniques may be used to perform feature extraction.

[0075] For different temporal resolution data, two techniques may be used to combine the data. For example, if the input sensor data is obtained from an in-situ wafer metrology method/sensor that provides the average etch or deposition rate over tens of seconds (such as what may be obtained from a full wafer interferometer), the data could be combined with all the spectra collected over that time by first passing the time averaged metrology data through its own branch in the ML model to the deep encoder, and then applying one of the following techniques. One technique comprises passing each spectra through the convolutional branch to extract features, passing those features through a time series network like a long short-term memory (LSTM) network, and then passing the output of the LSTM network to the deep encoder. Another technique comprises stacking the optical emission spectra together to create a 2D spectrograph and passing this through a branch, similar to the image branch, to the deep encoder. Both of these techniques work similarly in higher or lower dimensions.

[0076] A Root Mean Squared Error of the output of the sensor deep decoder in the whole encoder is calculated and compared to the same output on the individual pre-trained individual sensor encoder. This helps to guide the neural network to form a similar representation from each sensor while training, but allows the deep encoder, latent representation and deep decoder enough freedom in training to find a good representation that gets to an overall lower loss.

[0077] FIG. 4B is a schematic diagram illustrating a further example machine learning model for use in controlling a wafer production process in real-time. This shows how additional sensor data may be used to generate the latent representation, both during training and inference. Therefore, each set of data items used to train the model (and at inference time) may further comprise at least one of: RF power applied to the plasma reactor, temperature inside the plasma reactor, pressure inside the plasma reactor, gas flow rate into the plasma reactor, plasma impedance, and plasma electron density.

[0078] It can be seen from FIGS. 4A and 4B that training the neural network may comprise training an encoder of the neural network to: combine each set of data items to generate a latent representation of the state of the plasma at a particular point in time.

[0079] Similarly, FIGS. 4A and 4B show how training the neural network may further comprise training a decoder of the neural network to: reconstruct, from the generated latent representation, a set of data items corresponding to the generated latent representation; and minimise, using backpropagation, a difference between the set of data items and the reconstructed set of data items.

[0080] Training the neural network may further comprise: inputting, into the neural network, a desired latent representation of an ideal state of the plasma; training the neural network to identify any difference between each generated latent representation and the desired latent representation; and determining at least one parameter of the wafer production process to adjust to minimise any identified difference between each generated latent representation and the desired latent representation.

[0081] The comparison of the single meaningful representation with the target representation is preferably undertaken using a reinforcement learning technique in which a reinforcement learning agent/module receives a continuous reward signal indicative of a difference between the single meaningful representation and a target representation and, whilst training, learns how adjustments to the control parameters impacts upon the rewards signal. Once trained, the reinforcement learning agent exploits its knowledge to maintain the processing equipment in a stable condition in which the products produced thereby are at an acceptable level of quality. During production, the reward signal can still be used to achieve additional training, and adjustments made to the control parameters to adjust for slow changes in behaviour. If sudden changes in behaviour are noted, identified by a sudden change in the reward signal, the operator may be notified and the processing equipment 10 shut down.

[0082] It will be appreciated that, in accordance with the invention, a large number of sensor outputs may be used, substantially in real time, in controlling the operation of the processing equipment. Accordingly, variations in processing of the wafers may be quickly addressed, leading to enhanced product uniformity. Closed loop control, using the outputs of a number of sensors sensitive to a wide range of parameters or characteristics may be achieved.

[0083] Further example embodiments and features are described in the numbered paragraphs below:

[0084] Example 1: A control method for use in controlling the processing equipment used in the processing of a wafer, the method comprising receiving sensor information from a plurality of sensors sensitive to product and/or processing characteristics, inputting the sensor information into an unsupervised machine learning or deep learning model, and using the output of the model, substantially in real time, in adjusting at least one control parameter of the processing equipment.

[0085] Example 2: The method of Example 1, wherein the processing characteristics monitored by the sensors include at least one of RF power, temperature, pressure, gas flow rate, and characteristics such as electron density, the appearance of the wafer as detected by an optical camera, and optical emission spectroscopy outputs.

[0086] Example 3: The method of Example 1, wherein the unsupervised machine learning or deep learning model comprises a neural network.

[0087] Example 4: The method of Example 3, wherein the neural network includes an autoencoder operable to merge a plurality of sensor outputs into a single meaningful representation, and to extract from that representation outputs (or adjusted inputs) suitable for use in adjusting control parameters of the processing equipment.

[0088] Example 5: The method of Example 4, wherein the autoencoder combines data with different spatial and/or temporal dimensionality.

[0089] Example 6: The method of Example 4 or Example 5, wherein certain of the autoencoder inputs are themselves the outputs from neural networks or the like.

[0090] Example 7: Processing equipment comprising a processing chamber, a plurality of sensors sensitive to product and/or processing characteristics, a control unit to which sensor information from the sensors is supplied, the control unit comprising an unsupervised machine learning or deep learning model operable to produce an output which, substantially in real time, is used to control at least one control parameter of the processing equipment.

[0091] Those skilled in the art will appreciate that while the foregoing has described what is considered to be the best mode and where appropriate other modes of performing present techniques, the present techniques should not be limited to the specific configurations and methods disclosed in this description of the preferred embodiment. Those skilled in the art will recognise that present techniques have a broad range of applications, and that the embodiments may take a wide range of modifications without departing from any inventive concept as defined in the appended claims.

Control of Processing Equipment

Inventors

Cpc classification

Classification Explorer

Y02P90/02

GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Classification Explorer

G05B2219/32187

PHYSICS

Classification Explorer

G05B2219/32181

PHYSICS

Classification Explorer

H01J37/32935

ELECTRICITY

Classification Explorer

H01J37/32926

ELECTRICITY

Classification Explorer

G05B19/41875

PHYSICS

Classification Explorer

G05B2219/32188

PHYSICS

Classification Explorer

H01J37/3299

ELECTRICITY

Classification Explorer

H01L21/67069

ELECTRICITY

Classification Explorer

H01J37/32917

ELECTRICITY

Classification Explorer

G05B2219/45031

PHYSICS

International classification

Classification Explorer

H01J37/32

ELECTRICITY

Classification Explorer

H01L21/67

ELECTRICITY

Classification Explorer

G05B19/418

PHYSICS

Abstract

Claims

Description