METHOD AND SYSTEM PERFORMING PATTERN CLUSTERING
20230230348 · 2023-07-20
Inventors
- In Huh (Seoul, KR)
- Younggu KIM (Hwaseong-si, KR)
- Changwook Jeong (Hwaseong-si, KR)
- JAEMYUNG CHOE (SEOUL, KR)
Cpc classification
G06V10/762
PHYSICS
G06V10/774
PHYSICS
International classification
G06V10/762
PHYSICS
G06V10/77
PHYSICS
Abstract
A method of clustering patterns of an integrated circuit includes; providing a pattern image and numeric data, as input data corresponding to a first pattern to a first model, wherein the first model is trained by a plurality of sample images and a plurality of sample values, obtaining a content latent variable using the first model, and grouping a plurality of content latent variables corresponding to a plurality of patterns into a plurality of clusters based on a Euclidean distance, wherein the numeric data represents at least one attribute of the first pattern.
Claims
1. A method of clustering patterns of an integrated circuit, the method comprising: for each pattern among a plurality of patterns, providing a pattern image to a first model, wherein the first model is trained using a plurality of sample images; and generating a content latent variable and a pose latent variable using the first model in response to the pattern image, wherein the pose latent variable has a value corresponding to an Euclidean transform of each pattern; and thereafter, grouping a plurality of content latent variables, each respectively corresponding to a pattern among the plurality of patterns, into a plurality of clusters corresponding to the plurality of pattern based on a Euclidean distance.
2. The method of claim 1, wherein the pose latent variable includes a vector including eight real numbers respectively corresponding to eight transforms included in a dihedral group.
3. The method of claim 1, further comprising: providing the content latent variable to a second model, wherein the second model is trained using a plurality of sample content latent variables; generating a test pattern image using the second model; and training at least one of the first model and the second model using the test pattern image.
4. The method of claim 3, further comprising: generating a coordinate in response to the pattern image; identifying an Euclidean transform of the pattern image by sampling a probability distribution of the pose latent variable; transforming the coordinate in response to the Euclidean transform of the pattern image to generate a transform coordinate; and providing the transform coordinate to the second model, and the generating of the test pattern image using the second model is performed in response to the transform coordinate.
5. The method of claim 1, further comprising: providing numeric data to the first model, wherein the numeric data represents at least one attribute of a pattern among the plurality of patterns, and the generating of the content latent variable and the pose latent variable using the first model is performed in response to the pattern image and the numeric data.
6. (canceled)
7. The method of claim 1, further comprising: extracting at least one feature of the image pattern; and extracting at least one main feature from the at least one feature in response to the content latent variable.
8. The method of claim 7, wherein the extracting of the at least one main feature includes: calculating first correlation coefficients between the at least one feature and the plurality of clusters; and extracting the at least one main feature distinguishing the plurality of clusters in response to the first correlation coefficients.
9. The method of claim 7, wherein the extracting of the at leased one main feature includes: calculating second correlation coefficients between the at least one feature and content latent variables included in one cluster among the plurality of clusters; and extracting the at least one main feature distinguishing the content latent variables included in the one cluster in response to the second correlation coefficients.
10. The method of claim 1, further comprising: obtaining verification data associated with the image pattern; and calculating correlation coefficients between the verification data and the plurality of clusters.
11. The method of claim 10, further comprising: visualizing the plurality of clusters and the verification data.
12. A method of clustering patterns of an integrated circuit, the method comprising: providing a pattern image and numeric data, as input data corresponding to a first pattern to a first model, wherein the first model is trained by a plurality of sample images and a plurality of sample values; obtaining a content latent variable using the first model; and grouping a plurality of content latent variables corresponding to a plurality of patterns into a plurality of clusters based on a Euclidean distance, wherein the numeric data represents at least one attribute of the first pattern.
13. The method of claim 12, further comprising: extracting a plurality of features in response to at least one of the pattern image and the numeric data; and extracting at least one main feature from the plurality of features and the plurality of clusters.
14. The method of claim 13, wherein the extracting of the at least one main feature comprises: calculating first correlation coefficients between the plurality of features and the plurality of clusters; and extracting the at least one main feature distinguishing the plurality of clusters in response to the first correlation coefficients.
15. The method of claim 14, wherein the extracting of the at least one main feature further comprises: calculating second correlation coefficients between the plurality of features and content latent variables included in one cluster among the plurality of clusters; and extracting the at least one main feature distinguishing the content latent variables included in the one cluster in response to the second correlation coefficients.
16. The method of claim 13, further comprising: visualizing the plurality of clusters and the at least one main feature.
17. A system comprising: at least one processor; and a non-transitory storage medium storing instructions, which when executed by the at least one processor, allows the at least one processor to cluster patterns of an integrated circuit, wherein clustering of patterns comprises: providing a pattern image corresponding to a first pattern to a first model, wherein the first model is trained by a plurality of sample images; obtaining a first content latent variable and a first pose latent variable from the first model; and grouping a plurality of content latent variables corresponding to a plurality of patterns into a plurality of clusters based on a Euclidean distance, wherein the first pose latent variable represents a Euclidean transform of the first pattern with respect to a representative pattern of a first cluster including the first content latent variable.
18. The system of claim 17, wherein the clustering of pattern further comprises: providing the first content latent variable to a second model, wherein the second model is trained by a plurality of sample content latent variables; obtaining a first test pattern image from the second model; and training the first model and the second model based on the first image and the first test pattern image.
19. The system of claim 18, wherein the second model is trained by the plurality of sample content latent variables and a plurality of sample coordinates, and the clustering of patterns further comprises: generating a first coordinate corresponding to the first image; identifying the Euclidean transform of the first pattern by sampling a probability distribution of the first pose latent variable; transforming the first coordinate based on the Euclidean transform of the first pattern to generate a transformed first coordinate; and providing the transformed first coordinate to the second model.
20. The system of claim 17, wherein the first model is trained using the plurality of sample images and a plurality of sample values, and the clustering of patterns further comprises providing first numeric data representing at least one attribute of the first pattern to the first model.
21. The system of claim 17, wherein the clustering of patterns further comprises: obtaining verification data of the first pattern; and calculating correlation coefficients between the verification data and the plurality of clusters.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] Advantages, benefits, and features, as well as the making and use of the inventive concept may be more clearly understood upon consideration of the following detailed description together with the accompanying drawings, in which:
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
DETAILED DESCRIPTION
[0021] Throughout the written description and drawings, like reference numbers and labels are used to denote like or similar elements, components, features and/or method steps.
[0022] Figure (
[0023] In some embodiments, pattern clustering, like the example illustrated in
[0024] As noted above, as semiconductor processes continue to develop and/or evolve, it has become impractically to fully analyze a great number of patterns associated with an integrated circuit having a high degree of integration. Accordingly, the integrated circuit may be analyzed by clustering (or grouping) patterns associated with the integrated circuit based on (or in relation to) attribute(s) distinguishing respective clusters, a dispersion or tendency of attribute(s) extracted from patterns included in a cluster, etc.
[0025] For example, clustering of patterns may be accomplished in relation to a distance or similarity identified between patterns. That is, patterns having a relatively short distance (e.g.,) from an arbitrary point or high degree of attribute similarity may be grouped together in a cluster. In this regard, patterns corresponding to latent variables close to each other in latent space may be deemed to have similar attributes. As will be described hereafter in some additional detail, the pattern clustering system 10 may effectively “map” a pattern to one point in latent space (e.g., a representation space). That is, the pattern clustering system 10 may map a pattern to a latent variable, and then “group” patterns having a relatively low Euclidean distance in latent space into a cluster. In this manner, patterns associated with an integrated circuit may be effectively “clustered,” wherein each resulting cluster may exhibit or be characterized by at least one attribute. Thereafter, various patterns may be more readily analyzed based on clusters, thereby allowing the even very densely integrated circuits to be efficiently analyzed and verified.
[0026] Referring to
[0027] Using this nomenclature, a layer or an image requiring analysis may be extracted from data defining the layout of the integrated circuit. In this regard, the pattern image x may be pattern obtained by sub-dividing the extracted layer using a window of predetermined size. Thus, the pattern image x.sub.pattern may include two-dimensional (2D) geometric information. In some embodiments, when a pattern includes structures formed in relation to a number of layers, the pattern image x.sub.pattern may include multiple 2D arrays, wherein each of the 2D arrays corresponds to a respective layer. For example, assuming a 2D array is arranged in a matrix of 50×50 pixels, corresponding pattern image(s) x.sub.pattern for each layer may have a size of 50×50.
[0028] Referring to
[0029] As conceptually illustrated in
[0030] In some embodiments, the first model 11 may include mutually independent models. For example, the first model 11 may include one model trained to receive the pattern image x.sub.pattern and generate (or “output”) a first content latent variable and a pose latent variable z.sub.pose, and another model trained to receive numeric data x.sub.numeric and output a second content latent variable. Accordingly, the first model 11 may generate a content latent variable z.sub.content based on operations using the first content latent variable and the second content latent variable. In this regard, the first model 11 may generate a content latent variable z.sub.content by summing, multiplying, and/or concatenating the first content latent variable and/or the second content latent variable. One possible example of the first model 11 will be described hereafter in some additional detail with reference to
[0031] Conceptually, the content latent variable z.sub.content may correspond to a single point in latent space, and may have values respectively corresponding to attributes of a pattern defined by the pattern image x.sub.pattern and the numeric data x.sub.numeric. For example, the content latent variable z.sub.content may have a value corresponding to a type, a shape, an area, etc., of a structure included in the pattern. As noted above, content latent variables respectively corresponding to patterns having similar attributes may have similar values, and may be disposed relatively close to each other in latent space.
[0032] The pose latent variable z.sub.pose may have a value corresponding to an Euclidean transform of a pattern. In this regard, the Euclidean transform may be understood as a rigid transform or a coordinate transform, and may be referred to hereafter as a geometric transform in which Euclidean distances between all point pairs are maintained. For example, the Euclidean transform may include at least one of (e.g., a sequence of) rotation, translation and reflection. With regard to the layout of the integrated circuit, a second pattern that has been Euclidean-transformed from a first pattern may have substantially similar attributes as the first pattern. Accordingly, patterns having the Euclidean transform relationship may be deemed suitable for grouping together in a cluster.
[0033] Referring to
[0034] Here, it should be noted that a machine learning model may have an arbitrary structure capable of being trained through the use of sample data (or training data). The machine learning model may include at least one of, for example; an artificial neural network, a decision tree, a support vector machine, a Bayesian network, a genetic algorithm, etc. In the illustrated embodiments that follow, a machine learning model is assumed to be an artificial neural network, but the scope of the inventive concept is not limited thereto. In this regard, the artificial neural network may include at least one of, for example; a convolution neural network (CNN), a region (R)-based CNN (R-CNN), a region proposal network (RPN), a recurrent neural network (RNN), a stacking (S)-based deep neural network (DNN) (S-DNN), a state (S)-space (S) DNN (S-SDNN), deconvolution network, a deep belief network (DBN), a fully convolutional network, a long short-term memory (LSTM) network, a classification network, etc. Herein, the machine learning model may be referred to as a “model.”
[0035] The coordinate generator 13 may generate a coordinate x.sub.cord for a pattern. For example, if a pattern image x.sub.pattern includes one or more pixels, and the coordinate x.sub.cord may be used to determine a size of a pixel in the pattern. Thus, in some embodiments, the coordinate x.sub.cord may correspond to one pixel. And a distance between adjacent coordinates (e.g., the size of the one pixel) may be proportional to a critical dimension associated with (or defined by) a semiconductor process used to manufacture the integrated circuit. In this regard, the coordinate generator 13 may generate the coordinate x.sub.cord corresponding to the pattern image x and as pattern, described hereafter in some additional detail, the coordinate x.sub.cord may be used to identify (or provide) the size and coordinate information related to the pattern.
[0036] The coordinate transformer 14 may receive the pose latent variable z.sub.pose, and the coordinate x.sub.cord, and generate a transformed coordinate z.sub.cord. As noted above, the pose latent variable z.sub.pose may include information regarding the Euclidean transform of the pattern. The coordinate transformer 14 may generate the transformed coordinate z.sub.cord—in which the Euclidean transform has been reflected in the coordinate x.sub.cord generated by the coordinate generator 13. Accordingly, one point in the transformed coordinate z.sub.cord may include information regarding the Euclidean transform of the point having the pattern image x.sub.pattern corresponding to the point. Coordinate information and the Euclidean transform of the pattern may be considered independent of the content latent variable z.sub.content, and as a result, the content latent variable z.sub.content may represent only attribute(s) required for analysis of the pattern. One example of the coordinate transformer 14 will be described hereafter in some additional detail with reference to
[0037] The second model 12 may receive the content latent variable z.sub.content and the transformed coordinate z.sub.cord, and may output a test pattern image x′.sub.pattern and test numeric data x′.sub.numeric. The second model 12 may correspond to a machine learning model trained in relation to a plurality of sample content latent variables, a plurality of sample pose latent variables, and a plurality of transformed coordinates. The second model 12 may output the test pattern image x′.sub.pattern and the test numeric data x′.sub.numeric, as derived from the content latent variable z.sub.content and the transformed coordinate z.sub.cord, which may then be used to restore the pattern image x.sub.pattern and the numeric data x.sub.pattern.
[0038] In some embodiments, the second model 12 may be used to train the first model 11.
[0039] In some embodiments, the second model 12 may be used to generate the pattern image x.sub.pattern and the numeric data x.sub.numeric from the content latent variable z.sub.content and the transformed coordinate z.sub.cord, which are independent of the pattern image x.sub.pattern and the numeric data x.sub.numeric.
[0040] In some embodiments, the second model 12 may include mutually independent models. For example, the second model 12 may include one model generating the test pattern image x′.sub.pattern from the content latent variable z.sub.content and the transformed coordinate z.sub.cord, and another model generating the test numeric data x′.sub.numeric from the content latent variable z.sub.content and the transformed coordinate z.sub.cord.
[0041] Referring to
[0042] The cluster generator 15 may collect a plurality of content latent variables respectively corresponding to the plurality of patterns generated by the first model 11, and may cluster the collected content latent variables. In some embodiments, the cluster generator 15 may cluster the plurality of content latent variables based on the Euclidean distances between the content latent variables. The cluster generator 15 may perform clustering based on at least one of, for example; k-center, k-medoid, k-means, etc. Accordingly, in the illustrated example of
[0043] The encoder 21 may be used to encode (or compress) high-dimensional data x into a low-dimensional latent variable z, and the decoder 22 may be used to decode the low-dimensional latent variable z in order to generate restored high-dimensional data x′. When the encoder 21 compresses the high-dimensional data x, and the decoder 22 thereafter accurately generates restored high-dimensional data x′ (e.g., x=x′), latent space associated with latent variable(s) may be accurately evaluated as representing the high-dimensional data without material data loss. Further, the Euclidean distance in latent space may be assumed to be a meaningful metric. Accordingly, similarity between latent variables may be expressed as a Gaussian kernel, which means that a probabilistic density function representing extractability of the latent variable has a Gaussian distribution.
[0044] The VAE 20 may be trained in accordance with a loss function expressed below in Equation 1, wherein latent space defined by the encoder 21 represents high-dimensional data as much as possible without loss, and the extractability of the latent variable satisfies the Gaussian distribution:
(θ,φ)=−
.sub.q.sub.
(0,1)), [Equation 1]
wherein, x represents high-dimensional data (or an input image), z represents the latent variable, q.sub.ϕ represents the encoder 21, p.sub.θ represents the decoder 22, and N(0, 1) represents the Gaussian distribution which is a prior distribution of z. Referring to Equation 1, the first term may correspond to a loss rate upon restoration after compression, and the second term may correspond to the difference between the probability density function corresponding to the latent variable and the Gaussian distribution.
[0045] Hereinafter, illustrated embodiments of the inventive concept assume the use of VAE 20, but the scope of the inventive concept is not limited thereto. For example, the pattern clustering system 10 of
[0046]
[0047] Under the foregoing assumptions a Gaussian distribution may be expressed by Equation 2 that follows:
q(z.sub.content|x.sub.pattern,x.sub.numeric)=(μ.sub.content;σ.sup.12.sub.content). [Equation 2]
[0048] In this regard, the encoder 31 may output an expected center value μ.sub.content and an expected variance σ.sup.2.sub.pattern from the pattern image x.sub.pattern and the numeric data x.sub.numeric. The sampler 32—in response to the expected center value μ.sub.content and the expected variance σ.sup.2.sub.pattern received from the encoder 31—and may output the content latent variable z.sub.content by performing stochastic sampling on (μ.sub.content, σ.sup.2.sub.content) In some embodiments, the sampler 32 may perform stochastic sampling based on a Gaussian re-parameterization trick.
[0049] Additionally, the encoder 31 may output the pose latent variable z.sub.pose. And as noted above in relation to some embodiments, the encoder 31 may include mutually independent models, such as one model generating the expected center value μ.sub.content and the expected variance σ.sup.2.sub.pattern and another model generating the pose latent variable z.sub.pose.
[0050]
[0051] In some embodiments, the pose latent variable z.sub.pose may include a real number vector having a dimension corresponding to the number of discrete Euclidean transforms. For example, as illustrated in
[0052] Accordingly, a first matrix m.sub.1 represents a conversion to a first image 51 substantially the same as an input image; a second matrix m.sub.2 represents a conversion to a second image 52 in which the input image has been clockwise rotated by about 90 degrees; a third matrix m.sub.3 represents a conversion to a third mage 53 in which the input image has been clockwise rotated by about 180 degrees; and a fourth matrix m.sub.4 represents a conversion to an image 54 in which the input image has been clockwise rotated about 270 degrees. Additionally, a fifth matrix m.sub.5 represents a conversion to a fifth image 55 in which the input image is reflected, a sixth matrix m.sub.6 represents a conversion to a sixth image 56 in which the fifth image 55 has been clockwise rotated about 90 degrees, a seventh matrix m.sub.7 represents a conversion to a seventh image 57 in which the fifth image 55 has been clockwise rotated about 180 degrees, and an eighth matrix m.sub.8 represents a conversion to an eighth image 58 in which the fifth image 55 has been rotated clockwise about 270 degrees.
[0053] Despite the Euclidean transform, such as transforms included in the dihedral group, the attributes of the pattern may be substantially maintained. For example, results of a process simulation or process emulation representing attribute(s) of the pattern may be substantially invariant despite the Euclidean transform. Hereinafter, illustrated embodiments assume that the pose latent variable z.sub.pose is an eight-dimensional real number vector corresponding to a dihedral group, but the scope of the inventive concept is not limited thereto. For example, the pose latent variable z.sub.pose may be derived using a continuous coordinate transform (e.g., a continuous probability distribution).
[0054] Similar to the content latent variable z.sub.content described above with reference to
q(o.sub.pose|x.sub.pattern)=softmax(z.sub.pose), [Equation 3]
wherein o.sub.pose may be an eight-dimensional one-hot vector, and may represent one of the transforms of the dihedral group. For example, o.sub.pose=[0, 1, 0, 0, 0, 0, 0] may represent a transform corresponding to the second matrix m.sub.2 of
[0055] The second sampler 41 may generate the o.sub.pose by performing a stochastic sampling in softmax (z.sub.pose). In some embodiments, the second sampler 41 may perform the stochastic sampling based on Gumbel softmax re-parameterization. The second sampler 41 may identify a matrix m.sub.pose corresponding to a value of the o.sub.pose among the plurality of matrices m.sub.1 through m.sub.8 of
[0056] The multiplier 42 may multiply the coordinate x.sub.cord by the matrix m.sub.pose to output the transformed coordinate z.sub.cord. As described above with reference to
[0057]
[0058] Referring to
[0059] While the content latent variable z.sub.content includes information about intrinsic attributes of a pattern, it may not be easy to interpret the physical meaning of the information. In addition, it may be easy to interpret physical meanings from the feature f.sub.pattern, but it may not be easy to define a pattern only with the feature f.sub.pattern. For example, patterns of different shapes may also have features of similar values, or patterns having similar attributes may also have features of different values.
[0060] The latent feature interpreter 62 may receive the feature f.sub.pattern and the content latent variable z.sub.content, and may output a main feature f.sup.k.sub.pattern representing k attributes, wherein k is a positive integer. The latent feature interpreter 62 may identify at least one important attribute in clustering, among the attributes included in the feature f.sub.pattern provided by the feature extractor 61, and output the main feature f.sup.k.sub.pattern including at least one identified attribute. To this end, the latent feature interpreter 62 may calculate a correlation coefficient between the feature f.sub.pattern and the content latent variable z.sub.content. Accordingly, the cluster may be readily analyzed using attributes having high correlation coefficients with the content latent variable z.sub.content. In some embodiments, the latent feature interpreter 62 may identify at least one attribute corresponding to a correlation coefficient greater than a predefined threshold, and output the main feature f.sup.k.sub.pattern including at least one identified attribute.
[0061]
[0062] Referring to
[0063] The verification data y.sub.pattern may be referred to as data generated by verifying a pattern. For example, the verification data y.sub.pattern may also be obtained using a process simulation and/or an emulation, and may also be obtained by actually measuring a pattern. In some embodiments, the verification data y.sub.pattern may include a value representing the reliability of a pattern, such as a defect rate of the pattern and robustness. More accurate pattern analysis may be possible by utilizing the verification data y.sub.pattern as well as information derived from input data, that is, the pattern image x.sub.pattern and the numeric data x.sub.numeric.
[0064] In some embodiments, the verification data y.sub.pattern may be estimated. The above-described methods of obtaining the verification data y.sub.pattern may require a lot of cost (for example, time, computing resources, or the like), and accordingly, it may not be easy to obtain the verification data y.sub.pattern corresponding to all pattern images x.sub.pattern. As described above with reference to the figures, because patterns included in each of the n clusters X.sub.1 through X.sub.n may have the same or similar attributes, the verification data y.sub.pattern of a pattern may be easily derived from the verification data y.sub.pattern of the other pattern included in the same cluster as the pattern.
[0065] The pattern analyzer 71 may identify features or verification data for distinguishing clusters, and the feature and verification data may be respectively referred to as the global feature f.sub.global and the global verification data y.sub.global. In addition, the pattern analyzer 71 may identify features or verification data distinguishing patterns included in one cluster, and the features or verification data may be respectively referred to as the local feature f.sub.local and the local verification data y.sub.local. Exemplary operations that may be performed by the pattern analyzer 71 in order to generate the global feature Glom, the global verification data y.sub.global, the local feature f.sub.local, and the local verification data y.sub.local, will be described hereafter with reference to
[0066] The latent visualizer 72 may receive the main feature f.sup.k.sub.pattern, the content latent variable z.sub.content, the verification data y.sub.pattern, and the n clusters X.sub.1 through X.sub.n, which are provided to the pattern analyzer 71, and may receive the global feature f.sub.global, the global verification data y.sub.global, the local feature f.sub.local, and the local verification data y.sub.local from the pattern analyzer 71. Although latent space has a lower dimension than the pattern image x.sub.pattern, it may still be complicated for the user to recognize latent space. Accordingly, the latent visualizer 72 may visualize latent space so that the user may recognize latent space. For example, the latent visualizer 72 may visualize latent space based on a self-organizing map, a principal component analysis, and a t-distribution stochastic neighbor embedding. The latent visualizer 72 may generate a latent space map (MAP), wherein the latent space map may represent the content latent variable z.sub.content and the n clusters X.sub.1 through X.sub.n and represent features or verification data on the latent variable z.sub.content, or the n clusters X.sub.1 through X.sub.n. Accordingly, the user may readily analyze patterns in relation to the latent space map. Examples of possible latent space maps will be described hereafter with reference to
[0067]
[0068] Referring to
[0069] Referring to
[0070] Referring to
[0071] Referring to
[0072]
[0073] Referring to
[0074] The content latent variable z.sub.content and the pose latent variable z.sub.pose may be obtained (S14). For example, the first model 11 may correspond to a machine learning model trained by a plurality of sample pattern images and a plurality of values, and may output the content latent variable z.sub.content and the pose latent variable z in response to the pattern image x.sub.pose and the pattern numeric data x.sub.numeric. The content latent variable z.sub.content may correspond to a single point in latent space, and the pose latent variable z.sub.pose may represent the Euclidean transform of a pattern in the pattern image x.sub.pattern. Due to the pose latent variable z.sub.pose, the content latent variable z.sub.content may represent attributes of a pattern independent of the Euclidean transform.
[0075] A determination is now made as to whether there is an additional pattern (S16). In this regard, so long as there is another pattern to be provided to the first model 11 from among the patterns subject to pattern clustering in relation to the integrated circuit (S16=YES), the foregoing method steps (e.g., S12, S14, and S16) may be repeated.
[0076] However, upon determining that there is no additional pattern (S16=NO), a plurality of latent variables may be grouped into a plurality of clusters (S18). For example, the cluster generator 15 may collect the plurality of latent variables from the first model 11, and may cluster the plurality of latent variables based on the Euclidean distances between the collected plurality of latent variables. Accordingly, the n clusters (e.g., X.sub.1 through X.sub.n) may be generated, wherein latent variables are included in each cluster (e.g., patterns having similar attribute(s)).
[0077]
[0078] Referring to
[0079] A test pattern image x′.sub.pattern may be obtained (S24). For example, the second model 12 may generate and output the test pattern image x′.sub.pattern in response to the content latent variable z.sub.content and the transformed coordinate z.sub.cord. Here, the test pattern image x′.sub.pattern may correspond to a pattern image restored from the content latent variable z.sub.content and the transformed coordinate z.sub.cord, and the first model 11 and the second model 12 may be trained such that the test pattern image x′.sub.pattern becomes more similar to the pattern image x.sub.pattern provided to the first model 11.
[0080] The first model 11 and the second model 12 may be trained (S26). In some embodiments, the first model 11 and the second model 12 may be trained in accordance with the foregoing description of Equation 1 to minimize a loss function expressed by Equation 4 that follows:
(θ,φ)=−
.sub.q.sub.
(0,1))+D.sub.KL(q.sub.φ(o.sub.pose|x.sub.pattern)∥
(K)),
wherein U(K) represents discrete uniform distribution which is a previous distribution of the o.sub.pose. In this case, K may be a number of possible transforms. For example, as described above with reference to
[0081] Here, it should be noted that reduced dimensionality of latent space is advantageous. For example, the lesser the dimension of latent space, the less information the content latent variable z.sub.content may include. Accordingly, the content latent variable z.sub.content may include only essential information, and as a result, all information related to the Euclidean transform may be included in the pose latent variable z.sub.pose. However, it should be further noted, that if dimensionality of latent space becomes too small, the content latent variable z.sub.content may omit certain essential information. Accordingly, it is important to carefully determine the dimensionality of latent space. In some embodiments, as the dimensionality of latent space is reduced, an absolute value of the first term of Equation 4 (e.g., an absolute value corresponding to a decoder) may increase. Accordingly, the dimensionality of latent space may be determined by comparing the absolute value of the first term of Equation 4 to a threshold.
[0082]
[0083] Referring to
[0084] The Euclidean transform of the pattern may be identified (S34). For example, the coordinate transformer 14 may receive the pose latent variable z.sub.pose from the first model 11. When the Euclidean transform is limited to transforms included in the dihedral group, the pose latent variable z.sub.pose may be a logit of an eight-dimensional discrete probability distribution. The pose latent variable z.sub.pose may be transformed into a probability distribution using softmax, and one of the transforms included in the dihedral group may be identified using sampling of the probability distribution.
[0085] The coordinate x.sub.cord may be transformed (S36). For example, the coordinate transformer 14 may obtain a matrix corresponding to the identified Euclidean transform, and may generate the transformed coordinate z.sub.cord by multiplying the obtained matrix by the coordinate x.sub.cord. The transformed coordinate z.sub.cord may then be provided to the second model 12 (S38). For example, the coordinate transformer 14 may provide the transformed coordinate z.sub.cord to the second model 12. In this regard, the second model 12 may receive the transformed coordinate z.sub.cord in order to restore a pattern image from the content latent variable z.sub.content.
[0086]
[0087] Referring to
[0088] Referring to
[0089] Referring to
[0090] Referring to
[0091]
[0092] The computer system 130 may be implemented using a general purpose computing system or special purpose computing system. For example, the computer system 130 may be implemented using a personal computer (PC), a server, a laptop PC, a home appliance, an automobile etc. As illustrated in
[0093] The at least one processor 131 may execute a program module including a computer system executable instruction. The program module may include routines, programs, objects, components, logic, data structures, or the like that perform a particular task or implement a particular abstract data type. The memory 132 may include a computer system readable medium in a form of a volatile memory, such as random access memory (RAM). The at least one processor 131 may access the memory 132, and execute instructions loaded on the memory 132. The storage system 133 may non-transitorily store information, and include at least one program product including a program module configured to perform training on machine learning models for the pattern clustering described above with reference to the diagrams in some embodiments. The program may include, as non-limiting examples, an operating system, at least one application, other program modules, and program data.
[0094] The network adapter 134 may provide access to a local area network (LAN), a wide area network (WAN), and/or a public network (for example, the Internet). The I/O interface 135 may provide a communication channel with a peripheral device, such as a keyboard, a pointing device, an audio system, etc. The display 136 may output various pieces of information so that the user may identify various pieces of information.
[0095] In some embodiments, training of the machine learning models for the pattern clustering consistent with embodiments of the inventive concept may be implemented as a computer program product. The computer program product may include a non-transitory computer-readable medium (or storage medium) including computer-readable program instructions allowing the at least one processor 131 to perform image processing and/or training of models. Computer-readable instructions may include, as non-limiting examples, assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, micro-code, firmware instructions, state setting data, or source code or object code written in at least one programming language.
[0096] The computer-readable medium may include any type of medium capable of non-transitorily holding and storing instructions executed by at least one processor 131 or any instruction-executable device. The computer-readable medium may include an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any combination thereof, but is not limited thereto. For example, the computer-readable medium may include at least one of, for example; a portable computer diskette, a hard disc, a random access memory (RAM) such as a dynamic RAM (DRAM) or a static RAM (SRAM), a read-only memory (ROM), an electrically usable ROM (EEPROM), flash memory, a compact disc (CD), a digital versatile disc (DVD), a memory stick, a floppy disk, etc.
[0097]
[0098] Referring to
[0099] The at least one processor 141 may execute instructions. For example, the at least one processor 141 may also execute an operating system by executing instructions stored in the memory 143 or may also execute applications running on the operating system. In some embodiments, the at least one processor 141 may instruct tasks of the AI accelerator 145 and/or the hardware accelerator 147 by executing instructions, and may also obtain a result of performing the task from the AI accelerator 145 and/or the hardware accelerator 147. In some embodiments, the at least one processor 141 may include an application specific instruction set processor (ASIP) customized for a specific use, and may also support a dedicated instruction set.
[0100] The memory 143 may have an arbitrary structure for storing data. For example, the memory 143 may also include a volatile memory device, such as a DRAM or SRAM, and/or a non-volatile memory device, such as flash memory and RRAM. The at least one processor 141, the AI accelerator 145, and the hardware accelerator 147 may store data in the memory 143, or read the data from the memory 143.
[0101] The AI accelerator 145 may be referred to as hardware designed for AI applications. In some embodiments, the AI accelerator 145 may include a neural processing unit (NPU) for implementing a neuromorphic structure, may generate output data by processing input data provided by the at least one processor 141 and/or the hardware accelerator 147, and may provide output data to the at least one processor 141 and/or the hardware accelerator 147. In some embodiments, the AI accelerator 145 may be programmable, and may be programmed by the at least one processor 141 and/or the hardware accelerator 147.
[0102] The hardware accelerator 147 may be referred to as hardware designed to perform a specific task at a high speed. For example, the hardware accelerator 147 may be designed to perform data transform at a high speed, such as demodulation, modulation, encoding, and decoding. The hardware accelerator 147 may be programmable, and may be programmed by at least one processor 141 and/or the hardware accelerator 147.
[0103] In some embodiments, the AI accelerator 145 may execute machine learning models described above with reference to diagrams. For example, the AI accelerator 145 may execute each of the above-described layers. The AI accelerator 145 may generate an output including useful information by processing input parameters, feature maps, etc. In addition, in some embodiments, at least some of the models executed by AI accelerator 145 may be executed by the at least one processor 141 and/or the hardware accelerator 147.
[0104] While the inventive concept has been particularly shown and described with reference to embodiments thereof, it will be understood that various changes in form and details may be made therein without departing from the scope of the following claims.