Method and system for quantifying semantic variance between neural network representations

12283082 ยท 2025-04-22

Assignee

Inventors

Cpc classification

International classification

Abstract

A method and system for quantifying semantic variance between neural network representations is provided. Two neural network representations to be compared are first extracted, the weight of each filter in an intermediate layer corresponding to each semantic concept is learned on a reference dataset using the Net2Vec method, then set IoU of each representation for all semantic concepts in the reference dataset are calculated, and finally variance between the set IoU of the two representations for all the semantic concepts are integrated to obtain semantic variance between the two neural network representations. The method solves the problem of lack of accurate measurement on the variance between neural network representations on a semantic information level, and has an accurate measurement effect.

Claims

1. A method for quantifying semantic variance between neural network representations, comprising the following steps: S1: extracting representations: predicting, on a reference dataset, with two neural networks required for feature extraction, and in the prediction process, retaining an intermediate layer output of the neural networks when predicting each sample; S2: learning weights: learning a weight of each filter in an intermediate layer corresponding to each semantic concept on the reference dataset using a Net2Vec method; S3: calculating set IoU: linearly superimposing, using the weights learned in step S2, activation values of the filters in the intermediate layer output retained in step S1 to obtain a total activation value corresponding to each semantic concept, then binarizing the total activation value to obtain a mask of each sample corresponding to each semantic concept, and calculating the set IoU of each representation corresponding to each semantic concept; and S4: integrating variance: calculating a variance between the set IoU of the representations of the two neural networks for all semantic concepts, and integrating the variance to obtain semantic variance between two neural network representations.

2. The method for quantifying semantic variance between neural network representations according to claim 1, wherein the Net2Vec method in step S2 is specifically as follows: calculating a predicted segmentation mask M(x; w) using a Sigmoid function a according to the following equation: M ( x ; w ) = ( .Math. k w k .Math. A k ( x ) ) wherein wR.sup.K is a weight to be learned, K is a total number of filters in the intermediate layer, and A.sub.k(x) is an activation map of a k.sup.th filter for input x; learning the weight w for a concept c by minimizing a binary cross entropy loss function of the following equation: 1 = - 1 N c .Math. x X c M ( x ; w ) L c ( x ) + ( 1 - ) ( 1 - M ( x ; w ) ( 1 - L c ( x ) ) wherein N.sub.c is a number of samples containing the concept c, xX.sub.c represents the samples containing the concept c, and =1.sub.xX.sub.c|L.sub.c(x)|/S, wherein |L.sub.c(x)| is a number of foreground pixels of the concept c in a ground truth mask of the sample x, and S=h.sub.s.Math.w.sub.s is a total number of pixels in the ground truth mask.

3. The method for quantifying semantic variance between neural network representations according to claim 2, wherein in step S3, for the set IoU of the concept c: IoU set ( c ; M ) = .Math. x X c .Math. "\[LeftBracketingBar]" M ( x ) .Math. L c ( x ) .Math. "\[RightBracketingBar]" .Math. x X c .Math. "\[LeftBracketingBar]" M ( x ) .Math. L c ( x ) .Math. "\[RightBracketingBar]" wherein L.sub.c is a ground truth segmentation mask.

4. The method for quantifying semantic variance between neural network representations according to claim 3, wherein in step S4, the variance between the set IoU of the representations of the two neural networks for all semantic concepts are integrated as follows: S .Math. Var ( R 2 ; R 1 ) = .Math. j = 1 c S .Math. Var j 1 + ( 1 - ) S .Math. Var j 2 .Math. x X R .Math. "\[LeftBracketingBar]" L j ( x ) .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" X R .Math. "\[RightBracketingBar]" S wherein =I (min (IoU.sub.set(c.sub.j; R.sub.2), IoU.sub.set(c.sub.j; R.sub.1))>0) determines whether the concept c.sub.j is a common concept between two representations, =1 if c.sub.j is the common concept, otherwise =0, I(.Math.) is an indicator function, X.sub.R is the reference dataset, |L.sub.j(x)| is a number of foreground pixels of the concept c.sub.j in the ground truth mask of the sample x, |X.sub.R| is a total number of samples in the reference dataset, S=h.sub.s.Math.w.sub.s is a total number of pixels in the ground truth mask, S.Math.Var.sub.j.sup.1 is a semantic variance of common semantic concept between representations R.sub.1 and R.sub.2 of the two neural networks, S.Math.Var.sub.j.sup.2 is a semantic variance of non-common semantic concept between two representations, and is a weight for the concept that newly appears or disappears in representation R.sub.2 compared to representation R.sub.1.

5. The method for quantifying semantic variance between neural network representations according to claim 2, wherein in step S4, the variance between the set IoU of the representations of the two neural networks for all semantic concepts are integrated as follows: S .Math. Var ( R 2 ; R 1 ) = .Math. j = 1 c S .Math. Var j 1 + ( 1 - ) S .Math. Var j 2 .Math. x X R .Math. "\[LeftBracketingBar]" L j ( x ) .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" X R .Math. "\[RightBracketingBar]" S wherein =I (min (IoU.sub.set(c.sub.j; R.sub.2), IoU.sub.set(c.sub.j; R.sub.1))>0) determines whether the concept c.sub.j is a common concept between two representations, =1 if c.sub.j is the common concept, otherwise =0, I(.Math.) is an indicator function, X.sub.R is the reference dataset, |(L.sub.j(x)| is a number of foreground pixels of the concept c.sub.j in the ground truth mask of the sample x, |X.sub.R| is a total number of samples in the reference dataset, S=h.sub.s.Math.w.sub.s is a total number of pixels in the ground truth mask, S.Math.Var.sub.j.sup.1 is a semantic variance of common semantic concept between representations R.sub.1 and R.sub.2 of the two neural networks, S.Math.Var.sub.j.sup.2 is a semantic variance of non-common semantic concept between two representations, and is a weight for the concept that newly appears or disappears in representation R.sub.2 compared to representation R.sub.1.

6. The method for quantifying semantic variance between neural network representations according to claim 5, wherein the semantic variance of common semantic concept between representations R.sub.1 and R.sub.2 of the two neural networks is calculated according to the following equation to obtain a semantic variance of R.sub.2 relative to R.sub.1: S .Math. Var j 1 = IoU set ( c j ; R 2 ) - IoU set ( c j ; R 1 ) max ( IoU set ( c j ; R 2 ) , IoU set ( c j ; R 1 ) ) wherein IoU.sub.set(c.sub.j; R.sub.i).sub.j=1 . . . C is the set IoU of the representation R.sub.i corresponding to each semantic concept, and C is a total number of concepts; the semantic variance of non-common semantic concept between two representations is calculated according to the following equation to obtain a semantic variance of R.sub.2 relative to R.sub.1: S .Math. Var j 2 = IoU set ( c j ; R 2 ) - IoU set ( c j ; R 1 ) 1 C .Math. k = 1 C IoU set ( c k ; R 1 ) .

7. The method for quantifying semantic variance between neural network representations according to claim 6, wherein a value of is 2.

8. A system for quantifying semantic variance between neural network representations, comprising a computer program, wherein steps of any above method are implemented when the computer program is executed by a processor.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIGURE is a flowchart of method steps for quantifying semantic variance between neural network representations according to the present invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

(2) The present invention will be further illustrated below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are only used to illustrate the present invention and not to limit the scope of the present invention.

Embodiment 1

(3) As shown in FIGURE, a method for quantifying semantic variance between neural network representations is implemented as follows:

(4) Step S1. Extract representations of two neural networks (such as ResNet50) to be compared: Predict, on a reference dataset (such as a BRODEN dataset), the neural networks required for feature extraction, and in the prediction process, retain intermediate layer output of the neural networks when predicting each sample.

(5) Step S2. Learn the weight of each filter in an intermediate layer corresponding to each semantic concept on the reference dataset using the Net2Vec method:

(6) First, the weight to be learned is represented by wR.sup.K, where K is the total number of filters in the intermediate layer, and a predicted segmentation mask M(x; w) is calculated using a Sigmoid function according to the following equation:

(7) M ( x ; w ) = ( .Math. k w k .Math. A k ( x ) ) where A.sub.k(x) is an activation map of the k.sup.th filter for input x;

(8) Then, the weight w for the concept c is learned by minimizing the binary cross entropy loss function of the following equation:

(9) 1 = - 1 N c .Math. x X c M ( x ; w ) L c ( x ) + ( 1 - ) ( 1 - M ( x ; w ) ( 1 - L c ( x ) ) where N.sub.c is the number of samples containing the concept c, xX.sub.c represents the samples containing the concept c, and =1.sub.xX.sub.c|L.sub.c(x)|/S, where |L.sub.c(x)| is the number of foreground pixels of the concept c in the ground truth mask of the sample x, and S=h.sub.s.Math.w.sub.s is the total number of pixels in the ground truth mask;

(10) Then, on the reference dataset, 30epoch is trained to obtain the weight corresponding to each semantic concept using an SGD optimizer with a learning rate set to 0.0001, a momentum set to 0.9, and a batch size set to 64.

(11) Step S3. Calculate a set IoU of each representation corresponding to each semantic concept: first learn the weight of each filter in the intermediate layer corresponding to each semantic concept using the Net2Vec method, then linearly superimpose, using the learned weights, activation values of the filters in the intermediate layer output retained in the previous step to obtain a total activation value corresponding to each semantic concept, binarize the total activation value to obtain a mask of each sample corresponding to each semantic concept, and finally calculate the set IoU of each representation corresponding to each semantic concept;

(12) Specific steps are as follows:

(13) The set IoU for the concept c is calculated according to the following equation:

(14) IoU set ( c ; M ) = .Math. x X c .Math. "\[LeftBracketingBar]" M ( x ) .Math. L c ( x ) .Math. "\[RightBracketingBar]" .Math. x X c .Math. "\[LeftBracketingBar]" M ( x ) .Math. L c ( x ) .Math. "\[RightBracketingBar]"

(15) For the weight w of the concept c, this equation calculates an IoU (Jaccard coefficient) between the segmentation mask M after binarization of the total activation value obtained by linearly superimposing the activation values of the filters, and the ground truth segmentation mask L.sub.c;

(16) The corresponding set IoU is calculated for each semantic concept.

(17) Step S4. Integrate variance between the set IoU of two representations for all semantic concepts;

(18) The step of integrating variance between the set IoU of two representations for all semantic concepts is as follows:

(19) First, R.sub.1 and R.sub.2 represent representations of two neural networks, and IoU.sub.set(c.sub.j; R.sub.i).sub.j=1-c represents a set IoU of the representation R.sub.i corresponding to each semantic concept, where C is the total number of concepts. the semantic variance of common semantic concepts between the two representations is calculated according to the following equation to obtain the semantic variance of R.sub.2 relative to R.sub.1:

(20) 0 S .Math. Var j 1 = IoU set ( c j ; R 2 ) - IoU set ( c j ; R 1 ) max ( IoU set ( c j ; R 2 ) , IoU set ( c j ; R 1 ) )

(21) Then, the semantic variance of non-common semantic concepts between two representations is calculated according to the following equation to obtain the semantic variance of R.sub.2 relative to R.sub.1:

(22) S .Math. Var j 2 = IoU set ( c j ; R 2 ) - IoU set ( c j ; R 1 ) 1 C .Math. k = 1 C IoU set ( c k ; R 1 )

(23) Finally, the semantic variance of all semantic concepts are integrated from the above two equations according to the following equation to calculate final semantic variance between the two neural network representations:

(24) S .Math. Var ( R 2 ; R 1 ) = .Math. j = 1 c S .Math. Var j 1 + ( 1 - ) S .Math. Var j 2 .Math. x X R .Math. "\[LeftBracketingBar]" L j ( x ) .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" X R .Math. "\[RightBracketingBar]" S where =I (min (IoU.sub.set(c.sub.j; R.sub.2), IoU.sub.set(c.sub.j; R.sub.1))>0) determines whether the concept c.sub.j is a common concept between two representations, =1 if c.sub.j is a common concept, otherwise =0, |(.Math.) is an indicator function, X.sub.R is a reference dataset, |L.sub.j(x)| is the number of foreground pixels of the concept c.sub.j in the ground truth mask of the sample x, |X.sub.R| is the total number of samples in the reference dataset, and S=h.sub.s.Math.w.sub.s is the total number of pixels in the ground truth mask.

(25) =2 is set to emphasize the semantic variance caused by non-common concepts between two representations, namely, newly added or disappearing semantic concepts of R.sub.2 relative to

(26) R 1 , .Math. x X R .Math. "\[LeftBracketingBar]" L j ( x ) .Math. "\[RightBracketingBar]" .Math. "\[LeftBracketingBar]" X R .Math. "\[RightBracketingBar]" S
is a pixel ratio of the concept c.sub.j in the entire reference dataset, and the semantic variance of each semantic concept are divided by this equation to eliminate semantic concept proportion deviations in the reference dataset.

(27) When the semantic variance S.Math.Var(R.sub.2; R.sub.1) is positive, it indicates that R.sub.2 has richer semantic information than R.sub.1, and vice versa.

(28) It should be noted that the above content only explains the technical idea of the present invention and cannot limit the scope of protection of the present invention thereby. For those of ordinary skill in the art, many improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications fall within the scope of protection of the claims of the present invention.