Facial beauty prediction method and device based on multi-task migration

Abstract

Disclosed are a facial beauty prediction method and device based on multi-task migration. The method includes: performing similarity measurement based on a graph structure on a plurality of tasks to obtain an optimal combination of the plurality of tasks; constructing a facial beauty prediction model including a feature sharing layer based on the optimal combination; migrating feature parameters of an existing large-scale facial image network to the feature sharing layer of the facial beauty prediction model; inputting facial images for training to pre-train the facial beauty prediction model; and inputting a facial image to be tested to the trained facial beauty prediction model to obtain facial recognition results.

Claims

1. A facial beauty prediction method based on multi-task migration, comprising the following steps: performing similarity measurement based on a graph structure on a plurality of tasks to obtain an optimal combination of the plurality of tasks, wherein the plurality of tasks comprise a main task for facial beauty prediction and several auxiliary tasks for recognition of facial beauty factors; constructing a facial beauty prediction model based on the optimal combination, wherein the facial beauty prediction model comprises a feature sharing layer for extracting shared image features of the plurality of tasks; migrating feature parameters of an existing large-scale facial image network to the feature sharing layer of the facial beauty prediction model; inputting facial images for training to pre-train the facial beauty prediction model; and inputting a facial image to be tested to the trained facial beauty prediction model to obtain facial recognition results.

2. The facial beauty prediction method based on multi-task migration of claim 1, wherein the performing similarity measurement based on a graph structure on a plurality of tasks to obtain an optimal combination of the plurality of tasks comprises the following steps: constructing a specific network for each of the plurality of tasks and training the specific network to obtain a feature expression E.sub.s(I) of each task; matching vertices and paths of a plurality of the specific networks to construct a migration network among the plurality of tasks; measuring a task tightness among the plurality of the tasks, wherein the task tightness is calculated according to a formula: $D_{s .fwdarw. t} := \begin{matrix} \arg \min E_{I \in D} \\ θ \end{matrix} [L_{t} (D_{θ} (E_{S} (I)), f_{t} (I))],$ where I is an input, D is a data set, f.sub.t(I) is a true value of the t-th input I, L.sub.t is a loss between the true value and a predicted value, and E.sub.IϵD is an expected value; calculating a correlation matrix among the plurality of tasks, wherein each element in the correlation matrix is: $w_{i, j}^{'} = \frac{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) > D_{s_{j} - t} (I)]}{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) < D_{s_{j} .fwdarw. t} (I)]};$ constructing a directed acyclic graph structure by using each task as a node of the graph structure and the correlation matrix as a supervision value of the graph structure, and searching for an optimal path by minimizing a supervision budget to obtain the optimal combination of the plurality of tasks.

3. The facial beauty prediction method based on multi-task migration of claim 2, wherein the facial beauty prediction model further comprises a pre-processing layer for pre-processing the facial images for training and the facial image to be tested, independent feature extraction layers for extracting independent features of the plurality of tasks, feature fusion layers for fusing the independent features with geometric features and texture features corresponding to each task, and classification layers.

4. The facial beauty prediction method based on multi-task migration of claim 3, wherein the migrating feature parameters to the feature sharing layer comprises: comparing the feature parameters with basic parameters which are configured by the feature sharing layer according to the optimal combination; and migrating the feature parameters corresponding to the basic parameters.

5. The facial beauty prediction method based on multi-task migration of claim 1, wherein the auxiliary tasks comprise expression recognition, gender recognition and age recognition, and the facial recognition results comprise a facial beauty prediction result corresponding to the main task, and an expression recognition result, a gender recognition result and an age recognition result corresponding to the auxiliary tasks.

6. A facial beauty prediction device based on multi-task migration, comprising: a similarity measurement module, configured to perform similarity measurement based on a graph structure on a plurality of tasks to obtain an optimal combination of the plurality of tasks, wherein the plurality of tasks comprise a main task for facial beauty prediction and several auxiliary tasks for recognition of facial beauty factors; a model construction module, configured to construct a facial beauty prediction model based on the optimal combination, wherein the facial beauty prediction model comprises a feature sharing layer for extracting shared image features of the plurality of tasks; a feature parameter migration module, configured to migrate feature parameters of an existing large-scale facial image network to the feature sharing layer of the facial beauty prediction model; a pre-training module, configured to input facial images for training to pre-train the facial beauty prediction model; and a result obtaining module, configured to input a facial image to be tested to the trained facial beauty prediction model to obtain facial recognition results.

7. The facial beauty prediction device based on multi-task migration of claim 6, wherein the similarity measurement module comprises: a feature expression obtaining module, configured to construct a specific network for each of the plurality of tasks and train the specific network to obtain a feature expression E.sub.s(I) of each task; a migration network construction module, configured to match vertices and paths of the plurality of specific networks to construct a migration network among the plurality of tasks; a tightness measurement module, configured to measure a task tightness among the plurality of tasks, wherein the task tightness is calculated according to a formula: $D_{s .fwdarw. t} := \begin{matrix} \arg \min E_{I \in D} \\ θ \end{matrix} [L_{t} (D_{θ} (E_{S} (I)), f_{t} (I))],$ where I is an input, D is a data set, f.sub.t(I) is a true value of the t-th input I, L.sub.t is a loss between the true value and a predicted value, and E.sub.IϵD is an expected value; a correlation processing module, configured to calculate a correlation matrix among the plurality of tasks, wherein each element in the correlation matrix is: $w_{i, j}^{'} = \frac{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) > D_{s_{j} - t} (I)]}{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) < D_{s_{j} .fwdarw. t} (I)]};$ a graph structure construction module, configured to construct a directed acyclic graph structure by using each task as a node of the graph structure and the correlation matrix as a supervision value of the graph structure, and search for an optimal path by minimizing a supervision budget to obtain the optimal combination of the plurality of tasks.

8. The facial beauty prediction device based on multi-task migration of claim 7, wherein the facial beauty prediction model further comprises a pre-processing layer for pre-processing the facial images for training and the facial image to be tested, independent feature extraction layers for extracting independent features of the plurality of tasks, feature fusion layers for fusing the independent features with geometric features and texture features corresponding to each task, and classification layers.

9. The facial beauty prediction device based on multi-task migration of claim 8, wherein the feature sharing layer comprises a parameter configurator, which is used to configure basic parameters according to the optimal combination; and the feature parameter migration module comprises a matching module, which is configured to compare the feature parameters with the basic parameters and migrate the feature parameters corresponding to the basic parameters.

10. The facial beauty prediction device based on multi-task migration of claim 6, wherein the auxiliary tasks comprise expression recognition, gender recognition and age recognition, and the facial recognition results comprise a facial beauty prediction result corresponding to the main task, and an expression recognition result, a gender recognition result and an age recognition result corresponding to the auxiliary tasks.

Description

BRIEF DESCRIPTION OF DRAWINGS

(1) The present disclosure will be further explained below with reference to the accompanying drawings and examples.

(2) FIG. 1 is a diagram of steps of a facial beauty prediction method based on multi-task migration according to an embodiment of the present disclosure;

(3) FIG. 2 is a diagram of sub-steps of step S100;

(4) FIG. 3 is a network structure diagram of a facial beauty prediction model; and

(5) FIG. 4 shows a structure of a facial beauty prediction device based on multi-task migration according to an embodiment of the present disclosure.

DETAILED DESCRIPTION

(6) Specific embodiments of the present disclosure will be described in detail in this section, the preferred embodiments of the present disclosure are illustrated in the accompanying drawings, and the accompanying drawings are used for supplementing the literal description with graphics so that a person can intuitively and vividly understand each technical feature and overall technical solutions of the present disclosure, but cannot be understood as limitations to the scope of the present disclosure.

(7) In the description of the present disclosure, unless otherwise specified, terms such as “provided”, “mounted” and “connected” should be understood in a broad sense, and the specific meanings of the terms in the present disclosure can be reasonably determined by those skilled in the art in light of the specific contents of the technical solutions.

(8) Referring to FIG. 1, an embodiment of the present disclosure provides a facial beauty prediction method based on multi-task migration, including the following steps S100-S500.

(9) At step S100, similarity measurement based on a graph structure is performed on a plurality of tasks to obtain an optimal combination of the plurality of tasks. The plurality of tasks include a main task for facial beauty prediction and several auxiliary tasks for recognition of facial beauty factors.

(10) At step S200, a facial beauty prediction model is constructed based on the optimal combination. The facial beauty prediction model includes a feature sharing layer 20 for extracting shared image features of the plurality of tasks.

(11) At step S300, feature parameters of an existing large-scale facial image network are migrated to the feature sharing layer 20 of the facial beauty prediction model.

(12) At step S400, facial images for training are inputted to pre-train the facial beauty prediction model.

(13) At step S500, a facial image to be tested is inputted to the trained facial beauty prediction model to obtain facial recognition results.

(14) Because combination of N tasks would produce N*(N−1)/2 combinations in total, if all combinations are trained and fused, the amount of training will be severely increased, and unnecessary redundancy and useless data will be generated to affect the precision of classification and recognition. In this embodiment, the similarity measurement based on the graph structure is performed on the plurality of tasks to search for the correlation between the plurality of tasks and obtain the optimal combination of the plurality of tasks, and the optimal combination is combined into the facial beauty prediction model, thereby simplifying the facial beauty prediction model, reducing the redundancy of deep learning tasks, easing the burden of network training, and improving the efficiency and precision of network classification and recognition.

(15) Migration learning is to improve the learning of new tasks by transferring knowledge from related tasks that have been learned. The feature parameters of the existing large-scale facial image network are migrated for learning, which further reduces the cost of training the facial beauty prediction model and can also improve the precision of classification and recognition results.

(16) The facial beauty prediction model is perfected by combining the optimal combination of multiple tasks obtained after the similarity measurement and migrating the feature parameters for learning, which simplifies the structure of the facial beauty prediction model, reduces data processing of the facial beauty prediction model to improve processing efficiency, and also avoids low correlated data to improve the precision of classification and recognition.

(17) Referring to FIG. 2, further, step S100 specifically includes the following steps S110-S150.

(18) At step S110, a specific network is constructed for each of the plurality of tasks and trained to obtain a feature expression E.sub.s(I) of each task, where each specific network has an encoder and a decoder, all encoders have the same ResNet50 structure, and each decoder corresponds to a different task.

(19) At step S120, vertices and paths of the plurality of specific networks are matched to construct a migration network among the plurality of tasks.

(20) At step S130, a task tightness among the plurality of tasks is measured, wherein the task tightness is calculated according to a formula:

(21) $D_{s .fwdarw. t} := \begin{matrix} \arg \min E_{I \in D} \\ θ \end{matrix} [L_{t} (D_{θ} (E_{S} (I)), f_{t} (I))],$
where I is an input, D is a data set, f.sub.t(I) is a true value of the t-th input I, L.sub.t is a loss between the true value and a predicted value, and E.sub.IϵD is an expected value; the migration network is a directed graph, each node of the directed graph corresponds to a task, and the weight between nodes is the task tightness.

(22) At step S140, a correlation matrix among the plurality of tasks is calculated; specifically, for each task pair.sup.(i,j) in which a source task points to a target task, a test set is taken out by a hold-out method after migration; a matrix W.sub.t is constructed for each task, the output result of the matrix W.sub.t is controlled within a range [0.001, 0.999] by means of a Laplace smoothing method, and then the correlation matrix is obtained by transformation, wherein the correlation matrix reflects a similarity probability among the tasks. Each element w.sub.i,j′ in W.sub.t′ is calculated as follows:

(23) $w_{i, j}^{'} = \frac{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) > D_{s_{j} - t} (I)]}{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) < D_{s_{j} .fwdarw. t} (I)]};$

(24) At step S150, a directed acyclic graph structure is constructed by using each task as a node of the graph structure and the correlation matrix as a supervision value of the graph structure, and an optimal path is searched for by minimizing a supervision budget to obtain the optimal combination of the plurality of tasks, that is, a problem of subgraph selection is solved based on the correlation matrix.

(25) It should be noted that searching for the optimal vertice matching between two graphs means to find an optimal mapping function to establish a bijection between vertice sets of the two graphs, so that the difference between the corresponding vertices is minimum; the optimal path matching between the two graphs is searched for by calculating a common path matrix, the common path matrix is preferably calculated by a Kasa-algorithm, and the Kasa-algorithm is a variant of a Floyd-Warshall algorithm. After the vertice and path matching, the migration network among the plurality of tasks will be obtained.

(26) Referring to FIG. 3, the facial beauty prediction model further includes a pre-processing layer 10 for pre-processing the facial images for training and the facial image to be tested, independent feature extraction layers 30 for extracting independent features of the plurality of tasks, feature fusion layers 40 for fusing the independent features with geometric features and texture features corresponding to each task, and classification layers 50. The independent feature extraction layer 30 includes a convolutional layer, a BN layer, an activation function layer, and a pooling layer that are sequentially connected, and the classification layer 50 includes two fully connected layers.

(27) Specifically, in the pre-training step, the facial images for training are inputted to the facial beauty prediction model, and the facial images for training are subjected to pre-processing by the pre-processing layer 10, such as gray processing and pixel normalization. The facial images for training then enter the feature sharing layer 20 and are processed by the feature sharing layer 20 to obtain a feature map with shared image features; the feature sharing layer 20 configures basic parameters according to the optimal combination of the plurality of tasks; and in the step of migrating the feature parameters to the feature sharing layer 20, the feature parameters are compared with the basic parameters, and the feature parameters corresponding to the basic parameters are migrated to eliminate unnecessary feature parameters and simplify the step of extracting shared image features. There are different independent feature extraction layers 30 corresponding to different tasks, the inputs of different independent feature extraction layers 30 are feature maps with shared image features, and different independent feature extraction layers 30 extract independent features corresponding to different tasks. Specifically, the main task includes facial beauty prediction, and the auxiliary tasks include expression recognition, gender recognition, and age recognition. There are 4 independent feature extraction layers 30, which correspond to facial beauty prediction, expression recognition, gender recognition, and age recognition, respectively. Each independent feature extraction layer 30 is connected to a feature fusion layer 40, and the feature fusion layer 40 is connected to a classification layer 50. The feature fusion layer 40 fuses independent features with geometric features and texture features according to F.sub.fusion[F.sub.CNN,G,H] to obtain fusion features, where F.sub.fusion represents fusion features, G represents geometric features, and H represents texture features. The geometric features and the texture features are provided by existing image libraries or other feature extraction networks. The classification layer 50 obtains facial recognition results according to the fusion features. A large number of facial images for training are inputted to perfect the facial beauty prediction model. Specifically, the facial recognition results include a facial beauty prediction result corresponding to the main task, and an expression recognition result, a gender recognition result and an age recognition result corresponding to the auxiliary tasks.

(28) Similarly, the facial image to be tested is processed by the trained facial beauty prediction model based on the above steps. The facial beauty prediction model outputs the facial recognition results corresponding to the facial image to be tested to complete the facial beauty prediction.

(29) The above-mentioned facial beauty prediction method performs similarity measurement on the plurality of tasks to search for the correlation between the plurality of tasks and obtain the optimal combination of the plurality of tasks, and combines the optimal combination into the deep learning model, thereby reducing the redundancy of deep learning tasks, easing the burden of network training, and improving the efficiency and precision of network classification and recognition; and the feature parameters of the existing facial image network are migrated for learning to further reduce the cost of network training.

(30) Referring to FIG. 4, another embodiment of the present disclosure provides a facial beauty prediction device based on multi-task migration using the above-mentioned facial beauty prediction method, including:

(31) a similarity measurement module 100, configured to perform similarity measurement based on a graph structure on a plurality of tasks to obtain an optimal combination of the plurality of tasks, where the plurality of tasks include a main task for facial beauty prediction and several auxiliary tasks for recognition of facial beauty factors;

(32) a model construction module 200, configured to construct a facial beauty prediction model based on the optimal combination, where the facial beauty prediction model includes a feature sharing layer 20 for extracting shared image features of the plurality of tasks;

(33) a feature parameter migration module 300, configured to migrate feature parameters of an existing large-scale facial image network to the feature sharing layer 20 of the facial beauty prediction model;

(34) a pre-training module 400, configured to input facial images for training to pre-train the facial beauty prediction model; and

(35) a result obtaining module 500, configured to input a facial image to be tested to the trained facial beauty prediction model to obtain facial recognition results.

(36) Further, the similarity measurement module 100 includes:

(37) a feature expression obtaining module 110, configured to construct a specific network for each of the plurality of tasks and train the same to obtain a feature expression E.sub.s(I) of each task;

(38) a migration network construction module 120, configured to match vertices and paths of the plurality of specific networks to construct a migration network among the plurality of tasks;

(39) a tightness measurement module 130, configured to measure a task tightness among the plurality of tasks, where the task tightness is calculated according to a formula:

(40) $D_{s .fwdarw. t} := \begin{matrix} \arg \min E_{I \in D} \\ θ \end{matrix} [L_{t} (D_{θ} (E_{S} (I)), f_{t} (I))],$
where I is an input, D is a data set, f.sub.t(I) is a true value of the t-th input I, L.sub.t is a loss between the true value and a predicted value, and E.sub.IϵD is an expected value;

(41) a correlation processing module 140, configured to calculate a correlation matrix among the plurality of tasks, where each element in the correlation matrix is:

(42) $w_{i, j}^{'} = \frac{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) > D_{s_{j} - t} (I)]}{E_{I \in D_{test}} [D_{s_{i} .fwdarw. t} (I) < D_{s_{j} .fwdarw. t} (I)]};$

(43) a graph structure construction module 150, configured to construct a directed acyclic graph structure by using each task as a node of the graph structure and the correlation matrix as a supervision value of the graph structure, and search for an optimal path by minimizing a supervision budget to obtain the optimal combination of the plurality of tasks.

(44) Referring to FIG. 3, the facial beauty prediction model further includes a pre-processing layer 10 for pre-processing the facial images for training and the facial image to be tested, independent feature extraction layers 30 for extracting independent features of the plurality of tasks, feature fusion layers 40 for fusing the independent features with geometric features and texture features corresponding to each task, and classification layers 50.

(45) Further, the feature sharing layer 20 includes a parameter configurator, which is used to configure basic parameters according to the optimal combination; the feature parameter migration module includes a matching module, which is configured to compare the feature parameters with the basic parameters and migrate the feature parameters corresponding to the basic parameters.

(46) Further, the auxiliary tasks include expression recognition, gender recognition and age recognition, and the facial recognition results include a facial beauty prediction result corresponding to the main task, and an expression recognition result, a gender recognition result and an age recognition result corresponding to the auxiliary tasks.

(47) The above-mentioned facial beauty prediction device performs similarity measurement based on a graph structure on the plurality of tasks to search for the correlation between the plurality of tasks and obtain the optimal combination of the plurality of tasks, and combines the optimal combination into the deep learning model, thereby reducing the redundancy of deep learning tasks, easing the burden of network training, and improving the efficiency and precision of network classification and recognition; and the feature parameters of the existing facial image network are migrated for learning to further reduce the cost of network training.

(48) Another embodiment of the present disclosure provides a storage medium storing executable instructions, the executable instructions enabling a processor connected to the storage medium to process facial images according to the above-mentioned facial beauty prediction method to obtain facial beauty recognition results.

(49) Described above are only preferred embodiments of the present disclosure, and the present disclosure is not limited to the above-mentioned embodiments. As long as the embodiments achieve the technical effects of the present disclosure by the same means, they shall fall within the protection scope of the present disclosure.

Facial beauty prediction method and device based on multi-task migration

Assignee

Inventors

Cpc classification

Classification Explorer

G06F18/214

PHYSICS

Classification Explorer

G06V40/178

PHYSICS

Classification Explorer

G06V40/174

PHYSICS

Classification Explorer

G06F18/253

PHYSICS

Classification Explorer

G06V40/168

PHYSICS

Classification Explorer

G06F18/22

PHYSICS

International classification

Classification Explorer

G06V40/16

PHYSICS

Classification Explorer

G06F18/22

PHYSICS

Classification Explorer

G06F18/25

PHYSICS

Abstract

Claims

Description