IMAGE HAZE REMOVAL METHOD AND APPARATUS, AND DEVICE

Abstract

The present disclosure discloses an image haze removal method and apparatus, and a device. The method includes: acquiring a hazy image to be processed; and obtaining a haze-free image corresponding to the hazy image by inputting the hazy image into a pre-trained haze removal model. The present disclosure uses the residual dual attention fusion modules as basic modules of the neural network, so that each feature map can obtain pixel features while enhancing the global dependence, thus improving the image dehazing effect.

Claims

1. An image haze removal method, comprising: acquiring a hazy image to be processed; and obtaining a haze-free image corresponding to the hazy image to be processed by inputting the hazy image to be processed into a pre-trained haze removal model, wherein the pre-trained haze removal model comprises a plurality of residual groups, each of the residual groups comprises a plurality of residual dual attention fusion modules connected in series, each of the residual dual attention fusion modules comprises a residual block, a first convolutional layer, a channel attention module, a pixel attention module, and a second convolutional layer, an output of the residual block is connected to inputs of the channel attention module and the pixel attention module via the first convolutional layer, and outputs of the channel attention module and the pixel attention module are fused for output processing, such that pixel features are obtained while global dependency of each feature map is enhanced.

2. The image haze removal method according to claim 1, wherein the haze removal model comprises three residual groups, and the three residual groups are in in-channel connection according to outputs from back to front.

3. The image haze removal method according to claim 1, wherein each of the residual groups comprises three residual dual attention fusion modules.

4. The image haze removal method according to claim 1, wherein the outputs of the residual dual attention fusion modules are obtained by inputting the outputs of the channel attention module and the pixel attention module and an input of the residual block into the second convolutional layer for fusion after element-by-element summation.

5. The image haze removal method according to claim 1, wherein the haze removal model further comprises a feature extraction convolutional layer, a channel attention module, a pixel attention module, and an output convolutional layer, the hazy image to be processed enters the residual groups after being subjected to feature extraction by the feature extraction convolutional layer, and enters the channel attention module, the pixel attention module and the output convolutional layer in sequence for processing after being processed by the residual groups, so as to obtain output features, and the haze-free image is obtained by performing element-by-element summation on the output features and the hazy image to be processed.

6. The image haze removal method according to claim 1, wherein the haze removal model is trained by: acquiring an RESIDE dataset, and constructing a training sample set by randomly selecting 6000 pairs of hazy images and haze-free images from the RESIDE dataset; and training a pre-established neural network with the training sample set.

7. The image haze removal method according to claim 6, wherein a loss function L of the neural network is expressed as: $\begin{matrix} L = \frac{1}{N} {.Math.}_{i = 1}^{N} {.Math. {\hat{J}}_{i} - J_{i}^{gt} .Math.}_{1} & (1) \end{matrix}$ where N is the number of training samples, J.sub.i.sup.gt is a real clear image of an ith training sample, and Ĵ.sub.i is a haze-free image estimated by the neural network for the ith training sample.

8. An image haze removal apparatus, comprising: an image acquiring module, configured to acquire a hazy image to be processed; and an image haze removal module, configured to input the hazy image to be processed into a haze removal model for processing, and output a haze-free image corresponding to the hazy image to be processed, wherein the haze removal model comprises a plurality of residual groups, each of the residual groups comprises a plurality of residual dual attention fusion modules connected in series, each of the residual dual attention fusion modules comprises a residual block, a first convolutional layer, a channel attention module, a pixel attention module, and a second convolutional layer, an output of the residual block is connected to inputs of the channel attention module and the pixel attention module via the first convolutional layer, and outputs of the channel attention module and the pixel attention module are fused, such that pixel features are obtained while global dependency of each feature map is enhanced.

9. The image haze removal apparatus according to claim 8, wherein the outputs of the residual dual attention fusion modules are obtained by inputting the outputs of the channel attention module and the pixel attention module and an input of the residual block into the second convolutional layer for fusion after element-by-element summation.

10. A device, comprising a memory, a processor, and a computer program stored on the memory and capable of running on the processor, wherein the processor, when executing the computer program, implements the image haze removal method according to claim 1.

Description

BRIEF DESCRIPTION OF DRAWINGS

[0028] FIG. 1 is a flow chart of an image haze removal method according to an embodiment of the present disclosure.

[0029] FIG. 2 is a schematic structural diagram of a haze removal model according to an embodiment of the present disclosure.

[0030] FIG. 3 is a schematic diagram of a channel attention module according to an embodiment of the present disclosure.

[0031] FIG. 4 is a schematic diagram of a pixel attention module according to an embodiment of the present disclosure.

[0032] FIG. 5 is a schematic diagram of a residual group according to an embodiment of the present disclosure.

[0033] FIG. 6 is a schematic diagram of a residual dual attention fusion module according to an embodiment of the present disclosure.

[0034] FIG. 7 is a comparison diagram of haze removal effects of an image haze removal method according to an embodiment of the present disclosure and other methods.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0035] The present disclosure is further illustrated below in conjunction with the accompanying drawings. The following examples are provided merely to more clearly illustrate the technical solution of the present disclosure, and are not intended to limit the scope of the present disclosure.

Example 1

[0036] As shown in FIG. 1, the present disclosure provides an image haze removal method. The method establishes an image haze removal model based on a neural network, but improves a convolutional neural network with a fixed receptive field, and uses residual dual attention fusion modules as basic modules. Each residual dual attention fusion module is formed by fusion of a residual block, a channel attention module, and a pixel attention module.

[0037] Specifically, as shown in FIG. 2, the haze removal model includes a feature extraction convolutional layer, three residual groups, a channel attention module, a pixel attention module, and two output convolutional layers. The three residual groups are in in-channel connection according to outputs from back to front. A hazy image to be processed enters the residual groups after being subjected to feature extraction by the feature extraction convolutional layer, and enters the channel attention module, the pixel attention module and the two output convolutional layers in sequence for processing after being processed by the residual groups, so as to obtain output features. A haze-free image is obtained by performing element-by-element summation on the output features and the hazy image to be processed.

[0038] As shown in FIG. 3, in the channel attention module, input features are input into global average pooling, a convolutional layer, an ReLU activation function, a convolutional layer, and a Sigmoid activation function. An output of the channel attention module is obtained by element-by-element multiplication of an obtained output weight and the input features, with expressions as follows:

[00002] $\begin{matrix} G^{c} = \frac{1}{H \times W} {.Math.}_{x = 1}^{H} {.Math.}_{y = 1}^{W} Z^{c} (x, y) & (1) \end{matrix}$ $\begin{matrix} F_{CAB}^{c} = σ (Conv (δ (Conv (G_{c})))) .Math. F^{c} & (2) \end{matrix}$ $\begin{matrix} F_{CAB}^{c} = H_{CAB} (F^{c}) & (3) \end{matrix}$

[0039] where Z.sup.c(x, y) represents a pixel value of an input Z.sup.c of a c-th channel at a position (x,y), and c∈{R, G, B}; after global average pooling, the dimension of a feature map is changed from C×H×W to C×1×1; δ represents the ReLU activation function, σ represents the Sigmoid activation function, .Math. represents element-by-element multiplication; and a mapping function from an input F.sup.c of the channel attention module to the output F.sub.CAB.sup.c of the channel attention module is H.sub.CAB.

[0040] The first convolutional layer of the channel attention module uses 8 convolutional kernels with the size of 1*1, and the second convolutional layer uses 64 convolutional kernels with the size of 1*1.

[0041] As shown in FIG. 4, in the pixel attention module, the input features are input into a convolutional layer, an ReLU activation function, a convolutional layer, and a Sigmoid activation function. An output of the pixel attention module is obtained by element-by-element multiplication of an obtained output weight and the input features, with expressions as follows:

F.sub.PA=σ(Conv(δ(Conv(F)))) (4)

F.sub.PAB=F.sub.PA.Math.F (5)

F.sub.PAB=H.sub.PAB(F) (6)

[0042] where F.sub.PA represents the feature weight of the output, the dimension is changed from C×H×W to 1×H×W, and a mapping function from an input F of the pixel attention module to the output F.sub.PAB of the pixel attention module is H.sub.PAB.

[0043] The first convolutional layer of the pixel attention module uses 8 convolutional kernels with the size of 1*1, and the second convolutional layer uses 1 convolutional kernel with the size of 1*1. Other convolutional layers use 64 convolutional kernels with the size of 3*3.

[0044] As shown in FIG. 5, each residual group includes three residual dual attention fusion modules connected in series and a convolutional layer. The input features are input into the residual dual attention fusion modules and the convolutional layer, and an output of the residual group is obtained by element-by-element summation of an output result and the input features. Expressions of the output of the residual group are as follows:

F.sub.g,m=H.sub.RDAFM(F.sub.g,m-1) (7)

F.sub.g=Conv(F.sub.g,3)⊕F.sub.g,0 (8)

F.sub.g=H.sub.RG(F.sub.g,0) (9)

[0045] where F.sub.g,m-1 and F.sub.g,m represent an input and an output of an mth residual dual attention fusion module in a gth residual group, respectively, g=1, 2, 3, and m=1, 2, 3; a mapping function from an input F.sub.g,m-1 of the residual dual attention fusion module to an output F.sub.g,m of the residual dual attention fusion module is H.sub.RDAFM; and a mapping function from an input F.sub.g,0 of the residual group to the output F.sub.g of the residual group is H.sub.RG.

[0046] As shown in FIG. 6, the residual dual attention fusion module includes a residual block, a first convolutional layer, a channel attention module, a pixel attention module, and a second convolutional layer. The residual block includes a convolutional layer and an ReLU activation function. An output of the residual block is connected to inputs of the channel attention module and the pixel attention module via the first convolutional layer, and outputs of the channel attention module and the pixel attention module are fused, such that the output of the residual dual attention fusion module is obtained by inputting the outputs of the channel attention module and the pixel attention module and an input of the residual block into the second convolutional layer for fusion after element-by-element summation. Thus, pixel features are obtained while the global dependency of the output of the residual dual attention fusion module in each feature map is enhanced. Expressions of the output of the residual dual attention fusion module are as follows:

F.sub.RB=δ(Conv(F))⊕F (10)

F*=Conv(F.sub.RB) (11)

F.sub.RDAFM=Conv(F.sub.CAB(F*)⊕F.sub.PAB(F*)⊕F) (12)

F.sub.RDAFM=H.sub.RDAFM(F) (13)

[0047] where ⊕ represents element-by-element summation, F.sub.RB represents the output of the residual block, F* represents the inputs of the attention modules, and a mapping function from the input F of the residual dual attention fusion module to the output F.sub.RDAFM of the residual dual attention fusion module is H.sub.RDAFM.

[0048] The haze removal model is trained by the following steps: acquire an RESIDE dataset, and construct a training sample set by randomly selecting 6000 pairs of hazy images and haze-free images from the RESIDE dataset; and train the neural network with the training sample set to obtain the haze removal model. During use, the hazy image to be processed is acquired and input into the haze removal model to obtain the haze-free image.

[0049] A loss function L of the neural network is expressed as:

[00003] $\begin{matrix} L = \frac{1}{N} {.Math.}_{i = 1}^{N} {.Math. {\hat{J}}_{i} - J_{i}^{gt} .Math.}_{1} & (14) \end{matrix}$

[0050] where N is the number of training samples, J.sub.i.sup.gt is a real clear image of an ith training sample, and Ĵ.sub.i is a haze-free image estimated by the neural network for the ith training sample.

[0051] In the neural network, weight parameters of the network are initialized with an Adam optimizer, where default values of β.sub.1 and β.sub.2 are 0.9 and 0.999, respectively. An initial learning rate α is set as 1×10.sup.−4. The learning rate is updated using a cosine annealing strategy, and is adjusted from the initial value to 0:

[00004] $α_{t} = \frac{1}{2} (1 + \cos (\frac{t π}{T})) α$

[0052] where T is the total number of batches, α is the initial learning rate, t is a current batch, and α.sub.t is an adaptively updated learning rate.

[0053] For each sample image input into the training set of the haze removal network model, the total loss of the difference between a real clear image and a haze-removed image restored by the network is first obtained using forward propagation, and then weight parameters are updated based on the Adam optimizer. The total number of training steps is 1×10.sup.5, and every 200 steps is a batch, for a total of 500 batches. The above steps are repeated until the set maximum step length is reached, so as to obtain the trained haze removal network model, with expressions as follows:

F.sub.0=Conv(I) (16)

F.sub.g=H.sub.RG(F.sub.g-1) (17)

F= custom-character {F.sub.3,F.sub.2,F.sub.1} (18)

Ĵ=Conv(Conv(H.sub.PAB(H.sub.CAB(F))))⊕I (19)

[0054] where I represents the input hazy image, F.sub.g-1 and F.sub.g represent an input and an output of the gth residual group, respectively, g=1, 2, 3, custom-character {⋅} represents the operation of in-channel connection, and Ĵ represent a restored output image.

Example 2

[0055] In this example, an image haze removal apparatus is further provided. The apparatus includes:

[0056] an image acquiring module, configured to acquire a hazy image to be processed; and

[0057] an image haze removal module, configured to input the hazy image to be processed into a haze removal model for processing, and output a haze-free image corresponding to the hazy image to be processed.

[0058] The haze removal model includes a plurality of residual groups. Each of the residual groups includes a plurality of residual dual attention fusion modules connected in series. Each of the residual dual attention fusion modules includes a residual block, a first convolutional layer, a channel attention module, a pixel attention module, and a second convolutional layer. An output of the residual block is connected to inputs of the channel attention module and the pixel attention module via the first convolutional layer. Outputs of the residual dual attention fusion modules are obtained by inputting outputs of the channel attention module and the pixel attention module and an input of the residual block into the second convolutional layer for fusion after element-by-element summation, such that pixel features are obtained while global dependency of each feature map is enhanced.

Example 3

[0059] In this example, a device is further provided. The device includes a memory, a processor, and a computer program stored on the memory and capable of running on the processor. The processor, when executing the computer program, implements the image haze removal method according to Example 1.

[0060] Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may take the form of a full hardware embodiment, a full software embodiment, or an embodiment combining software and hardware. Besides, the present application may adopt the form of a computer program product implemented on one or more computer available storage media (including but not limited to a disk memory, a CD-ROM, an optical memory and the like) including computer available program codes.

[0061] The present application is described with reference to the flow diagram and/or block diagram of the method, device (system), and computer program product according to the embodiments of the present application. It should be understood that each flow and/or block in the flow diagram and/or block diagram and the combination of flows and/or blocks in the flow diagram and/or block diagram may be implemented by computer program instructions. These computer program instructions may be provided to processors of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing devices to generate a machine, such that instructions executed by processors of a computer or other programmable data processing devices generate an apparatus for implementing the functions specified in one or more flows of the flow diagram and/or one or more blocks of the block diagram.

[0062] These computer program instructions may also be stored in a computer-readable memory capable of guiding a computer or other programmable data processing devices to work in a specific manner, such that instructions stored in the computer-readable memory generate a manufactured product including an instruction apparatus, and the instruction apparatus implements the functions specified in one or more flows of the flow diagram and/or one or more blocks of the block diagram.

[0063] These computer program instructions may also be loaded on a computer or other programmable data processing devices, such that a series of operation steps are executed on the computer or other programmable devices to produce computer-implemented processing, and thus, the instructions executed on the computer or other programmable devices provide steps for implementing the functions specified in one or more flows of the flow diagram and/or one or more blocks of the block diagram.

[0064] The above are only preferred implementations of the present disclosure. It should be noted that those of ordinary skill in the art can also make several improvements and transformation without departing from the technical principle of the present disclosure, and these improvements and transformation shall also fall within the scope of the present disclosure.

IMAGE HAZE REMOVAL METHOD AND APPARATUS, AND DEVICE

Assignee

Inventors

Cpc classification

Classification Explorer

G06T5/003

PHYSICS

Classification Explorer

G06T5/006

PHYSICS

Classification Explorer

G06T2207/20084

PHYSICS

Classification Explorer

G06N3/00

PHYSICS

International classification

Classification Explorer

G06T5/00

PHYSICS

Abstract

Claims

Description