Method for restoring video data of pipe based on computer vision

11620735 · 2023-04-04

Assignee

CHINA UNIVERSITY OF MINING & TECHNOLOGY, BEIJING (Beijing, CN)

Inventors

Cpc classification

International classification

Abstract

A method for restoring video data of a pipe based on computer vision is provided. The method includes: performing gray stretching on pipe image/video collected by a pipe robot; processing noise interference by smoothing filtering; extracting an iron chain from the center of a video image as a template for location; performing target recognition on the center of video data by an SIFT corner detection algorithm; detecting ropes on left and right sides of a target by Hough transform; performing gray covering on the iron chain at the center of the video image and the ropes on two sides; and restoring data by an FMM image restoration algorithm.

Claims

1. A method for restoring video data of a pipe based on computer vision, comprising: step (1): collecting image/video information of a pipe by a pipe robot with a high-definition camera entering the pipe, and performing gray stretching on the collected pipe image/video; enhancing contrast of the pipe image to make light and shade contrast of the pipe image more distinct and features more obvious; wherein a gray value ƒ(x,y) of each pixel (x,y) in an input image is as an independent variable of a function, H denotes a transform operation performed on ƒ(x,y) in a spatial domain to increase or reduce the gray value, to obtain a dependent variable as a gray value g(x,y) in an output image, and equation (1) is as follows:
g(x,y)=H[ƒ(x,y)] (1) performing spatial smoothing filtering enhancement on a gray image by using an adjacent averaging method of a spatial domain method, wherein weight of each pixel is equal in the adjacent averaging method, considering that importance of each pixel is assumed to be same, and equation (2) is as follows: $\begin{matrix} g (x, y) = \frac{1}{M} \underset{i, j \in s}{.Math.} f (i, j) & (2) \end{matrix}$ wherein, s is a set of pixel coordinates in a neighborhood of (x,y), (i,j) is coordinates of a pixel in the neighborhood and M is a number of pixels in the set s; step (2): extracting an iron chain from a center of data as a template for target recognition, and intercepting an image of the iron chain in a center of the video data image after image preprocessing; step (3): performing target detection on all data by using a scale-invariant feature transform (SIFT) corner detection algorithm to locate the iron chain at the center; wherein, a Gaussian kernel function is used to perform filtering when constructing a scale space; and L(x,y,σ) is defined as a convolution operation of the original image I(x,y) and a scale-variable two-dimensional Gaussian function G(x,y,σ), and equations (3) and (4) are as follows: $\begin{matrix} G (x, y, σ) = \frac{1}{2 π σ^{2}} \exp (- \frac{{(x - m / 2)}^{2} + {(y - n / 2)}^{2}}{2 σ^{2}}) & (3) \end{matrix}$ $\begin{matrix} L (x, y, σ) = G (x, y, σ) * I (x, y) & (4) \end{matrix}$ wherein * represents the convolution operation, (x,y) represents a pixel position in an image; m, n represent a center of a Gaussian template; and σ represents a scale space factor; for key points detected in a difference of Gaussian (DOG) pyramid, gradient and direction distribution features of pixels in an adjacent window of a layer of a Gaussian pyramid image to which the key points correspond, are collected, wherein, a module value m(x,y) and a direction θ(x,y) of the gradient are as shown in equations (5) and (6), a function L(x,y), with a same meaning as the equations (4), is defined as the convolution operation of a point (x,y) in the original image and the scale-variable two-dimensional Gaussian function, in the equations (5) and (6), σ is omitted as it is a fixed value and will not change arbitrarily, and the equations (5) and (6) are as follows: $\begin{matrix} m (x, y) = \sqrt{\begin{matrix} {(L (x + 1), y) - L (x - 1, y))}^{2} + \\ {(L (x, y + 1) - L (x, y - 1))}^{2} \end{matrix}} & (5) \end{matrix}$ $\begin{matrix} θ (x, y) = \tan^{- 1} ((L (x, y + 1) - L (x, y - 1) / (L (x + 1, y) - L (x - 1, y))) & (6) \end{matrix}$ wherein, a gradient histogram statistical method is used to count image pixels in a particular area with a key point as the origin to determine a direction of the key point; after completing the gradient calculation of the key points, a histogram is used to show the gradients and directions of pixels in the neighborhood, with a peak direction of the histogram representing a main direction of the key points, a peak of a direction histogram representing a direction of a neighborhood gradient at this feature point and a maximum value in the histogram being taken as the main direction of the key points; only directions in which peaks are greater than 80% of the peak of the main direction are kept as auxiliary directions of the key points; and corner detection is used to perform target detection on the data; step (4): detecting positions of ropes by using Hough transform after a position of the center of the data is obtained, wherein a main principle is as follows: all straight lines ƒ(x)=kx+b that pass through each pixel point (x.sub.θ,y.sub.θ) at an edge are mapped into a Hough space, and then appropriate positions are selected, wherein, k represents a straight slope and b represents y-intercept; a straight line perpendicular to x-axis, which does not have the slope and cannot be expressed based on the slope, is expressed by a parametric equation r=x*cos(θ)+y*sin(θ), wherein (x,y) represents a pixel point at an edge, r represents a distance between a straight line passing through this point and the origin, and θ represents an included angle between r and positive x-axis; and voting in the Hough space after mapping of each edge point, and adding 1 to a pixel value of point (r,θ) every time a straight line equation satisfies this point; step (5): covering positions of the iron chain and the ropes obtained through steps (3) and (4) with pixels having a gray value of 0; and step (6): restoring the data by using Telea's fast marching method (FMM) image restoration algorithm, wherein a fast marching restoration algorithm is a fast time-sensitive image restoration method, and is adopted to start restoration from edge pixels of an area to be restored, march to pixels within the area to be restored and finally complete a restoration process; several parameters are defined first as follows: Ω is defined as the area to be restored of the image, and ∂Ω is defined as a boundary where the area to be restored is in contact with an undamaged area; nature of fast marching is to obtain distances T between all pixel points in the area to be restored Ω and the boundary ∂Ω, wherein the distances are positive values when pixel points are in the area to be restored, and the distances are negative values when pixel points are outside the area to be restored; a sequence of restoration is determined according to a magnitude of T, and the restoration is continued until all pixels within Ω are processed; for a damaged point R in ∂Ω, an area B.sub.ε(p) with a width of ε on two sides of the boundary is created, and a gray value of a pixel point p in this area is calculated based on gray values of all known pixel points q according to the following equation:
R.sub.q(p)R(q)+∇R(q)(p−q) (7) wherein R(q) represents the gray value of the known pixel point q and ∇R(q) represents a gradient value of the known pixel point q; B.sub.ε(p) these undamaged pixel points in the area have different weights in the whole operation process, and the respective weights are obtained by using a weighting calculation equation (8): $\begin{matrix} R (p) = \frac{{.Math.}_{q \in B_{l} (p)} w (p, q) [R (q) + \nabla R (q) (p - q)]}{{.Math.}_{q \in B_{l} (p)} w (p, q)} & (8) \end{matrix}$ wherein w(p,q) represents a weight function for a pixel which is used to determine a contribution of each pixel in the domain B.sub.ε(p); w(p,q) refers to an iso-illuminance parameter of the damaged point p and is related to a geometric distance parameter between two points; this processing method preserves continuous texture of the data of a local area of the image to a certain extent during updating and calculation of parameters of the damaged point p; and the function is defined as equation (9):
w(p,q)=dir(p,q)*dst(p,q)*lev(p,q) (9) wherein * represents the convolution operation, dir(p,q) represents a texture direction constraint, dst(p,q) represents a geometric distance constraint and lev(p,q) represents a level set constraint; dir(p,q) reflects correlation between point p and point q in the texture direction; dst(p,q) reflects correlation of a geometric distance between point p and point q, and a smaller value means that the weight is greater; and lev(p,q) ensures that the known pixel point closer to the contour of an area to be repaired at the point p contribute more to the point p; three constraint conditions are as shown in equations (10): $\begin{matrix} dir (p, q) = \frac{p - q}{.Math. p - q .Math.} .Math. N (p) & (10) \end{matrix}$ $dst (p, q) = \frac{d_{0}^{2}}{{.Math. p - q .Math.}^{2}}$ $lev (p, q) = \frac{T_{0}}{1 + .Math. T (p) - T (q) .Math.}$ wherein d.sub.0 and T.sub.0, as a distance constraint parameter and a level set constraint parameter respectively, are generally set to be 1; dir(p,q) N=∇T and N(p) represents the texture direction of point p; dst(p,q) p lev(p,q) ∂Ω wherein, a direction of an iso-illuminance curve of the FMM algorithm is updated according to a calculation of a domain T; ∂Ω a distance domain T on two sides of the initial boundary ∂Ω is calculated; the gray value of pixel point p as described above is calculated based on the known pixels in the domain B.sub.ε(p), and then a set −T.sub.out of points outside the boundary is calculated outside the boundary area ∂Ω within a range of T≤ε, and similarly, a set T.sub.in of internal points relative to the boundary is calculated inside the boundary area ∂Ω, thereby defining the whole T domain and guaranteeing that the restoration calculation of FMM is performed on a narrow edge with a width of ε on two sides of the boundary ∂Ω; and the T domain of the whole image is then defined as: $\begin{matrix} T (p) = {\begin{matrix} T_{in} (p), p \in Ω \\ - T_{out} (p), p .Math. Ω \end{matrix} & (11) \end{matrix}$ for the value of ε of the selected domain B.sub.ε(p), 3-10 pixel points are selected for a good effect.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The present disclosure will be further described below in conjunction with the accompanying drawings and embodiments.

(2) FIG. 1 is a flow chart according to the present disclosure.

(3) FIG. 2 shows an image showing gray stretching and smoothing filtering according to the present disclosure.

(4) FIG. 3 shows an image of an iron chain extracted from video data according to the present disclosure.

(5) FIG. 4 shows a view of target detection by using an SIFT algorithm according to the present disclosure.

(6) FIG. 5 shows an effect image of ropes detected by using Hough transform according to the present disclosure.

(7) FIG. 6 shows an effect image with an iron chain and ropes covered according to the present disclosure.

(8) FIG. 7 shows a basic principle of an FMM algorithm according to the present disclosure.

(9) FIG. 8 shows effect images of restored data according to the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

(10) To make the objective, technical solutions and effects of the present disclosure clearer and more comprehensible, the present disclosure will be described in more detail with reference to the accompanying drawings and embodiments, but these embodiments should not be construed as a limitation to the present disclosure.

(11) As shown in FIG. 1, a method for restoring video data of a pipe based on computer vision provided in the present disclosure is specifically implemented by the following steps:

(12) S1010: a pipe robot with a high-definition camera enters a pipe to collect image/video information of a pipe, and gray stretching is performed on the collected pipe image/video. Contrast of the pipe image is enhanced to make light and shade contrast of the pipe image more distinct and features more obvious. A gray value ƒ(x,y) of each pixel (x,y) in an input image is used as an independent variable of a function, and H denotes a transform operation performed on ƒ(x,y) in the spatial domain to increase or reduce the gray value thereof, and thus a dependent variable is obtained as a gray value g(x,y) in an output image. Equation (1) is specifically as follows:
g(x,y)=H[ƒ(x,y)] (1)

(13) Spatial smoothing filtering enhancement is performed on a gray image by using an adjacent averaging method of a spatial domain method, thereby eliminating jagged contours due to uneven light, local highlighting, and metal reflection caused by a point light source in a real pipe environment. The weight of each pixel is equal in the adjacent averaging method, that is, importance of each pixel is assumed to be the same, and equation (2) is specifically as follows:

(14) $\begin{matrix} g (x, y) = \frac{1}{M} \underset{i, j \in s}{.Math.} f (i, j) & (2) \end{matrix}$

(15) where s is a set of pixel coordinates in a neighborhood of (x,y), while (i,j) is coordinates of a pixel in the neighborhood and M is the number of pixels in the set s. A resulting preprocessed image is as shown in FIG. 2.

(16) S1110: an iron chain is extracted from a center of video data as a template for target recognition. An image of the iron chain is intercepted in the center of the video data image after the image is preprocessed, as shown in FIG. 3.

(17) S1120: target detection is performed on all data by using an SIFT corner detection algorithm, and the iron chain at the center is found and located.

(18) SIFT is short for scale-invariant feature transform, which is a description used in the field of image processing. This description is scale-invariant, which allows detection of key points in an image, and is a local feature descriptor. The SIFT has good stability and invariance, and is adaptable to rotation, scaling, and variable brightness, and capable of avoiding interference of variable viewing angle, affine transformation and noise to a certain extent.

(19) The SIFT algorithm uses a Gaussian kernel function to perform filtering when constructing a scale space, so that an original image preserves the most detail features, and the detail features are gradually reduced after Gaussian filtering to simulate feature representation in a large scale situation. L(x,y,σ) is defined as a convolution operation of the original image I(x,y) and a scale-variable two-dimensional Gaussian function G(x,y,σ).

(20) $\begin{matrix} G (x, y, σ) = \frac{1}{2 π σ^{2}} \exp (- \frac{{(x - m / 2)}^{2} + {(y - n / 2)}^{2}}{2 σ^{2}}) & (3) \end{matrix}$ $\begin{matrix} L (x, y, σ) = G (x, y, σ) * I (x, y) & (4) \end{matrix}$

(21) As shown in equations (3) and (4), (x,y) represents a pixel position in the image, and m, n represent a center of a Gaussian template; σ represents a scale space factor, and the smaller the value of the scale space factor, the less the image is smoothed, and the smaller the corresponding scale; a large scale corresponds to overview features of the image, while a small scale corresponds to detail features of the image; and * represents the convolution operation.

(22) Extreme points are found out based on scale invariance, and a reference direction needs to be assigned to each key point based on local features of the image, so that the descriptor is invariant to rotation of the image. For key points detected in a difference of Gaussian (DOG) pyramid, gradient and direction distribution features of pixels in an adjacent window of a layer of Gaussian pyramid image to which such key points correspond are collected. A module value m(x,y) and a direction θ(x,y) of the gradient are as shown in equations (5) and (6):

(23) $\begin{matrix} m (x, y) = \sqrt{\begin{matrix} {(L (x + 1), y) - L (x - 1, y))}^{2} + \\ {(L (x, y + 1) - L (x, y - 1))}^{2} \end{matrix}} & (5) \end{matrix}$ $\begin{matrix} θ (x, y) = \tan^{- 1} ((L (x, y + 1) - L (x, y - 1) / (L (x + 1, y) - L (x - 1, y))) & (6) \end{matrix}$

(24) This algorithm uses a gradient histogram statistical method to count image pixels in a particular area with a key point as an origin to determine a direction of the key point. After completing the gradient calculation of the key points, a histogram is used to show the gradients and directions of pixels in the neighborhood. A peak direction of the histogram represents a main direction of the key points, while a peak of a direction histogram represents a direction of a neighborhood gradient at this feature point and a maximum value in the histogram is taken as the main direction of the key points. To enhance robustness of matching, only directions in which peaks are greater than 80% of the peak of the main direction are kept as auxiliary directions of the key points.

(25) The SIFT corner detection is used to perform target detection on the data, with an effect view of target detection as shown in FIG. 4.

(26) S1130: positions of ropes are detected by using Hough transform after a position of the center of the data is obtained. A main principle is as follows: all straight lines ƒ(x)=kx+b (k representing a straight slope and b representing y-intercept) that possibly pass through each pixel point (x.sub.0,y.sub.0) at an edge are mapped into a Hough space, and then appropriate positions are selected. As a straight line perpendicular to x-axis does not have the slope, it cannot be expressed based on the slope, and thus is expressed by a parametric equation r=x*cos(θ)+y*sin(θ), where (x,y) represents a pixel point at an edge, while r represents a distance between a straight line passing through this point and the origin, and θ represents an included angle between r and the positive x-axis. Voting is performed in the Hough space after mapping of each edge point, and 1 is added to a pixel value of the edge point (x,y) every time a straight line equation satisfies this point (r,θ).

(27) After the ropes are detected by using the Hough transform, the resulting image is as shown in FIG. 5.

(28) S1140: the positions of the central iron chain and the ropes on two sides are obtained after the above two steps. The positions of the iron chain and the ropes are then covered with pixels having a gray value of 0, thereby getting ready for restoration. A resulting image is as shown in FIG. 6.

(29) S1150: the data is restored by Telea's FMM (fast marching method) image restoration algorithm.

(30) The fast marching restoration algorithm is a fast time-sensitive image restoration method. The basic idea of this algorithm is to start restoration from the edge pixels of an area to be restored, gradually march to the pixels within the area to be restored and finally complete the whole restoration process. Several parameters are defined first: Ω is defined as the area to be restored of the image, and ∂Ω is defined as a boundary where the area to be restored is in contact with an undamaged area. The nature of fast marching is to obtain distances T between all pixel points in the area Ω and the boundary ∂Ω, wherein the distances are positive values when pixel points are in the area to be restored, and the distances are negative values when pixel points are outside the area to be restored; a sequence of marching is determined according to the magnitude of T, and restoration is continued until all pixels within Ω are processed. The basic principle of the FMM algorithm is as shown in FIG. 7.

(31) For a damaged point p in ∂Ω, an area B.sub.ε(p) with width ε on an outside of the boundary is created, and a gray value of pixel point p in this area is calculated based on the gray values of all known pixel points q according to the following equation:
R.sub.q(p)=R(q)+∇R(q)(p−q) (7)

(32) where R(q) and ∇R(q) represent the gray value and the gradient value of the known pixel point q, respectively; and obviously, the gray value of point p needs to be calculated through substitutions of parameters of all undamaged points in the area B.sub.F(p). These undamaged pixel points in the area have different weights in the whole operation process, and the respective weights are obtained by using a weighting calculation equation (8):

(33) $\begin{matrix} R (p) = \frac{{.Math.}_{q \in B_{l} (p)} w (p, q) [R (q) + \nabla R (q) (p - q)]}{{.Math.}_{q \in B_{l} (p)} w (p, q)} & (8) \end{matrix}$

(34) where w(p,q) represents a weight function for a pixel which is used to determine a contribution of each pixel in the domain B.sub.F(p). w(p,q) refers to an iso-illuminance parameter of the damaged point p and is related to a geometric distance parameter between two points. This processing approach retains an extension of regional image structure data to a certain extent during updating and calculation of parameters of the damaged point p. The function is defined as equation (9):
w(p,q)=dir(p,q)*dst(p,q)*lev(p,q) (9)

(35) where * represents the convolution operation, dir(p,q) represents a texture direction constraint, dst(p,q) represents a geometric distance constraint and lev(p,q) represents a level set constraint. dir(p,q) reflects correlation between point p and point q in the texture direction, and the more approximate the two points in texture, the greater the weight. dst(p,q) reflects correlation of a geometric distance between point p and point q, and obviously, the smaller this value, the greater the weight. lev(p,q) reflects an influence of information arrival, and the weight is greater when it is closer to known information.

(36) The three constraint conditions are as shown in equations (10):

(37) $\begin{matrix} dir (p, q) = \frac{p - q}{.Math. p - q .Math.} .Math. N (p) & (10) \end{matrix}$ $dst (p, q) = \frac{d_{0}^{2}}{{.Math. p - q .Math.}^{2}}$ $lev (p, q) = \frac{T_{0}}{1 + .Math. T (p) - T (q) .Math.}$

(38) where d.sub.0 and T.sub.0, as a distance constraint parameter and a level set constraint parameter, are generally set to 1. dir(p,q) ensures a greater contribution of a known pixel point of N=∇T when closer to a normal direction, and N(p) represents the texture direction of point p. dst(p,q) ensures a greater weight of a known point closer to damaged point p in gray updating and calculation thereof. lev(p,q) ensures a greater contribution of a known point closer to the boundary outside the same boundary ∂Ω.

(39) A direction of an iso-illuminance curve of the FMM algorithm is updated according to a calculation of a domain T. To ensure that the restoration is started from an initial boundary ∂Ω and to eliminate interference of a large number of irrelevant internal pixels far away from the boundary, a distance domain T needs to be calculated on two sides of the initial boundary ∂Ω. As described above, the gray value of pixel point p is calculated based on the known pixels in the domain B.sub.ε(p), and then, a set −T.sub.out of points outside the boundary is calculated outside the boundary area ∂Ω within a restricted range of T≤ε, and similarly, a set T.sub.in of internal points relative to the boundary is calculated inside the boundary area ∂Ω, thereby defining the whole T domain and guaranteeing that the restoration calculation of FMM is performed on a narrow edge with a width of ε on an outside of the boundary ∂Ω. The T domain of the whole image is then defined as:

(40) $\begin{matrix} T (p) = {\begin{matrix} T_{in} (p), p \in Ω \\ - T_{out} (p), p .Math. Ω \end{matrix} & (11) \end{matrix}$

(41) For the value of ε of the selected domain B.sub.ε(p), 3-10 pixel points are usually selected for a good effect, thereby achieving balance between a restoration rate and a restoration effect.

(42) After the image is restored by using the FMM algorithm, the resulting restored image is as shown in FIG. 8.

(43) The specific embodiments described herein are merely intended to illustrate the spirit of the present disclosure by way of example. A person skilled in the art can make various modifications or supplements to the specific embodiments described or replace them in a similar manner, but it may not depart from the spirit of the present disclosure or the scope defined by the appended claims.

Method for restoring video data of pipe based on computer vision

Assignee

Inventors

Cpc classification

Classification Explorer

G06T7/0004

PHYSICS

Classification Explorer

G06T2207/10016

PHYSICS

Classification Explorer

G06T7/62

PHYSICS

Classification Explorer

G06T5/20

PHYSICS

Classification Explorer

G06T7/13

PHYSICS

Classification Explorer

G06T2207/30108

PHYSICS

Classification Explorer

G06V2201/07

PHYSICS

Classification Explorer

G06T2207/20061

PHYSICS

Classification Explorer

G06V10/48

PHYSICS

Classification Explorer

G06V10/54

PHYSICS

Classification Explorer

F16L55/26

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Classification Explorer

G06T7/44

PHYSICS

Classification Explorer

G06V10/449

PHYSICS

Classification Explorer

G06T3/60

PHYSICS

Classification Explorer

G06V10/36

PHYSICS

Classification Explorer

G06T5/40

PHYSICS

Classification Explorer

F16L2101/30

MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING

Classification Explorer

G06T5/008

PHYSICS

Classification Explorer

G06T5/002

PHYSICS

Classification Explorer

G06T7/70

PHYSICS

Classification Explorer

G06T3/40

PHYSICS

Classification Explorer

G06V10/462

PHYSICS

Classification Explorer

G06V2201/06

PHYSICS

Classification Explorer

G06F18/10

PHYSICS

International classification

Classification Explorer

G06T5/00

PHYSICS