METHOD OF AUTOMATIC IMAGE FREEZING OF DIGESTIVE ENDOSCOPY
20220006981 · 2022-01-06
Inventors
Cpc classification
H04N7/188
ELECTRICITY
G06V20/46
PHYSICS
International classification
H04N7/18
ELECTRICITY
A61B1/273
HUMAN NECESSITIES
G06T3/40
PHYSICS
Abstract
A method of automatic image freezing of digestive endoscopy based on a perceptual hash algorithm includes: analyzing a video streaming of digestive endoscopy acquired by digestive endoscopy imaging system into image data; calculating a similarity between an image at t point in time and images of first n frames, to obtain a weighted similarity k of the image; and comparing the weighted similarity k of the image at t point in time with a freezing boundary l, and triggering an instruction of image freezing when the k reaches l to obtain the clear images with the best visual field from the video streaming of digestive endoscopy.
Claims
1. A method, comprising: 1) analyzing a video streaming of digestive endoscopy acquired by a digestive endoscopy imaging system into image data; 2) calculating a similarity between an image at t point in time and images of first n frames, to obtain a weighted similarity k of the image; and 3) comparing the weighted similarity k of the image at t point in time with a freezing boundary l, and triggering an instruction of image freezing when the k reaches l to obtain clear images with a best visual field from the video streaming of digestive endoscopy.
2. The method of claim 1, wherein in 1), the method further comprises removing fuzzy invalid frame images, cropping the clear images, reducing a size of cropped images, retaining image structure information, and converting the cropped images into gray scale images.
3. The method of claim 2, wherein in 1), bicubic interpolation is adopted to reduce the size of the cropped images.
4. The method of claim 2, wherein in 1), a calculation formula of converting the cropped images into the gray scale images is as follows:
Gray=0.30*R+0.59*G+0.11*B; where R, G and B respectively represent information values of red light, green light and blue light.
5. The method of claim 2, wherein in 1), Gray-scale value of adjacent pixels in each line of a gray image are compared; if a Gray-scale value of a previous pixel is greater than that of a latter pixel, a dHash value is set to “1”, if not, the dHash value is set to “0”.
6. The method of claim 1, wherein in 2), the similarity between different images is calculated by calculating a Hamming distance between different images.
7. The method of claim 6, wherein in 2), the Hamming distance between different images refers to a number of digits required to change dHash values corresponding to a first image to dHash values corresponding to a second image.
8. The method of claim 7, wherein in 2), a formula for calculating the similarity between a current image and the first n frames is as follows:
Sim=100*(64−d(x,y))/64; where d (x, y) is the Hamming distance between different images, d (x, y)=Σx⊕y, x and y are the dHash values corresponding to different images, and ⊕ is exclusive OR.
9. The method of claim 1, wherein in 3), the freezing boundary l is obtained by analyzing a video of manually freezing image by an endoscopist during the digestive endoscopy.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0019]
[0020]
[0021]
[0022]
DETAILED DESCRIPTION
[0023] To further illustrate the disclosure, embodiments detailing a method of automatic image freezing of digestive endoscopy based on a perceptual hash algorithm are described below. It should be noted that the following embodiments are intended to describe and not to limit the disclosure.
[0024] Image structure information: refers to the hue change and position arrangement of each pixel in the image.
[0025] Gray-scale value: the black tone is used to represent the color of image, gray-scale is the brightness of pixel is divided into 256 grades from 0 to 255. Gray-scale value is the number from 0 to 255, 0 represents black, and 255 represents white.
[0026] Gray scale image: the image is composed of every pixel represented by Gray-scale value.
[0027] As shown in
[0028] S1. analyzing a video streaming of digestive endoscopy acquired by digestive endoscopy imaging system into image data;
[0029] S2. calculating a similarity between an image at t point in time and images of first n frames, to obtain a weighted similarity k of the image; and
[0030] S3. comparing the weighted similarity k of the image at t point in time with a freezing boundary l, and triggering an instruction of image freezing when the k reaches l to obtain the clear images with the best visual field from the video streaming of digestive endoscopy.
Example 1
[0031] S1. Obtaining the video streaming of digestive endoscopy through the digestive endoscopy imaging system, and analyzing the video streaming into images (30 frames per second). Then remove fuzzy invalid frame images and take 10 of them;
[0032] S2. Cropping the valid frame images to 360*360 pixels, further reducing the size of cropped images, and only retaining the structural information of images;
[0033] An image with 360*360 pixel has more than 100,000 pixels, containing a huge amount of information, and many details need to be processed. Therefore, the image is required to be scaled to a very small size. The purpose is to remove the details of the image, and only retain the basic information such as structure, light and shade, and discard the differences caused by different sizes and proportions.
[0034] The bicubic interpolation is adopted to scale the image. Although the calculation is large, the quality of the scaled image is high and the image is not easy to be distorted. According to
F(i′,j′)=Σ.sub.row=−1.sup.2Σ.sub.col=−1.sup.2f(i+row,j+col)S(row−v)S(col−u)
where v represents the deviation of the number of rows, u represents the deviation of the number of columns; row represents a row, col represents a column; S(x) represents the interpolation expression, comprising common expressions based on trigonometric values, Bell distribution, and B-spline curve, which can be selected according to different needs. The Bell distribution expression is selected in the embodiment of the disclosure.
[0035] In order to better calculate the dHash value of the converted images, the embodiment of the disclosure reduces the images to 9*8, a total of 72 pixels.
[0036] Converting the images to gray scale images;
[0037] The reduced images are color and consists of RGB values represented as (R, G, B). R, G and B are the information values of red light, green light and blue light respectively. The larger the value is, the brighter the color is, while the smaller the value is, the darker the color is. For example, white represents (255,255,255) and black represents (0,0,0). In general, there is little relationship between image similarity and color. Therefore, the image is processed into gray scale image to reduce the complexity of later calculation, referring to the final obtained gray scale image with 9*8 pixel in
[0038] The weighted average method is adopted: due to the different sensitivity of human eyes to red, green and blue, different weights are given to each pixel of the images to calculate the grays values. The formula is as follows:
Gray=0.30*R+0.59*G+0.11*B
[0039] Comparing the gray difference of pixels of gray scale images, calculate the difference values, and generate the dHash values of images.
[0040] The gray scale images have 9 pixels per row for a total of 8 rows. Comparing the difference between two adjacent pixels in each row, and each row generates eight difference values. If the gray value of the previous pixel is greater than that of the latter pixel, the difference value is set to “1”, if not, the difference value is set to “0”. Then the calculated difference values of the pixels are compared from top to bottom and from left to right, and splice them into 64-bit binary string in order, which is the dHash values of the images.
Example 2
[0041] The example is basically the same as that in Example 1 except the following descriptions.
[0042] In S2, the similarity between different images is calculated by calculating the Hamming distance between different images. The Hamming distance between different images represents the number of digits required to change dHash values corresponding to image A to dHash values corresponding to image B. The formula to calculate the similarity between the current image and the first n frames is:
Sim=100*(64−d(x,y))/64;
where the d (x, y) is the Hamming distance between different images, d (x, y)=Σx⊕y, x and y are the dHash values corresponding to different images, and ⊕ is exclusive OR.
[0043] Calculating the Hamming Distance Between Different Images;
[0044] Hamming distance represents the number of different characters in the corresponding position of two equal length strings, which in dHash is to take the binary dHash value of two images to exclusive OR and calculate the digit of “1” of the exclusive OR result, that is, the digit with different binary dHash values. The Hamming distance between the strings x and y is defined as d (x, y):
d(x,y)=Σx⊕y
⊕ is exclusive OR; x and y are the dHash values corresponding to different images.
[0045] S6. Comparing the dHash values of the image at t point in time and the images of the first 9 frames to obtain the overlap rate of the current image and the images of the first 9 frames respectively, namely, the similarity. The calculation formula of similarity Sim of two images is Sim=100*(64−d (x, y))/64. And the weighted similarity of image at t point in time is obtained,
Sim.sub.i represents the similarity between the image t point in time and the images of the first i frames (i value range is 1-9).
Example 3
[0046] The example is basically the same as that in Example 2 except the following descriptions.
[0047] The freezing boundary l of the weighted similarity is set by analyzing the video of manually freezing image by endoscopist during digestive endoscopy.
[0048] The weighted similarity
[0049] This technical scheme is used to replace the operation of manually freezing image, which can not only effectively obtain the clear image of the best visual field, but also reduce the workload of endoscopist. The core is how to trigger the instruction of image freezing. Based on the habit of human operation, when endoscopist want to capture static images for freezing operation, they will try their best to keep the endoscopic body and the examination area to remain relatively static. The similarity of sequent frames in the output videos is very high. The perceptual hash algorithm (hereinafter referred to as PHA) is a kind of hash algorithm, which is mainly used to search similar images. PHA is a general name of a class of hash algorithm, whose function is to generate the “fingerprint” string of each image and compare the fingerprint information of different images to judge the similarity of images. The closer the results are, the more similar the images are. PHA comprises average hash (aHash), perceptual hash (pHash) and different hash (dHash).
[0050] From the above analysis, PHA is used to analyze and calculate the similarity of the adjacent frames in the per unit time of the digestive endoscopic images. The higher the similarity is, the more likely the image will be frozen. When the similarity reaches the preset boundary, it can be considered as freezing operation, which can automatically issue the freezing instruction, and complete the subsequent process.
[0051] The disclosure enables endoscopists only need to stop the movement of endoscopic body to keep the visual field unchanged when they need to carefully examine the image of a certain visual field. Then the images can be automatically determined as frozen images. There is no need for the endoscopists to manually operate the “freeze” button, so as to reduce the workload of the endoscopists. The system automatically executes the freezing instruction, which can avoid the deviation of visual field or loss of effective information of frozen images due to slow reaction or unskilled operation, so as to effectively obtain the clear images with the best visual field.
[0052] It will be obvious to those skilled in the art that changes and modifications may be made, and therefore, the aim in the appended claims is to cover all such changes and modifications.