TEXT IMAGE CORRECTION METHOD AND APPARATUS

20240161523 · 2024-05-16

Assignee

CIENET TECHNOLOGIES (BEIJING) CO., LTD. (Beijing, CN)

Inventors

Yuance DENG (Beijing, CN)

Cpc classification

International classification

Abstract

A text image correction method and a corresponding text image correction apparatus. Frequency information of a row-direction cumulative curve used by the method is sensitive to an error between a compensation angle for a tilt angle and a real tilt angle, and the method thus has good robustness. The method can accurately estimate the compensation angle for a tilt angle and correct a tilted text image. The method and apparatus can be applied to scenarios such as image pre-processing, automatic compensation for angles of scanned text images, automatic compensation for tilt angles of mobile phone photos.

Claims

1. A text image correction method, comprising the following steps: preprocessing to-be-corrected images into binary images; sequentially rotating the binary images in the same direction with a predetermined step size, recording a cumulative rotation angle upon each rotation and calculating a row cumulant of a current binary image until the current binary image is rotated to a threshold angle; extracting a frequency satisfying preset conditions for the row cumulant of each binary image; and correcting the to-be-corrected images by using the cumulative rotation angle corresponding to the maximum frequency among the frequencies of the binary images satisfying the preset conditions as a compensation angle.

2. The text image correction method according to claim 1, wherein the preprocessing to-be-corrected images into binary images further comprises: converting the to-be-corrected images into grayscale images; and converting the grayscale images into the binary images based on a maximum between-class variance method.

3. The text image correction method according to claim 1, wherein the calculating a row cumulant of the current binary image further comprises: calculating a row cumulant of each row for the current binary image; and sequentially constructing the row cumulant of each row into a column vector S.sup.t as the row cumulant of the current binary image, a calculation formula for the row cumulant of each row being:
s.sub.i.sup.t=?.sub.jd.sub.(i,j).sup.t wherein s.sub.i.sup.t is a row cumulant of an i.sup.th row of a current binary image D.sup.t, and d.sub.(i,j).sup.t is an element in the i.sup.th row and j.sup.th column of the current binary image D.sup.t.

4. The text image correction method according to claim 1, wherein the extracting a frequency satisfying preset conditions for the row cumulant of each binary image further comprises: performing sliding window smoothing filtering on the row cumulant of the current binary image to obtain a corresponding filtering sequence Q.sup.t; performing mean subtraction processing on t e filtering sequence Q.sup.t to obtain a corresponding mean-subtracted sequence H.sup.t; performing spectral analysis on the mean-subtracted sequence H.sup.t to obtain a discrete sequence P.sup.t; and extracting a frequency f.sup.t satisfying the preset conditions from the discrete sequence P.sup.t.

5. The text image correction method according to claim 4, wherein a calculation formula for the filtering sequence Q.sup.t is: $q_{i}^{t} = \frac{1}{L} {.Math.}_{j = 0}^{L} s_{i + j}^{t}$ wherein q.sub.i.sup.th is an i.sup.th element of Q.sup.t, L is a width of a sliding window, and s.sub.i+j.sup.t is a row cumulant of an (i+j).sup.th row in the current binary image D.sup.t.

6. The text image correction method according to claim 5, wherein a calculation formula for the mean-subtracted sequence H.sup.t is:
h.sub.i.sup.t=q.sub.i.sup.t?M(Q.sup.t) wherein h.sub.i.sup.t is an i.sup.th element of the mean-subtracted sequence H.sup.t, and M(*) represents calculation of a mean of an input sequence.

7. The text image correction method according to claim 6, wherein a calculation formula for the discrete sequence P.sup.t is: $p_{k}^{t} = {.Math.}_{j = 0}^{N - 1} h_{j}^{t} * e^{- \frac{2 ? i}{N} kj} k = 0, .Math., N - 1$ wherein h.sub.j.sup.t is the i.sup.th element of the an-subtracted sequence H.sup.t, p.sub.k.sup.t is a k.sup.th element of the discrete sequence P.sup.t, and N is a minimum power of 2 greater than a length of the sequence H.sup.t.

8. The text image correction method according to claim 7, wherein a calculation formula for the frequency ?.sup.t satisfying the preset conditions is: $f^{t} = \frac{F_{s}}{2 N} k s . t . p_{k}^{t} ? ?^{*} M (P^{t})$ wherein Fs is a sampling rate of the row cumulant of the current binary image, and ? is a configurable system parameter.

9. The text image correction method according to claim 1 wherein a calculation formula for sequentially rotating the binary images in the same direction is: ${\begin{matrix} d_{(x, y)}^{t} = d_{(i, j)}^{b} \\ x = .Math. \cos ?_{t} * i + \sin ?_{t} * j .Math. \\ y = .Math. - \sin ?_{t} * i + \cos ?_{t} * j .Math. \end{matrix}$ wherein d.sub.(x,y).sup.t is an element of a rotated binary image an x.sup.th row and a y.sup.th column, ?.sub.t is a rotation angle of the rotated binary image, d.sub.(i,j).sup.b is an element of an unrotated binary image in an i.sup.th row and a j.sup.th column, and custom-character * represents rounding a result

10. A text image correction apparatus, comprising: a processor and a memory, wherein the processor reads a computer program in the memory to perform the following operations: preprocessing to-be-corrected images into binary images; sequentially rotating the binary images in the same direction with a predetermined step size, recording a cumulative rotation angle upon each rotation, and calculating a row cumulant of a current binary image until the current binary image is rotated to a threshold angle; extracting a frequency satisfying preset conditions for the row cumulant of each binary image; and correcting the to-be-corrected images by using the cumulative rotation angle corresponding to the maximum frequency among the frequencies of the binary images satisfying the preset conditions as a compensation angle.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0041] FIG. 1 is a flowchart of a text image correction method according to an embodiment of the present disclosure.

[0042] FIG. 2 is a schematic flowchart of an image correction process in an embodiment of the present disclosure.

[0043] FIG. 3 is a schematic flowchart of extracting a frequency of a row cumulant of a binary image in an embodiment of the present disclosure.

[0044] FIG. 4(a) is an example diagram of a binary image rotated by 0.5 degrees in an embodiment of the present disclosure.

[0045] FIG. 4(b) shows a frequency analysis curve of a row cumulative variable of a binary image rotated by 0.5 degrees in an embodiment of the present disclosure.

[0046] FIG. 4(c) shows a variation curve of a row cumulative variable of a binary image rotated by 0.5 degrees in an embodiment of the present disclosure.

[0047] FIG. 5(a) is an example diagram of a binary image rotated by 8.5 degrees in an embodiment of the present disclosure.

[0048] FIG. 5(b) shows a frequency analysis curve of a row cumulative variable of a binary image rotated by 8.5 degrees in an embodiment of the present disclosure.

[0049] FIG. 5(c) shows a variation curve of a row cumulative variable of a binary image rotated by 8.5 degrees in an embodiment of the present disclosure.

[0050] FIG. 6(a) is an example diagram of a binary image rotated by 69.5 degrees in an embodiment of the present disclosure.

[0051] FIG. 6(b) shows a frequency analysis curve of a row cumulative variable of a binary image rotated by 69.5 degrees in an embodiment of the present disclosure.

[0052] FIG. 6(c) shows a variation curve of a row cumulative variable of a binary image rotated by 69.5 degrees in an embodiment of the present disclosure.

[0053] FIG. 7 shows a relationship curve between a cumulative rotation angle and a corresponding row cumulant of a binary image in an embodiment of the present disclosure.

[0054] FIG. 8 is a schematic structure diagram of a text image correction apparatus according to an embodiment of the present disclosure.

DETAILED DESCRIPTION

[0055] The Summary section of the present disclosure will be described in detail with reference to the accompanying drawings and specific embodiments.

[0056] As shown in FIG. 1, a text image correction method provided in this embodiment of the present disclosure mainly includes the following steps:

[0057] 101: Preprocess to-be-corrected images into binary images. Specifically:

[0058] 1011: Convert the to-be-corrected images into grayscale images.

[0059] As shown in FIG. 2, a to-be-corrected image (original text image) D? is read. A dimension of a data matrix. D? corresponding to a digital image is set to m?n?3, where m and n are the number of pixel points corresponding to a height and width of the to-be-corrected image D?, respectively.

[0060] In general, the to-be-corrected image D? is colored, which is represented by data of three color channels of r/g/b having dimensions of m?n.

[0061] The to-be-corrected image D? is converted into a grayscale image D.sup.g according to the following formula:

D.sup.g=0.2989*R+0.5870*G+0.1140*B(1)

[0062] In Formula (1), R/G/B represent red, green, and blue components in the original color image D?, respectively, and relationships between values of elements of a matrix and the original color image are as follows:

[00006] $\begin{matrix} {\begin{matrix} r_{(i, j)} = D^{o} (i, j, 1) \\ g_{(i, j)} = D^{o} (i, j, 2) \\ b_{(i, j)} = D^{o} (i, j, 3) \end{matrix} & (2) \end{matrix}$

[0063] In Formula (2), r.sub.(i,j), g.sub.(i,j), and b.sub.(i,j) represent elements of an i.sup.th row and a j.sup.th column in the R/G/B matrix, respectively.

[0064] 1012: Convert the grayscale images into the binary images based on a maximum between-class variance method.

[0065] After obtaining the grayscale image D.sup.g, in order to eliminate the influence of background light of a photographing environment on the determination of a text area, a suitable threshold T is further calculated using a maximum between-class variance method, and then converted into a binary image D.sup.b according to the following formula expressed:

[00007] $\begin{matrix} d_{(i, j)}^{b} = {\begin{matrix} 0 & d_{(i, j)}^{g} ? T \\ 1 & d_{(i, j)}^{g} > T \end{matrix} & (3) \end{matrix}$

[0066] In Formula (3), d.sub.(i,j).sup.g and d.sub.(i,j).sup.b represent elements of an i.sup.th and a j.sup.th column of D.sup.g and D.sup.b, respectively.

[0067] The maximum between class variance method, also referred to as an Ostu method, is a classical and commonly used threshold selection algorithm. The method realizes automatic selection of a global threshold T by the statistics of histogram characteristics of the whole image, and includes the following algorithm steps:

[0068] Step 1: Calculate a histogram of an image: statistically obtaining the number of pixels falling on each bin among 256 bins (0-255) of all pixel points of the image.

[0069] Step 2: Normalize the histogram: dividing the number of pixels in each bin by the total number of pixels.

[0070] Step 3: Start iteration from 0, where i represents a threshold of classification, namely a gray level.

[0071] Step 4: Statistically obtain a ratio w0 of pixels (pixels having pixel values within this range are referred to as foreground pixels) with gray levels of 0-i to the whole image through the normalized histogram and statistically obtain an average grayscale u0 of the foreground pixels, statistically obtain a ratio w1 of pixels (pixels having pixel values within this range are referred to as background pixels) with gray levels of i-255 to the whole image and statistically obtain an average grayscale u1 of the background pixels,

[0072] Step 5: Calculate a variance of the foreground pixels and the background pixels g=w0*w1*(u0?u1)(u0?u1).

[0073] Step 6: Turn i++ to 4, and end iteration when i is 256.

[0074] Step 7: Take an i value corresponding, to maximum g as a global threshold of the image.

[0075] 102: Sequentially rotate the binary images in the same direction with a predetermined step size, record a cumulative rotation angle upon each rotation, and calculate a row cumulant of a current binary image until the current binary image is rotated to a threshold angle.

[0076] In order to extract distribution information of image text and blank space under different rotation angles, the binary image D.sup.b is required to be ergodically rotated by a certain angle, the dimension of two-dimensional information is reduced to one-dimensional space by calculating a row cumulant, and then a plurality of strong frequency components are extracted by performing spectral analysis on the row cumulant and stored for subsequent analysis.

[0077] As shown in FIG. 3, the binary image D.sup.b is sequentially rotated in the same direction with an angle ?.sub.? as a step size, whereby a cumulative rotation angle ?.sub.t of the binary image D.sup.b is increased from 0 degrees to ?.sub.max. ?.sub.? and ?.sub.max are configurable algorithm parameters. In this embodiment, ?.sub.? is 0.5 degrees, and ?.sub.max is 180 degrees.

[0078] In one embodiment of the present disclosure, the same direction is clockwise or counterclockwise. Since the to-be-corrected image D? is skewed clockwise or counterclockwise, there will always be a unique angle corresponding to the horizontal text in the image in the process of sequentially rotating in a certain direction by 180 degrees.

[0079] In one embodiment of the present disclosure, the step size ?.sub.? determines the accuracy of skew angle estimation, determines the size of system errors, and also affects the calculation complexity and delay of the whole algorithm. Assuming that the cumulative rotation angle of the binary image D.sup.b is ?.sub.t (increased from 0 degrees to ?.sub.t by 0.5 degrees each time), the current rotated image is D.sup.t, and the rotation process is:

[00008] $\begin{matrix} {\begin{matrix} d_{(x, y)}^{t} = d_{(i, j)}^{b} \\ x = .Math. \cos ?_{t} * i + \sin ?_{t} * j .Math. \\ y = .Math. - \sin ?_{t} * i + \cos ?_{t} * j .Math. \end{matrix} & (4) \end{matrix}$

[0080] In Formula (4), d.sub.(x,y).sup.t is an element of a rotated binary image in an x.sup.th row and a y.sup.th column, ?.sub.t is a rotation angle of the rotated binary image, d.sub.(i,j).sup.b is an element of an unrotated binary image in an i.sup.th row and j.sup.th column, and custom-character * represents rounding a result.

[0081] After the rotated binary image is obtained, a row cumulant of the image may be calculated as S.sup.t.

[0082] 1021: Calculate a row cumulant of each row for the current binary image.

[0083] A calculation formula for the row cumulant of each row is:

s.sub.i.sup.t=?.sub.jd.sub.(i,j).sup.t(5)

[0084] In Formula (5), s.sub.i.sup.t is a row cumulant of an i.sup.th row of a current binary image D.sup.t, and d.sub.i,j).sup.t is an element in the i.sup.th row and j.sup.th column of the current binary image D.sup.t.

[0085] 1022: Sequentially construct the row cumulant of each row into a column vector S.sup.t as the row cumulant of the current binary image.

[0086] It is assumed that in one embodiment of the present disclosure, the to-be-corrected image D? is skewed clockwise by 8.5 degrees. When the corresponding binary image D.sup.b is rotated counterclockwise by 0.5 degrees, that is, ?.sub.t=0.5 degrees, a variation curve of a row cumulative variable of the binary image D.sup.b is shown in FIG. 4(c). When ?.sub.t=8.5 degrees, the variation curve of the row cumulative variable of the binary image D.sup.b is shown in FIG. 5(c). When ?.sub.t=69.5 degrees, the variation curve of the row cumulative variable of the binary image D.sup.b is shown in FIG. 6(c). When ?.sub.t=0.5 degrees, ?.sub.t=8.5 degrees, and ?.sub.t=69.5 degrees, the corresponding S.sup.t is obtained by column sampling of the binary image D.sup.b at the same sampling rate F.sub.s, it can be seen from the corresponding variation curve of the row cumulative variable that the variation curve of the row cumulative variable of the binary image changes obviously under different rotation angles. When the cumulative rotation angle ?.sub.t is equal to or close to a true skew angle of 8.5 degrees, the variation MVO of the row cumulative variable of the binary image is uniformly distributed, and the corresponding frequency value increases obviously. Therefore, it is possible to determine whether the text is horizontal or not from the law of the variation curve of the row cumulative variable of the binary image.

[0087] 103: Extract a frequency satisfying preset conditions for the row cumulant of each binary image. Specifically:

[0088] 1031: Perform sliding, window smoothing filtering on the row cumulant of the current binary image to obtain a corresponding tittering sequence Q.sup.t.

[0089] As shown in FIG. 3, in order to prevent the influence of image noise, the row cumulative variable S.sup.t of the current binary image D.sup.b is subjected to sliding window smoothing filtering to remove some burrs and obtain the corresponding filtering sequence Q.sup.t. A calculation formula for the filtering sequence Q.sup.t is:

[00009] $\begin{matrix} q_{i}^{t} = \frac{1}{L} {.Math.}_{j = 0}^{L} s_{i + j}^{t} & (6) \end{matrix}$

[0090] In Formula (6), Q.sup.t is an i.sup.th element of q.sub.i.sup.t, L is a width of a sliding window, and s.sub.i+j.sup.t is a row cumulant of an (i+j).sup.th row in the current binary image D.sup.t.

[0091] In one embodiment of the present disclosure, when the sliding window exceeds an actual length of S.sup.t, an invalid element in the window defaults to 0.

[0092] 1032: Perform mean subtraction processing on the filtering sequence Q.sup.t to obtain a corresponding mean-subtracted sequence H.sup.t.

[0093] In order to prevent the influence of a direct current signal on the subsequent frequency analysis, mean subtraction processing is performed on the filtering sequence Q.sup.t. A calculation formula for the mean-subtracted sequence H.sup.t is:

h.sub.i.sup.t=q.sub.i.sup.t?M(Q.sup.t)(7)

[0094] In Formula (7), h.sub.i.sup.t is an i.sup.th element of the mean-subtracted sequence H.sup.t, and M(*) represents calculation of a mean of an input sequence.

[0095] 1033: Perform spectral analysis on the mean-subtracted sequence H.sup.t to obtain a discrete sequence P.sup.t.

[0096] DFT spectral analysis is performed on the mean-subtracted sequence H.sup.t. A calculation formula for the discrete sequence P.sup.t is:

[00010] $\begin{matrix} p_{k}^{t} = {.Math.}_{j = 0}^{N - 1} h_{j}^{t} * e^{- \frac{2 ? i}{N} kj} k = 0, .Math., N - 1 & (8) \end{matrix}$

[0097] In Formula (8), h.sub.j.sup.t is the i.sup.th element of the mean-subtracted sequence H.sup.t, p.sub.k.sup.t is a k.sup.th element of the discrete sequence P.sup.t, and N is a minimum power of 2 greater than a length of the sequence H.sup.t.

[0098] In one embodiment of the present disclosure, the discrete sequence P.sup.t describes the magnitude of contribution of different frequency components to the mean-subtracted sequence H.sup.t, corresponding to the magnitude of the probability of regular distribution between text and blank space in the current binary image D.sup.t. Then, by frequency analysis on S.sup.t, variation law information of text and blank space in the binary image D.sup.t in the row direction with the variation of the cumulative rotation angle may be obtained, as shown in FIG. 4(c) to FIG. 6(c).

[0099] 1034: Extract a frequency f.sup.t satisfying the preset conditions from the discrete sequence P.sup.t.

[0100] The main frequency components in P.sup.t are extracted. As shown in Formula (9), k satisfying the conditions is found and then converted into the corresponding frequency, f.sup.t satisfying the conditions is found to form an output vector F.sup.t.

[0101] A calculation formula for the frequency f.sup.t satisfying the preset conditions is:

[00011] $\begin{matrix} f^{t} = \frac{F_{s}}{2 N} k s . t . p_{k}^{t} ? ?^{*} M (P^{t}) & (9) \end{matrix}$

[0102] In Formula (9), Fs is a sampling rate of the row cumulant of the current binary image, and ? is a configurable system parameter. In one embodiment of the present disclosure, the sampling rate Fs is an algorithm parameter with a value of 50 Hz, ? is mainly used for determining a standard for the magnitude of frequency contribution, and has a value of 15, If ? is too small, the frequency components of misjudgment will be increased. On the contrary, if it is too large, information frequency components may be missed. The best effect should be that one to two frequency components with obvious greater contribution may be selected in each rotation.

[0103] After extracting the frequency of the current binary image D.sup.t satisfying the preset conditions, the frequency satisfying the preset conditions is associated with the cumulative rotation angle ?.sub.t recorded correspondingly. Then, ?.sub.t is updated according to the step size ?.sub.?:

?.sub.t:=?.sub.t+?.sub.t(10)

[0104] The extraction and calculation process corresponding to the next binary image D.sup.t is entered, and ends when ?.sub.t=180 degrees.

[0105] 104: Correct the to-be-corrected images by using the cumulative rotation angle corresponding to the maximum frequency among the frequencies of the binary images satisfying the preset conditions as a compensation angle.

[0106] Ideally, if the text in the binary image is not skewed, the corresponding row cumulative curve of the image should oscillate according to the law of FIG. 5(c) and has good frequency characteristics. On the contrary, if the frequency information of the row cumulative curve of the image may be extracted under different rotation angles, the skew angle of the text may also be estimated.

[0107] A frequency component sequence ? F. is formed by splicing the output vector F.sup.t of each binary image end to end. The highest frequency is searched from each element of ? F. Then, the cumulative rotation angle corresponding to the highest frequency is obtained from the recorded cumulative rotation angles as a compensation angle:

[00012] $\begin{matrix} \tilde{?} = \underset{?_{t}}{\arg} \max (\tilde{f}) s . t . \tilde{f} = \max (? F .) & (11) \end{matrix}$

[0108] After obtaining the compensation angle {tilde over (?)}, the binary image D.sup.b before rotation is rotated by {tilde over (?)} to output a corrected binary image {tilde over (D)}.sup.b:

[00013] $\begin{matrix} {\begin{matrix} {\tilde{d}}_{(x, y)}^{b} = d_{(i, j)}^{b} \\ x = .Math. \cos \tilde{?} * i + \sin \tilde{?} * j .Math. \\ y = .Math. - \sin \tilde{?} * i + \cos \tilde{?} * j .Math. \end{matrix} & (12) \end{matrix}$

[0109] In Formula (12), d.sub.(i,j).sup.b and {tilde over (d)}.sub.(x,y).sup.b represent elements of an i.sup.th row and a j.sup.th column of an image matrix D.sup.b respectively, and {tilde over (D)}.sup.b represents an element of an x.sup.th row and a y.sup.th column.

[0110] In one embodiment of the present disclosure, the to-be-corrected image D? is skewed clockwise by 8.5 degrees, and the to-be-corrected image D? may also be rotated counterclockwise by 8.5 degrees after obtaining the compensation angle {tilde over (?)} of 8.5 degrees.

[0111] The above technical solution is described in detail in combination with application examples as follows:

[0112] A to-be-corrected image D? is obtained and converted into a binary image D.sup.b. The binary image D.sup.b is skewed clockwise by 8.5 degrees. The binary image D.sup.b is rotated counterclockwise by 180 degrees with a step size of 0.5 degrees:

[0113] When the binary image D.sup.b is rotated counterclockwise by 0.5 degrees, an example diagram of the current binary image D.sup.t is shown in FIG. 4(a), and text in the current binary image D.sup.t is still skewed. A corresponding row cumulant S.sup.t is calculated for the binary image D.sup.t in which ?.sub.t=0.5 degrees. Then, a frequency component is extracted for the row cumulant S.sup.t, as shown in FIG. 4(c). Then, a corresponding output vector F.sup.t is obtained, as shown in FIG. 4(b). It can be seen that when ?.sub.t=0.5 degrees, the highest frequency is 0.073242 Hz.

[0114] When the binary image D.sup.b is continuously rotated counterclockwise to 8.5 degrees, an example diagram of the current binary image D.sup.t is shown in FIG. 5(a), and the text in the current binary image D.sup.t is horizontal. The corresponding row cumulant S.sup.t is calculated for the binary image D.sup.t in which ?.sub.t=8.5 degrees. Then, the frequency component is extracted for the row cumulant S.sup.t, as shown in FIG. 5(c). Then, the corresponding output vector F.sup.t is obtained, as shown in FIG. 5(b). It can be seen that when ?.sub.t=8.5 degrees, the highest frequency is 1.001 Hz, which is greater than 0.073242 Hz.

[0115] When the binary image D.sup.b is continuously rotated counterclockwise to 69.5 degrees, an example diagram of the current binary image D.sup.t is shown in FIG. 6(a), and the text in the current binary image D.sup.t is seriously skewed. The corresponding row cumulant S.sup.t is calculated for the binary image D.sup.t in which ?.sub.t=69.5 degrees. Then, the frequency component is extracted for the row cumulant S.sup.t, as shown in FIG. 6(c). Then, the corresponding output vector F.sup.t is obtained, as shown in FIG. 6(b). It can be seen that when ?.sub.t=69.5 degrees, the highest frequency is 0.24414 Hz which is less than 1.001 Hz.

[0116] As shown in FIG. 7, the highest point represents that ?.sub.t=8.5 degrees and {tilde over (f)}=1.001 Hz. Therefore, when the rotation angle is close to or equal to a true skew compensation angle, the main contribution frequency of the row cumulative curve of the binary image is significantly increased.

[0117] It can be seen that the corresponding highest frequency is the highest among all the frequencies generated in the rotation process when the text is horizontal in the binary image D.sup.t, thus verifying the rationality of the text image correction method.

[0118] In order to realize the text image correction method provided by the present disclosure, the present disclosure also provides a text image correction apparatus: As shown in FIG. 8, the text image correction apparatus includes a processor 82 and a memory 81, and may further include a communication assembly, a sensor assembly, a power assembly, a multimedia assembly, and an input/output interface as required. The memory, the communication assembly, the sensor assembly, the power assembly, the multimedia assembly, and the input/output interface are all connected to the processor 82.

[0119] In the text image correction apparatus, the processor 82 reads a computer program in the memory 81 to perform the following operations:

[0120] preprocessing to-lie-corrected images into binary images;

[0121] sequentially rotating the binary images in the same direction with a predetermined step size, recording a cumulative rotation angle upon each rotation, and calculating a row cumulant of a current binary image until the current binary image is rotated to a threshold angle;

[0122] extracting a frequency satisfying preset conditions for the row cumulant of each binary image; and

[0123] correcting the to-be-corrected images by using the cumulative rotation angle corresponding to the maximum frequency among the frequencies of the binary images satisfying the preset conditions as a compensation angle.

[0124] The text image correction method and the text image correction apparatus provided by the present disclosure may estimate the compensation angle of the skew angle more accurately, and may correct skewed text and pictures. Frequency information of a row cumulative curve used is sensitive to an error between a compensation angle of a skew angle and a true skew angle, and therefore, the robustness is excellent. The text image correction method and the text image correction apparatus may be applied to image preprocessing, automatic angle compensation for scanned text and images, automatic compensation for phone camera skew angles, and other scenarios.

[0125] The text image correction method and apparatus provided by the present disclosure are described in detail above. Any obvious change made to the present disclosure without departing from the essence of the present disclosure will constitute infringement of the patent right of the present disclosure, and those of ordinary skill in the art will bear corresponding legal responsibilities.

TEXT IMAGE CORRECTION METHOD AND APPARATUS

Assignee

Inventors

Cpc classification

Classification Explorer

G06V30/1478

PHYSICS

Classification Explorer

G06V10/242

PHYSICS

International classification

Classification Explorer

G06V30/146

PHYSICS

Classification Explorer

G06V10/24

PHYSICS

Abstract

Claims

Description