System for analyzing a document and corresponding method
11057536 ยท 2021-07-06
Assignee
Inventors
- Tobias Schoen (Nuremberg, DE)
- Wolfgang Holub (Fuerth, DE)
- Daniel Stromer (Hoechstadt, DE)
- Andreas Maier (Erlangen, DE)
- Gisela Anton (Erlangen, DE)
- Michel Thilo (Nuremberg, DE)
- Martin Vossiek (Fuerth, DE)
- Jan Schuer (Langensendelbach, DE)
Cpc classification
H04N1/00827
ELECTRICITY
G01N23/041
PHYSICS
H04N1/04
ELECTRICITY
International classification
G01N23/041
PHYSICS
H04N1/203
ELECTRICITY
Abstract
The invention refers to a system for analyzing a document. Each of two measurement arrangements includes two components being a radiation source and a radiation detector, respectively. The two measurement arrangements provide measurement data and differ from each other with regard to a measurement principle, a kind of radiation source, a kind of radiation detector, a kind of relative movement between the document and a component, a kind of relative arrangement of the two components to each other, a kind of emitted radiation, a kind of received radiation or a kind of processing information about radiation emitted by the respective radiation source and/or about radiation received by the respective radiation detector. An evaluator provides data based on the measurement data. The invention also refers to a corresponding method.
Claims
1. A system for analyzing at least one document, comprising at least two measurement arrangements and an evaluator, wherein each of the at least two measurement arrangements comprises at least two components, where one component of the at least two components is a radiation source and another component of the at least two components is a radiation detector, wherein each of the at least two measurement arrangements provides measurement data based on radiation emitted by the respective radiation source and radiation received by the respective radiation detector, wherein the at least two measurement arrangements differ from each other with regard to a measurement principle, and wherein the evaluator provides data concerning the at least one document based on the measurement data provided by the at least two measurement arrangements.
2. The system of claim 1, wherein the radiation emitted by at least one radiation source and received by at least one radiation detector of the at least two measurement arrangements are X-rays.
3. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on Computed Tomography.
4. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on Terahertz time-domain spectroscopy.
5. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on phase contrast images.
6. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on dark-field imaging.
7. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on absorption imaging.
8. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on a Talbot-Lau method.
9. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on a relative rotation of the at least one document and at least one component of the at least two components around each other with an axis of rotation being orthogonal to a plane of the at least one document.
10. The system of claim 9, wherein the measurement arrangement is configured to rotate at least one component of the at least two components around the at least one document.
11. The system of claim 1, wherein the radiation source of at least one measurement arrangement of the at least two measurement arrangements emits radiation towards a second plane perpendicular to a first plane of the at least one document, where the at least one document has a larger extension within the first plane than in the second plane.
12. The system of claim 1, wherein the radiation source of one measurement arrangement of the at least two measurement arrangements emits radiation towards a plane of the at least one document, where the document has a largest extension within the plane.
13. The system of claim 1, wherein at least one measurement arrangement of the at least two measurement arrangements provides the measurement data based on a relative movement between the at least one document and at least one component of the at least two components along a movement axis within a plane parallel to a plane of the at least one document and wherein the document has a largest extension within the plane.
14. The system of claim 1, wherein the evaluator provides data concerning a text written within the at least one document.
15. The system of claim 1, wherein the evaluator provides data concerning a structural variation and/or a density fluctuation and/or a thickness of at least one page of the at least one document.
16. The system of claim 1, wherein the evaluator provides data concerning a thickness of a distance between two pages of the at least one document.
17. The system of claim 1, wherein the system further comprises a holder configured for holding at least two documents.
18. The system of claim 1, wherein a first measurement arrangement of the at least two measurement arrangements provides the measurement data based on Terahertz time-domain spectroscopy and a second measurement arrangement of the at least two measurement arrangements provides the measurement data based on Computed Tomography or based on phase contrast images.
19. The system of claim 1, wherein a first measurement arrangement of the at least two measurement arrangements provides the measurement data based on phase contrast images and a second measurement arrangement of the at least two measurement arrangements provides the measurement data based on dark-field imaging.
20. A method for analyzing at least one document, comprising: performing at least two measurements using at least two measurement arrangements, wherein each of the at least two measurement arrangements comprises at least two components, where one component of the at least two components is a radiation source and another component of the at least two components is a radiation detector, wherein the at least two measurement arrangements differ from each other with regard to a realized measurement principle, providing measurement data based on radiation emitted by the radiation sources and radiation received by the respective radiation detectors, and providing data concerning the at least one document based on the measurement data.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
DETAILED DESCRIPTION OF THE INVENTION
(14)
(15) Each of the two measurement arrangements 10, 20 comprises two components 11, 12, 21, 22. In the shown embodiment, the connecting lines between the two associated components 11, 12, 21, 22 are perpendicular to each other.
(16) Each measurement arrangement 10, 20 has one component that serves as a radiation source 11, 21 and one component that is configured as a radiation detector 12, 22. The respective measurement data 13, 23 are based on the radiation emitted by the radiation sources 11, 21 and the radiation received by the radiation detectors 12, 22. The measurement data 13, 23 are in an embodiment raw data concerning the radiation. In a different embodiment, the information about the radiation is at least pre-processed.
(17) The measurement arrangements 10, 20 differ from each other. The possible differences refer, for example, to: the measurement principles that a realized by the measurement arrangements 10, 20, the specific radiation sources that are compromised by the measurement arrangements 10, 20, the kind of radiation detectors 12, 22 used by the measurement arrangements 10, 20, a relative movement between the document 100 and at least one component 11, 12, 21, 22 of the measurement arrangements 10, 20, the way how the components 11, 12, 21, 22 belonging to the same measurement arrangement 10, 20 are arranged relative to each other, the emitted radiation, the received radiation, the kind of information relevant for the processing and obtaining of the measurement data concerning the emitted radiation and/or received radiation.
(18) In some embodiments, the same components are used but in a different way or with a different relative location and/or movement with regard to the document.
(19) In the shown embodiment, the measurement arrangements 10, 20 are differently arranged with respect to the document 100. Further, the radiation detector 12 of the first measurement arrangement 10 is connected with the radiation source 11 that provides the measurement data 13. Contrary to this, the radiation source 21 of the second measurement arrangement 20 is connected with the radiation detector 22 that is configured to provide based on the emitted and received radiation the measurement data 23.
(20) Thus, the measurement arrangements 10, 20 allow to obtain different measurement data 13, 23 due to the measurements as such and/or due to different steps performed during the evaluation process for providing the respective measurement data.
(21) The different measurement data 13, 23 increase the information about the document 100 which is used by the evaluator 50 for generating the analysis data 101.
(22) The requirements for the measurement arrangements and the way of merging the measurement data are based in an embodiment on the following optimization.
(23) The modalities refer to the different kinds of measurements realized by the measurement arrangements.
(24) If N.sub.m modalities m are used to reconstruct a document, then the following optimization problem is generated:
(25)
(26) With the following variables: The variable m describes a modality. The variable N.sub.m describes the total number of modalities used for the reconstruction. The variable A.sub.m describes as system of linear equations of a modality m. The variable x describes an intensity value of voxel in a reconstruction volume. The variable p.sub.m describes a projection of a modality m. The variable N.sub.r describes the total number of regularizers r. The variable r describes a regularizer. The variable R.sub.r (x) describes a regularizing function of a regularizer r. The variable .sub.m, .sub.r describe LaGrange Multipliers.
(27) The complete optimization problem is split into N.sub.m independent problems with N.sub.r additional regularization problems.
(28) The individual minimization problems for the reconstructions (A.sub.mx-p.sub.m.sub.2.sup.2) are in one embodiment solved by using common iterative solution methods (e.g. gradient-descent approaches) and for each modality the best fitting regularizers are chosen. The generated modality projections p.sub.m have a certain resolution and image contrast, limited to the specific modality. However, the complete resolution of the reconstruction system is only limited by the resolution of the highest used modality.
(29) As each modality has its own strength, a smart combination of the used modalities is used resulting in an enhanced reconstruction combining the strengths and eliminating the weakness of each modality.
(30) In an embodiment, a scanning system consisting of three modalities is used:
(31) 1. X-Ray CT (X-Ray)
(32) 2. Phase-Contrast system (PC)
(33) 3. Terahertz Scanner (THz)
(34) The following optimization problem results:
.sub.X-Ray.Math.A.sub.X-rayxp.sub.X-Ray.sub.2.sup.2+.sub.PC.Math.A.sub.PCxp.sub.PC.sub.2.sup.2+.sub.THz.Math.A.sub.THzxp.sub.THz.sub.2.sup.2+.sub.TV.Math.x.sub.1
(35) In an embodiment, a Total Variation Regularization is chosen. This is based on the assumption that the measured signals are piecewise constant. is a sparsifying transformation (such as the gradient) applied on the reconstructions x [15].
(36) The three acquired projections have different resolutions and the image intensities/contrasts derive from different physical phenomena:
(37) The modalities and the corresponding physical phenomena and resolutions are: For p.sub.X-Ray, these are X-Ray absorption and detector resolution. For p.sub.PC, theses are micro dispersions and compression. For p.sub.THz, these are transmission and camera resolution.
(38) The addition of the modalities results in a final enhanced reconstruction where the strengths of the specific modalities are combined.
(39) In
(40) Shown is a THz-TDS system in which the THz source 11 emits a THz pulse which is given by the lower graph. This pulse is reflected by the document 100 which is here a book. The upper graph shows the reflected pulse. The evaluation of the emitted and the reflected pulse allows to scan the different pages 105, 105 and to evaluate the air gap 106 between the pages 105, 105.
(41) A difference in reflection factors between unwritten paper and paper described with ink can be detected by THz imaging. In the case of a paper stack, the reflected THz waves can be assigned to the respective sheet by the time domain windowing. The composition of ink and paper influences the reflection factor and thus the quality of the image.
(42) Compared to computed tomography and optical imaging methods, THz imaging systems have a relatively low lateral resolution due to the comparatively large wavelength. However, due to the permeability of some materials, there are advantages over optical methods. Furthermore, electromagnetic waves in the frequency range used are non-ionizing and therefore harmless to health.
(43) In the embodiment of
(44) For the measurements, it is taken into account that paper mainly consists of cellulose. A widely used ink since the third century before Christ is iron gall ink. There are many historical recipes to produce iron gall ink [7], but all of them are based on the same ingredients: iron salt, tannic acid and gum arabic. One can remove the ink from the paper with simple methods, but particles that penetrated deeper layers of the paper will still be present. The fact that the ink has metallic particles leads to the presumption that X-Ray radiation should be able to image those particles such that the ink can be differentiated from the paper through higher absorption of the metal compared to cellulose. This allows to reconstruct even erased parts of a writing.
(45) In the shown embodiment, the rotation plane for the full circle scan is placed in the plane of the pages such that the axis of rotation is orthogonal to the book's front cover and a scan is performed around the document 100. In the embodiment, for the 3-D reconstruction a FDK (after the authors Feldkamp, Davis, and Kress) method is used consisting of a cosine weighting, ramp filtering and back projection step [14]. The voxel size for the reconstruction was set to 68.14 m68.14 m68.14 m.
(46) In the shown embodiment, two imaging modalities which are based on X-rays are used: phase contrast images and dark-field images. In phase contrast images, the change of phase when X-rays enter and leave the material can be visualized. This change of phase causes a deflection of the X-ray wave front. While the phase signal is a measure of the large scale variation of the wave front, the dark-field images describe the small scale irregularity of the wave front which can be caused by objects which are smaller than the pixel size. Thus, the dark-field gives information on structural variations and density fluctuation. The exploitation of phase and dark-field signals for imaging has been enabled by the so called Talbot-Lau method, which has recently just became applicable for X-rays.
(47) Hence, due to the different imaging modalities the same components are used for two different measurement arrangements.
(48)
(49) A first measurement arrangement 10 allows an X-ray-measurement of the document 100. The radiation source 11 and the radiation detector 12 are arranged so that the book as an example of the document 100 lies between them and that the radiation is directed towards a second plane 116 which is perpendicular to the planes of the individual pages of the book. The radiation source 11 and the radiation detector 12 rotate around the document 100.
(50) The second measurement arrangement 20 comprises one unit serving as radiation source 21 and radiation detector 22, where the radiation are THz pulses directed towards the first plane 115 and here also towards the written surface of the pages of the document 100. The THz pulses are emitted and the reflected signals are detected. For the measurement, the radiation source 21 and radiation detector 22 are moved along a movement axis 25 that is here parallel to the first plane 115. This measurement provides depth information.
(51) The measurement data of both measurement arrangements 10, 20 are combined to obtain the desired analysis data. In an embodiment, the THz radiation settings are calculated using the information about the thickness of the pages and of the air gaps between them.
(52) In an embodiment, X-ray detectors with a very small pixel size of 50 microns are used. Given that usual paper with a weight of 80 g/mm.sup.2 has a thickness of about 100 microns, these detectors allow the reconstruction of two slices per page. The investigated methods are also applicable for documents in the form of scrolls.
(53) In an embodiment with three measurement arrangements X-ray absorption, phase contrast, and dark-field imaging are performed.
(54) Spatial resolution in X-ray imaging is influenced by two important factors: The size of the detector elements and the size of the focal spot that emits the X-ray radiation. Both have to be small in order to create images of high spatial resolution
(55) In an embodiment, the local deformation which stems from the pressure caused by a pen to the paper is visualized. In a different embodiment, the kind of used paper is analyzed. This is based on the fact that parchment is quite different from paper. Manufacturing paper causes the fibers to be distributed quite evenly across the sheet. Therefore, a very homogeneous impression even in the dark-field image is given. Most likely, parchment is similar, but the analysis refer to the manufacturing process of the parchment.
(56) In this embodiments, the components of one measurement arrangement are used for two different measurements by evaluating two different variables of the respective measurements. Due to this, the same components belong to two different measurement arrangements which differ only with regard to the evaluating of the data.
(57) In the arrangement shown in
(58) Within the arrangement of
(59) In the embodiment shown in
(60) The embodiment of
(61) The embodiment of
(62) The foregoing embodiments are e.g. performed using a 360 scan or a 180 scan (so called short scan). The latter embodiment is based on the fact that the second half of the scan allows only the measurement of redundant rays.
(63) A principle of laminography is the measurement of the document 100 under different angles but emitting the radiation just from the same side. Usually, two of the three elements (radiation source 11, document 100, and radiation detector 12) perform coordinated movements. In the embodiment shown in
(64) In the embodiment of
(65) The first measurement arrangement 10 performs a rotation around the axis of rotation 15 parallel to the cover of the document 100, whereas the second measurement arrangement 20 is located under a 45 angle to the cover. The document 100 is depending on the embodiments moved and/or rotated. A simultaneous movement and rotation allows the helical scan indicated by the arrow. The third measurement arrangement 30performing e.g. ultrasound or phase contrast x-ray measurementis also located under a 45 angle to the cover of the document 100 and here also to the axis of rotation 15. The data of the three measurement arrangements 10, 20, 30 are combined by the evaluator 50 in order to obtain a three dimensional representation.
(66) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, one or more of the most important method steps may be executed by such an apparatus.
(67) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier. Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier. In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(68) While this invention has been described in terms of several advantageous embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
REFERENCES
(69) [1] Pernerstorfer, Matthias J. Von der Digitalisierungsidee zur Digitalen Bibliothek. Wege fr Museen, Bibliotheken und Archive in die Europeana. AKMB-news: Informationen zu Kunst, Museum und Bibliothek 18.2 (2013): 16-18. [2] V. Mocella, E. Brun, C. Ferrero, and D. Delattre, Revealing letters in rolled herculaneum papyri by x-ray phase-contrast imaging, Nature communications, vol. 6, 2015. [3] L. Glaser and D. Deckers, The Basics of Fast-scanning XRF Element Mapping for Iron-gall Ink Palimpsests, Manuscript cultures, vol. 7, no. PUBDB-2015-06320, pp. 104-112, 2014. [4] Sokolov, Arseni{hacek over (i)} Aleksandrovich, and Igor' Mikha{hacek over (i)}lovich Ternov. Synchrotron radiation. Akademia Nauk SSSR, Moskovskoie Obshchestvo lspytatelei prirody. Sektsia Fiziki. Sinkhrotron Radiation, Nauka Eds., Moscow, 1966, 228 pp. 1 (1966). [5] Weitkamp, Timm, et al. X-ray phase imaging with a grating interferometer. Optics express 13.16 (2005): 6296-6304. [6] G. L. Zeng, Medical image reconstruction. Springer, 2010. [7] A. Stijnman, Historical iron-gall ink recipes: art technological source research for inkcor, Papier Restaurierung, vol. 5, no. 3, pp. 14-17, 2004. [8] Tonouchi, Masayoshi. Cutting-edge terahertz technology. Nature photonics 1.2 (2007): 97-105. [9] Siegel, Peter H. Terahertz technology. IEEE Transactions on microwave theory and techniques 50.3 (2002): 910-928. [10] Redo-Sanchez, Albert, et al. Terahertz time-gated spectral imaging for content extraction through layered structures. Nature Communications 7 (2016): 12665. [11] Xu, Minghua, and Lihong V. Wang. Photoacoustic imaging in biomedicine. Review of scientific instruments 77.4 (2006): 041101. [12] Hoelen, C. G. A., et al. Three-dimensional photoacoustic imaging of blood vessels in tissue. Optics letters 23.8 (1998): 648-650. [13] Arridge, Simon R. Optical tomography in medical imaging. Inverse problems 15.2 (1999): R41. [14] L. Feldkamp, L. Davis, and J. Kress, Practical cone-beam algorithm, JOSA A, vol. 1, no. 6, pp. 612-619, 1984. [15] M. Amrehn et al., Portability of TV-Regularized Reconstruction Parameters to Varying Data Sets. Bildverarbeitung fr die Medizin 2015, Springer Berlin Heidelberg, 2015. 131-136.