METHOD OF STORING RECORD INFORMATION

20180081601 · 2018-03-22

Assignee

Inventors

Cpc classification

International classification

Abstract

In a method of storing record information in a document in which a record structure of the information is encoded in metadata, the document including attributes that specify how the information is to be rendered when the document is reproduced, the metadata are encoded in the form of printable and machine-readable objects with attributes that prevent these objects from being rendered.

Claims

1. A method of storing record information in a document in which a record structure of the information is encoded in metadata, the document including attributes that specify how the information is to be rendered when the document is reproduced, the method comprising the step of: encoding the metadata in the form of printable and machine-readable objects with attributes that prevent these objects from being rendered.

2. The method according to claim 1, wherein the document comprises a plurality of pages, and the printable objects encoding the metadata are placed on each page to which the respective metadata pertain.

3. The method according to claim 1, wherein the attributes comprise an object color and a background color, and the objects encoding the metadata are prevented from being rendered by setting the object color attribute to be equal to the background color.

4. The method according to claim 1, wherein the attributes comprise an object transparency, and the printable objects encoding the metadata are prevented from being rendered by setting the object transparency to 100%.

5. The method according to claim 1, wherein the attributes comprise a position attribute determining the position of the printable object on a page, and the objects encoding the metadata are prevented from being rendered by assigning a position attribute that places them outside of a printable domain.

6. A method of converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the method comprising the step of: using the method according to claim 1 for storing the information in the reformatted document.

7. The method according to claim 6, wherein the document to be converted is a VDP document, and the metadata comprise information that defines at least one record and a record comprises one or more pages of the VDP document.

8. An apparatus for converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the apparatus being arranged to implement the method according to claim 6.

9. A computer program product comprising program code on a machine readable non-transitory medium, the program code, when loaded into a computer for document processing, causing the computer to perform the method according to claim 1.

10. An apparatus for expanding a source document obtained by the method according to claim 6 into an expanded document in a printable format, the apparatus being configured to retrieve the metadata from the printable and machine-readable objects in the source document.

11. The method according to claim 2, wherein the attributes comprise an object color and a background color, and the objects encoding the metadata are prevented from being rendered by setting the object color attribute to be equal to the background color.

12. The method according to claim 2, wherein the attributes comprise an object transparency, and the printable objects encoding the metadata are prevented from being rendered by setting the object transparency to 100%.

13. The method according to claim 2, wherein the attributes comprise a position attribute determining the position of the printable object on a page, and the objects encoding the metadata are prevented from being rendered by assigning a position attribute that places them outside of a printable domain.

14. A method of converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the method comprising the step of: using the method according to claim 2 for storing the information in the reformatted document.

15. A method of converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the method comprising the step of: using the method according to claim 3 for storing the information in the reformatted document.

16. A method of converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the method comprising the step of: using the method according to claim 4 for storing the information in the reformatted document.

17. A method of converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the method comprising the step of: using the method according to claim 5 for storing the information in the reformatted document.

18. An apparatus for converting a document in which a record structure of information in the document is encoded in the form of non-printable metadata into a reformatted document, the apparatus being arranged to implement the method according to claim 7.

19. A computer program product comprising program code on a machine readable non-transitory medium, the program code, when loaded into a computer for document processing, causing the computer to perform the method according to claim 2.

20. A computer program product comprising program code on a machine readable non-transitory medium, the program code, when loaded into a computer for document processing, causing the computer to perform the method according to claim 3.

Description

[0016] Embodiment examples will now be described in conjunction with the drawings, wherein:

[0017] FIG. 1 is a block diagram of a printing system employing a method according to the invention;

[0018] FIG. 2 is a diagrammatic representation of a conventional VDP document;

[0019] FIG. 3 is a diagrammatic representation of the document shown in FIG. 2, but reformatted in accordance with the invention;

[0020] FIG. 4 is an enlarged view of a single page in the reformatted document shown in FIG. 3;

[0021] FIG. 5 is a diagrammatic representation of a document obtained by extracting pages from the VDP document in accordance with the present invention.

[0022] FIG. 1 schematically shows a printing system that is specifically adapted for VDP printing and comprises a document converter 10, a document expander 12, a print engine 14, a quality control section 16 and a document splitter 18. The document converter 10 has a memory 20 for receiving and storing documents 22a in a standard VDP format such as PDF/VT. The converter 10 further has program code for converting the document 22a into a reformatted document 22b which will then be sent to and stored in a memory 24 of the document expander 12.

[0023] The document expander 12 has program code for expanding the reformatted document 22b into an expanded document 22c which will then be sent to and stored in a memory 26 of the print engine 14.

[0024] It will be understood that the document converter 10 and the document expander 12 may be implemented in a print preprocessor or a print server or may form part of a print controller of the print engine 14.

[0025] The document 22a, and accordingly also the documents 22b and 22c, contain information to be printed, and this information is grouped into certain units, e.g. into pages each of which is to be printed on a sheet of a recording medium in the print engine 14.

[0026] In accordance with the PDF/VT standard, the document 22a specifies a plurality of records to be printed and includes both variable content (e.g. variable text) which varies from record to record, and static content, e.g. in the form of so-called reusable objects, which is typically replicated in each record. In order to specify the record structure that is constituted by the variable content and the static content, the document 22a includes not only printable objects 28 which will actually appear on the printed copies, but also metadata 30 which specify a structure of the document 22a, for example defining a sequence of records wherein each record corresponds to a set of sheets to be mailed to an individual customer. The metadata may comprise further information for supporting selecting records or pages belonging to records, such as a customer identifier, a mailing address, or some other identifier.

[0027] The document expander 12 expands the document into a format in which each copy is composed of a number of pages in the form in which they are to be printed, i.e. with replica of the static content included in each copy. In the print engine 14, each page of the document 22c will be converted into a bitmap by raster image processing and will be printed on a recording sheet.

[0028] The document 22b has been converted by the converter 10 into a format in which the static content is included only once in the entire document, similarly as in case of the document 22a. However, unlike the document 22a, the reformatted document 22b includes printable objects 28. The metadata 30 are converted into a form in which they are included in the printable objects 28, as will be explained in detail below. Encoding metadata 30 into printable objects 28 does not preclude the metadata 30 from being also included in document 22b in their original format though.

[0029] Note that the functionality of the converter 10 may alternatively be embedded in the expander 12, the expander 12 doing the conversion of metadata 30 during the expansion process.

[0030] When the hard copies have been printed from the expanded document 22c, they are inspected in the quality control section 16, and in case that any printing or finishing errors have occurred which result in an inacceptable quality of some of the printed pages, these pages (and preferably only these pages) have to be reprinted.

[0031] For that purpose, the document splitter 18 has access to the expanded document 22c. Based on instructions received from the quality control section 16 or input manually by a user, the document splitter 18 extracts from the document 22c those pages which have to be reprinted, and thereby composes a new document 32c which has the same format as the document 22c but includes only the pages that need to be reprinted, including the metadata that are pertinent to these pages. The document 32c is then sent back to the print engine 14 where it is reprinted.

[0032] Document 32c will be in an expanded format, but will be significantly reduced in size compared to document 22c as it only comprises the pages that need to be reprinted. Therefore, a significantly smaller document is submitted to the print engine 14 reducing bandwidth utilization on a network and storage space utilization in the print engine 14 for sending and storing document 32c compared to document 22c, as well as utilization of the raster image processor and storage space utilization by the raster images.

[0033] Note that although FIG. 1 appears to show documents 22a and 22b with a smaller number of pages than document 22c (or document 32c), it is not uncommon that the VDP format used for documents 22a and 22b has the same number of pages as document 22c. For example, it is typical for modern VDP formats to explicitly define every single page to be printed. However, static pages will only contain references to reusable objects, while dynamic pages will typically contain page specific printable objects. (Note that it is possible to form unique, dynamic pages with references to reusable objects.)

[0034] The structure of the document 22a is shown in greater detail in FIG. 2. As described, it includes the printable objects 28 and the metadata 30. The printable objects 28 are grouped into a sequence of pages 34 which are labeled as page 1, . . . , page n in this example. It will be observed that the metadata 30 are neither grouped into pages nor divided into data items that could be mapped onto the pages, because there is not necessarily a one-to-one mapping between metadata items and pages. This is the reason why the format of the document 22a would be difficult to handle in the document splitter 18.

[0035] In the example shown, the pages page 1 and page 2 constitute a first record. Similarly, page 3 and page 4 form a second record and page 5 and page 6 form a third record. The pages page 1, page 3, and page 5 may for example constitute static content by all referencing the same reusable objects that are thus to be replicated for each record. The pages page 2, page 4 and page 6 may constitute variable content, which means that at least some of the printed objects 28 on these pages are different from record to record.

[0036] In the format used for the document 22a, the definitions of the records, specifying which pages belong to which records form part of the metadata.

[0037] When the document converter 10 converts the document 22a into the reformatted document 22b, it parses the metadata 30 read from the document 22a and translates them into printable but invisible objects 36 and places them onto the pages that constitute the respective records, as has been shown in FIG. 3.

[0038] FIG. 4 is an enlarged representation of page 1 in FIG. 3. As shown in FIG. 4, the extra printable object 36 takes the form of an invisible comment that is inserted on the top margin of the page 34, whereas the other (visible) content of the page is constituted by the printable object 28 that is taken from the document 22a.

[0039] In the example shown, the invisible comment comprises two text lines each of which starts with a specific mark 38 (*) which identifies the text line as part of the metadata. The first text line, start record 1, indicates that the first record starts at this position, i.e. at the top of page 1. The second text line, tags (name, value, . . . ) includes an identifier and other parameters of the first record.

[0040] Each text item on the page 34 has attributes which specify how the text is to be rendered. These attributes have been indicated on the right side in FIG. 4. The printable object 28 that constitutes the contents of the page has the attribute white for background color and the attribute black for the text color, so that the text will be rendered as black letters on a white background. In case of the invisible printable object 36, the text color has been set to white, i.e. the same color as the background, which makes the text invisible.

[0041] Returning to FIG. 3, it will be understood that the invisible printable object 36 on page 2 marks the end of the first record in a similar way, and the invisible printable objects 36 on the further pages identify these pages as further starts and ends of records.

[0042] The reformatted document 22b may be a normal PDF document, for example. In that case, the invisible text in the printable objects 36 may be encoded in a text object in a suitably identifiable and interpretable format which can readily be interpreted by the document splitter 18. Thus, by interpreting the invisible text, the document splitter 18 can retrieve all the information (metadata) that is needed for compiling the document 32c by having access to the metadata and in particular the record structure. Note that the presence of the metadata in the form of the printable objects 36 in document 22b does not preclude the presence in the form as metadata 30 similar to document 22a. The expander 12 may use either the original metadata 30 if still present in document 22b or may alternatively interpret the printable objects 36 to derive the metadata.

[0043] In another embodiment, the reformatted document 22b may be in a multi-layer TIFF format, for example. In that case, the printable objects 36 may be comprised in a layer that encodes the metadata in the pixel data of this layer while the pixel data is being prevented from being rendered by for example another layer on top of the metadata layer obscuring the pixels encoding the metadata.

[0044] FIG. 5 shows the document 32c that is obtained by extracting two pages from the reformatted document 22c. In this example, the extracted pages are the two pages that constitute the first record. The information that these pages constitute a record is encoded in the invisible printable objects 36 on these pages. Consequently, when the document 32c is resent to the print engine 14 for reprinting these pages, the print engine 14 will receive a document consisting only of page 1 and page 2. Consequently, only these pages (record 1) will be reprinted when the expanded document is sent to the print engine 14 and therefore the network bandwidth needed for submitting the document 32c is substantially lower than resubmitting document 22c. Furthermore, storage space needed for storing document 32c will also be lowered, as well as the storage space needed for the RIPped bitmaps and the CPU load resulting from the RIP.

[0045] The major advantage of the present invention is the metadata actually being tied directly to the pages. Typically, the expanded document 22c does not contain the metadata 30 anymore and any record structure or other information stored in it may not be apparent anymore. Due to the metadata being encoded in the printable objects 36, the metadata is still available, and even better, it is still available in document 32c after the document splitter 18 has extracted individual pages from the expanded document 22c. Actually, the invention allows for all kind of document processing tools to be deployed for intermediate processing without the risk of discarding the metadata as long as these tools do not delete or alter the content of printable objects.

[0046] It will be understood that the pages that are extracted for being reprinted do not have to comprise a record but might comprise only part of a record and, on the other hand, might include also pages or combinations of pages of the records.

[0047] When a record extends to three or more consecutive pages, it may be preferable that each individual page of the record has an invisible printable object identifying that page as part of the record. In that case, it would even be possible to extract only the first few pages of a record in the document splitter 18 in order to reprint only these pages, because all necessary information would be available even though the page marking the end of the record has been clipped away.