Methods for dynamic document generation

Abstract

Dynamic web page generation is optimized by reducing the processing overhead required to parse the web page HTML code for tokens and insert dynamic content. Using the invention, an HTML file for a dynamic web page need be read and parsed only once throughout the life of the server. A software object parses the HTML, decomposes the page into constituent pieces and saves them to data structures as byte streams, which are cached, along with the software object, rendering multiple disk accesses unnecessary when the page is reconstituted. For subsequent requests, the dynamic page is created from the cached version, which is shareable across users and across requests. The optimization reduces server resource usage for dynamic page generation to near zero. The invention is also applicable to other documents combining static and dynamic content that require composition tools for editing.

Claims

1. A computer-implemented method comprising: generating, by one or more servers including at least one data storage device, a template for a web page, wherein the template includes at least one static content portion and at least one placeholder for at least one dynamic content portion comprising content based on information received from a client device; storing, by the one or more servers, the at least one static content portion; storing, by the one or more servers, the at least one placeholder for the at least one dynamic content portion in association with the stored at least one static content portion; receiving, from the client device by the one or more servers, content corresponding to the at least one dynamic content portion; and storing, by the one or more servers, the received content corresponding to the at least one dynamic content portion; and mapping, by the one or more servers, the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

2. The computer-implemented method of claim 1, wherein: storing the at least one static content portion comprises storing the at least one static content portion in a first data structure; storing the at least one placeholder for the at least one dynamic content portion comprises storing the at least one placeholder for the at least one dynamic content portion in a second data structure; and storing the received content corresponding to the at least one dynamic content portion comprises storing the received content corresponding to the at least one dynamic content portion in a third data structure; and mapping the stored at least one placeholder to the received content comprises mapping the stored at least one placeholder using the third data structure, wherein the third structure maps the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

3. The computer-implemented method of claim 2, wherein: the first data structure comprises a first array of immutable content containing at least one static content portion and at least one integer object associated with the at least one placeholder; the second data structure comprises a second array of immutable content mapping the at least one integer object to at least one placeholder replacement object, the at least one placeholder replacement object comprising a raw placeholder name for the at least one placeholder; and the third data structure maps the raw placeholder name to at least one replacement value based on the received content.

4. The computer-implemented method of claim 3, wherein the at least one dynamic content portion comprises the at least one placeholder and a text surrounding the at least one placeholder.

5. The computer-implemented method of claim 2, wherein the one or more servers includes a cache, and wherein the method further comprises storing the first data structure, the second data structure, and the third data structure in the cache of the one or more servers.

6. The computer-implemented method of claim 1, further comprising sending, by the one or more servers, data corresponding to the template for the web page to the client device.

7. The computer-implemented method of claim 1, wherein the at least one placeholder comprises a plurality of placeholders and the at least one dynamic content portion comprises a plurality of dynamic content portions.

8. A system comprising: at least one processor; and at least one non-transitory computer readable storage medium storing instructions that, when executed by the at least one processor, cause the system to: generate a template for a web page, wherein the template includes at least one static content portion and at least one placeholder for at least one dynamic content portion comprising content based on information received from a client device; store the at least one static content portion; store the at least one placeholder for the at least one dynamic content portion in association with the stored at least one static content portion; receive, from the client device, content corresponding to the at least one dynamic content portion; store the received content corresponding to the at least one dynamic content portion; and map the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

9. The system of claim 8, wherein: storing the at least one static content portion comprises storing the at least one static content portion in a first data structure; storing the at least one placeholder for the at least one dynamic content portion comprises storing the at least one placeholder for the at least one dynamic content portion in a second data structure; and storing the received content corresponding to the at least one dynamic content portion comprises storing the received content corresponding to the at least one dynamic content portion in a third data structure; and mapping the stored at least one placeholder to the received content comprises mapping the stored at least one placeholder using the third data structure, wherein the third structure maps the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

10. The system of claim 9, wherein: the first data structure comprises a first array of immutable content containing at least one static content portion and at least one integer object associated with the at least one placeholder; the second data structure comprises a second array of immutable content mapping the at least one integer object to at least one placeholder replacement object, the at least one placeholder replacement object comprising a raw placeholder name for the at least one placeholder; and the third data structure maps the raw placeholder name to at least one replacement value based on the received content.

11. The system of claim 10, wherein the at least one dynamic content portion comprises the at least one placeholder and a text surrounding the at least one placeholder.

12. The system of claim 8, further comprising instructions that, when executed by the at least one processor, cause the system to send data corresponding to the template for the web page to the client device.

13. The system of claim 8, wherein the at least one placeholder comprises a plurality of placeholders and the at least one dynamic content portion comprises a plurality of dynamic content portions.

14. A non-transitory computer-readable medium storing instructions that, when executed by at least one processor, cause a computer system to: generate a template for a web page, wherein the template includes at least one static content portion and at least one placeholder for at least one dynamic content portion comprising content based on information received from a client device; store the at least one static content portion; store the at least one placeholder for the at least one dynamic content portion in association with the stored at least one static content portion; receive, from the client device, content corresponding to the at least one dynamic content portion; and store the received content corresponding to the at least one dynamic content portion; and map the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

15. The non-transitory computer-readable medium of claim 14, wherein: storing the at least one static content portion comprises storing the at least one static content portion in a first data structure; storing the at least one placeholder for the at least one dynamic content portion comprises storing the at least one placeholder for the at least one dynamic content portion in a second data structure; and storing the received content corresponding to the at least one dynamic content portion comprises storing the received content corresponding to the at least one dynamic content portion in a third data structure; and mapping the stored at least one placeholder to the received content comprises mapping the stored at least one placeholder using the third data structure, wherein the third structure maps the stored at least one placeholder to the received content corresponding to the at least one dynamic content portion.

16. The non-transitory computer-readable medium of claim 15, wherein: the first data structure comprises a first array of immutable content containing at least one static content portion and at least one integer object associated with the at least one placeholder; the second data structure comprises a second array of immutable content mapping the at least one integer object to at least one placeholder replacement object, the at least one placeholder replacement object comprising a raw placeholder name for the at least one placeholder; and the third data structure maps the raw placeholder name to at least one replacement value based on the received content.

17. The non-transitory computer-readable medium of claim 15, wherein the computer system includes a cache, and further comprising instructions that, when executed by the at least one processor, cause the computer system to store the first data structure, the second data structure, and the third data structure in the cache of the computer system.

18. The non-transitory computer-readable medium of claim 14, further comprising instructions that, when executed by the at least one processor, cause the computer system to send data corresponding to the template for the web page to the client device.

19. The non-transitory computer-readable medium of claim 14, wherein the at least one placeholder comprises a plurality of placeholders and the at least one dynamic content portion comprises a plurality of dynamic content portions.

20. A process for optimizing generation of a computer readable document incorporating static and dynamic content, comprising the steps of: providing, by at least one processor, a template file of said document, said file resident on a non-transitory mass storage device of a first computer; reading said template into memory; creating, by the at least one processor, a content composer, said content composer comprising a first software object; parsing, by the at least one processor, said template by said content composer, said template including one or more portions of static content and one or more placeholders corresponding to one or more portions of dynamic content, said parsing comprising deconstructing the template in order to separate the one or more portions of static content and the one or more placeholders; decomposing said template into separate page components by said content composer; converting, by the at least one processor, said components into strings of computer readable code by said content composer; storing said strings to one or more data structures; and caching said data structures containing said page components.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 provides a top-level block diagram of a process for optimizing generation of a computer readable document incorporating static and dynamic content, according to the invention;

(2) FIG. 2 provides a block diagram of a sub-process for “freeze-drying” raw content from a document template, according to the invention;

(3) FIG. 3 provides a block diagram of a plurality of data structures for storing the freeze-dried content of FIG. 2, according to the invention; and

(4) FIG. 4 provides a block diagram of a sub-process for composing a document from the freeze-dried content of FIG. 3, according to the invention.

DETAILED DESCRIPTION

(5) Overview

(6) A description of the details and function of the present invention is provided below. The source code listed in APPENDIX A, written in JAVA, details the implementation of a preferred embodiment of the invention. The patentee has no objection to the reproduction of the source code or other information for the purpose of obtaining and maintaining a valid patent. However, the patentee otherwise reserves all copyright interests.

(7) The invention is embodied as both a process to be executed on a computer, typically a web server, and a computer program product providing computer readable program code means for executing the various steps of the process. The computer readable program code is embodied on a computer readable medium. The computer readable medium may be either fixed, such as a mass storage device or a memory, or it may be removable, such as a CD or a diskette. The invention is implemented through the use of conventional computer programming techniques well known to those skilled in the art. While the source code provided in the attached appendix is written in JAVA, other programming languages would also be suitable for programming the invention. While the invention is preferably programmed in an object-oriented language such as JAVA or C++, other embodiments, consistent with the spirit and scope of the invention, programmed in procedural languages or scripted languages, are also possible.

(8) Referring now to FIG. 1, the invention provides a process for optimizing generation of a computer readable document incorporating static and dynamic content 10, particularly web pages being served up to a client in response to a request from a user. As previously mentioned, one of the most common ways of generating web pages having dynamic content is to start with a page template. Typically, the page template is a file of HTML code containing placeholders where the dynamic content is to be inserted. The placeholders usually consist of tokens. For example, “@Username@” might be typically used as a placeholder for a user's name. When the template is created, or after it is edited, it is saved to disk, typically on a web server. Thereafter, the HTML file is read from the disk and parsed to locate the “live” or dynamic sections, which have been set off or reserved by the tokens. The invention provides a process in which the HTML file needs to be read from disk and parsed only once, unlike prior art methods, which require that the file be read and parsed every time a client requests the page.

(9) File Reads

(10) In the current embodiment of the invention, the HTML file is read from the disk 11 by means of a helper software object tasked with various utility file operations, such as reading in files, getting file lists and so on. Reading pages of “static” content is performed by a “getContent( )” method embodied in the helper object. The “getContent( )” method of the helper object retrieves the raw HTML file and stores the raw content to the cache as a string. More detailed descriptions of the operation of the helper object and the “getContent( )” method are to be found by referring to the documentation provided in the enclosed Appendix.

(11) Content Composer

(12) When parsing the HTML file for caching and token replacement purposes, the goal is to separate the HTML file into its component static pieces, dynamic pieces, and replaceable token pieces. A common term of art for this process is “freeze-drying” 12. The invention provides a ContentComposer class that is the sole parser and manager of this freeze-dried content. Each HTML file has a separate instance of the ContentComposer object associated with it. In keeping with conventional methods of object-oriented programming, in which an object includes both instructions and the associated data, the ContentComposer object for a particular page includes the implementation logic and the raw content string. When a file is loaded, the helper object checks to see if a ContentComposer object exists for the file. If the file has no associated ContentComposer object, the helper object creates one 20. A global HashMap, held in the cache, provides storage for ContentComposer objects. Thus, following creation of the ContentComposer, the new ContentCompser object is stored to the global Hashmap. In this way, the deconstructed file data is effectively cached, so that it may be used on subsequent invocations 21.

(13) After being cached, ContentComposer parses the HTML file by “decomposing” the raw code string, separating it into its various components 22. Components are one of three types: blocks of immutable content containing no tokens; lines of immutable content that surround tokens; and token replacement values.

(14) According to a preferred embodiment of the invention, a token comprises a string that starts and ends with the “@” characters and contains no embedded white space, newline characters, colons, semi-colons, or commas. However, the delimiting characters are a mere matter of choice, dictated in this case by the conventional manner of creating tokenized HTML code.

(15) In some cases, only the token is replaced, in other cases, the entire line containing the token is replaced. For example, the method allows calling processes to replace the whole line of text that the token was on, which is a frequent operation for callers replacing <li> or <select> items.

(16) As previously described, the helper object provides a raw code string to the ContentComposer for parsing. A setContents( ) method within the ContentComposer provides most of the parsing logic for the invention. The setContents( ) method parses the raw content string to locate delimiting characters. Upon locating a delimiting character, the parsing engine evaluates the string for the presence of the previously indicated illegal characters—white space, newline characters, colons, semi-colons, or commas. The presence of any illegal characters indicates that the delimiting character is not associated with a valid token. “@foo bar@” or “keith@iamaze.com” are examples of such invalid strings. As the various page components are identified, they are stored to one of several data objects that are also associated with the ContentComposer. After the page components are identified, the page is decomposed by saving the separate components to a plurality of data structures 23. These data structures are described in greater detail below. It should be noted that the process of separating the page into components and storing them in the data structures constitutes the process commonly known as “freeze-drying.” While, for the purpose of description, the data and the data structures are described separately from the logic and instructions, they are, in fact, all associated within a single ContentComposer object, which is held in the cache. Thus, as with the raw code string, the data structures containing the page components are effectively cached, eliminating the necessity of any further disk accesses when the HTML file is composed.

(17) After the page components are cached, calling processes can ask the ContentComposer to perform token replacement, which it can do very fast: in 0-1 time, the tokens are stored in a HashMap as described below. The final part of SXContentComposer's lifecycle is when the caller asks the ContentComposer to “compose( )” itself, thus creating a page for download to a client 13. The compose( ) method itself provides additional important performance gains. Rather than recomposing the HTML into a string, and passing the string to the calling process, extremely wasteful of memory and processor time, the ContentComposer walks through the data structures and writes the data to an output stream as it is walking 14.

(18) This implementation holds three primary data structures. It is necessary to hold this parsed data in three disparate, but linked, data structures because the data must be accessed from a number of different “angles”, and for a number of different purposes. The composer will need access to all the original static text, plus some way to gather the token replacement values. The caller will need to replace token values (by specifying the token name), or the whole line the token appears on. The caller may also want to inspect the line a token appears on.

(19) Data Structures

(20) The three primary data structures are as follows:

(21) The first is an array of immutable content broken up into “chunks” 30. Each chunk is either a text block with no “@foo@” tokens, or it is an integer object pointing to the index of a token replacement object, (SXTokenLine) which will supply the values (string) for that chunk.

(22) The second data structure is also an array of immutable content: an array of the token-replacement-objects mentioned above 31, and pointed to by the chunks array. These token-replacement-objects are of type Token Line and they hold the static text that immediately precedes and follows a token. They also hold the raw token name itself (e.g. “@FooBar@”) as well as a pointer to an object stored within the third data structure, a structure that holds the replacement line or replacement value associated with this token. This final object is of type Token. While the names assigned to the various page component types in the current embodiment are descriptive of their content, they are primarily a matter of choice.

(23) The third data structure is a HashMap with all the tokens from the raw content as keys and all the replacement values set by the calling process as the values 32. These replacement values are of type Token Object, which can hold a replacement line or a replacement value for a token.

(24) Note that the immutable text chunks never change throughout the life of this object, while the values stored in the tokens and replacement values HashMap are likely to change every time content is created, since tokens and replacement values represent the dynamic portion of the content.

(25) Furthermore, to reduce the overhead of future writes to streams, and to reduce the excessive creation of string objects, the static data in both the immutable text chunks array as well as the immutable token lines array is stored as byte( ) rather than string.

(26) Compose( ) Method

(27) The Compose( ) method of the ContentComposer writes each text chunk and token replacement value to an output stream in sequential order, creating a single, coherent, token-replaced text stream.

(28) As the ContentComposer walks the immutable text chunks array 40, if it encounters an array entry that is a token rather than a chunk of text, instead of concatenating the actual token, it concatenates the value for the token found in the tokens and replacement values HashMap 41.

(29) The specific process and data structures used by the ContentComposer are described in greater detail in the example provided below.

Example

(30) Sample Raw Content:

(31) TABLE-US-00001 <html> <title> iAmaze Presentation Tool </title> <hl> Welcome to iAmaze, @UserName@! </hl> Would you like to work on the presentation you last worked on, named @LastPresentation@? If so, click here. </html>
Sample Raw Data Structures Created from Raw Content:
Immutable Text Chunks Array:

(32) TABLE-US-00002 immutableTextChunksArray[0] = ″<html> <title>iAmaze Presentation Tool </title>″ immutableTextChunksArray[1] = ″new integer(0)″(use to look up, at index=0, this token's pre- & post- SXTokenLine line text objects in the “immutableTokenLines” array.) immutableTextChunksArray[2] = ″! </hl> ″ immutableTextChunksArray[3] = ″new Integer (1) ″(index into “immutableTokenlLines″ array, above and below) immutableTextChunksArray[4] = ″ ? If so, click-here. </html> ″
Immutable Token Lines Array:

(33) TABLE-US-00003 ImmutableTokenLinesArray[0] = {SXTokenLine{prefix= “<hi>Welcome to iAmaze, “ suffix=”! </hl>” , pointer to SXToken object in the tokensAndReplacementValues} ImmutableTokenLinesArray[1] = {SXTokenLine{prefix= “Would you like to work on the presentation you last worked on , named: ”, suffix=“ ” ,pointer to SXToken object in the tokensAndReplacementValues}
Tokens and Replacement Values HashMap:

(34) TABLE-US-00004 TokensAndReplacementValues={ {“@UserName@”,SXToken{replacementForToken=null, replacementForTokenLine=null}}, {“@LastPresentation@”, SXToken{replacementForToken=null, replacementForTokenLine==null}}}

(35) Thus, the data structures for the example page appear as shown above immediately after the parsing or “freeze-dry” process. After being supplied values by calling process, for example, in response to a request from a user, two separate methods are called to replace the tokens with the new content:

(36) After Calls to:

(37) TABLE-US-00005 anSXContentComposer.replaceLineContainingToken(“@UserName@”,“< hl> Welcome to work, Keith! </hl>”); anSXContentComposer.replaceToken(“@LastPresentation@”,“1999 Harleys”);
The Tokens and Replacement Values HashMap Look as Below:

(38) TABLE-US-00006 tokensAndReplacementValues= {{“@UserName@ ”,SXToken{replacementFortoken=null, replacementForTokenLine=“<hl>Welcome to work, Keith! </hl>”}}, {″@LastPresentation@″, SXToken {replacementForToken = “1999Harleys”, replacementForTokenLine=null}}}

(39) The first call replaces the entire line containing the token. The second call replaces only the token. The immutable text chunks and the immutable token lines arrays remain the same, since they contain immutable data.

(40) A call to SXContentComposer's Compose( ) or toString( ) methods generates the following:

(41) TABLE-US-00007 <html> <title> iAmaze Presentation Tool </title> <hl> Welcome to work, Keith! </title> Would you like to work on the presentation you last worked on, named 1999 Harleys? If so, click here. </html>

(42) The toString( ) method outputs a string to the output stream in a fashion similar to the Compose( ) method. More detailed description of the toString( ) method as well as the replaceLineContainingToken( ), and replaceToken( ) methods is to be found below.

(43) Although the invention has been described herein with reference to certain preferred embodiments, one skilled in the art will readily appreciate that other applications may be substituted without departing from the spirit and scope of the present invention. Accordingly, the invention should only be limited by the Claims included below.

Methods for dynamic document generation

Assignee

Inventors

Cpc classification

Classification Explorer

G06F40/10

PHYSICS

Classification Explorer

G06F16/986

PHYSICS

Classification Explorer

G06F8/427

PHYSICS

Classification Explorer

G06F16/958

PHYSICS

Classification Explorer

G06F40/143

PHYSICS

Classification Explorer

G06F16/9574

PHYSICS

Classification Explorer

H04L67/1097

ELECTRICITY

Classification Explorer

G06F16/972

PHYSICS

International classification

Classification Explorer

G06F17/00

PHYSICS

Classification Explorer

H04L29/08

ELECTRICITY

Classification Explorer

G06F17/21

PHYSICS

Classification Explorer

G06F17/22

PHYSICS

Classification Explorer

G06F9/45

PHYSICS

Classification Explorer

G06F17/30

PHYSICS

Abstract

Claims

Description