Systems And Methods For Partial Information Retrieval Using Data Provenance Techniques
20250190463 ยท 2025-06-12
Assignee
Inventors
- Ursula Caroline Wolz (Bennington, VT, US)
- Christopher Dunne (Plainfield, NJ, US)
- James Mangione (Phoenixville, PA, US)
Cpc classification
International classification
Abstract
Systems and methods for partial information retrieval using data provenance techniques are disclosed. The system includes an partial information retrieval processor that executes an event trigger software agent which identifies an event, a query listener agent which generates a query in response to the identified event, and a partial information retrieval agent which processes the query in accordance with one or more modular domain heuristic data structures and generates response data that includes provenance information. The system can include a knowledge base updating agent which updates a knowledge base using the response data, as well as a response generator agent. The system allows for the generation of natural language answers to questions in circumstances where only partial information is available, such as partially-identified question types or question contexts. A visualization user interface is also provided, which allows for visualization of partial information retrieval outcome.
Claims
1. A system for partial information retrieval comprising: a partial information retrieval processor in communication with a data source; an event trigger agent executed by the processor, the event trigger software agent identifying an event; a query listener agent executed by the processor and generating a query in response to the identified event; and a partial information retrieval agent executed by the processor and processing the query in accordance with a modular domain heuristic data structure and generating response data that includes provenance information.
2. The system of claim 1, further comprising a knowledge base updating agent executed by the processor, the knowledge base updating agent updating a knowledge base using the response data.
3. The system of claim 2, wherein the knowledge base updating agent generates the modular domain heuristic data structure, and the modular domain heuristic data structure includes domain knowledge, a domain heuristic, and at least one partial information retrieval heuristic.
4. The system of claim 3, wherein the domain heuristic comprises a domain-specific heuristic.
5. The system of claim 4, wherein the domain-specific heuristic comprises a food-specific domain heuristic including a food quantity heuristic, a food type heuristic, a food source heuristic, a data origin indicator, an amount origin indicator, and a type origin indicator.
6. The system of claim 2, further comprising a response generator agent executed by the processor, the response generator agent generating at least one human-readable response based on the response data.
7. The system of claim 6, wherein the response generator agent causes the event trigger agent to trigger an event.
8. The system of claim 1, wherein the response comprises a natural language answer.
9. The system of claim 1, wherein the response comprises a response data structure having at least one data provenance chain.
10. The system of claim 1, wherein the query listener agent generates a query data structure including a question type, a question context, and provenance data.
11. The system of claim 1, further comprising a visualization interface generated by the system, the visualization interface displaying at least one visualization piece that visualizes the response data.
12. The system of claim 11, wherein the at least one visualization piece includes a first section which graphically illustrates an actual value, a second section which graphically illustrates a recommended value, and a difference section which graphically illustrates a difference between the actual value and the recommended value.
13. A method for partial information retrieval comprising: providing a partial information retrieval processor in communication with a data source; identifying the occurrence of an event using an event trigger agent executed by the processor; generating a query by a query listener agent executed by the processor in response to the identified event; and processing the query in accordance with a modular domain heuristic data structure using a partial information retrieval agent executed by the processor; and generating response data that includes provenance information.
14. The method of claim 13, further comprising updating a knowledge base by a knowledge base updating agent executed by the processor and using the response data.
15. The method of claim 13, wherein the knowledge base updating agent generates the modular domain heuristic data structure, and the modular domain heuristic data structure includes domain knowledge, a domain heuristic, and at least one partial information retrieval heuristic.
16. The method of claim 15, wherein the domain heuristic comprises a domain-specific heuristic.
17. The method of claim 16, wherein the domain-specific heuristic comprises a food-specific domain heuristic including a food quantity heuristic, a food type heuristic, a food source heuristic, a data origin indicator, an amount origin indicator, and a type origin indicator.
18. The method of claim 14, further comprising generating at least one human-readable response based on the response data using a response generator agent executed by the processor.
19. The method of claim 18, wherein the response generator agent causes the event trigger agent to trigger an event.
20. The method of claim 13, wherein the response comprises a natural language answer.
21. The method of claim 13, wherein the response comprises a response data structure having at least one data provenance chain.
22. The method of claim 13, wherein the query listener agent generates a query data structure including a question type, a question context, and provenance data.
23. The method of claim 13, further comprising generating and displaying a visualization interface, the visualization interface displaying at least one visualization piece that visualizes the response data.
24. The method of claim 23, wherein the at least one visualization piece includes a first section which graphically illustrates an actual value, a second section which graphically illustrates a recommended value, and a difference section which graphically illustrates a difference between the actual value and the recommended value.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0010] The foregoing features of the invention will be apparent from the following Detailed Description of the Invention, taken in connection with the accompanying drawings, in which:
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
DETAILED DESCRIPTION
[0018] The present disclosure relates to systems and methods for partial information retrieval using data provenance techniques, as discussed in detail below in connection with
[0019]
[0020] The processor 12 could include one or more suitable computer systems programmed in accordance with the present invention, such as a personal computer, server, cloud computing platform, or other suitable processor. The processor 12 could be programmed using any suitable high- or low-level programming language, including, but not limited to, C, C++, Java, Javascript, Python, or other suitable programming language. Additionally, the processing steps disclosed herein could be embodied as computer-readable instructions stored in a non-transitory, computer-readable medium provided in, or in communication with, the processor 12. The network 16 could be any suitable computer network including, but not limited to, a local area network, wide area network, an intranet, the Internet, a cellular data communications network, or any other suitable computer network. The end-user computing device 18 could include, but is not limited to, a personal computer, a tablet computer, a laptop computer, a mobile computing device, a smart cellular telephone, or any other suitable computing device. Additionally, it is noted that the functions performed by the processor 12 could be performed by the end-user computing device 18. Still further, the end-user computing device 18 could include a smart watch, and/or the queries generated by the computing device 18 could alternatively (or, additionally) be generated by one or more sensors, sensor agents that ask questions (e.g., in a form that is not intuitive to a human being), an Internet-of-Things (IoT) device, an artificial intelligence (AI) platform and/or chip, or any other device/platform/service that can generate a question.
[0021]
[0022] The event trigger agent 22 monitors for an event that is suitable for a question-and-answer cycle. Such triggers could include, but are not limited to, information generated by a sensor (such as an electro-mechanical sensor, an audio input, a gesture input, an environmental sensor, etc.), a query generated by a user (e.g., using one of the end-user computing devices 18 of
[0023] The query listener agent 24 can be instantiated by a recognized event trigger (e.g., by the event trigger agent 22), and is a listener software object that has baseline information about its triggers and can also include data provenance information. For example, if the agent 24 is instantiated in response to an event generated by a sensor (and detected by the trigger agent 22), it can keep track of information relating to the sensor, such as sensor type, manufacturer name, model number, serial number, etc. If the agent 24 is instantiated in response to a cloud-based event detected by the agent 22, it can keep track of information such as a Uniform Resource Locator (URL) link associated with the cloud-based event, an indication of whether a website corresponding to the URL link is a government database, or is associated with a non-profit or commercial entity. The agent 24 records the time of the event detected by the agent 22 and parses event data into the query data structure 26.
[0024] The query data structure 26 includes data relating to a question type, a question context, and provenance data (either created by the agent 24 or added to a provenance map that the agent 24 may have received). The question type is domain-dependent and can be as specific or broad as an application domain may require. It provides the query instruction type for accessing domain knowledge structures. For example, the question type could be as specific as a formal Structured Query Language (SQL) or more general (e.g., a text stream), depending on the domain. The question context provides the focus of the query, and can be represented as an attribute of an attribute value pair, where the result returned by the system is the value. Similar to the question type, the question context can be a text stream to focus the response of a large language model. The provenance data can be a map of the processes through which the input data (e.g., event data generated by the event agent 22) is transformed into output (e.g., response structure), including the input into each process in the map. The map can be organized by granularity (much like a physical map, where high-level countries are shown, followed by states, cities, etc.). The provenance data is utilized by the partial information retrieval agent 28 to produce a response. Simplified provenance maps can be single bundles of event data, originating sensor information, and time information. The granularity of the map can be dependent upon the requirements of the domain database search techniques utilized by the system. Additionally, the provenance map can be fully or partially encrypted to prevent tampering, and a key can also be included that only allows for decryption of the map by processes that require the map for decision-making purposes.
[0025] The partial information retrieval agent 28 receives and processes the query data structure 26 and creates the response data structure 28, and its behavior is controlled by the modular, domain-specific heuristics data structure 30. A detailed description of the processes carried out by the agent 28 is provided below in connection with
[0026] The modular domain heuristic data structure 30 includes information relating to domain knowledge, domain heuristics, and partial information retrieval heuristics. These components are defined at levels of specificity necessary to determine the sufficiency of a query structure, and can be embodied as knowledge objects in software. Additionally, the domain knowledge, domain heuristics, and partial information retrieval heuristics can be implemented as processes ranging from fixed selection statements (if . . . then conditional logic) in software code, to machine learning architectures. It is noted that the data structure 30 is modular in nature, in that, depending on the domain being processed by the system, one or more customized data structures 30 could be utilized and/or substituted by the agent 28. For example, if the agent 28 is processing a query in the domain of food data, a specific data structure 30 with its own customized heuristics germane to food data could be utilized by the agent 28 to develop a query response (the heuristics of the data structure 30 controlling operation of the agent 28), whereas if the agent 28 needs to process a future query in a completely different domain (e.g., well data, cemetery data, library data, etc.), then a different, customized (modular) data structure 30 could be utilized by the agent 28 with its own set of heuristics in order to control operation of the agent 28 and development of a response tailored to that domain.
[0027] The domain knowledge can be encoded in a form appropriate for a domain, from a simple lookup to a complex primitive to composite knowledge structures. The information encoded in the data structure 30 can be incomplete or extraneous, thereby mitigating the need to clean or enhance the information. For example, a data object representing an apple might include detailed information about the nutritional value of a medium apple. The provenance of the data may be from a commercial website for weight management. Alternatively, an entry for apple might include both the medium apple as well as the full details for 100 grams of apple from the FDA Food Source database. Alternatively, the domain knowledge could be as simple as apple, carbs, 34.
[0028] The domain heuristics of the data structure 30 comprise rules or inferences that can be applied to the domain structure to determine whether there is sufficient or insufficient data available to produce a response to a query. For example, a heuristic may be encoded that states that if there is only one entry, such as apple, carbs, 34, then the system should use such entry and note the provenance associated with the entry (e.g., indicating that the entry came from a user's journal entry, for example), but if there is an FDA Food Source also available, then the system should also use that source in developing a response and include both sources in the provenance for later disambiguation.
[0029] The partial information retrieval heuristics of the data structure 30 comprise a set of rules that can be tailored to a domain, and control operation of the agent 28 to generate a response to a query generated by the query listener agent 24. They provide general principles for deciding whether the question type and question context are recognized, and the set of possible results returned can be bundled into either a successful response or an unsuccessful response. Processing steps carried out by the agent 28 are described in more detail below in connection with
[0030] The response data structure 32 bundles all of the information collected by the agent 28 into a domain-specific data structure, and is processed by both the knowledge base update agent 34 and the response generator update agent 36. Included in the structure 32 is an indication of whether the response is successful or unsuccessful in answering the query generated by the query agent 24.
[0031] The knowledge base update agent 34 is an optional component of the system that can update one or more persistent stores (e.g., update one or more of the modular domain heuristic data structures 30), either in real time or through a deferred batch process, and with or without human supervision. Using this agent 34, the system can develop an up-to-date repository of information relating to domain-specific queries and associated heuristics.
[0032] The response generator agent 36 generates a natural language response to the query generated by the agent 24, which can be generated and transmitted to the user (e.g., to one or more of the devices 18 of
[0033]
[0034] In step 50, a determination is made as to whether a query response failure (which could be generated in steps 46 or 48) matters. If so, step 52 occurs, wherein the system generates and returns a response. Otherwise, step 54 occurs, wherein the system adds the response to the response structure. Then, in step 56, a determination is made as to whether there are more questions in the query to be answered. If not, step 52 occurs. Otherwise, step 58 occurs, wherein the system performs further partial information processing on the remaining questions by recursively calling the processing steps 40 of
[0035]
[0036] In the event that only a partial question type is determined to be identified in step 60, then step 72 occurs, wherein a determination is made as to whether the question context can be identified. If so, step 74 occurs, wherein the system attempts to match the partial question type with the question context. Then, in step 76, a determination is made as to whether the partial question type and question context match. If so, the full bundle is returned in step 86, a success notification S3 is generated, and processing ends. Otherwise, step 78 occurs, wherein the system attempts to match the partial question type with the partial question context. Then, step 80 occurs, wherein a determination is made as to whether the partial question type matches the partial question context. If so, step 86 occurs, wherein the full bundle is returned, a success notification S3 is generated, and processing ends. In such circumstances, the question carbs in apples could have generated a question type such as How many Y in X where thee intent was Does X have Y. The candidate types inferred to be sufficient are collected into an unordered collection (perhaps represented as a list) of partial question types. The question context is then evaluated as sufficient to be identified. If a single perfect match occurs, then it is used with each item in the partial question type list to produce a list of potential matches. If the potential list contains at least one match, then the results are bundled and returned as a successful result.
[0037] In the event that a negative determination is made in step 80, step 82 occurs, wherein the input data is bundled, a failure notification F2 is generated, and processing ends. In such circumstances, there is still value in understanding and returning the failed result. For example, the question Antioxidants in a honey crisp may lead to failure if the knowledge representation cannot identify this kind of apple and there is no knowledge of antioxidants, nor how the two are related. Returning the heuristic (e.g., that the context is not in the knowledge base) can lead to more refined questions. Finally, in the event that a negative determination is made in step 72 (i.e., the question context cannot be identified), step 84 occurs, wherein the system bundles the input data and generates a failure notification F3. It is noted that, in all of the failure paths (failure notifications F1, F2, and F3), the input data provenance is augmented with a bundle of heuristics that produced the unsuccessful result. Importantly, the failure of one question within the framework of a composite question may still allow success at a higher level (with an optional qualifier).
[0038] The processing steps 48 of
[0039] Additionally, it is noted that the various bundles generated by the processing steps of
[0040] The domain heuristics of the data structure 30 discussed in connection with
[0041]
[0042] It is noted that the systems and methods discussed herein in connection with
[0043] In the natural (non-digital) world, information is only as accurate and reliable as the methodology employed by the sentient reporter and, more critically by the media through which that information is presented. Historians assert that at every stage from observation to recording to reporting, the quality of the initial observation is modified via cultural, social and psychological norms, biases, and assumptions. Information can be trusted only when its origins and transformations can be examined or referenced. Attribution is essential to reliable information retrieval. Information provenance is key to good decision making. In computing information systems, informative responses to user queries via visualization-based user interfaces must provide a means for interactively examining the provenance of that informationoften referred to as drilling down. This is especially critical to summation techniques within in conversational AI systems. The customized visualization interface described in
[0044] Current digital technologies provide the illusion that quantified data is precise, and that current data retrieval, analysis, and reporting techniques maintain that precision. This is a significant problem when inappropriate visualization techniques are applied to data that may be incomplete or aggregated in a manner without attention to its provenance chain. It is not sufficient to provide links to source, but instead necessary to allow a user to examine the aggregation process directly. This is particularly problematical for individuals making personal decisions based on incomplete information sourced both from their personal observations and external information sources (e.g., media, the Internet, libraries, domain professionals (both through computer-based media and direct communication), etc.) Standard visualization techniques ranging from lists and charts to graphs to animations, and video recordings may be useful at demonstrating phenomena, but are equally able to obfuscate and elude. The visualization interface of
[0045]
[0046] To articulate the novelty of this approach, the problem of representing weight loss goals is presented herein, but of course, other types of visualizations are possible. Weight loss, despite the overwhelming information available on the Internet, is a simple calculation for an individual: the calories consumed in a day must be consistently less than an individualized calculated maintenance amount. Diet programs have promoted this concept as far back as the 18th century. Complex models, methods, and visualizations are available for assisting individuals in mapping their personal consumption to goals determined by generalized (and often conflicting) assumptions and calculations.
[0047] At their core, existing visualizations represent the calorie consumption goals as precise quantitative data. Data aggregation without provenance hides that precise target behind rules prescribing good, neutral and bad foods, and rarely if ever provides transparency for these categorizations. The information is incomplete but is represented visually as complete and absolute as a goal to achieve. Similarly, actual calories consumed are assumed to be precisely measured, when it is always an estimate as the USDA nutrition website asserts. For example, a Honey Crisp Apple listed on the USDA FoodData website is calibrated to 100 g based on a well-established methodology. But at 60 calories/100 grams, a precise calorie count for an actual apple could vary significantly. In a typical day with a target of 2000 calories (however well disguised by the weight loss industry), the accumulation of those small differences can statistically lead to percentage of error that can indeed impact calorie deficiency goals. Forgetting to record a candy bar purchased during a commute home can similarly thwart precise comparison of targets and actual values.
[0048] Visualizing incomplete or inaccurate data as precise data points leads to misinterpretation that, in turn, can lead to failed expectations. For example, if an established recommended calorie intake is 2000 calories, and the recommended daily deficit is 200 calories, the prediction is that an individual will lose about a pound per week in a healthy manner. If weight is not lost at that rate, it may very well be that the data recording is not sufficiently accurate for these precise targets and the visualizations are unable to demonstrate that inaccuracy. Visualizations that support drilling down into the provenance of data can provide explanations for failed outcomes that don't blame the victim (e.g. not enough will power, didn't properly take the medication, didn't follow an exercise regimen, etc.). The visualization interface described herein removes the value judgement and provides a visualization that provides, at a glance, the relationship between, and individual provenance of, two aggregate and potentially inaccurate values. Depending on the immediate needs of the user, the visualization interface can produce multiple viewpoints of a data set.
[0049]
[0050]
[0051]
[0052] As noted above, the systems and methods (and their visualizations) could be applied to a variety of domains, including any domain in which target outcomes and actual recording are compared. Health care domains include, but are not limited to, protein versus calories, nutrition balance, physical therapy, and habit formation. Energy consumption domains include, but are not limited to, total energy use versus energy produced by solar panels, car maintenance and fuel consumption, etc. Education domains include, but are not limited to, normative student outcomes versus actual performance.
[0053] Still another domain having applicability to the systems and methods of the present disclosure include personal or small business finance, e.g., making it easier for someone not that familiar with or comfortable with traditional financial reports. Examples include: [0054] 1. The big picture/zooming outcould provide a person's overall financial health (conveyed by the visualization patterns disclosed herein), perhaps giving them new insights into areas they've never thought of before. [0055] 2. For visualizing net worth, a visualization block (e.g., the triple of pieces referred to) could represent: [0056] a. A single asset (stock, fund, 401k, real estate, vehicle, boat, plane, etc.) [0057] i. The block represents a position in that investment. [0058] 1. Cost Basis [0059] 2. Current Value [0060] 3. Difference is represented as total gain/loss [0061] b. A single liability (mortgage, credit card, other loans, etc.) [0062] i. The block represents an obligation for that liability, or past due amount, or delinquency. [0063] 1. Total borrowed or owed [0064] 2. Cumulative payments made [0065] 3. Difference is represented as the outstanding balance [0066] 3. For income statements, each block could represent: [0067] a. A single income source (salary, royalty, rental income) [0068] i. The block represents an estimated income for that source and time period [0069] 1. Total estimated income [0070] 2. Actual income [0071] 3. Difference indicates whether an individual is below or above the estimated income [0072] b. A single expense (loan payments, house-hold expenses, vacations) [0073] i. The block represents the planned expense item (although loan payments are mostly fixed, things like housing expenses would be variable) [0074] 1. Planned amount [0075] 2. Amount spent [0076] 3. Difference indicates whether an individual is over or under that particular planned expense
[0077] The systems and methods herein can perform data comparison where a difference provides essential insight into the question or problem and can be visualized. For example, student progress ranging from a curricular activity to standardized test results can inform comparison of a student's performance to a normative target. A numeric value out of context provides little insight into what knowledge or skills a student has gained. Providing a difference measure between a norm and a student's performance is enhanced with the potential to drill down from an aggregate score or rubric assignment to examine the component performances to directly identify causes for concern or potential for enrichment.
[0078] The visualization components of the systems and methods of the present invention include a triple of pieces: two data objects that are retrieved from one or more resources, and a difference object that is defined as the quantitative difference between the two data objects. Provenance of the data objects as well as that of the difference object is maintained throughout, to support methods/algorithms to drill down into the data analysis behind each of the data objects, and build up new pieces from previously defined data and difference objects, or combine pieces into rows and blocks to form more complex representations. Finally, a data piece summary object can be generated from rows and blocks of pieces. Enhanced visualization ability comes from domain specific heuristics for choosing the order of the two data pieces (top/bottom or left right) position of the difference piece (relative to the smaller data object: (before or after), color (or image), opacity, relative depth of the piece, and piece orientation (horizontal/vertical).
[0079]
[0080]
[0081] A unique aspect of the definition is that each visual attribute includes its provenance, which can be used to visually examine its origins. The surrounding system can use this to drill down within any piece providing dashboard capability to provide multi-modal visualizations as well as interaction to allow users to construct sophisticated visualizations that combine visualization pieces with other informative visual modalities such as traditional graphs, text boxes, and animations. The attributes and value pairs are intentionally implemented as natural language-based tokens so that accessibility functions can be employed to render in appropriate modalities for the visually impaired.
[0082] The subclasses 178-182 include: [0083] 1. A primitive piece is a shape (illustrated as a rectangle), where the length is defined as a scaled unit of measure of an input quantity (such as calories in a ratio of 5 calories per pixel). Remaining dimensions, fill, and opacity are dependent upon domain heuristics. A single datum has a context that is illustrated through the remaining dimensions via the domain heuristics. Each of those heuristic choices can be examined via the input data provenance via the interaction interface. Consequently, the origin or rationale for reducing complex information to a single data point can be examined and not simply taken as absolute. [0084] 2. A difference piece is constructed from three primitive pieces. Placement of the three pieces relative to each other, as well as their remaining attributes, is determined by domain heuristic rules. This piece is the core construct of the system, and includes the following: [0085] Two measured quantities (M1 with the same unit of measure where each input quantity defines the length of the component pieces (e.g., actual and recommended calories)). [0086] A difference piece whose length is calculated as the absolute value of the difference between the two measured quantities. [0087] 3. A row is a sequence of pieces with identical widths (determined from domain heuristics). The pieces are oriented relative to the row and are stacked sequentially so that the sum of their lengths defines the length of the row, and the width of the row is defined as the standard width of all of the pieces. A row can be used represent one measured quantity in a difference piece. The widths need to be identical to support block construction and maintain the integrity of the overall visualization of the differences. [0088] 4. A block is a sequence of rows in a dimension different from the rows from which it is constructed. The length of the block is defined as sum of the width of its component pieces. The length of the block is the longest length of the rows. The length of a row can be ragged.
[0089]
[0090] The following example is given in connection with a visualization of a meal. When the meal is breakfast, the actual datum is blue and the recommended is aqua. For any meal: [0091] If the actual is less than the recommended and the objective is to gain weight, then the difference piece is red. [0092] The difference piece is green if: [0093] the actual is less than the recommended and the objective is to lose weight, OR [0094] the actual is greater than the recommended and the objective is to gain weight [0095] The difference piece is red if: [0096] The actual is less than the recommended and the objective is to gain weight, OR [0097] The actual is greater than the recommended and the objective is to lose weight
[0098] All of the other attributes are hard-coded to values for this illustration, but could be determined by rules such as: [0099] If the provenance of the datum came from a nutrition label or the USDA Food Data website then the opacity is solid otherwise the opacity is 50% [0100] If the piece represents a single meal, then then show the piece horizontally with the actual above the recommended. [0101] If the piece represents a single day, then show the piece vertically with actual to the left of recommended. [0102] If the user specified focus is on the actual, then make the recommended half the depth.
[0103] The visualization object attributes can be determined by static (compile time) or run time (user initiated) rules established in the domain heuristic object.
[0104]
[0105] Importantly, the interface supports multiple layers of pieces. For example, selecting breakfast from a day composite piece will instantiate a new window with all of the functionality of the parent window. Parent-child relationships are maintained behind the scenes, but free motion within any window supports user-initiated organizations. The interaction objects also support corralling visualization pieces into a row that in turn can be combined into a block. The pieces attribute of a piece represents a row or block depending on the structure of the pieces.
[0106]
[0107] In step 312, an event trigger occurs, which specifies two data items to be comparted (data1 and data2) and also includes (as input) provenance that impacts domain heuristics. The build difference piece method 314 is then instantiate, and in step 316, the system creates a new difference piece. Additionally, a first datum piece (datum1) is created with a length of data1, and a second datum piece (datum2) is created with length of data2. Next, in step 318, a determination is made as to whether data1 is greater than data2. If so, step 320 occurs, wherein the length of the difference piece is set to the length of datum1. Then, in step 322, the dominant piece is set as datum1, and the subordinate piece is set as datum2. Then, step 324, discussed below, occurs.
[0108] In the event that a negative determination is made in step 318, step 330 occurs, wherein a determination is made as to whether data1 is less than data2. If not (implying that data1 and dat2 are equal and there is no difference), step 336 occurs, wherein the system applies rendering heuristics to datum 1 and datum 2. Otherwise, step 332 occurs, wherein the length of the difference piece is set to the length of datum2. Then, in step 334, the dominant piece is set to datum2, and the subordinate piece is set to datum1.
[0109] In step 324, the system creates a difference piece having a length equal to the difference in lengths between the dominant piece and the subordinate piece. In step 326, the system applies rendering heuristics to the dominant piece, the subordinate piece, and the difference piece. Finally, in step 328, the system returns a new difference piece.
[0110]
[0111]
[0112]
[0113]
[0114] The domain heuristic object is an API for the system that mediates between an external decision-making system and the visualization system. Consequently, the API consists of methods to retrieve required attribute values for a piece as summaries. When a piece is being built, it calls the relevant Domain Heuristic method whose implementation is defined by the surrounding system. Information querying is two way: the surrounding system can both request and provide information so that visualization decisions are not static but can be updated dynamically in real time.
[0115] Like the Domain Heuristics Object, the Interaction Object is an API for the system. The abstract object is a generic display that has a position in a visual rendering system such as a web window, and an interaction API definition through which to manipulate both the position and size of an instantiated display and the internal contents of the display. Each piece has a provenance, either input data (that also has a provenance), or the originating instantiation of the piece (via build methods). If the surrounding system maintains a provenance discipline (e.g. provides a provenance with raw input data) then an appropriate display hierarchy can be constructed by the surrounding system.
[0116] The visualization system described herein illustrates the difference between two datum with the same unit of measure. The power of the system is the recursive inclusion of data provenance. The developer of the surrounding system can provide displays and interaction that appropriately constrain or allow examination and reorganization of the input data without either overwhelming the user, or misrepresenting the data and its origins.
[0117] Having thus described the system and method in detail, it is to be understood that the foregoing description is not intended to limit the spirit or scope thereof. It will be understood that the embodiments of the present disclosure described herein are merely exemplary and that a person skilled in the art can make any variations and modification without departing from the spirit and scope of the disclosure. All such variations and modifications, including those discussed above, are intended to be included within the scope of the disclosure. What is desired to be protected by Letters Patent is set forth in the following claims.