REGISTRY DRIVEN INTEROPERABILITY AND EXCHANGE OF DOCUMENTS
20170364498 ยท 2017-12-21
Assignee
Inventors
- Christopher Todd Ingersoll (Berkeley, CA)
- Jayaram Rajan Kasi (San Jose, CA)
- Alexander Holmes (San Jose, CA)
- Michael Clark (Los Gatos, CA)
- Ashok Aletty (Saratoga, CA)
- Sathish Babu K. Senathi (Fremont, CA)
- Helen S. Yuen (Oakland, CA)
Cpc classification
G06F40/143
PHYSICS
G06F40/154
PHYSICS
Y10S707/99934
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
Abstract
The present invention relates to systems and methods for registry driven transformation of a document exchanged between businesses or applications. More particularly, it relates to systems and protocols for using one or more commonly accessible registries to transform electronic commerce documents among dissimilar interfaces, preferably XML documents. Particular aspects of the present invention are described in the claims, specification and drawings.
Claims
1. A method including: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions, transforms including program instructions that transform data between document family versions, and associations of the transforms with source document family members and target document family members; and responding to requests for transforms to convert a document from a source version to a target version by: traversing the document family registry data structure as a directed graph, and providing one or more sequences of transforms determined by the traversing.
2. The method of claim 1, wherein the document family versions in the document family registry data structure are identified by library, document identifier, release version and schema type.
3. The method of claim 1, wherein the one or more sequences of transforms includes one or more of Contivo maps, XSLT maps, XST maps, XSD-SOX classes, and Java maps.
4. The method of claim 1, wherein the one or more sequences of transforms is stored in a local repository and is checked against a non-local repository for currency before application.
5. The method of claim 1, wherein: the maintaining of the document family registry data structure further includes maintaining transform success scores; and the traversing of the document family registry data structure further includes calculating composite transform success scores for the one or more sequences of transforms.
6. The method of claim 5, wherein the transform success scores correspond to a fraction of fields translated verbatim from a field of the source document to a field of the target document.
7. The method of claim 5, wherein the transform success scores correspond to a fraction of text in fields of the source document found verbatim in fields of the target document.
8. The method of claim 1, wherein: the maintaining of the document family registry data structure further includes maintaining special rules for transforms to be applied corresponding to one or more participants in a document exchange; and the providing further includes referring to privately maintained transforms, maintained separately from the document family registry data structure.
9. The method of claim 1, wherein the providing further includes referring to privately maintained transforms, maintained separately from the document family registry data structure.
10. A method including: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions; and transforms including program instructions that transform data between document family versions; and responding to requests, which identify a first document family version and a second document family version, by providing a transform to be executed remotely and that transforms the data from the first document family version to the second document family version.
11. The method of claim 10, wherein the document family versions in the document family registry data structure are identified by library, document identifier, release version and schema type.
12. The method of claim 10, wherein the one or more sequences of transforms includes one or more of Contivo maps, XSLT maps, XST maps, XSD-SOX classes, and Java maps.
13. The method of claim 10, wherein the one or more sequences of transforms is stored in a local repository and is checked against a non-local repository for currency before application.
14. The method of claim 10, further including traversing the document family registry data structure and identifying one or more sequences of multiple transforms that collectively transform from the first document family version of a source document into the second document family version of a target document.
15. The method of claim 14, wherein: the maintaining of the transforms further includes maintaining transform success scores; and the responding to requests further includes calculating composite transform success scores for the one or more sequences of multiple transforms.
16. The method of claim 15, wherein the transform success scores correspond to a fraction of fields translated verbatim from a field of the source document to a field of the target document.
17. The method of claim 15, wherein the transform success scores correspond to a fraction of text in fields of the source document found verbatim in fields of the target document.
18. The method of claim 10, wherein the maintaining of the document family registry data structure further includes maintaining special rules for transforms to be applied corresponding to one or more participants in a document exchange.
19. The method of claim 10, wherein the transform to be executed remotely is a privately maintained transform that is maintained separately from the document family registry data structure.
20. The method of claim 10, wherein: the maintaining of the document family registry data structure further includes maintaining special rules for transforms to be applied corresponding to one or more participants in a document exchange; and the responding further includes referring to a private transform that is maintained securely within the document family registry data structure.
21. A non-transitory computer readable storage medium impressed with computer program instructions that, when executed on a processor, implement a method comprising: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions, transforms including program instructions that transform data between document family versions, and associations of the transforms with source document family members and target document family members; and responding to requests for transforms to convert a document from a source version to a target version by: traversing the document family registry data structure as a directed graph, and providing one or more sequences of transforms determined by the traversing.
22. A non-transitory computer readable storage medium impressed with computer program instructions that, when executed on a processor, implement a method comprising: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions, and transforms including program instructions that transform data between document family versions; and responding to requests, which identify a first document family version and a second document family version, by providing a transform to be executed remotely and that transforms the data from the first document family version to the second document family version.
23. A system including one or more processors coupled to memory, the memory loaded with computer instructions that, when executed on the processors, implement actions comprising: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions, transforms including program instructions that transform data between document family versions, and associations of the transforms with source document family members and target document family members; and responding to requests for transforms to convert a document from a source version to a target version by: traversing the document family registry data structure as a directed graph, and providing one or more sequences of transforms determined by the traversing.
24. A system including one or more processors coupled to memory, the memory loaded with computer instructions that, when executed on the processors, implement actions comprising: maintaining a document family registry data structure for a particular document family including document family members, the document family registry data structure including: document family versions, and transforms including program instructions that transform data between document family versions; and responding to requests, which identify a first document family version and a second document family version, by providing a transform to be executed remotely and that transforms the data from the first document family version to the second document family version.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
DETAILED DESCRIPTION
[0022] The following detailed description is made with reference to the figures. Preferred embodiments are described to illustrate the present invention, not to limit its scope, which is defined by the claims. Those of ordinary skill in the art will recognize a variety of equivalent variations on the description that follows.
[0023]
[0024] The web services engine 211 has access to a variety of transforms 213, including transforms using the common syntactic base. These transforms may be reusable. More than one transform may be invoked to convert a document from source semantics to target semantics. It may be desirable to utilize a common semantic base for transformations, for instance, transforming incoming documents to a well-understood document schema, such as the xCBL schema for electronic commerce documents 212. By transforming incoming documents to a common semantic base, the need for point-to-point transforms is minimized. The transforms may be chained and may be reusable. The transforms may be isomorphic or homomorphic. That is, the transforms need not be perfectly reversible. The transforms typically will be rated, either a priori or by comparing source and target semantics before and after transformation, to estimate the degree of loss resulting from the transform. A transform success score can be used to select among alternate sequences of transforms from source to target semantics. Loss resulting from transforms can be compensated for by including in the target document one or more fields that capture imperfectly translated information from the source document. These fields may be user viewable, so that a user associated with the source, the target or an intermediary service provider can respond to imperfections in the computer-implemented transformation service. Alternatively, the source document and target document can be sent to the target, with references to parts of the source document that have been imperfectly transformed or that are suspected of having been imperfectly transformed. These references can be part of the target document or a separate document, such as an error document. They can be a string, a pointer or some other form of reference. References can be provided to one or more sections of the target document where the imperfectly transformed information belongs. The references to the target document may be to an element or subsection of the target document or to a specific location within an element or subsection. In yet another embodiment, the target document and excerpts of the source document can be sent to the target, with references to the excerpts of the source document and, optionally, to the target document.
[0025] A commonly accessible registry, partially illustrated in
[0026] A commonly accessible registry can provide a so-called semantic hub. The commonly accessible registry may maintain service descriptions for the applications that provide services, such as electronic commerce services. Inbound and outbound document interfaces are registered as part of the service descriptions, preferably in the form of XSD definitions. A service is free to register multiple interfaces, for instance to support multiple versions of an electronic commerce document standard (e.g., xCBL 2.0, xCBL 3.0, or xCBL 3.5) or to support multiple document standards (e.g., xCBL, IDOC, OAG, or BAPI). The introduction of document family concepts provides a way to manage schemas and document types across documents standards and standards versions, as well as custom systems. Document families associate document types that represent the same business events into families. Transformation maps or transforms manage standard and custom logic to convert among document family members. A cost of using a particular transform may reflect imperfect translation of the document. Again, a transform success score can be associated with the transform either a priori, based on prior experience, or by dynamically comparing the semantic content of the document before and after application of the transform.
[0027] Maintaining transforms using XML as a common syntactic base is preferred, but not necessary. XML is a rich, self-describing data representation that facilitates declarative mapping logic. Several semantic bases, such as xCBL component model, provide a consistent semantic base to harness XML's powerful semantics. Modeling of XML documents to a semantic registry facilitates reuse and rapid development of new transforms, thereby enhancing the value of existing transforms. Focusing on semantic mapping, with a common syntactic base and even a common semantic base, reduces the complexity of developing new transforms. Business analysts, instead of programmers, may be able to use transform-authoring tools to define XML-to-XML semantic conversions.
[0028] A document family, as illustrated in
[0029] A registry may subdivide schemas into namespaces, as illustrated in
[0030]
[0031]
[0032]
[0033] The namespace is linked to documents and document families, in this embodiment, through the document ID class 812. The document ID 812 may actually have two types of links to a namespace, one of which is the root namespace it belongs to, and the other which is used for extension namespaces. This supports major versions and minor versions. A major version document ID may be a brand new version of a document that does not extend a previous version of a document. A minor version document ID may extend either a major or minor version document ID. A major version doc ID will only have a single namespace relationship, which references the namespace within which the root element is defined. A minor version doc ID references the super parent (major version) doc ID's namespace, along with any other namespaces within which any extensions exist. The document ID 812 may be associated with the document family 804, an external ID 805, document rule 813, a transformation map 823 and an XML document ID 822. Attributes of a document ID may include a name, a URI and a primary alternate URI. A URI is automatically generated for a doc ID using three components: namespace URI, DocID Name, DocId version. This Doc Id URI is used to refer to this Doc ID. If a user desires a custom Doc ID naming scheme, they may enter their own URI, and this is set in the primaryAltId relationship. Users may also have more than one naming scheme, in which case the otherIds relationship models these names. All these names should be unique. Attributes of a document ID may further include a display name, a description and a document version. All of these attributes may be maintained as strings. A specialization of document ID is XML, document ID 822, for XML documents. Attributes of the specialization may include an XML element name, a version type, a bean class name and major and minor versions. As characteristic of XML, a relationship loop indicates that XML document IDs may represent nested elements. An external ID 805 may be associated with the document ID 812. The external ID 805 may be a registry key or an alias for a URI. Both a primary, default link and one or more user supplied aliases may link the document ID and external ID.
[0034] Document ID rules 813 may be sufficiently generalized to support transforms, validations, and display maps. Transforms 823, sometimes called transformation maps, are a specialization of the document ID rule 813. Logic implementing the transform is linked to a document ID rule 813 through a set of transform components 825. A transform component, in turn, is linked to an external file 827. Attributes of the transformation map 823 may include a cost or transform success score, a transformation URI and a location URI. The transformation URI uniquely identifies a transformation map within a registry. A location URI is an optional identifier that indicates where the transformation should take place. For example, if only one host within a network is capable of performing the transformation, its URI is assigned to the location URI attribute and the transformation/router will send the transformation to this host to be performed. Attributes of the transformation component 825 may include a transformation component URI, a name, description, component type, implementation file, package name and execution order. Transformation components 825 are linked as a set to the document ID rule 813. The execution order attribute confirms the sequence in which transforms are applied, if more than one transform is required. In this embodiment, transform logic may include one or more of an XSLT map, and XST map, a Java component, or a Contivo map. Transform components are linked to set of configuration elements 826. Attributes of the configuration element may include a name and a value. Document ID rules 813 are also linked to a set of map context strings 814. These strings associate the document ID rule 813 and with a particular trading party, either a sending/source or receiving/target party, or with a particular service or action, as described above in the context of
[0035] Logic to retrieve and execute transforms may conveniently be accessed through an XML transformation module (XTM), as illustrated in
[0036] The transformation may be identified in the inbound message 901, which may but preferably does not include the details of which transforms should be applied to accomplish the transformation. In
[0037] An ICD is contained in the same envelope 901 as the message to be transformed, may use the following schema to identify a transformation required:
TABLE-US-00001 <?xml version=1.0 encoding=UTF-8?> <xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema elementFormDefault=qualified attributeFormDefault=unqualified> <xs:element name=TransformationContract> <xs:annotation> <xs:documentation>Transformation Instructions</xs:documentation> </xs:annotation> <xs:complexType> <xs:sequence> <xs:element name=Attachment type=xs:boolean minOccurs=0/> <xs:element name=Transformation minOccurs=0 maxOccurs=unbounded> <xs:complexType> <xs:sequence> <xs:element name=Connector type=xs:anyURI/> <xs:element name=StartDocTypeName type=xs:QName/> <xs:element name=StartDocVersion type=xs:string/> <xs:element name=EndDocTypeName type=xs:QName/> <xs:element name=EndDocVersion type=xs:string/> <xs:element name=CommunityID type=xs:string minOccurs=0/> <xs:element name=ComponentID type=xs:string maxOccurs=unbounded/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
[0038] An example of transformation instructions, according to schema above, is:
TABLE-US-00002 <xs:element name=Transformation minOccurs=0 maxOccurs=unbounded> <xs:complexType> <xs:sequence> <xs:element name=Connector type=xs:anyURI/> <xs:element name=StartDocTypeName type=xs:QName/> <xs:element name=StartDocVersionID type=xs:string/> <xs:element name=EndDocTypeName type=xs:QName/> <xs:element name=EndDocVersionID type=xs:string/> <xs:element name=CommunityID type=xs:string minOccurs=0/> <xs:element name=ComponentID type=xs:string maxOccurs=unbounded/> </xs:sequence> </xs:complexType> </xs:element>
In this example, the source document type is identified by StartDocTypeName and StartDocVersion. The StartDocTypeName should be a fully qualified document type, a QName in XML, terms, including a namespace and local name of the root element for the document type. Alternatively, a unique naming convention could be used, with appropriate administrative provisions to enforce uniqueness within a relevant scope. A version identifier should be supplied to distinguish among variations of the same document. A customer may extend an address element within a purchase order, for instance, and the extensions will have a different minor version ID than the major version. EndDocTypeName and EndDocVersion identify the target document resulting from the transform. Community ID specifies the community where the transform is registered. Component ID is used to look up the transform logic, for instance via the transformation component 825.
[0039] One implementation of an ICD specifying the target's preference to receive (or not) the original, source document in addition to the transformed target document is expressed in the following schema excerpt:
TABLE-US-00003 <xs:element name=TransformationContract> <xs:annotation> <xs:documentation>Transformation Instructions</xs:documentation> </xs:annotations <xs:complexType> <xs:sequence> <xs:element name=Attachment type=xs:boolean minOccurs=0/>
The attachment tag will indicate whether the original, source document should be attached or not. A default, in the absence of this element, may either be to attach the document or not to attach it.
[0040]
[0041] More detail regarding computation of transform sequences using both source and target registries of transforms is provided in flowchart
[0042]
[0043] Referring to the flow chart in
[0044] The calculation of alternative transform sequences and preferred transform sequences may operate in different environments. The following use cases illustrate some of these environments. In the first use case, no transformation is required. The module for determining a transform sequence is invoked, but the source and target documents are the same type. No transformation is required. In the second use case, no transformation is available between source and target. This may be the case when no transform sequence can be calculated between differing source and target documents, or when transformation policy is no transforms and the source and target documents differ, or when only a lossless transformation is accepted but all calculated transform sequences are lossey, as indicated by their transform success scores. An operating exception occurs. In the third use case, the source and target are in the same community, so only one transform registry is queried and a valid path exists. One or more transform sequences are determined. A preferred sequence is determined. In a fourth use case, the source and target are in separate communities and a valid path exists. Two transform registries are queried. As in the third case, one or more transform sequences are determined and a preferred sequence is determined.
[0045] Transform success scores, as described above, can be determined a priori, by experience or dynamically, or, more generally, by any metric of a lossey semantic transform. An a priori score is assigned to a transform based on some combination of analysis and tests. The score does not change with experience. An experience based score may begin with an a priori score or a default score, and be adjusted with experience. For instance, methods of dynamically computing success, explained below, can be applied for selected transforms that are used, and the corresponding transform success score updated, for instance as a weighted or moving average, either discarding an oldest historical success score or assigning relative weights to past and present success scores. One approach to dynamically determining success scores is to apply a transform to the candidate document and analyze the transformed document. The transform is applied to the source or intermediate source document, producing a target or intermediate target document. The content of elements (in an XML or similar document) is listed for source and target documents, for instance in a frequency table. Discrepancies between the source and target frequencies reduce the transform success score, regardless of whether the difference is positive or negative. The discrepancies optionally are reported. The success score can depend on exact matches between element contents, or may be weighted by degree. The following example helps illustrate this approach to dynamic scoring. The source document fragment is:
TABLE-US-00004 <NameAddress> <Name>Pikachu Pokemon</Name> <Address1>125 Henderson Drive</Address1> <City>Pleasanton</City> <State>CA</State> </ NameAddress >
The transformed target document fragment is:
TABLE-US-00005 <NameAddress> <Name>Pikachu Pokemon</Name> <Street>Henderson Drive</Street> <HouseNumber>125</HouseNumber > <City>Pleasanton</City> <State>CA</State> </NameAddress>
[0046] A frequency comparison, based on elements of the source document fragment and keyed to exact matches would be:
TABLE-US-00006 Source Doc Target Doc Content frequencies frequencies Pikachu Pokemon 1 1 125 Henderson Drive 1 0 Pleasanton 1 1 CA 1 1
[0047] A dynamic transform success score corresponding to the fraction of fields in the source document that appear verbatim as fields in the target document can be expressed as a success of 75 percent or a cost of 25 percent could be assigned to this example. A different score would be assigned if partial matches counted, as the house number element of the target document matches one token of the address1 element of the source document. The success score could correspond to the fraction of the text in fields of the source document that appears verbatim in fields of the target document. Application of a sequence of scores requires calculation, for some purposes, of an aggregate success scores. When individual scores are combined into an aggregate transform success score, the combination may be additive, averaged or multiplicative. The method of constructing an aggregate transform success may take into account the number of transforms in sequence, as in the multiplicative combination of success scores or may accumulate (without compounding) the errors, as in the additive combination of costs. For instance, in the multiplicative combination, if the transforms are T1, T2 and T3, loss percentages can be calculated for each of the three and combined as (1T1)*(1T2)*(1T3). More generally, an aggregate transform success score may be any metric of a sequence of transforms resulting in a lossey transformation from source to target document.
[0048] User interfaces for administering document family information and for searching for transforms are illustrated in
[0049] While the present invention is disclosed by reference to the preferred embodiments and examples detailed above, it is understood that these examples are intended in an illustrative rather than in a limiting sense. Computer-assisted processing is implicated in the described embodiments. Accordingly, the present invention may be embodied in methods for computer-assisted processing, systems including logic to carry out transform processing, media impressed with logic to carry out transform processing, data streams impressed with logic to carry out transform processing, or computer-accessible transform processing services. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention and the scope of the following claims.