Computer implemented method for quantifying the relevance of documents
11709871 · 2023-07-25
Assignee
Inventors
Cpc classification
International classification
Abstract
A computer system comprising a processor, graphical output means and a computer readable storage medium storing instructions that when executed by the processor cause the processor to perform a method for quantifying and aggregating the relevance of documents.
Claims
1. A computer implemented method comprising: assigning documents to one or more document families, each document family comprising one or more documents, wherein each document is selected from a group consisting of a patent document and a patent application document; calculating, for each document family, a document family coverage score DFCS, the document family coverage score being indicative of the validity of the document family in a category, whereby the validity is calculated from one or more first properties of each document belonging to said document family; calculating, for each document family, a document family linkage score DFLS, said document family linkage score being calculated by finding one or more document links, each document link connecting a source document to a destination document, each destination document belonging to said document family, each source document belonging to another document family; finding one or more document family links, whereby each document family link connects a source document family with said document family, said document family acting as destination document family, whereby the existence of each document family link is derived from the one or more found document links and wherein the DFLS is derived from the existence and weight of the one or more found document family links; and calculating, for each document family, a document family combined relevance score DFCR by multiplying the document family coverage score DFCS and the document family, linkage score DFLS having been calculated for each document family; grouping document families into one or more portfolios, each portfolio comprising one or more document families; and densely displaying, for each document portfolio, an aggregated view in which a plurality of data values are displayed in a summarized form on a graphical user interface with the summarized form providing a visualization of relationships between all documents in the document portfolio, the aggregated view comprising or being derived from one or more aggregated score values, the one or more aggregated score values being calculated by applying an aggregating function on the DFCR, the DFLS, or the DFCS value of the one or more document families of said portfolio.
2. The computer implemented method according to claim 1, wherein the categories are geographic territories and the first properties are countries.
3. The computer implemented method according to claim 2, wherein the DFCS of each document family is calculated by summing up weights assigned to each document of the document family, each weight w.sub.c being indicative of a significance of the country c.
4. The computer implemented method according to claim 1, wherein each weight w.sub.c is multiplied with a value being indicative of a significance of the country c and the DFCS of each document family h at a sheet date is calculated as DFCS (document family b)=Σ([W.sub.c*GNI.sub.c]/GNI.sub.REF), wherein W.sub.c is a country specific weight of country c, country c having been assigned to the document; wherein Σ indicates the sum over all documents of a document family and for all countries c considered; wherein GNI.sub.c is a parameter being indicative of a significance of country c; and wherein GNI.sub.REF is a reference parameter being indicative of a significance of a reference country REF.
5. The computer implemented method according to claim 1, wherein each value being indicative of a significance of a country can be replaced by a user-specific value, and wherein a reference parameter GNI.sub.REF can be selected or specified by the user via the graphical user interface, wherein the weight w.sub.c is indicative of a legal status of the document, wherein said legal status is selected from the group consisting of a valid patent status, an expired status and a pending legal status, wherein a patent document has valid patent status in a country if the granting date of the patent <=sheet date <date of expiration of the patent, wherein the document has pending legal status in a country if: the date of tiling the document is <=sheet date, and if sheet date is <date of expiration of the document; and if the granting date >sheet date wherein the document has expired status in a country if sheet date >=expiration date or wherein sheet date <date of tiling of the document; and wherein the weight w.sub.c for pending status is a score value indicating the probability that a patent will be granted for the document.
6. The computer implemented method according to claim 1, where the document links are weighted and are indicative of citations of prior art patent documents, the method further comprising the steps: calculating, for each document link, a document linkage weight a, the document linkage weight being a quality measure of the document link; calculating, for each document family link, a document family linkage weight β, the document family linkage weight β being a derivative of the document linkage weights a of all document links connecting source documents of one source document family with destination documents of one destination document family; calculating, for each destination document family, an aggregate value y as a derivative of all document family linkage weights β of all document family links connecting a source document family with the destination document family; and returning the calculated aggregate value y as DFLS value.
7. The computer implemented method according to claim 6, wherein the document linkage weight a is selected from the group comprising: a patent office specific quality value, said patent office specific quality value being indicative of the quality of the citations issued by the patent office, wherein the document link quality value is inversely proportional to the average number of cited documents of said patent office; a patent examiner specific quality value, said patent examiner specific quality value being indicative of the quality of the citations issued by the patent examiner, wherein the document link quality value is inversely proportional to the average number of cited documents of said patent examiner; a citing authority specific quality value, said citing authority specific quality value being indicative of the authority having cited a particular document, said authority being in particular an inventor, an examiner or a 3rd party; a citation category of the destination document; a property of the destination document, said property being indicative of the relevance of said destination document to the user; a property of the source document, said property being indicative of the relevance of said source document to the user; a quality value being derived from the technology field of the source document, said quality value being inversely proportional to the average number of documents cited by a document having assigned said technology field; and a quality value being derived from the technology field of the source document and the technology field of the destination document, said quality value being derived from a predefined or dynamically calculated similarity score, the similarity score being indicative of the similarity of the technology field of the source document and the technology field of the destination document.
8. The computer implemented method according to claim 6, wherein each document family linkage weight β.sub.DFSource, DFDest.Math. is equal to the maximum document linkage weight MAX(a.sub.ALL); the average document linkage weight AVG(a.sub.ALL); the median document linkage weight MEDIAN(a.sub.ALL); the summed-up document linkage weight SUM(a.sub.ALL); or the logarithmic document linkage weight being calculated as ln(N+a.sub.AGG) or log(N+a.sub.AGG), wherein N is a natural integer >0, wherein a.sub.ALL represents all document linkage weights of all document links connecting source documents belonging to the document family DF.sub.Source with destination documents belonging to the destination document family DFDest and wherein a.sub.AGG represents a data value hawing been calculated by aggregating all of said document linkage weights a.sub.ALL.
9. The computer implemented method according to claim 6, wherein the documents are patent documents, wherein the document links are citations, and wherein the document linkage weight a.sub.d1, d2 is determined for each document link by: determining the average number of prior art citations CS.sub.o,y issued by a patent office o per patent document and per time period z; calculating for each document link the document linkage weight a.sub.d1,d2 as a.sub.d1,d2=1/CS.sub.o,z, wherein o indicates the patent office issuing the citation, the citation corresponding to the document link to be weighted, and wherein z indicates the time period z in which the citation was issued by the patent office.
10. The computer implemented method according to claim 6, wherein the step of calculating the aggregate value y.sub.DFDest comprises in addition the execution of a normalization step, the normalization step comprising: calculating, for each time period z of a set of time periods z.sub.l, . . . z.sub.k an intermediate value X1.sub.z, the intermediate value X1.sub.z being the arithmetic mean of the aggregate value y of all document families whose status depends on a date lying within the time period z, wherein the date is selected from the group comprising the publication date of the earliest published document belonging to the document family; the priority date of the patent family; the filing date of the earliest filed patent document belonging to the document family; and the earliest date of receiving patent protection for any of the patent documents belonging to the document family; determining a normalized aggregated value δ.sub.DFDest of each document family DF.sub.Dest whose status depends on a date lying within the time period z, wherein δ.sub.DFDest=y.sub.DFDest/X1.sub.z; returning δ.sub.DFDest as DFLS value of document family DF.sub.Dest.
11. The computer implemented method according to claim 10, wherein the normalization is executed in addition in respect to at least one field f, the method further comprising the steps: determining one or more fields fl, . . . , fv having been assigned to the one or more document families; calculating, for each field fl, . . . , fv and for each time period zl, . . . zk an intermediate X2TF.sub.f,z value, the intermediate X2TF.sub.f,z value being calculated as the average of all normalized aggregate values δ.sub.DFDest,f,z of all document families DF.sub.Dest,f,z having been assigned to field f and whose status depends on the same kind of date, the date lying within the time period z; calculating, for each destination document family DF.sub.Dest, an intermediate value X2DF.sub.Dest, wherein X2DF.sub.Dest=ø(X2TF.sub.fl,z, . . . , X2TF.sub.fm,z), whereby the intermediate values X2TF.sub.fl,z, . . . , X2TF.sub.fm,z are intermediate values having been calculated for each field fl, . . . , fm, the fields fl, . . . , fm each having been assigned to the document family DF.sub.Dest; calculating the DFLS value for each document family DF.sub.Dest by dividing δDF.sub.Dest by X2.sub.DFDest.
12. The computer implemented method according to claim 6, further comprising the steps: determining one or more fields f.sub.l, . . . , f.sub.v having been assigned to the one or more document families; calculating, for each field f.sub.l, . . . , f.sub.v and for each time period zl, . . . zk an intermediate X2BTF.sub.f,z value, the intermediate X2BTF.sub.f,z value being calculated as the average of all aggregate values y.sub.DFDest,f,z of all document families DF.sub.Dest,f,z having assigned the field f and whose status depends on the same kind of date, the date lying within the time period z; calculating, for each destination document family DFFDest, an intermediate value X2BDF.sub.Dest, wherein X2BDF.sub.Dest=ø(X2BTFf.sub.l,z, . . . , X2BTF.sub.fm,z), whereby the intermediate values X2BTF.sub.fl,z, . . . , X2BTF.sub.fm,z are intermediate values having been calculated for each field fl, . . . , fm, the fields fl, . . . , fm each having been assigned to the document family DF.sub.Dest; calculating the DFLS value for each document family DF.sub.Dest by dividing y.sub.DFDest by X2BDF.sub.Dest.
13. The computer implemented method according to claim 1, wherein the aggregated score value is selected from a group comprising: a field share value FSH, the field share value being calculated for said portfolio for one field f, whereby a field is a property of a document family and wherein each document family has assigned at least one field, the field share value FSH being calculated for said field f by: calculating a first sum as the sum of all DFCR values of all document families having assigned said field f and belonging to said portfolio; calculating a second sum as the sum of all DFCR values of all document families having assigned said field f and belonging to a superset of document families, said superset of document families comprising said portfolio; calculating the ratio of the first and the second sum and using said ratio as field share value FSH; a portfolio size PSI, wherein the portfolio size of each portfolio is calculated as the number of document families within the portfolio having a DFCS value larger than 0; a portfolio strength PST, wherein the portfolio strength of each portfolio is calculated as the sum of the DFCR score values of all document families within the portfolio; a portfolio linkage score PLS, wherein the portfolio linkage score is calculated for each portfolio as the average of the DFLS values of all document families within the portfolio having a document family coverage score value larger than 0; a portfolio coverage score PCS, wherein the portfolio coverage is calculated for each portfolio as the average of the document family coverage scores of all document families within the portfolio having a document family coverage score value larger than 0.
14. The computer implemented method according to claim 1, wherein document families sharing one or more first or second property values or value ranges are grouped into the same portfolio, said first or second properties being selected from the group comprising: a technology field; a business field; a company owning the document; a document type; a document kind code; a organizational subunit of a company owning or creating the document; a branch of a company owning the document; a geographic region of origin or validity of the document; a status of the document; a patent office; a publisher or journal; the topic of the text of the document; a patent examiner; a time period; an IPC-class or sub-class; a bibliographic feature such as the name of an author or an inventor; and a feature having been determined by a clustering algorithm applied on the document data objects, wherein via each of said first or second properties one or more document portfolios can be specified upon which the aggregating function can be applied.
15. The computer implemented method according to claim 14, wherein the document families within each of the one or more document portfolios are iteratively grouped into second-, third-, fourth- or nth-order document-family sub-sets, thereby building a hierarchy of document-family sub-sets; wherein the first or second property shared by the document families within each document-family sub-set is different in each level of the hierarchy of document-family sub-sets; and wherein an aggregated score value is calculated for any document family sub-set of the document family sub-set hierarchy.
16. The computer implemented method according to claim 15, wherein the step of displaying, for each document portfolio, an aggregated score value further comprises the steps: providing the user with means to select a document-family sub-set at an arbitrary level of the hierarchy of document family sub-sets; and displaying, via the graphical user interface, the document families or documents contained within the selected sub-set of document families, the displayed documents or document families being ranked according to any of the document family score values DFCR, DFLS, DFCS or derivatives thereof.
17. The computer implemented method according to claim 1, where the document links are weighted and are indicative of citations of prior art patent documents.
18. A computer implemented method comprising: assigning documents to one or more document families, each document family comprising one or more documents; calculating, for each document family, a document family coverage score DFCS, the document family coverage score being indicative of the validity of the document family in a category, wherein the validity is calculated from one or more first properties of each document belonging to said document family; calculating, for each document family, a document family linkage score DFLS, said document family linkage score being calculated by finding one or more document links, each document link connecting a source document to a destination document, each destination document belonging to said document family, each source document belonging to another document family; finding one or more document family links, wherein each document family link connects a source document family with said document family, said document family acting as destination document family, wherein the existence of each document family link is derived from the one or more found document links and wherein the DFLS is derived from the existence and weight of the one or more found document family links; calculating, for each document family, a document family combined relevance score DFCR by multiplying the document family coverage score DFCS and the document family linkage score DFLS having been calculated for each document family; grouping document families into one or more portfolios each comprising one or more document families; densely displaying, for each document portfolio, an aggregated view m which a plurality of data values are displayed in a summarized form on a graphical user interface with the summarized form providing a visualization of relationships between all documents in the document portfolio, the aggregated view comprising or being derived from one or more aggregated score values, the one or more aggregated score values being calculated by applying an aggregating function on the DFCR, the DFLS, or the DFCS value of the one or more document families of said portfolio, wherein the documents are patent documents or patent applications, and wherein the calculation of the relevance score for each document family further comprises calculating the DFLS value of a first document family whose status depends on a date lying within a time period zx, the time period zx being younger than a threshold time value, by calculating an average DFLS value of all DFLS values having been calculated for one or more second document families of the same portfolio.
19. A non-transitory computer readable storage medium containing instructions that when executed by a processor cause the processor to perform operations comprising: assigning documents to one or more document families, each document family comprising one or more documents, wherein each document is selected from a group consisting of a patent document and a patent application document; calculating, for each document family, a document family coverage score DFCS, the document family coverage score being indicative of the validity of the document family in a category, whereby the validity is calculated from one or more first properties of each document belonging to said document family; calculating, for each document family, a document family linkage score DFLS, said document family linkage score being calculated by finding one or more document links, each document link connecting a source document to a destination document, each destination document belonging to said document family, each source document belonging to another document family; finding one or more document family links, whereby each document family link connects a source document family with said document family, said document family acting as destination document family, whereby the existence of each document family link is derived from the one or more found document links and wherein the DFLS is derived from the existence and weight of the one or more found document family links; and calculating, for each document family, a document family combined relevance score DFCR by multiplying the document family coverage score DFCS and the document family linkage score DFLS having been calculated for each document family; grouping document families into one or more portfolios, each portfolio comprising one or more document families; and densely displaying, for each document portfolio, an aggregated view in which a plurality of data values are displayed in a summarized form on a graphical user interface with the summarized form providing a visualization of relationships between all documents in the document portfolio, the aggregated view comprising or being derived from one or more aggregated score values, the one or more aggregated score values being calculated by applying an aggregating function on the DFCR, the DFLS, or the DFCS value of the one or more document families of said portfolio.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) In the following, embodiments of the invention are described by way of example, only making reference to the drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13) After the portfolio benchmarking method has been started in step 100, document families are defined in step 101 by assigning multiple documents to document families having one or multiple properties in common, e.g. referring to the same invention. On the data object level, this step implies connecting document data objects of the same document family to each other e.g. by adapting the values of document data object attributes or by creating entries in association tables of data bases. The criterion according to which documents are assigned to document families depends on the type of documents. In case the documents are patent documents, the patent documents referring to the same third patent document as priority document or referring to each other as priority documents are grouped into one document family, here called patent family. All patent documents of a patent family represent the same invention. In step 104, document families whose documents share are particular property are grouped into portfolios; For example, if all documents are patent documents and all document families are patent families, document families may be grouped to portfolios if they share the same owner, here referred to as patent holder, usually a company. The owner may be a person or a company or any other institution and is, according to a preferred embodiment, derived from properties of the document data objects assigned to the document family. In case the documents are patents, each document may comprise information on the applicant, usually a company, holding the patent. In step 102, the validity of each document is examined. This step comprises testing whether the meta-information of the document data object comprises sufficient and consistent data, e.g. on the legal status of a document in a country or other pieces of data which may be of relevance in succeeding processing steps. According to a preferred embodiment of the invention, patent documents being not patent documents and patent applications in the strict meaning of the word, e.g. utility patents and utility patent applications, are filtered out in this step. In addition, patent documents issued from patent offices providing only insufficient data on the legal status may be filtered out here.
(14) In step 103, the DFCS value is calculated for each document family (DF) which will be explained in greater detail by
(15) In step 113, one or multiple aggregate relevance scores, e.g. the portfolio size PSI, the portfolio strength PST, the field share FSH, the portfolio linkage score PLS or the portfolio coverage score PCS, are calculated on foe DFCS, DFLS and DFCR score values of all document families within a portfolio. A portfolio may comprise the totality of document families available and managing to pass the validity check in step 102 or any document family sub-set thereof. According to preferred embodiments of the invention, each portfolio comprises all document families being owned by the same person or company.
(16) According to embodiments of the invention, one or multiple of the following aggregate score values are calculated: The portfolio size PSI is calculated in step 107 and represents the total number of document families of a portfolio having a DFCS value greater than 0. The portfolio strength PST is calculated in step 108 as the sum of the DFCR values of all document families of a portfolio. The field share FSH is calculated in step 109 as the ratio of the sum of the DFCR score values of all document families of a portfolio and the sum of the DFCR score values of a superset of document families, whereby only document families having assigned a particular field of interest are considered According to said embodiment, the field share FSH measures what share of the proprietary technology of the industry is owned by a certain company. It can be calculated as the share of the Patent Portfolio Strength of a company in the total Patent Portfolio Strength of all companies in the industry. Depending on the embodiment, the FSH value can also be calculated as the share of the Patent Portfolio Strength of a company in a particular technology field in relation to the total Patent Portfolio Strength of all patent families in that technology field. It can also be calculated as a share of a PST value of an arbitrary sub-portfolio derived by grouping patent families according to e.g. some criteria A and B compared to a total PST value of a portfolio derived by grouping patent families according to e.g. criteria A. The portfolio linkage score PLS is calculated in step 110 as the average of the DFLS value of all document families of the portfolio with a DFCS value greater than 0. The portfolio linkage score is indicative of the relevance of a portfolio. The portfolio coverage score PCS is calculated in step 111 as the average of all DFCS values of all document families of the portfolio with a DFCS value greater than 0.
(17) Finally, the end of the benchmarking method is reached in step 112.
(18)
(19) After starting the definition of patent families in step 200, a list of documents, according to the described embodiment, patent documents, describing the same invention is created in step 201. Two documents describe the same invention and are assigned to one patent family, if a) both documents share at least one priority document, which means that it is checked whether the ID and the date of priority of the priority document referred to by both documents is identical, or b) one document cites the other document as priority document.
(20) In step 202, the document families are filtered and only those patent families are kept which comprise at least one patent document which meets a list of quality criteria. Said at least one patent document must: a) represent a patent document in the narrow sense of the word, including patents and patent applications but excluding utility patents and utility patent applications b) have been published not earlier than Jan. 1, 1970.
(21) According to a preferred embodiment of the invention, all documents of the resulting filtered patent families remain in a the database irrespective of whether the documents individually meet the quality criteria.
(22) The definition of document families, here described for the case of patent families, ends with step 203.
(23)
(24) In step 301, the validity of all documents DOC of the document family b, here a patent family, is determined for all countries c for all sheet dates of interest according to the following rules:
(25) In case the first date of filing DOC in a country c happened earlier than sheet date and if sheet date is earlier than the date of expiration of the patent in country c, then a document DOC is considered as valid in country c. As a result, document family b comprising DOC is also considered as valid in country c.
(26) A list of sheet dates of interest may, for example, be December 31. of the years 1998-2003.
(27) Each country c is assigned a weighting factor w.sub.c, for each document DOC of document family b which is calculated as follows: w.sub.c is 0, if the sheet date is later or identical to the date of expiration of the patent which was granted in country c based on DOC. w.sub.c is 0, if the sheet date is earlier than the first date of filing DOC. w.sub.c is 0.7, if the first date of filing DOC is earlier than or equal to sheet date and sheet date is earlier than the date of expiration of the property right based on DOC in country c and sheet date is earlier than the day the patent is granted. w.sub.c is 1, if grant date of DOC is earlier than or equal to sheet date and sheet date is earlier than the expiration date of the patent granted on DOC in country c.
(28) In the next step 303, the weighting factors wc of each country c and document DOC are further weighted according to the impact of this country. According to a preferred embodiment of the invention, this weighting is done by multiplying the weighting factor obtained for a particular country c in the previous step, which is either 0, 0.7 or 1, by a country specific weight indicating the significance of the country, e.g. its gross national income GNI. The obtained value is divided by the GNI of a reference country, e.g. the GNI of the USA, to obtain a relative, country specific weight of the impact of the invention in a particular country c in relation to a patent filed or granted in the USA:
wp.sub.c=[w.sub.c*GNI.sub.c]/GNI.sub.USA
(29) The GNI figures represent external data and are derived according to preferred embodiments of the invention on an annual basis from the World Bank. According to further embodiments of the invention, said global economic key figures are replaced by figures which better represent the economic impact of a country in respect to a particular business or technology field, e.g. sales figures of the pharmaceutical industries or of automobile manufacturers.
(30) The final DFCS value for patent family b is calculated by summing up for all countries c the weighted factors wp.sub.c obtained on the documents DOC of the document family:
DFCS.sub.b=Σ.sub.cwp.sub.c.
(31) To further improve the accuracy of the relevance quantification, further embodiments of the invention consider PCT and EP patent applications according to the following rules:
(32) Pending EP-applications are treated as patent applications in all EPC states until either the patent is granted or the application is abandoned, depending on which of the two options takes place earlier.
(33) WO-applications are considered as equivalent to patent applications in all PCT states within the first 40 month after the first date of filing.
(34) If a national patent application exists in addition to a PCT or an EP application, the respective country is not considered twice.
(35)
(36) In step 402, a statistics is created for every patent office about which sufficient data is available. In this step, the average number of patent documents cited as prior art documents by a patent office o for a patent application per year y is determined. The value obtained is referred to as CS.sub.o,y wherein o is indicative of the patent office and y of the year.
(37) In step 403, all document links connecting documents contained in the totality of documents to be examined are determined and to every document link a document linkage weight α is assigned. A document link is a link connecting a source document with a destination document. According to a preferred embodiment of the invention, each prior art citation of a patent document issued for each patent document by a patent office is considered as a document link. A database table is created comprising all document links in association with its corresponding source document, destination document and document linkage weight α. The document linkage weight α depends on the citation quality of the patent office issuing each link. The higher the number of citations issued by a patent office per patent document, the lower the relevance and quality of the citation in respect to a particular patent document. The value α is therefore determined for each document link based on the patent office issuing the link as α=1/CS.sub.o,y. The determination and weighting of document links is depicted graphically in greater detail in
(38) In step 404, all weighted document family links within the total set of examined document families are determined. A database table is created comprising all document family links. Each document family link entry of that table also comprises its corresponding source document family DF.sub.Source, its destination document family DF.sub.Dest. and its document family linkage weight β. A document family acts as source document family being connected with a destination document family via a document family link if the source document family comprises at least one document linking to a document belonging to the destination document family. According to a preferred embodiment, the document family linkage weight β is calculated as the MAXIMUM value of all document linkage values α connecting documents of the source document family with documents of the destination document family.
β.sub.DFSource,DFDest.=MAXIMUM(α.sub.1,α.sub.2, . . . α.sub.n).
In step 405, the value γ is calculated for every document family DF.sub.Dest. The value γ is calculated as the sum of the document family linkage weights of all document family links connecting a source document family i with document family DF.sub.Dest.
γD.sub.FDest.=Σ.sub.iβ.sub.DFSource_i,DFDest.
(39) The calculation of γ.sub.DFDest. is depicted graphically in greater detail in
(40) In step 406, a citation statistic for all years of first publication z is created. This task comprises the calculation of the average γ of all document families having the same year of first publication z. Every document family is characterized by a year of first publication z which represents, for patent documents, the first year wherein any of the documents belonging to a document family was published. An intermediate value X1 is calculated for each year of first publication z and all γ of all document families having a year of first publication z:
X1.sub.z=ø(γ.sub.DFDest.).
(41) According to the depicted embodiment of the invention, the document links are based on citations. The document family links are derived from the document links and are therefore also based on citations. Citation based relevance scores of documents have a strong bias towards older documents as older documents had a greater chance of becoming cited than recently published documents. Therefore, according to some embodiments, the intermediate value X1 is corrected for the last two years before the sheet date. To calculate X1 for the last two years, the average of DF.sub.Dest of the third year ahead of the sheet date is used for the calculation. A ‘year’ in this context is a time period of 12 month determined in relation to the current date, not a calendar year.
(42) In step 407, the value δ is calculated for every document family DF.sub.Dest. The value δ.sub.DFDest. is calculated as the ration of the γ.sub.DFDest. value and the average of all t patent families having the same year of first publication z:
δ.sub.DFDest.=γ.sub.DFDest./ø(γ.sub.DF1,γ.sub.DF2, . . . ,γ.sub.DFt)
(43) In step 408, a citation statistics is calculated for all technology fields f considered. The average δ of all document families having a year of first publication z per technology field f is calculated. The technology fields are defined by the first four digits of the IPC classification (IPC subclasses). Every document family having been assigned to an IPC subclass (irrespective of the assigning patent office) is considered.
(44) An intermediate value X2TF.sub.f,z is calculated for each year of first publication z considered, e.g. the last 50 years from the current date, and for all technology fields f of interest. X2TF.sub.f,z is calculated as the average S of all document families having a year of first publication z and having been assigned to the technology field f (a document family can have assigned one or multiple technology fields).
X2TF.sub.f,z=ø(δ.sub.f,z)
(45) In cases less than 200 document families exist for a particular technology field, the calculation of X2TF.sub.f,z is not based on an average value derived from the year of first publication z but rather from an average value based on multiple years.
(46) In step 409, the document family linkage score DFLS is calculated for every document family DF.sub.Dest The step comprises two sub-steps. At first, the one or multiple technology fields f to which DF.sub.Dest. has been assigned to is determined. The average value from all X2TF.sub.f,z values corresponding to technology fields having been assigned to document family DF.sub.Dest and having the same year of first priority is calculated and referred to as intermediate value X2.
X2.sub.DFDest.=ø(X2TF.sub.f1_DFDest,z_DFDest.,X2TF.sub.f2_DFDest,z_DFDest., . . . ,X2TF.sub.fn_DFDest.,z_DFDest.)
(47) The X2TF.sub.f1_DFDest., z_DFDest.. X2TF.sub.f2_DFDest., z_DFDest. values do not have to be calculated de novo in step 409, as said values have been calculated already for each technology field f and each year of first publication z in step 408. It is only required to retrieve the appropriate X2TF value for the technology fields and the year of first publication of document family DF.sub.Dest. whose DFLS is to be calculated.
(48) In the next sub-step, the DFLS value of the document family DF.sub.Dest. is calculated as the ratio of δ.sub.DFDest. and X2.sub.DFDest.
DFLS.sub.DFDest.=δ.sub.DFDest./X2.sub.DFDest.
(49) In decision 410 it is determined whether the benchmarking method is executed for a company or not. According to an embodiment of the invention, the user is provided with means, e.g. a GUI, to select between the two options ‘YES: portfolio benchmarking for a company’ and ‘No’. In case the option ‘Yes’ is selected, a further step 411 is executed adapting the DFLS value calculated in step 409 for patent documents being younger than 24 month. Patent documents being younger than 24 month are assigned a predefined or calculated other value. Said other value is, for example, the average DFLS value calculated for document families held by the company for which the portfolio benchmarking is executing and having and whose age is between e.g. 24 to 48 month, the age of a patent document being calculated based on the filing date. In case the second option ‘No’ is selected, the calculation of the DFLS value of document family DF.sub.Dest is terminated in step 412. The ‘No’ option may be preferentially selected if the portfolio benchmarking is executed for instances other than companies or for companies which do not own patent documents older than 24 month’.
(50) The determination and weighting of document family links is depicted graphically in greater detail in
(51)
(52) According to further embodiments of the invention, the document linkage weight α is not calculated based on the citation quality of the patent office but rather on the citation quality of a patent examiner working at a patent office. Again, the higher the average number of prior art citations issued by a patent examiner per patent document, the lower is the quality and relevance of a single citation issued by said examiner, α is calculated analogously to the patent office based weighting, but instead of patent office specific scores patent examiner specific scores are used for the weighting.
(53) Analogously, according to further embodiments of the invention, document links are weighted based on the average number of prior art patent document citations assigned to a patent document in a particular technology field. The higher said average, the lower is considered the quality of each single citation and the lower the weight of each single document link connecting documents of a particular technology field.
(54) The weight β1 of a single document family link, indicated in
(55) The right box of
γ.sub.DFDest=Σ(β1,β2)
(56)
(57) While the machine-readable medium 602 is shown in an exemplary embodiment to be a single medium, the term “machine-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions 603 for execution by the machine and that cause the machine to perform any one or more of the methods of the present invention. The term “machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media, and the like. The set of instructions may also reside, completely or at least partially, within the main memory and/or within the processor during their execution by the computer system 600, the main memory 606 and the processor 601 also constituting machine-readable media. The calculated aggregate score values and/or their visual representations may be displayed on a display 607 being part of the computer system, e.g. a screen, or be transmitted to the remote display 604 over a network 605 via the network interface 608 utilizing any one of a number of well-known transfer protocols (e.g., HTTP).
(58) The computer-implemented method described herein requires physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the discussion herein, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
(59) The computer-readable instructions may be stored in a computer readable storage medium 602, such as, but not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs) such as dynamic RAM (DRAM), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
(60) The present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein.
(61)
(62)
(63)
(64)
(65) Although the invention herein has been described with reference to particular embodiments, it is to be understood that these embodiments are merely illustrative of the principles and applications of the present invention. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention as defined by the appended claims.
Abbreviations
(66) GNI Gross National Income OLAP online analytical processing DFCS document family coverage score DFLS document family linkage score DFCR document family combined relevance score FSH field share PST portfolio strength PSI portfolio size PLS portfolio linkage score PCS portfolio coverage score DOCDB EPO patent information resource INPADOC-PRS INternational PAtent DOCumentation
LIST OF REFERENCE NUMERALS
(67) 100-112 steps 200-203 steps 300-304 steps 305 GNI figures of World Bank 400-412 steps 500 document family DF.sub.Dest. 501 document family DF1.sub.Source 503 document family DF2.sub.Source 504 document family link 505 document link from d3 to d6 506 document link from d8 to d5 507 document family link 600 computer system 601 processor 602 storage medium 603 instructions 604 remote display means 605 network 606 main memory 607 display means 608 network interface 702 list of document family scores 703-704 steps 800 bar chart: field share 900 line chart: avg. DFCR 1000 table comprising mult. aggreg. scores 1001 company column 1002 FSH column 1003 PST column 1004 PSI column 1005 avg. DFCR column 1006 avg. DFLS column 1007 avg. DFCR column 1008 avg. age column