Artificially-intelligent, continuously-updating, centralized-database-identifier repository system
11556515 · 2023-01-17
Assignee
Inventors
- Matthew E. Carroll (Charlotte, NC, US)
- Manu Kurian (Dallas, TX, US)
- Aaron E. Russell (Virginia Beach, VA, US)
Cpc classification
G06F16/254
PHYSICS
International classification
G06F16/25
PHYSICS
Abstract
A centralized database identifier repository may identify databases using a unique identifier, or key tag, for each database. Each identified database may include data relating to one or more specific data elements. The repository may include a variety of data elements. Each data element may be associated with one or more database keys. The repository may be a repository of reference pointers. The repository may facilitate data viewing and data retrieval. A requestor may search for a data element using the centralized repository. The repository may retrieve data relating to a specific data element, from all databases identified by unique identifiers, that include data relating to the data element. The databases' unique identifiers may be encrypted tokens.
Claims
1. A method for data consolidation of an artificially-intelligent centralized key data repository, the method comprising: reviewing a plurality of databases, each database included in the plurality of databases comprising one or more data elements; based on the reviewing, determining: duplicate data elements within the plurality of databases; comparable data elements within the plurality of databases; a utilization metric for each data element included within the plurality of databases; ranking the data elements included in the plurality of databases based on the utilization metric, wherein more frequently used data elements receive higher ranking and less frequently used data elements receive lower ranking; determining and assigning memory locations for each data element, the memory locations including a plurality of memory locations with shorter than a threshold response time and a plurality of locations with greater than the threshold response time, wherein data elements with greater than a threshold frequency are assigned to memory locations with shorter than the threshold response time and data elements with lower than the threshold frequency are assigned to memory locations with greater than the threshold response time; identifying one or more recommendations for database synchronization and/or database usage optimization; displaying the recommendations to operator; and executing the recommendations upon receipt of operator confirmation.
2. The method of claim 1, further comprising: re-reviewing the plurality of databases; and identifying one or more recommendations after a predetermined time period.
3. The method of claim 1, further comprising: re-ranking the data elements included in the plurality of databases; and redetermining and reassigning memory locations based on the re-ranking.
4. The method of claim 1, further comprising: deactivating data elements that are determined to be utilized less than a second threshold frequency.
5. The method of claim 1, further comprising archiving data elements that are determined to be utilized less than a second threshold frequency.
6. The method of claim 1, further comprising combining two or more data elements upon determining that the contents of the two or more data elements contains more than a predetermined amount of overlapping data.
7. The method of claim 1, further comprising combining two or more databases upon determining that the contents of the two or more databases contains more than a predetermined amount of overlapping data.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The objects and advantages of the invention will be apparent upon consideration of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION
(7) A centralized database identifier repository is provided. The repository may include a plurality of database identifiers. Each database identifier identifies a database included within a plurality of databases. The repository may also include a plurality of data elements. Each data element may be associated with data included in one or more of the plurality of databases.
(8) The repository may also include a linkage between each data element and one or more database identifiers. Each of the one or more database identifiers may identify a linking database. The linking database may include data associated with the data element. The database may be included in the plurality of databases. An example may be a first data element that identifies a person named James Smith. Data pertaining to James Smith may be found in databases A, G and H. The repository may include the data element James Smith linked to database identifiers that identify databases A, G and H.
(9) The repository may be operable to receive a request from a user. The request may include one or more data elements. The repository may be operable to respond to the user. The response may include the database identifiers associated with the received one or more data elements.
(10) The repository may be operable to receive a request from a user. The request may include one or more data elements. The repository may be operable to determine one or more database identifiers associated with the request. The repository may be operable to transmit a second, or subsequent, request to each of the databases identified by the one or more database identifiers. The second, or subsequent, request may include the one or more data elements. The repository may receive the data, associated with the one or more data elements, from each of the databases. The repository may transmit the received data associated with the one or more data elements to the user.
(11) In some embodiments, the request may include user entitlement data. The repository may determine whether a user identified by the user entitlement data is permitted to access the data from each of the databases prior to transmitting the received data to the user.
(12) In some embodiments, based on a user's entitlement, there may be different levels of access to the data included in the databases. In one example, a user may be entitled to the knowledge of whether data in included in the database, however the user may not be entitled to view the data. In another example, a user may be entitled read access to the databases, however the user may not be able to retrieve the data. In another example, a user may be able to read and retrieve the data.
(13) In some embodiments, the request may include a reason for the request. The repository may transmit the received data upon receipt of an acceptable reason for the request. An acceptable reason for a request may be a reason selected from a predefined list of acceptable reasons. An example of an acceptable reason may be performing a transaction associated with a person identified by a data element.
(14) In some embodiments, a database identifier may include a communication link with the associated database. The database identifier may communicate between the centralized repository and the identified database in order to retrieve the data from the identified database.
(15) In some embodiments, each database identifier is an encrypted token. Each token may be processed by a validation layer prior to communicating with an associated or underlying database. The validation may be based on the requestor's entitlements and/or the requestor's purpose for the data retrieval. Upon validation, the database identifier token may communicate with the underlying database to retrieve the requested data.
(16) In some embodiments, the system may combine two or more tables or databases upon determining that the contents of the two or more tables or databases contains more than a predetermined amount of overlapping data.
(17) Apparatus and methods described herein are illustrative. Apparatus and methods in accordance with this disclosure will now be described in connection with the figures, which form a part hereof. The figures show illustrative features of apparatus and method steps in accordance with the principles of this disclosure. It is to be understood that other embodiments may be utilized and that structural, functional and procedural modifications may be made without departing from the scope and spirit of the present disclosure.
(18) The steps of methods may be performed in an order other than the order shown or described herein. Embodiments may omit steps shown or described in connection with illustrative methods. Embodiments may include steps that are neither shown nor described in connection with illustrative methods.
(19) Illustrative method steps may be combined. For example, an illustrative method may include steps shown in connection with another illustrative method.
(20) Apparatus may omit features shown or described in connection with illustrative apparatus. Embodiments may include features that are neither shown nor described in connection with the illustrative apparatus. Features of illustrative apparatus may be combined. For example, an illustrative embodiment may include features shown in connection with another illustrative embodiment.
(21)
(22) Each data element may be found in one or more databases of a system. In order to couple together the duplicate data elements found in each database, the repository may store the data element and each database location in which the data element is found. Such a repository may provide a centralized source to locate each data element.
(23) Data element A, shown at 102, may be located in databases AG, GH and SD, identified by database identifiers DB AG, DB GH and DB SD. Data element B, shown at 104, may be located in databases GH, AH and SW, identified by database identifiers, DB GH, DB AH and DB SW. Data element C, shown at 106, may be located in databases AH and GH, identified by database identifiers DB AH and DB GH. Data element D, shown at 108, may be located in databases AG, GH, AH and SW, identified by database identifiers DB AG, DB GH, DB AH and DB SW.
(24)
(25)
(26) Database GH, may include a record relating to John Doe, as shown at 306. Record 306 may include data relating to John Doe. Record 306 may include a name, street address, phone number and last updated timestamp.
(27) Database SD, may include a record relating to John Doe, as shown at 308. Record 308 may include data relating to John Doe. Record 308 may include a name, street address, cell phone number, home phone number and last updated timestamp.
(28) It should be appreciated that the street address on records 304 and 306 match, while the street on record 308 differs from records 304 and 306. It should also be appreciated that records 304 and 306 include one phone number and record 308 includes two phone numbers. Even though records 304, 306 and 308 are not identical, the system may determine that the records identify the same person. The determination may be made because the records include greater than a threshold percentage of identical data.
(29) It should also be appreciated that an artificial intelligence bot, as shown in
(30)
(31) Steps 408 and 410 may be based on steps 402, 404 and 406. Step 408 shows recommending database synchronization. The recommending may be based on the identifying duplicate records, identifying similar records and identifying the utilization of each table within the database.
(32) Step 412 may be based on step 406. Step 412 may include ranking tables based on utilization. Tables that are utilized more times per time period may be ranked higher than tables that are utilized fewer times per time period.
(33) Step 414 may include determining and assigning memory location for tables based on usage frequency. Tables and/or databases that ranked higher may be assigned memory locations with a shorter response time than tables and/or databases that are ranked lower.
(34) Step 416 shows displaying recommendations to operator. The recommendations may be displayed to an operator. At times, the recommendations may include displaying two similar records to an operator to identify which record is more accurate.
(35) Step 418 shows executing recommendations in response to operator confirmation. In these embodiments, the system may execute the recommendations upon operator confirmation. In other embodiments, the system may execute recommendations that have been determined to be accurate at greater than a predetermined confidence threshold independent operator confirmation.
(36)
(37) Table CZ, included in database GH, shown at 506, may be accessed at a rate of 50× per minute. Table QW, shown at 508, may be accessed at a rate of 5× per minute. AI bot 502 may have determined the access rate for each of tables 504, 506 and 508.
(38) AI bot 502 may determine that table CZ is accessed at the highest rate. Therefore, table CZ may be placed into the memory location with the shortest response time, as shown at 512. AI bot 502 may determine that table A is accessed at the second to highest rate. Therefore, table A may be placed into the memory location with the second to highest response time, as shown at 514. It should be appreciated that memory location 516 may be vacant, because it may have been reserved or AI bot 502 may be waiting to place an appropriate table within memory location 516. AI bot 502 may determine that table QW is accessed at the slowest rate. Therefore, table QW may be placed into the memory location with the longest response time, as shown at 518.
(39) Thus, an artificially-intelligent, continuously-updating centralized database identifier repository system is provided. Persons skilled in the art will appreciate that the present invention can be practiced by other than the described embodiments, which are presented for purposes of illustration rather than of limitation. The present invention is limited only by the claims that follow.