IMPUTING AN OUTCOME ATTRIBUTE TO A PERS RECORD MISSING AN OUTCOME ATTRIBUTE USING A STRUCTURED SITUATION STRING OR UNSTRUCTURED CASE NOTE TEXT ASSOCIATED WITH THE RECORD
20200372982 ยท 2020-11-26
Inventors
- Mariana Nikolova-Simons (Eindhoven, NL)
- Jorn Op Den Buijs (Eindhoven, NL)
- Linda SCHERTZER (Framingham, MA, US)
- Debra ALPERT (Framingham, MA, US)
Cpc classification
G16H50/70
PHYSICS
G16H10/60
PHYSICS
G16H50/30
PHYSICS
International classification
G16H10/60
PHYSICS
H04M3/537
ELECTRICITY
G16H50/30
PHYSICS
Abstract
A computing system (204) includes a memory device (208) configured to store missing outcome instructions (218) and a processor (206) configured to execute the missing outcome instructions. The instructions cause the processor to: evaluate records stored on a storage device (114) of a call center (106) of a system (102), wherein the storage device is configured to store electronic records of calls from a user site (104) to the call center, and each record includes at least a situation attribute field and an outcome attribute field, identify records of the stored records which include, at least one of, a value in the situation attribute or case note text in a case note database, but no value in the outcome attribute, and impute an outcome to the outcome attribute field of each identified record using at least one of the situation attribute and the case note text to predict the missing outcome.
Claims
1. A computing system, comprising: a memory device configured to store missing outcome instructions; and a processor configured to execute the missing outcome instructions, which causes the processor to: evaluate records stored on a storage device of a call center of a system, wherein the storage device is configured to store electronic records of calls from a user site to the call center, and each record includes at least a situation attribute field and an outcome attribute field; identify records of the stored records which include, at least one of, a value in the situation attribute or case note text in a case note database, but no value in the outcome attribute; impute an outcome to the outcome attribute field of each identified record of the stored electronic records using at least one of the situation attributes and the case note text to predict the missing outcome, and store the identified electronic records that include the imputed outcomes in the storage device.
2. The system of claim 1, wherein the situation attribute is from a group consisting of a structured situation string and a reference to a case note, and the processor is further configured to identify whether the situation attribute is the structured situation string or the reference to the case note.
3. The system of claim 2, in response to identifying the situation attribute as the structured situation string and no case note available, the processor is further configured to determine the outcome to impute to the record using a look up table which maps structured situation strings to imputed outcomes.
4. The system of claim 2, in response to at least one of identifying the situation attribute as the reference to the case note and identifying the database has the case note text, the processor is further configured to determine a distribution of a first set of outcome attributes of reference records with both an outcome and a case note, and determine whether the distribution is balanced across outcome attributes based on a predetermined threshold level.
5. The system of claim 4, in response to determining the distribution is balanced, the processor creates a set of only one new predictive model for the balanced distribution of outcome attributes.
6. The system of claim 4, in response to determining the distribution is not balanced and the first set can be decomposed into a second smaller set of outcome attributes, the processor decomposes the outcomes of the first set into the second smaller set and creates a set of multiple new predictive models, one for each of the outcomes in the second smaller set.
7. The system of claim 4, in response to determining the distribution is not balanced and the first set cannot be decomposed into a second smaller set of outcome attributes, the processor groups the outcome attributes into a third set of balanced outcome attributes, which is smaller than the first set, and creates a set of only one new predictive model groups.
8. The system of claim 5, wherein the processor is further configured to train the set of new predictive models with a training set of case notes with a known outcome.
9. The system of claim 8, wherein the processor is further configured to test the trained set of new predictive models with a testing set of case notes with a known outcome.
10. The system of claim 9, wherein the processor is further configured to employ the trained and tested set of new predictive models to impute outcome attributes for each record with the situation attribute or a case note but no outcome attribute.
11. The system of claim 10 wherein the processor is further configured to blend the outcome attributes parts into a single outcome attribute in response to the set of new predictive models including multiple outcome attributes.
12. The system of claim 10, wherein the processor is further configured to confirm at least one of the imputed outcomes.
13. The system of claim 1, wherein the memory device is further configured to store predictive models, including the trained and tested set of predictive models, and the processor is further configured to execute the stored predictive models to predict, for a user of the system, a risk of emergency transport based on the predictive models.
14. The system of claim 13, wherein the processor is further configured to generate and transmit a notification to a clinical site in response to the predicted outcome indicating a user is at risk of transportation to a healthcare facility within a specified time period.
15. A method, comprising: identifying stored records, which include at least a situation attribute field and an outcome attribute field, that have at least one of a situation attribute and case note text in a case note database, and no outcome attribute; determining whether the situation attribute field includes a structured situation string or a reference to a case note in response to the situation attribute field including a situation attribute; performing a structured situation string analysis to impute an outcome to one or more of the records missing an outcome attribute in response to the situation attribute field including the structured situation string; performing a case note analysis to impute an outcome to one or more of the records missing an outcome attribute in response to at least one of the situation attribute field including the reference to the case note and the case note database including the case note text; employing records with outcome attributes and imputed outcome attributes to generate predictive models; employing the predictive models to predict a health state of a user; imputing the predicted health state of the user to a record of the identified records; and storing the record with the imputed predicted health state of the user.
16. (canceled)
17. (canceled)
18. (canceled)
19. (canceled)
20. (canceled)
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
DETAILED DESCRIPTION OF EMBODIMENTS
[0037]
[0038] The type attribute 300 identifies a type of a call, which can be welcome 302, test 304, maintenance 306, check-in 308, accidental 310, incident 312, etc. The situation attribute 400 either identifies a reason for the call via a structured situation string or specifies that there is a case note with unstructured narrative text for the call.
[0039] Returning to
[0040] The memory 208 stores data 214, such as records, and computer readable instructions 216. The processor 206 is configured to execute the computer readable instructions 216. The illustrated computer readable instructions 216 include a missing outcome instruction set or algorithm 218 and a predictive analytics instruction set or algorithm 220. As described in greater detail below, the missing outcome instruction set or algorithm 218 includes instructions, which, when executed by the processor 206, cause the processor 206 to at least identify records with a situation attribute or an associated case note but no outcome attribute, and imputes an outcome attribute to the outcome field of the identified record using either the situation attribute or the case note if available.
[0041] The predictive analytics instruction set or algorithm 220 includes instructions that cause the processor 206 to predict users at risk of emergency transport based on predictive models, which are trained and tested on the outcomes in the records, and to generate notification, when needed, indicating a user is at risk for emergency transport with a specified time period (e.g., 30 days). An example product that performs such an analysis is the CareSage predictive analytics engine, a product of Koninklijke Philips N.V., a company headquartered in the NL. By also using imputed outcomes, the predictive analytics instruction set or algorithm 220 uses additional information that was not previously available, which can reduce patient risk, lower costs and/or improve patient outcomes.
[0042]
[0043] At 602, stored call records are processed to identify, at least, records with a situation attribute (e.g., a string, a reference to a case note, and/or an associated case note text) but no outcome attribute. Act 602 is omitted where such records have already been identified. A more detailed example is provided below.
[0044] At 604, it is determined whether each of the identified records includes a structured situation string or a reference to a case note, and whether a case note database (e.g., in the storage device 114) has a case note corresponding to each record. For sake of brevity and explanatory purposes, reference to the term case note herein refers to: 1) the case note referenced in the situation; 2) the case note in the case note database; or 3) both the case note referenced in the situation attribute and the case note the case note database.
[0045] Returning to
[0046] Returning to
[0047] At 608, one or more new predictive models are employed to predict the value for the missing outcome attribute based on the case notes. A more detailed example is provided below.
[0048] At 610, it is determined whether there is a single predictive model or multiple predictive models utilized to impute the missing outcome attribute.
[0049] If there is more than one predicted value, then at 612, the predicted values are combined or blended according to predetermined assessment rules to produce a single imputed outcome for the file. A more detailed example is provided below.
[0050] If there is only one predictive model, at 614, the model output value is the single imputed outcome for the record.
[0051] At 616, it is determined whether the imputed outcome (from act 606, 612 or 614) is to be confirmed. For example, this information can be in the LUT of
[0052] Returning to
[0053] If the imputed value is not to be confirmed or after confirmation, at 620, the record is populated with the single imputed outcome and stored.
[0054] At 622, it is determined whether there is another identified record.
[0055] If there is another record, then acts 604-622 are repeated for the record.
[0056] If there is no other identified record to process, at 624, the records with imputed outcomes can be stored, further processed, etc. For example, in one non-limiting instance, the records with imputed outcomes can be used along with the records with outcomes with predictive analytics to predict a health state of a user, e.g., users at risk of emergency transport and notify the clinical site 118, if needed, that a user(s) is at risk for emergency transport based on the predicted outcome.
[0057]
[0058] At 802, the stored records are processed to identify files of a predetermined case type (e.g., incident). For example, the computing system 204 receives a user input, via the input device 212, which indicates a case type of interest, and the processor 206 reads the value of the type attribute of each file and identifies the files that include a type attribute that is equal to the case type of interestthe predetermined case type.
[0059] At 804, each identified record is processed to determine whether the record includes an outcome attribute. For example, the processor 206 reads the value of the outcome attribute of a record and identifies whether the record include an outcome attribute. This is repeated for the other identified records.
[0060] At 806, each of the records with no outcome attribute is processed to determine whether the record includes a structured situation string or a case note. For example, the processor 206 reads the value of the situation attribute of a record and searches the case note database, and identifies whether the records include a structured situation string or a case note. For the later, the processor 206 extracts the case identification from the record and searches the case note database (e.g., a table thereof) for the case identification. If the case identification is found in the case note database, the processor retrieves the corresponding case note text.
[0061] Returning to
[0062] In one instance, records that include an outcome attribute are flagged as records that can be used for predictive analytics. Additionally, or alternatively, records that are missing an outcome attribute, a structured situation string and a case note are flagged as records that cannot be used for predictive analytics.
[0063]
[0064] At 902, records with an outcome attribute and a case note are processed to determine a distribution of l different outcomes l levels of the categorical variable. For example, using the six (6) outcome possibilities (i.e. l=6) from
[0065] At 904, it is determined whether the distribution of the l levels is well balanced. In one instance, a well-balanced distribution is one in which a difference between a greatest occurrence of an outcome and a least occurrence of an outcome satisfies a predetermined threshold. For example, where there are at least two levels (l2), the greatest occurrence is Y for the EMER ASSIST-TRANS level, the least occurrence is Z for of SUB-ASSISTED SELF level, and the predetermined threshold is T, the distribution is well balanced if |YZ|<T, and not well balanced otherwise.
[0066] If the l levels are well balanced, at 906, a number of new predictive models is set to one (i.e. N=1).
[0067] If the l levels are not well balanced, at 908, it is determined whether the l levels can be decomposed into n binary pairs (n<1). For example, the six (6) outcome possibilities from
[0068] If the l levels can be decomposed as such, at 910, the number of new predictive models is set to n (i.e. N=n).
[0069] If the l levels cannot be decomposed as such, at 912, the l levels are grouped into L (L<1) well balanced levels, and, at 914, the number of new predictive models is set to one (i.e. N=1).
[0070] At 916, the N predictive models are trained with a training set of case notes with a known outcome.
[0071] At 918, the trained N predictive models are tested with a test set of case notes with a known outcome.
[0072] At 920, the trained and tested N predictive models are used to predict values for the missing outcomes.
[0073]
[0074] At 1002, blending rules are obtained.
[0075] Returning to
[0076] If P1 is equal to VALUE0 at 1004, at 1006, it is determined whether P2 is equal to VALUE0.
[0077] If P2 is equal to VALUE0 at 1006, at 1008, P1_VALUE0 and P2_VALUE0 are blended to create the single value P1_VALUE0-P2_VALUE0.
[0078] If P2 is not equal to VALUE0 at 1006, at 1010, P1_VALUE0 and P2_VALUE1 are blended to create the single value P1_VALUE0-P2_VALUE1.
[0079] If P1 is not equal to VALUE0 at 1004, at 1012, it is determined whether P2 is equal to VALUE0.
[0080] If P2 is equal to VALUE0 at 1012, at 1014, P1_VALUE1 and P2_VALUE0 are blended to create the single value P1_VALUE1-P2_VALUE0.
[0081] If P2 is not equal to VALUE0 at 1012, at 1016, P1_VALUE1 and P2_VALUE1 are blended to create the single value P1_VALUE1-P2_VALUE1.
[0082] At 1018, the blended value is imputed to the missing outcome.
[0083]
[0084] At 1202, it is determined, for each user with a record with an imputed outcome, if there is an electronic health record (EHR). In one example, this can be achieved through a PERS-EHR integration interface.
[0085] If there is an EHR for a user, at 1204, the imputed outcome is compared with the outcome in the EHR. In one instance, this includes analyzing the EHR to extract emergency room (ER) and/or hospital admissions information about the outcome and comparing the extracted information with the predicted outcome to confirm the predicted outcome.
[0086] If there is no EHR for the user, at 1206, a call script template is retrieved from a set of predefined templates. The call script retrieved depends from the case type and the predicted case outcome.
[0087] Returning to
[0088] At 1210, the imputed outcome is compared with the actual information.
[0089] At 1212, it is determined if the imputed outcome is the same as the retrieved outcome.
[0090] If the predicted outcome is not the same as the actual outcome, at 1214, the record is updated with the actual outcome.
[0091] If the predicted outcome is the same as the actual outcome or after updating the imputed outcome, at 1216 the record is stored with the confirmed outcome.
[0092] A non-limiting example of using the approach described herein is described next in connection with
[0093] For this example, there are 5,329 records of type incident. The output of act 602 (
[0094] Of the remaining 720 (13.5%), 397 (7.5%) have a situation attribute but no case note, and of 323 (6%) have a situation attribute that refers to a case note.
[0095]
[0096]
[0097] For this example, the blending rules shown in
[0098] The computing system 204 (
[0099] The following illustrates a non-limiting example that shows an example of a benefit of imputing an outcome attribute as described herein. For this example, a predictive model is trained/validated with historic data and outcome window, and performance is tested on a cohort without imputation and a cohort with imputation. From
[0100] In another example, the approach described herein is applied to records with the type incident with a partly missing outcome. An example of this category is all incident cases with assigned outcome Emer AssistNo Status. No Status means that the call center 106 was unable to get the user's transport status upon calling back the user and/or responder. This may occur when the user or responder does not answer the call center follow-up call, the hospital did not confirm the user's admission, or the EMS dispatch center cannot provide transport information. In this example, of the 3,258 subscribers in the previous example, there are 904 (17%) incident cases with outcome Emer AssistNo Status, which is shown in
[0101] In
[0102] The following describes an example to classify case outcomes based on the case notes (e.g., from a record and/or the case note database). In this example, first, separate case note fragments belonging to a single case are joined. These separate fragments result from a single case often involving different interactions with the user, even by different call centers. Case notes are then converted to lower case, and stripped from commas and dots. Words are extracted from the case notes by splitting case notes on white spaces. In one instance, the call center can use a standard list of abbreviations, e.g., disp means dispatch and amb means ambulance.
[0103] A table of the frequency of each word is generated and sorted in order of decreasing frequency. For each of the N most frequently occurring words, one-hot encoding can be used to indicate if this word was present in the case note. Thus, a large sparse matrix can be generated with one row per case and N columns for the words, where each cell is 1 if the corresponding word is present in the case note and 0 otherwise. This approach is also known as bag-of-words. In one instance, the number of words N can be varied from 5 to 500 to determine the optimal N required for classification of the case notes.
[0104] The set of case notes is randomly split into a training and test set using a 0.75/0.25 ratio. The training set is used to develop the case outcome classifier. The classifier(s) is developed using a machine learning technology, e.g. boosted regression trees approach referred to as extreme gradient boosting. Due to its tree-based nature, this methodology allows for automatic selection of interaction between variables. Variable importance is determined according to a gain, which is a measure for the relative contribution of the corresponding variable to the predictive model, calculated by taking the improvement in accuracy brought by a variable to the branches it is on. The boosted regression model involves tuning hyperparameters of the learning algorithm, such as the number of trees, the maximum depth of the trees (defining the degree of interaction between variables), and the learning rate. This optimization can be achieved using 5-fold cross-validation on the training set with the optimization metric determined by the AUC in the test fold.
[0105] The separate test set is used to validate the performance of the classifiers, which is depicted in
[0106] In another embodiment, the approach described herein is applied to pre-populate case attributes while call center personnel is in a conversation with a subscriber and types a case note. In this embodiment, the predictive models analyze the case note text as the text is entered (i.e. in real time) and predict the value of different case attributes. These values are pre-filled in the record for the personnel, who can confirm them and proceed with rounding up the case. In one instance, this will lead to improved call center efficiency and quality of recorded data.
[0107] In another embodiment, the approach described herein is applied to analyze case notes typed by a sale representative and predict the probability of finalizing an inbound call with having a new subscriber and no need for a call back.
[0108] The method(s) described herein may be implemented by way of computer readable instructions, encoded or embedded on computer readable storage medium (which excludes transitory medium), which, when executed by a computer processor(s) (e.g., CPU, microprocessor, etc.), cause the processor(s) to carry out acts described herein. Additionally, or alternatively, at least one of the computer readable instructions is carried by a signal, carrier wave or other transitory medium, which is not computer readable storage medium.
[0109] While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive; the invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims.
[0110] In the claims, the word comprising does not exclude other elements or steps, and the indefinite article a or an does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measured cannot be used to advantage.
[0111] A computer program may be stored/distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.