TOOL FOR PREDICTING HEALTH AND DRUG ABUSE CRISIS
20210104333 · 2021-04-08
Assignee
Inventors
Cpc classification
G16H50/20
PHYSICS
G16H10/60
PHYSICS
G16H20/10
PHYSICS
G16H50/30
PHYSICS
A61B5/4845
HUMAN NECESSITIES
G16H10/40
PHYSICS
G16H40/20
PHYSICS
G16H50/70
PHYSICS
G06N5/01
PHYSICS
International classification
G16H50/80
PHYSICS
A61B5/00
HUMAN NECESSITIES
G06Q10/06
PHYSICS
G16H10/40
PHYSICS
G16H10/60
PHYSICS
G16H20/10
PHYSICS
G16H40/20
PHYSICS
G16H50/20
PHYSICS
G16H50/30
PHYSICS
Abstract
Systems and methods are provided for understanding, forecasting, managing, and mitigating healthcare crises. A real-time health crisis forecast system and method may include predictor variable data sets such as urine drug testing (UDT) data and demographic data for selected regional populations during selected timeframes and dependent variable data such as mortality rates for selected regional populations during selected timeframes. A health forecast model describing the relationship between the predictor variable and dependent variable data may be generated using selected statistical methods. A model may be used to generate a real-time health crisis forecast for a selected population during a selected timeframe based on inputs of updated predictor variable data. A dashboard presenting graphical representations of a real-time health crisis forecast may provide relevant organizations with a resource allocation and deployment plan, enabling a proactive response.
Claims
1. A health forecasting system comprising: a health forecasting logical circuit and a graphical user interface, the health forecasting logical circuit comprising a processor; and a non-transient memory with computer executable instructions embedded thereon, the computer executable instructions configured to cause the processor to: obtain a first data set from a first data source, wherein the first data set is selected from a group consisting of: positive drug test rate for one or more controlled substance, crime lab seizure data, emergency room visitation data, prescription rates, and demographic data for a regional population; obtain a second data set from a second data source, wherein the second data set comprises mortality data for a regional population; and train, with a crisis prediction logical circuit, a health forecasting model, wherein the health forecasting model describes a relationship between the second data set and the first data set by a dual validation approach including temporally offset data sets.
2. The system of claim 1, wherein the health forecasting model comprises a logistic regression, a gradient-boosted decision tree, or a cognitive neural network.
3. The system of claim 1, wherein the computer executable instructions further cause the processor to: update the first data set from the first data source on a selected time interval; apply the health forecasting model to the updated first data set; generate a real-time drug crisis forecast based on the application of the health forecasting model to the first data set for the selected time interval; and generate one or more graphical data representations based on the generated drug crisis forecast, wherein the graphical data representations are based on user-selected model data structures.
4. The system of claim 3, wherein the computer executable instructions further cause the processor to: obtain a third data set from a third data source comprising available drug crisis response resources for a regional population; and generate a resource deployment plan based on the drug crisis forecast and the available drug crisis response resources in the selected geographical region.
5. The system of claim 4, wherein the system generates comparative drug crisis forecasts for multiple selected geographic regions such that a comparative risk assessment may be performed and a resource allocation plan for the selected geographic regions may be generated based on the comparative risk assessment.
6. The system of claim 1, wherein the first data set comprises urine drug testing (“UDT”) data.
7. The system of claim 1, wherein the first data set comprises demographic data selected from a group consisting of: unemployment rates, education rates, poverty rates, and insurance rates.
8. The system of claim 6, wherein the UDT data is collected at the county level for a regional population and is updated on a monthly timeframe.
9. The system of claim 7 wherein the demographic data is collected at the county level for a regional population and is updated on a monthly timeframe.
10. The system of claim 1, wherein the processor trains the health forecasting model to describe the relationship between the first and second data sets using at least one of the following regression methods: Poisson regression, negative binomial regression, logistic regression, regression trees, random forest, regularized regression, and non-linear prediction.
11. The system of claim 3, wherein the user-selected model data structures provide a comparative risk assessment and include at least one of the following: a choropleth map and a table ranking counties by determined risk level.
12. A method for mitigating the localized impact of a health crisis, the method comprising: obtaining, with a graphical user interface, a first data set from a first data source, wherein the first data set is selected from a group consisting of: positive drug test rate for one or more controlled substance, crime lab seizure data, emergency room visitation data, prescription rates, and demographic data for a regional population; obtaining with a graphical user interface a second data set from a second data source, wherein the second data set comprises mortality data for a regional population; training, with a crisis prediction logical circuit, a health forecasting model, wherein the health forecasting model describes a relationship between the second data set and the first data set, by a dual validation approach including temporally offset data sets; updating the first data set from the first data source on a selected time interval; applying the health forecast model to the updated first data set; generating a real-time health crisis forecast based on the application of the health crisis model to the updated first data set for the selected time interval; generating one or more graphical data representations based on the generated health crisis forecast, wherein the graphical data representations are based on user-selected model data structures.
13. The method of claim 12, wherein the health forecast model comprises a logistic regression model, a gradient-boosted decision tree, or a cognitive neural network.
14. The method of claim 12, wherein the first data set comprises UDT data.
15. The method of claim 12, wherein the first data set comprises demographic data selected from a group consisting of: unemployment rates, education rates, poverty rates, and insurance rates.
16. The method of claim 14, wherein the UDT data is collected at the county level for a regional population and is updated on a monthly timeframe.
17. The method of claim 15 wherein the demographic data is collected at the county level for a regional population and is updated on a monthly timeframe.
18. The method of claim 12, wherein the processor trains the health forecasting model to describe the relationship between the first and second data sets using at least one of the following regression methods: Poisson regression, negative binomial regression, logistic regression, regression trees, random forest, regularized regression, and non-linear prediction.
19. The method of claim 12, wherein the user-selected model data structures provide a comparative risk assessment and include at least one of the following: a choropleth map and a table ranking counties by determined risk level.
20. A health forecasting system comprising: a health forecasting logical circuit and a graphical user interface, the health forecasting logical circuit comprising a processor; and a non-transient memory with computer executable instructions embedded thereon, the computer executable instructions configured to cause the processor to: obtain a first data set from a first data source, wherein the first data set is selected from a group consisting of: positive drug test rate for one or more controlled substance, crime lab seizure data, emergency room visitation data, prescription rates, and demographic data for a regional population; obtain a second data set from a second data source, wherein the second data set comprises mortality data for a regional population; and train, with a crisis prediction logical circuit, a health forecasting model, wherein the health forecasting model describes a relationship between the second data set and the first data set; wherein the health forecasting model comprises a logistic regression model, a gradient-boosted decision tree, or a cognitive neural network; update the first data set from the first data source on a selected time interval; apply the health forecast model to the updated first data set; generate a real-time health crisis forecast based on the application of the health forecasting model to the first data set for the selected time interval; generate one or more graphical data representations based on the generated health crisis forecast, wherein the graphical data representations are based on user-selected model data structures.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0028] The present disclosure, in accordance with one or more various embodiments, is described in detail with reference to the following figures. The figures are provided for purposes of illustration only and merely depict typical or example embodiments.
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044] The figures are not exhaustive and do not limit the present disclosure to the precise form disclosed.
DETAILED DESCRIPTION
[0045] Some embodiments of the disclosure provide a method for generating a real-time health crisis forecast based on careful selection of predictive data and modeling to empirically develop a predictive model describing the relationship between predictive data and mortality data. For example, Urine Drug Testing (UDT) data may be selected for a specific time frame to create a predictive model. For example, UDT data may be collected for about a six year period, from the years 2013 to 2018. UDT results of patient specimens submitted for testing by health care professionals as part of routine care may be used. Specimens may be collected for the entire country. Specimens may also be collected for a geographic subregion, such as a specific state or county. A single specimen for each patient may be selected based on the earliest specimen collection date and may be used for downstream analysis. This selection may be performed to remove repeated measurements for the same patient from the analysis.
[0046] While the prediction model is described herein with reference, for example, to forecasting and understanding drug use, it should be understood that the same model may also be applied to forecasting disease spread across local regions to manage a disease and/or infection crisis, and/or mitigate the localized impact of an epidemic, such as the COVID-19 epidemic.
[0047] Because UDT data may be collected during routine medical care, a large sample size may be used to create an accurate predictive model. For instance, in one example embodiment, a sample size exceeding 1 million randomly sampled patient specimens may be used. Samples may be collected for adult patients. The UDT tests selected for inclusion may test for several classes of drugs including methamphetamine, heroin, fentanyl and prescription opioids. The UDT tests may employ a liquid chromatography-tandem mass spectrometry method to detect the presence of drugs in selected drug classes. The liquid chromatography-tandem mass spectrometry testing method is a laboratory-developed test with performance characteristics determined by Millennium Health, San Diego, Calif., which is certified by the Clinical Laboratory Improvement Amendments and accredited by the College of American Pathologists for high-complexity testing.
[0048] UDT tests may be used to identify at least the following classes of drugs: methamphetamine, cocaine (benzoylecgonine), fentanyl (fentanyl and norfentanyl), heroin (6-MAM) and prescription opioids (codeine, hydrocodone, norhydrocodone, hydromorphone, morphine, oxycodone, noroxycodone and oxymorphone). A UDT test result may be considered positive if any parent analyte or metabolite within a drug class is detected. In some embodiments, in the prescription opioid class, all analytes and/or metabolites may need to be ordered and a valid testing result of all analytes and/or metabolites may be necessary for each specimen in order to accurately confirm an affirmative finding that drugs in the opioid class were detected. Health care professionals may report a patient's prescribed medications and disclose that information along with a UDT test. In some embodiments, UDT results analyzed may only include non-prescription drugs.
[0049] An embodiment including the study of UDT data may follow a study protocol approved by an appropriate body, such as the Aspire Independent Review Board. For example, consistent with best practices, the study of UDT data may include a waiver of consent for the use of deidentified patient data and may conform to study guidelines set forth in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.
[0050] In some embodiments, mortality data indicating drug overdose deaths may be identified in the National Vital Statistics System multiple cause-of-death mortality files using International Classification of Diseased, Tenth Revision (ICD-10) or another appropriate data source. In some embodiments, underlying cause-of-death codes (UCD) including X40-X44 (unintentional), X60-X64 (suicide), X85 (homicide), and Y10-Y14 (undetermined intent) may be considered and collected to estimate drug overdose mortalities. In some embodiments, deaths having drug overdose identified as the underlying cause of death may be included in the mortality data set used to create a model if the following ICD-10 multiple cause-of-death (MCD) codes were indicated: T40.1 (heroin), T40.2 (natural/semisynthetic opioids), T40.4 (other synthetic narcotics), T40.5 (cocaine), and T43.6 (psychostimulants with abuse potential). Mortality rates may be collected at a selected regional level. For instance, In some embodiments, mortality rates may be collected at the state level. In another embodiment, mortality rates may be collected at the county level. In another embodiment, mortality rates may be collected at both the state and county level and/or for some other regional population. Mortality rates may be collected for a selected time interval. In some embodiments, mortality rates may be collected over a recent five-year interval.
[0051] In some embodiments, demographic data for regional populations may be collected form the American Community Survey (ACS 2018 Data Release) and/or from another appropriate data source. Regional population features having predictive value may be selected from the demographic data. Demographic data may be collected for a selected time interval. In some embodiments, demographic data may be collected over a recent five-year interval. Demographic data may be collected over a time interval of about 5 years. In some embodiments, demographic data may be collected at the state, county, or city level, or for any other selected geographic region or regions of interest. In some embodiments, selected data may include social, economic, housing, and/or demographic data obtained from the ACS data profiles.
[0052] In some embodiments, other demographic data, such as annual U.S. opioid prescription rates may be collected for inclusion in the model. This data may be collected from the Center for Disease Control and Prevention (CDC) or from another appropriate data source. This data may be collected on a selected time interval. For example the data may be collected for a specific year, such as the year 2018. The data may be collected for selection regional populations, such as at the state and county level. In some instances, data may be missing or insufficient for a selected regional population, such as a particular county, for a selected time interval, such as during a given year. In these instances, the missing or insufficient data rates may be imputed using a mean level for an encompassing geographic region, for instance at the state level, for the selected year. For example, Table 1, below, shows demographic features which may be selected for inclusion in a model and shows an imputed prescription rate feature for a selected year:
TABLE-US-00001 TABLE 1 Embodiment Showing Selected Demographic Features for Inclusion in a Predictive Model ACS variable ID Model ID DP05_0001E Estimate!!SEX total_population. (used as AND AGE!!Total poisson offset in model) population DP05_0002PE Percent!!SEX AND pct_male AGE!!Male DP05_0017E Estimate!!SEX median_age AND AGE!!Median age (years) DP05_0059PE Percent!!RACE!!White pct_white DP03_0009PE Percent!!EMPLOYMENT pct_unemployed STATUS!!Percent Unemployed DP03_0062E Estimate!!INCOME median_householdIncome AND BENEFITS (IN 2012 INFLATION- ADJUSTED DOLLARS)!!Median household income (dollars) DP03_0099PE Percent!!HEALTH pct_noHealthInsurance INSURANCE COVERAGE!!No health insurance coverage DP03_0119PE Percent!!PERCENTAGE pct_familyBelowPoverty OF FAMILIES AND PEOPLE WHOSE INCOME IN THE PAST 12 MONTHS IS BELOW THE POVERTY LEVEL!!All families DP02_0067PE Percent!!EDUCATIONAL pct_higherEdGrad ATTAINMENT!!Percent bachelor's degree or higher DP02_0069PE Percent!!VETERAN pct_veterans STATUS!!Civilian veterans DP02_0071PE Percent!!DISABILITY pct_disability STATUS OF THE CIVILIAN NONINSTITUTIONALIZED POPULATION!!With a disability DP02_0009PE + Percent single male + pct_singleHousehold DP02_0007PE single female head of household NA County leevl opiod mean_imputed_Prescribing_Rate prescription rate (missing county data imputed with the state average for a given year) NA UDT positivity rate Methamphetamine Positivity (%) in percentage NA UDT positivity rate Cocaine Positivity (%) in percentage NA UDT positivity rate Heroin Positivity (%) in percentage NA UDT positivity rate Fentanyl Positivity (%) in percentage
[0053] A statistical method or methods may be selected to create a predictive model. In some embodiments, a regression model, such as a Poisson regression, may be used to generate a predictive model. The dependent variable in an example embodiment using a Poisson regression may be the number of drug overdose deaths occurring in a selected geographic region, such as a county, during a selected time interval, such as a given year. Predictor variables in an example embodiment using a Poisson regression may be UDT positive rates and/or data comprising selected demographic features for a selected geographic region, such as the county or state level, during a selected time interval. Table 1, above, shows selected predictor variables for inclusion in an model in an example embodiment.
[0054] In some embodiments, an initial regression model may be used in which the regression model treat the county and state as random variables in a random intercept model. County may be nested within state. Fixed factor regression coefficients may be converted to incidence rate ratios (IRR) to allow for easier interpretation of the modeled relationship. Mortality and incidence rates may be determined as deaths per 100,000 members of a population. Statistical models may be determined on estimates for a selected geographic region, such as the county level, and for a selected time interval, such as over a recent five-year period.
[0055] In some embodiments, validations may be performed to evaluate the predictive value of the generated model. In some embodiments, a validation may train Poisson model parameters with county level data over a two year period and a test dataset from a year preceding the two year training data may be used. A second validation using data shifted forward an additional year may be used. Performance metrics used to assess the accuracy and predictive value of the model may be used. These may include a Pearson correlation of observed and predicted mortality rates, a Mean Square Error (MSE), a Mean Absolute Deviation (MAD), and/or a Mean Absolute Percent Error (MAPE). Statistical software, such as R statistical software version 4.0.2 (R Project for Statistical Computing) or another appropriate program or method may be used for data analysis. In some embodiments using R for data analysis, the glmr( ) function from the Ime4 v1.1-23 package may be used for estimating Poisson mixed models. In other embodiments, other appropriate estimate methods may be used. Table 2, below, shows an example embodiment of pairwise univariate Pearson correlations for a morality rate and other predictor variables determined for a regression model for a selected year:
TABLE-US-00002 TABLE 2 Embodiment Showing Univariate Correlation of Mortality and Predictor Variables Determined for a Given Year at the County Level Pearson Correlation pct_singleHousehold pct_higherEdgrad pct_veterans pct_disability pct_unemployed pct_singleHousehold 1.0000 −0.4748 0.0084 0.2047 0.5901 pct_higherEdgrad −0.4748 1.0000 −0.3608 −0.7316 −0.5256 pct_veterans 0.0084 −0.3608 1.0000 0.4189 0.0699.Math. pct_disability 0.2047 −0.7316 0.4189 1.0000 0.4875 pct_unemployed 0.5901 −0.5256 0.0699 0.4875 1.0000 pct_householdincome −0.5172 0.7299 −0.2062 −0.7475 −0.5630 pct_noHealthinsurance 0.4488 −0.3022 0.0184 0.1257 0.3769 pct_familyBelowPoverty 0.7466 −0.5422 −0.0632 0.5024 0.7273 pct_male −0.1109 −0.1869 0.2428 0.0270 −0.0318 median_age −0.4989 −0.1320 0.1955 0.3738 −0.0931 pct_white −0.4971 −0.1998 0.2166 0.2775 −0.3691 mean_imputed_Prescribing_Rate 0.1005 −0.5638 0.4276 0.6984 0.2042 methamphetamine_positivity 0.0349 −0.1096 0.0673 0.0839 −0.0624 heroin_positivity −0.0303 0.0530 −0.0636 −0.0199 −0.0594 cocaine_positivity 0.0407 0.0665 −0.0703 0.0187 0.0230 fentanyl_positivity −0.0512 −0.0058 −0.0205 0.0280 −0.0807 opioids_positivity −0.0450 −0.1799 0.0325 0.1861 0.0604 Crude.Rate 0.0159 −0.3266 0.1538 0.4644 0.1188 Pearson Correlation pct_householdincome pct_noHealthinsurance pct_familyBelowPoverty pct_male pct_singleHousehold −0.5172 0.4488 0.7446 −0.1109 pct_higherEdgrad 0.7299 −0.3022 −0.5422 −0.1869 pct_veterans −0.2062 0.0184 −0.0632 0.2428 pct_disability −0.7475 0.1257 0.5024 0.0270 pct_unemployed −0.5630 0.3769 0.7273 −0.0318 pct_householdincome 1.0000 −0.4068 −0.7702 0.0992 pct_noHealthinsurance −0.4068 1.0000 0.5700 −0.0075 pct_familyBelowPoverty −0.7702 0.5700 1.0000 −0.1305 pct_male 0.0992 −0.0075 −0.1305 1.0000 median_age 0.0054 −0.2887 −0.3020 −0.1708 pct_white −0.0461 −0.3250 −0.3502 0.3077 mean_imputed_Prescribing_Rate −0.5496 0.0957 0.2407 −0.0510 methamphetamine_positivity −0.0850 0.0943 0.0535 0.2288 heroin_positivity 0.0910 −0.2132 −0.0588 −0.0134 cocaine_positivity −0.0187 −0.1774 0.0229 −0.2536 fentanyl_positivity 0.0435 −0.3256 −0.0932 −0.1401 opioids_positivity −0.0384 −0.0662 0.0223 −0.0025 Crude.Rate −0.2663 −0.1936 0.1361 −0.0412 Pearson Correlation median_age pct_white mean_imputed_Prescribing_Rate methamphetamine_positivity heroin_positivity pct_singleHousehold −0.4989 −0.4971 0.1005 0.0349 −0.0303 pct_higherEdgrad −0.1320 −0.1998 −0.5638 −0.1096 0.0530 pct_veterans 0.1955 0.2166 0.4276 0.0673 −0.0636 pct_disability 0.3738 0.2775 0.6984 0.0839 −0.0199 pct_unemployed −0.0931 −0.3691 0.2042 −0.0624 −0.0594 pct_householdincome 0.0054 −0.0461 −0.5496 −0.0850 0.0910 pct_noHealthinsurance −0.2887 −0.3250 0.0957 0.0943 −0.2132 pct_familyBelowPoverty −0.3020 −0.3502 0.2407 0.0535 −0.0588 pct_male −0.1708 0.3077 −0.0510 0.2288 −0.0134 median_age 1.0000 0.4671 0.2727 −0.1885 0.0579 pct_white 0.4671 1.0000 0.3065 0.1200 0.0649 mean_imputed_Prescribing_Rate 0.2727 0.3065 1.0000 0.1395 −0.0345 methamphetamine_positivity −0.1885 0.1200 0.1395 1.0000 0.1349 heroin_positivity 0.0579 0.0649 −0.0345 0.1349 1.0000 cocaine_positivity 0.1546 −0.0742 −0.0445 −0.1620 0.5483 fentanyl_positivity 0.1722 0.1418 −0.0201 −0.0628 0.6295 opioids_positivity 0.1922 0.1641 0.2179 0.0330 0.4482 Crude.Rate 0.3148 0.2430 0.3088 −0.0907 0.3814 Pearson Correlation cocaine_positivity fentanyl_positivity opioids_positivity Crude.Rate pct_singleHousehold 0.0407 −0.0512 −0.0450 0.0159 pct_higherEdgrad 0.665 −0.0058 −0.1799 −0.3266 pct_veterans −0.0703 −0.0205 0.0325 0.1538 pct_disability 0.0187 0.0280 0.1961 0.4644 pct_unemployed 0.0230 −0.0807 0.0604 0.1188 pct_householdincome −0.0187 0.0435 −0.0384 −0.2663 pct_noHealthinsurance −0.1774 −0.3256 −0.0662 −0.1936 pct_familyBelowPoverty 0.0229 −0.0932 0.0223 0.1361 pct_male −0.2536 −0.1401 −0.0025 −0.0412 median_age 0.1546 0.1722 0.1922 0.3148 pct_white −0.0742 0.1418 0.1641 0.2430 mean_imputed_Prescribing_Rate −0.0445 −0.0201 0.2179 0.3088 methamphetamine_positivity −0.1620 −0.0628 0.0330 −0.0907 heroin_positivity 0.5483 0.6295 0.4482 0.3814 cocaine_positivity 1.0000 0.5471 0.3066 0.3987 fentanyl_positivity 0.5471 1.0000 0.3954 0.4463 opioids_positivity 0.3066 0.3954 1.0000 0.2748 Crude.Rate 0.3987 0.4463 0.2748 1.0000
[0056] In an embodiment of the present disclosure, a processor may empirically determine a health forecasting model by obtaining training data sets and performing regression modeling. The health forecasting model may be trained by empirically determining the relationship between two training data sets. For example, the training data sets may include a predictor variable data set, including, for example, UDT data and demographic data, and a dependent variable data set, including, for example, mortality data. A regression model may be used to express the relationship describing the dependent variable data, such as mortality data, as a function of the predictor variable data, such as the UDT and demographic data. Coefficients for the regression model may be empirically determined based on the training data sets. Then, a health forecasting model including the empirically determined coefficients can be generated to describe the relationship between predictor and dependent variable data.
[0057] In an embodiment, regression models may be used. For example a linear regression model, given by the function:
F(x)=(B.sub.0+B.sub.1x.sub.1+B.sub.2x.sub.2+ . . . +B.sub.kx.sub.k)
[0058] may be applied to express the dependent variable data as a function of the predictor variable data set. The coefficients B.sub.0, B.sub.1, B.sub.2, . . . , B.sub.k may be empirically determined and used to generate a health forecasting model.
[0059] In another embodiment, a Poisson regression model, given by the expression:
ln(F(x))=(B.sub.0+B.sub.1x.sub.1+B.sub.2x.sub.2+ . . . +B.sub.kx.sub.k)
[0060] may be applied, for example, to express a morality rate as a function of a UDT positivity rate and other demographic rates, such as education rates, insurance rates, unemployment rates, and poverty rates. As described above with respect to the linear regression model, coefficients may be empirically determined to by applying the Poisson regression model to training data sets to generate a health forecasting model including the determined coefficients.
[0061] Those having skill in the art will appreciate that these functions are merely example functions and other statistical methods may be available. Additionally, correction coefficients and other known modeling concepts may be applied in conjunction with the above regression models, or other models, to accurately derive a health forecasting model.
[0062] As shown below in Table 3, Poisson regression coefficients (Incident Rate Ratios), may be determined for a selected test data time interval. For example, the test data time interval may be from about the years 2013 to 2018. For example, a Poisson coefficient may represent the contribution of a specific class of drug detected by UDT tests. For example, as shown below, positive detection of fentanyl has a Poisson coefficient of 1.091 which indicates the relationship this factor has on predicted mortality. A Poisson coefficient of 1.091, in this example, may indicated that for every increase of fentanyl by 1 unit, deaths in the county which the health forecasting model was generated to describe may increase 9.1% in a given year. Poisson coefficients represent the effect of fentanyl positive rates, unemployment rates, disability rates, methamphetamine rates, education rates, opioid rates, poverty rates, heroin rates, prescription rates, and insurance rates, as well as the effect of other predictive factors.
TABLE-US-00003 TABLE 3 Embodiment Showing Example Poisson Regression Coefficients Poisson Regression Coefficients (Incidence Rate Ratios) Variable Est LL UL pval (Intercept) 0.000155 0.000138 0.000173 0.00E+00 pct_unemployed 0.694148 0.676398 0.712364 5.91E−168 fentanyl_positivity 1.09099 1.082016 1.100038 7.14E−95 pct_familyBelowPoverty 1.302187 1.237623 1.370119 2.51E−24 pct_disability 1.242781 1.190094 1.297801 8.02E−23 pct_male 1.2057 1.157682 1.25571 1.86E−19 methamphetamine_positivity 0.960348 0.950655 0.970141 5.43E−15 median_age 1.19612 1.135697 1.259758 1.28E−11 pct_higherEdGrad 1.170343 1.103433 1.24131 1.63E−07 pct_veterans 0.889224 0.850691 0.929501 2.05E−07 opioids_positivity 0.97725 0.968712 0.985862 2.74E−07 median_householdIncome 0.911468 0.867975 0.95714 2.02E−04 heroin_positivity 0.989904 0.981699 0.998177 1.69E−02 cocaine_positivity 1.010967 1.00157 1.020453 2.21E−02 pct_singleHousehold 1.033515 0.993647 1.074982 1.00E−01 mean_imputed_Prescribing_Rate 1.014594 0.989921 1.039883 2.49E−01 pct_white 1.011499 0.961653 1.063928 6.57E−01 pct_noHealthInsurance 0.998021 0.969163 1.027738 8.95E−01
[0063] In alternative embodiments, other modeling methods may be used to generate a health forecasting model by empirically determining the relationship between training data sets. For example, machine learning techniques, such as a gradient-boosted decision tree, or a cognitive neural network, may be applied to training data sets.
Example Embodiment for Training Health Forecasting Model
[0064] With reference now to
[0065] With reference now to
[0066] A processor may then obtain user input selecting the collection interval 104 for the first data set 100. The collection interval 104 may be at an annual, monthly, weekly, or daily level. For example, a user may selected data for about a two year period. In an example embodiment, a user may selected data for the years 2016-2018. A processor may further obtain user input selecting the collection region 106 for the first data set 100. The selected region(s) 106 may be at the country, state, county, city, or individual zip code level. For example, a user may select UDT data for a specific county. In an example embodiment, the UDT data may be selected for Los Angeles County. In an example embodiment, a user may select a first data set 100 comprising both UDT and demographic data. The UDT data may be collected on monthly time interval 104. The UDT data may be collected for all counties 106 within a state. Other embodiments exist. Some data types may only be available on an annual time interval 104. Some data types may also not be available for every selected county 106 within the selected collection interval 104. In these situations, an embodiment may include imputation of data based on mean values measured for an encompassing collection region. For instance, in one embodiment, where data is not available for a selected county in a given year, data may be imputed based on a state level mean.
[0067] In some embodiments, imputation methods may improve performance and generalizability. Imputation methods may include UDT imputation (spatio-temporal smoothing and prediction) to improve coverage and mortality (death) imputation via multiple imputation to impute values onto counties or regions without data.
[0068] Referring back to
[0069] With reference now to
[0070] Referring back to
[0071] With reference now to
[0072] In an embodiment, a health forecasting model may be trained using a dual validation method. For example, two simple validations may be performed. A first validation may train a health forecasting model by applying a Poisson regression model to two training data sets including a predictor variable training data set and a dependent variable training data set at the county level for a period of about two years. A test data set from the year following the later year of the two year period may be selected. A first validation may involve assessing a prediction generated by the health forecasting model for the same year as the test data set against the test data set. Another validation may be performed for on offset time interval of about one year further in the future. Performance metrics such as Pearson correlation of measured and predicted mortality rates, Mean Square Error (MSE), Mean Absolute Deviation (MAD) and Absolute Percent Error (MAPE) may be used in the validations to train the health forecasting model.
Example Embodiment for Generating Real-Time Drug Crisis Forecast
[0073] With reference now to
[0074] The processor may then perform a subprocess in which it performs a validation check 134 to test the accuracy and predictive value of the generated real-time drug crisis forecast 132. To perform the validation check, the processor may generate a real-time drug crisis forecast 132 which predicts mortality rates for a past period for a given region in which a second data set 108 comprising mortality data for that period and region is already available. Then, the processor may generate user selected performance metrics to measure the accuracy and predictive value of the model. For instance, these metrics may include a Pearson correlation of observed and predicted mortality rates, a Mean Square Error (MSE), a Mean Absolute Deviation (MAD), and/or a Mean Absolute Percent Error (MAPE). Statistical software, such as R statistical software version 4.0.2 (R Project for Statistical Computing) or another appropriate program or method may be used for data analysis. In some embodiments using R for data analysis, the glmr( ) function from the Ime4 v1.1-23 package may be used for estimating Poisson mixed models. In other embodiments, other appropriate estimate methods may be used.
[0075] In a graphical representation generation step 160, the processor may then generate one or more graphical representations 136 of the real-time drug crisis forecast 132. The graphical representations may take the form of a mapping, for instance showing regions of increasing drug use, a choropleth mapping, or a table showing comparative risk data. The processor may present the graphical representations to a user 136 in a dashboard. A user may select different parameters and filters to adjust the graphical representations 136 to view desired data and predictions specific to particular regional areas and time intervals.
[0076] With reference now to
[0077] With reference now to
[0078] Referring back to
[0079] Referring now to
[0080] Referring back to
Example Embodiment for Generating Resource Deployment Plan
[0081] With reference now to
[0082] With reference now to
[0083] Next the processor may obtain user indications of collection regions and collection intervals for the third data set 138. Collection regions may be at the national, state, county, city or zip code level. Collection intervals may be at the annual, quarterly, monthly, weekly, or even daily levels. For instance, in an example embodiment, a user may select a third data set 138 comprising the availability of fire department resources in Los Angeles County on a monthly basis. For example, a user may select an interval of a range of about one month. For example, a user may select data for Los Angeles County for the month of October in the year 2020.
[0084] Referring back to
[0085] With reference now to
[0086] The processor may then perform a comparative risk assessment 148 for the user selected subregions. For example, In some embodiments, a user may be a state government agency. The agency may be interested in a comparative risk assessment for counties within the state. The agency may select several counties as subregions of interest. The processor may then obtain resource availability data at the county level. The processor may then perform a comparative risk assessment by considering both the predicted mortality rate for the selected subregions during a selected time interval as well as the available response resources. The processor may return a drug crisis forecast that indicates predicted mortality for a selected county during a selected time period is high. The processor may return an indication, based on the resource availability data, that the high risk county also lacks needed resource to respond to a crisis. The processor may then perform a comparative risk assessment by generating a real-time drug crisis forecast for another county within the state. The processor may return a forecast indicating the second county presents a comparatively low risk of predicted mortality within the selected time frame. The process may then return an indication, based on the resource availability data, that the second county has a surplus of available response resources.
[0087] The processor may then perform an efficient resource allocation 150 based on the comparative risk assessment 148. For example, in the embodiment described above, the processor may return to the state agency an efficient resource allocation plan based on the comparative risk for the two selected subregion counties. The processor may return a resource allocation that proposes a sharing of resources between the selected counties because the first region lacks resources and anticipated a high predicted mortality rate while the second region has an abundance of resources but does not anticipate a high mortality rate.
[0088] Referring back to
Example Embodiments for Real Time Drug Crisis Prediction System
[0089] With reference now to
[0090] A database 202 may include several data sets. In an example embodiment, as shown in
[0091] With reference now to
[0092] Referring back to
[0093] The health crisis prediction system may further comprise a health crisis forecast generation module 212 which may include several applied coefficients 222, 224, 226 which are included in a health forecasting model 116. The applied coefficients may be included in the health forecasting model applied to the updated data sets from the updated database 210 to form a health crisis forecast as described in the method of
[0094] The health crisis prediction system may further comprise a graphical user interface 244, which may include one or more graphical representations 228, 230, and 232 of the health crisis forecast. The graphical representations may be generated according to the method of
[0095] With reference now to
Alternative Embodiments
[0096] In addition to the above embodiments, alternative embodiments may also be beneficial under certain circumstances.
[0097] For example, in an alternative embodiment, machine learning methods may be used to train model parameters and develop modes. Machine learning methods may include Support Vector Machines and Neural networks.
[0098] In another embodiment, it may be desirable to study trends and generate a prediction for a specific geographic region having specific needs. Additional predictor variable data, including demographic data, may be obtained for a specific region to generate a customized model.
[0099] In some embodiments, time series modeling techniques may be desirable. Time series modeling techniques may include ARIMA models, spatio-temporal modeling, GEE methods, and other time series modeling techniques.
[0100] In some embodiments, additional UDT analytes may be desirable. Additional UDT analytes may include fentanyl analogs and benzodiazepines.
Software Elements for Use in Health Forecasting Logical Circuit
[0101] Where components or components of the application are implemented in whole or in part using software, in one embodiment, these software elements can be implemented to operate with a computing or processing component capable of carrying out the functionality described with respect thereto. One such example computing component is shown in
[0102] Referring now to
[0103] Computing component 700 might include, for example, one or more processors, controllers, control components, or other processing devices, such as a processor 704. Processor 704 might be implemented using a general-purpose or special-purpose processing engine such as, for example, a microprocessor, controller, or other control logic. In the illustrated example, processor 704 is connected to a bus 702, although any communication medium can be used to facilitate interaction with other components of computing component 700 or to communicate externally.
[0104] Computing component 700 might also include one or more memory components, simply referred to herein as main memory 708. For example, preferably random access memory (RAM) or other dynamic memory might be used for storing information and instructions to be executed by processor 704. Main memory 708 might also be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 704. Computing component 700 might likewise include a read only memory (“ROM”) or other static storage device coupled to bus 702 for storing static information and instructions for processor 704.
[0105] The computing component 700 might also include one or more various forms of storage device 710, which might include, for example, a media drive 712 and a storage unit interface 720. The media drive 712 might include a drive or other mechanism to support fixed or removable storage media 714. For example, a hard disk drive, a solid state drive, a magnetic tape drive, an optical disk drive, a compact disc (CD) or digital video disc (DVD) drive (R or RW), or other removable or fixed media drive might be provided. Accordingly, storage media 714 might include, for example, a hard disk, an integrated circuit assembly, magnetic tape, cartridge, optical disk, a CD or DVD, or other fixed or removable medium that is read by, written to or accessed by media drive 712. As these examples illustrate, the storage media 714 can include a computer usable storage medium having stored therein computer software or data.
[0106] In alternative embodiments, storage device 710 might include other similar instrumentalities for allowing computer programs or other instructions or data to be loaded into computing component 700. Such instrumentalities might include, for example, a fixed or removable storage unit 722 and an interface 720. Examples of such storage units 722 and interfaces 720 can include a program cartridge and cartridge interface, a removable memory (for example, a flash memory or other removable memory component) and memory slot, a PCMCIA slot and card, and other fixed or removable storage units 722 and interfaces 720 that allow software and data to be transferred from the storage unit 722 to computing component 700.
[0107] Computing component 700 might also include a communications interface 724. Communications interface 724 might be used to allow software and data to be transferred between computing component 700 and external devices. Examples of communications interface 724 might include a modem or softmodem, a network interface (such as an Ethernet, network interface card, WiMedia, IEEE 802.XX or other interface), a communications port (such as for example, a USB port, IR port, RS342 port Bluetooth® interface, or other port), or other communications interface. Software and data transferred via communications interface 724 might typically be carried on signals, which can be electronic, electromagnetic (which includes optical) or other signals capable of being exchanged by a given communications interface 724. These signals might be provided to communications interface 724 via a channel 728. This channel 728 might carry signals and might be implemented using a wired or wireless communication medium. Some examples of a channel might include a phone line, a cellular link, an RF link, an optical link, a network interface, a local or wide area network, and other wired or wireless communications channels.
[0108] In this document, the terms “computer program medium” and “computer usable medium” are used to generally refer to transitory or non-transitory media such as, for example, memory 708, storage unit 720, media 714, and channel 728. These and other various forms of computer program media or computer usable media may be involved in carrying one or more sequences of one or more instructions to a processing device for execution. Such instructions embodied on the medium, are generally referred to as “computer program code” or a “computer program product” (which may be grouped in the form of computer programs or other groupings). When executed, such instructions might enable the computing component 700 to perform features or functions of the present application as discussed herein.
[0109] It should be understood that the various features, aspects and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described. Instead, they can be applied, alone or in various combinations, to one or more other embodiments, whether or not such embodiments are described and whether or not such features are presented as being a part of a described embodiment. Thus, the breadth and scope of the present application should not be limited by any of the above-described exemplary embodiments.
[0110] Terms and phrases used in this document, and variations thereof, unless otherwise expressly stated, should be construed as open ended as opposed to limiting. As examples of the foregoing, the term “including” should be read as meaning “including, without limitation” or the like. The term “example” is used to provide exemplary instances of the item in discussion, not an exhaustive or limiting list thereof. The terms “a” or “an” should be read as meaning “at least one,” “one or more” or the like; and adjectives such as “conventional,” “traditional,” “normal,” “standard,” “known.” Terms of similar meaning should not be construed as limiting the item described to a given time period or to an item available as of a given time. Instead, they should be read to encompass conventional, traditional, normal, or standard technologies that may be available or known now or at any time in the future. Where this document refers to technologies that would be apparent or known to one of ordinary skill in the art, such technologies encompass those apparent or known to the skilled artisan now or at any time in the future.
[0111] The presence of broadening words and phrases such as “one or more,” “at least,” “but not limited to” or other like phrases in some instances shall not be read to mean that the narrower case is intended or required in instances where such broadening phrases may be absent. The use of the term “component” does not imply that the aspects or functionality described or claimed as part of the component are all configured in a common package. Indeed, any or all of the various aspects of a component, whether control logic or other components, can be combined in a single package or separately maintained and can further be distributed in multiple groupings or packages or across multiple locations.
[0112] Additionally, the various embodiments set forth herein are described in terms of exemplary block diagrams, flow charts and other illustrations. As will become apparent to one of ordinary skill in the art after reading this document, the illustrated embodiments and their various alternatives can be implemented without confinement to the illustrated examples. For example, block diagrams and their accompanying description should not be construed as mandating a particular architecture or configuration.