SYSTEM, METHOD AND APPLICATION TO CONVERT TRANSDERMAL ALCOHOL CONCENTRATION TO BLOOD OR BREATH ALCOHOL CONCENTRATION
20240188886 ยท 2024-06-13
Assignee
Inventors
- Gary Rosen (Los Angeles, CA, US)
- Susan Luczak (Los Angeles, CA, US)
- Chunming Wang (Los Angeles, CA, US)
- Jay Bartroff (Los Angeles, CA, US)
- Larry Goldstein (Los Angeles, CA, US)
Cpc classification
A61B5/6801
HUMAN NECESSITIES
A61B5/082
HUMAN NECESSITIES
G16H50/20
PHYSICS
A61B5/7264
HUMAN NECESSITIES
G16H50/30
PHYSICS
A61B5/4845
HUMAN NECESSITIES
G16H50/70
PHYSICS
G16H15/00
PHYSICS
International classification
Abstract
System, method and application that obtains, consolidates, and integrates multiple sources of data including Transdermal Alcohol Concentration (TAC) along with drinking diary, photo/video, breath analyzer, other biological data (e.g., heart rate, skin conductance. blood flow, person-level biometrics), and environmental data (e.g., ambient temperature, humidity, GPS) and uses models described herein to convert TAC obtained from a wearable biosensor into estimated Blood Alcohol Concentration (BAC) or Breath Alcohol Concentration (BrAC).
Claims
1. A method for converting transdermal alcohol concentration (TAC) to blood or breath alcohol concentration (BAC/BrAC), the method comprising: measuring, using a biosensor, the TAC of a human; receiving, by a processor, data corresponding to one or more drinking curves for a population of humans; receiving, by the processor, data corresponding to at least one of (i) static characteristics of the human, (ii) physiological characteristics of the human, and (iii) current environmental conditions; and converting, using the processor, the TAC to BAC/BrAC using the data from one or more drinking curves, and the at least one of (i) the static characteristics of the human, (ii) the physiological characteristics of the human, and (iii) the current environmental conditions.
2. The method of claim 1, wherein the data corresponding to the one or more drinking curves includes a measurement of TAC and a measurement of at least one of BAC and BrAC.
3. The method of claim 1, wherein the data corresponding to the one or more drinking curves includes a time sequence of measurements of TAC and a time sequence of measurements of BAC or BrAC, and wherein the method is performed in real time.
4. The method of claim 1, wherein the data corresponding to the static characteristics includes a measurement of at least one of age, sex, ethnicity, height, weight, body fat and muscle, skin color, skin thickness, and skin tortuosity, wherein the data corresponding to the physiological characteristics includes a measurement of at least one of sweat, skin conductance, skin hydration, exercise, heart rate, blood pressure, blood flow, and stomach content, and wherein the data corresponding to the current environmental conditions includes a measurement of at least one of ambient temperature, humidity, pressure, GPS, weather, and climate.
5. The method of claim 1, wherein the converting is performed using a deterministic or stochastic finite dimensional autoregressive moving average with exogenous input (ARMAX) input/output model.
6. The method of claim 1, wherein the converting is performed using a blind or Bayesian deconvolution scheme.
7. The method of claim 1, wherein the converting is performed using a lattice filter-based recursive identification scheme.
8. The method of claim 1, wherein the converting is performed using an artificial neural network (ANN) by the processor, wherein the processor is remote from the biosensor and connected to the biosensor by a network.
9. The system of claim 1, wherein the converting is performed using a hidden Markov model (HMM) or a physics-informed hidden Markov model (PIHMM) by the processor.
10. The system of claim 1, wherein the converting is performed using a deconvolution filter based on output feedback linear quadratic Gaussian tracking gain computed by the processor.
11. The system of claim 1, wherein the converting is performed using first principles physics-based forward model with random parameters having distributions fit to population BrAC/TAC data and wherein the fitting the distributions is based on a na?ve pooled or mixed effects statistical model using either maximum likelihood, method of moments, or Bayesian techniques by the processor.
12. A system for converting transdermal alcohol concentration (TAC) to blood or breath alcohol concentration (BAC/BrAC), wherein the converting is in real-time with progressive forecasting and modeling techniques and recursive updating methods, the system comprising: a biosensor for measuring the TAC of a human; and a processor configured to: receive data from one or more drinking curves from a population of humans; receive data corresponding to at least one of (i) static characteristics of the human, (ii) physiological characteristics of the human, and (iii) the current environmental conditions; and convert, by the processor, in real-time the TAC to BAC/BrAC using the data from one or more drinking curves and the at least one of (i) the static characteristics of the human, (ii) the physiological characteristics of the human, and (iii) the current environmental conditions.
13. The system of claim 10, wherein the processor is remote from the biosensor and is connected to the biosensor via a network.
14. The system of claim 10, further comprising a remote database containing the one or more drinking curves from the population of humans connected to the processor via a network.
15. The system of claim 10, wherein the system comprises a plurality of further biosensors connected to the processor via a network, wherein the processor coverts, in real-time the TAC to BAC/BrAC for each of the plurality of further biosensors.
16. The system of claim 10, wherein the data corresponding to the one or more drinking curves includes a measurement of TAC and a measurement of at least one of BAC and BrAC.
17. The system of claim 10, wherein the data corresponding to the static characteristics includes a measurement of at least one of age, sex, ethnicity, height, weight, body fat and muscle, skin color, thickness, and tortuosity, wherein the data corresponding to the physiological characteristics includes a measurement of at least one of sweat, skin conductance, skin hydration, exercise, heart rate, blood pressure, blood flow, and stomach content, and wherein the data corresponding to the current environmental conditions includes a measurement of at least one of ambient temperature, humidity, pressure, GPS location data, weather, and climate.
18. The system of claim 10, wherein the converting is performed in real-time using a deterministic or stochastic finite dimensional autoregressive moving average with exogenous input (ARMAX) input/output model.
19. The system of claim 10, wherein the converting is performed using an artificial neural network (ANN) or a physics-informed neural network (PINN) by the processor.
20. A biosensor device for converting transdermal alcohol concentration (TAC) to blood or breath alcohol concentration (BAC/BrAC), the device comprising: a wearable sensor contactable to a human skin to measure the TAC of the human; a processor connected to the wearable sensor and connectable to a network, the processor configured to receive, via the network, data corresponding to one or more drinking curves for a population of humans; the processor configured to convert TAC to BAC/BrAC using (i) the data from one or more drinking curves and (ii) the measured TAC.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] Other systems, methods, features, and advantages of the present invention will be or will become apparent to one of ordinary skill in the art upon examination of the following figures and detailed description.
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
DETAILED DESCRIPTION
[0035] A system, method and/or a mobile application (app) that converts Transdermal Alcohol Concentration (TAC) to estimated Blood or Breath Alcohol Concentration (BAC/BrAC) in real-time and post-drinking, by using a novel collection of data from biosensors, self-report, and the environment.
[0036] With the goal of well-founded statistical inference on an individual's blood alcohol level based on noisy measurements of their skin alcohol content, this disclosure develops M-estimation methodology in a general setting. Discussions herein then apply it to a diffusion equation-based model for the blood/skin alcohol relationship thereby establishing existence, consistency, and asymptotic normality of the nonlinear least squares estimator of the diffusion model's parameter and the resulting estimated blood alcohol curve. Simulation studies show agreement between the performance of these estimators and their asymptotic distributions, and the results are applied to a real skin alcohol data set collected via biosensor.
[0037] A goal is to model and estimate a human subject's alcohol concentration in the blood (BAC) or breath (BrAC) as a function of the alcohol level measured at the skin, i.e., the transdermal alcohol concentration (TAC), via a biosensor. Approximately 1% of the alcohol ingested in the human body is metabolized through the skin. For decades it has been recognized that the levels of TAC are connected to those of BAC/BrAC, but also that there are challenges in modeling this relationship. Because alcohol has to pass from the blood through the skin to be captured by a TAC sensor placed on the surface of the skin, it is subject to variation across individuals (e.g., skin layer thickness, porosity, tortuosity, etc.) and drinking episodes (e.g., ambient temperature, humidity, subject activity level, skin hydration, vasodilation, etc.). These effects result in a TAC-BAC/BrAC relationship that can be highly variable. Thus, TAC devices to date have typically been primarily used only in legal and research settings as abstinence monitors (e.g., in court mandated monitoring of DUI offenders) because of difficulties researchers have found translating raw TAC to the quantity of alcohol in the blood.
[0038] Still, TAC measured by a wearable biosensor device has great potential as a tool to improve personal and public health. It provides a passive, unobtrusive way to collect naturalistic data for extended periods of time. The same is not true about BrAC, which typically must be measured by trained research staff in the laboratory under controlled conditions using a breath analyzer, and thus is less practical for capturing alcohol levels in the field under real-world conditions. Moreover, the breath analyzer requires a user to be compliant, potentially interferes with naturalistic drinking patterns, and is subject to inaccuracy (e.g., readings too high due to mouth alcohol, or too low due to not properly taking a deep lung breath for a reading). Thus, creating a system that reliably converts TAC data into estimates of BAC (or BrAC) would greatly benefit the alcohol research and clinical communities who, along with public health institutes, have been quite interested in such models. Such a tool would dramatically improve the accuracy of field data and the validity of naturalistic studies of alcohol-related health outcomes, disease progression, treatment efficacy, and recovery. A wearable alcohol monitoring device could have consumer appeal as well, helping individuals monitor their own alcohol levels and make better health choices.
[0039] Work on the TAC-BAC/BrAC relationship begins with deterministic models for the forward process of the propagation of alcohol from the blood, through the skin, and its measurement by the sensor. Other approaches reverse the forward process to estimate BrAC based on the TAC. These efforts show unaccounted for variation in the TAC-BAC/BrAC relationship and subsequent work began to incorporate uncertainty into the models via a random diffusion equation. Other statistical modeling approaches include a regression model for peak BrAC using peak TAC, time of peak TAC, and gender using controlled laboratory data. Other efforts examine time delays from peak BrAC to peak TAC. Further efforts use physics-based statistical models for the TAC-BAC/BrAC relationship.
[0040] In this disclosure, systems, methods, and devices are presented to meet this challenge by using a physics-based statistical model that allows individual, device, and drinking episode level variation by treating the data from each person/device/episode triple as resulting from its own model parameters. Discussions herein determine the large sample behavior of estimates of these parameters and give conditions under which these estimates are consistent and have a limiting normal distribution. These discussions then use those results to give a statistically rigorous asymptotic characterization of the properties of the BrAC/BAC curve estimates obtained from measured TAC, including information on estimation error. As these estimates are made on an individualized basis, they will not be adversely affected when used in a study of a population whose characteristics vary widely. On the other hand, these estimatesin some embodimentsrequire individualized calibration over subject, device and environmental conditions
[0041] While the discussion includes calibration aspects, in further embodiments, the key model parameters depend on measurable subject and environmental covariates which may be measured, and which eliminates some or all calibration.
[0042] It may, in some embodiments be desirable to quantitatively estimate BAC/BrAC from TAC to within the desired degree of accuracy after first calibrating the underlying models to each individual subject and situation, thus accounting for confounding person-level, environmental, and physiological factors that differ across the population of subjects and across situations. The forward and inversion models included in the app are sophisticated mathematical systems that include deterministic and population models and supervised learning algorithms.
[0043] First, the forward model captures the dynamics of the transport of ethanol molecules from the blood through the skin and its measurement(s) by the biosensor. The app includes the option to calibrate the forward model based on individualized data obtained from a real-time drink diary, retrospective drink diary, or pre-set drinking paradigm, or based on population-based models alone or combined with individualized personal data (e.g., age, sex, ethnicity, skin, height, weight, body fat, etc.)
[0044] Then, in the inversion process the model is used to deconvolve estimated BAC/BrAC in future drinking episodes from measured TAC and all other available data. This means the BAC/BrAC in subsequent drinking episodes is estimated from the TAC provided by the biosensor without any further action by the user.
[0045] The real-time deconvolution scheme to estimate BAC/BrAC uses novel models that incorporate adaptive real-time data driven model refinement/learning, autoregressive moving average with exogenous input (ARMAX), and lattice filter-based recursive identification schemes to produce estimates in real-time, and which can be continuously updated with new data. An additional approach to recovering BAC/BrAC from TAC includes a real-time deconvolution scheme based on a technique from linear control and estimation theory. Farther mechanisms are also discussed.
[0046] Once drinking is complete for an episode and TAC has returned to or established a baseline, the app uses the full set of data to update the model BAC/BrAC estimates using the entire set of data for the episode. In addition, over time individuals can update their personalized model fits with data obtained through the app and paired biosensors in additional drinking sessions. As data accumulates for an individual subject, Bayesian techniques are used to improve the accuracy of the estimated BAC/BrAC. Finally, the app also includes components for capturing subjective responses to alcohol (e.g., feeling flushed, intoxicated) and drinking context (e.g., vis photos, video, GPS location) beyond alcohol consumption, using automated reminders, random prompts, and/or self-timed diary entries options, and this data can then be paired to estimated BAC/BrAC and other biosensor measurements.
[0047] The output includes TAC and estimated BAC/BrAC curves with credible bands, additional biosensor data and subjective ratings of alcohol response displayed alone and in conjunction with estimated BAC/BrAC, and summary scores of drinking events along with correlations with subjective ratings of alcohol response and drinking contexts. Summary scores will also be retained and displayed in a calendar format, which will also allow for retrospective recording of drinking sessions when not wearing a TAC biosensor. This multi-faceted app provides comprehensive assessment and result options for capturing drinking in real-time and consolidating this data into meaningful metrics. This multifaceted app provides a comprehensive system that incorporates all available data, utilizes self-report through a novel web application, and includes real-time forecasting of estimated BAC/BrAC curves and scores. This app is the first effective tool for non-experts to produce quantitative estimates of BAC/BrAC from TAC and other data.
[0048] A wearable biosensor (e.g., a digital watch, fuel cell, Fitbit?) may be used to measure or sense ethanol molecules from the blood via the skin. The system is based on a fit forward model in the form of a partial differential (diffusion) equation that captures the dynamics of the transport of ethanol molecules from the blood through the skin and its measurement by the biosensor. The system then uses the estimated model to deconvolve estimated BAC/BrAC from the biosensor measured TAC. The accuracy of the estimated BAC/BrAC is significantly improved by correcting for environmental and physiological factors that differ across the population of subjects and situations. Therefore, it is important that the underlying models be, in some form, calibrated to each subject, device, and situation The system utilizes sophisticated mathematical population models and supervised learning algorithms together with the capability to optionally enter drinking diary, breathalyzer, and other biosensor data to tune the underlying models to the physiological characteristics of the person wearing the device and the current environmental conditions. The BAC/BrAC for all drinking episodes can then be estimated from the TAC passively provided by the biosensor without any active participation by the user.
[0049] This invention extends the scope of application of TAC to BAC/BrAC conversion software and adjusts for variations (i) between subjects, (ii) within subjects, (iii) in environmental conditions, (iv) across hardware devices, and/or (v) in repeated measurements over time, when applying the diffusion model, and (vi) can be fit in real-time. In particular, the invention utilizes statistical models for the low dimensional input parameters to the diffusion equations that depend on covariate information that describe characteristics of the subjects and their environment. The end result is personalized, real-time BAC/BrAC estimates with accompanying statistical accuracy measures, such as credible intervals and margins of error. In addition, the invention provides a theoretical, asymptotic analysis of the performance of the new estimation methods that result upon embedding the models in the underlying diffusion equation.
[0050] The invention utilizes adaptive real-time data driven model refinement/learning For example, the invention has the ability to incorporate real-time drink diary data into one or more of the underlying physics-based models described earlier to construct an adaptive/recursive data assimilation, estimation, and prediction system. The models are continuously updated with newly available real-time individual-level data to produce revised/estimated BAC/BrAC based on TAC in real-time. Even though the underlying state equation that forms the basis of this invention is, in general, infinite dimensional, end-to-end, it is a single input/single output linear time invariant system. Thus, BAC/BrAC can be approximated using a deterministic or stochastic finite dimensional autoregressive moving average with exogenous input (ARMAX) input/output model. The invention further includes lattice filter-based recursive identification schemes, which allow for the efficient modification of both the order of the model and the parameters when new data is introduced into the system. The invention takes advantage of the wealth of real-time adaptive parameter estimation, filtering, prediction, and deconvolution schemes available for systems described by these types of models. The invention accounts for the introduction of nonlinearities into these schemes through the use of artificial neural networks (ANNs) and trains them using a variant of back propagation. This scheme yields a somewhat delayed estimated BAC/BrAC, which can then be augmented by a prediction scheme to yield preliminary real-time estimated BAC/BrAC, and afterwards update estimated BAC/BrAC for the entire episode.
[0051] The invention incorporates new innovations that serve to improve the efficiency and accuracy of the estimated BAC/BrAC. In particular, two approaches to deconvolving the BAC/BrAC signal from the TAC signal have been included. One approach used to recover BAC/BrAC from TAC is based directly on our physiological model for the diffusion of ethanol through the epidermal layer of the skin. While this approach provides an effective low rank parameterization of the relationship between BAC/BrAC and TAC when there was extremely limited experimental data, a more empirical model can offer more flexibility when a relatively rich pool of laboratory collected contemporaneous matched BAC/BrAC-TAC data is available.
[0052] In an empirical linear model, we assume that the measured TAC is the convolution of two random signals, the convolution kernel K(s; ?) and the measurement error ?(s; ?). That is,
y.sub.TAC(t; ?)=?.sub.??.sup.tK(t?s; ?)v.sub.BrAC(s)ds+?(s; ?).
[0053] The process of training the model given in the above equation consists of identify reliable distributions for the functions K and ? based on available matched BAC/BrAC-TAC pairs. Since both functions belong to an infinite-dimensional space of random functions, effective parameterization of these function spaces is crucial to ensure stability of the training process. Inspired by the physiological model, a family of cubic spline functions defined on a strategically selected non-uniform grid is chosen. Analysis of the optimally determined kernel functions from a set of BAC/BrAC-TAC pairs exhibited an encouraging level of consistency among test subjects and data from different sessions for the same test subject.
[0054] Using the resulting distributions for K and ? obtained through analysis of data for an appropriate cohort or population, the retrieval of BAC/BrAC from TAC is done in bear real-time by calculating statistically consistent and efficient estimators for BAC/BrAC. One example of such an estimator is given by
where
[0055] Another approach to recovering BAC/BrAC from TAC includes a real-time deconvolution scheme based on a linear control and estimation theory technique. By formulating the deconvolution problem as a linear quadratic Gaussian tracking problem, the estimated BAC/BrAC signal is obtained in the form of a linear output feedback law. More precisely, the estimated BAC/BrAC signal is given as a real-time linear function of the measured TAC signal. Undesirable non-physical oscillations in the estimates which result from the underlying ill-posedness of the filtering problem being solved to determine the BAC/BrAC signal are mitigated by including an appropriate penalty term in the quadratic performance index. This approach also yields credible bands and error bars along with the estimated BAC/BrAC signal.
[0056] Beyond the mathematical models, this software invention includes real-time and retrospective self-report data collection mobile app for recording drinking diary, breathalyzer, other biosensor data, drinking context, and other factors that vary over a drinking episode (e.g., stomach contents, mood, behavior). The app includes the option to add calibration data from individualized data obtained from a real-time drink diary, retrospective drink diary, pre-set drinking paradigm, or based on population-based models combined with individualized personal data (e.g., age, sex, ethnicity, skin, height, weight, body fat, etc.). The app also includes components for capturing subjective responses to alcohol (e.g., feeling flushed, intoxicated) and drinking context (e.g., via photos, video, GPS location) beyond alcohol consumption, using automated reminders, random prompts, and/or self-timed diary entries options, and these data can then be paired to estimated BAC/BrAC and other biosensor measurements. Summary scores of drinking events along with correlations with subjective ratings of alcohol response and drinking contexts will be calculated and displayed in episode-level figures and charts These summary scores will also be retained and displayed for multiple drinking episodes in a calendar format, which also will allow for retrospective recording of drinking sessions.
[0057] The invention is implemented using a combination of hardware and software. The hardware includes the TAC biosensor, processors, memories, displays, and environmental sensors. The software includes computer code that can run on the hardware. The invention allows the user the option to select which method(s) they would like to use through a set of menus, based on what the user prioritizes to optimize, similar to factor analyses options in commercially available statistical software where the user can select different matrix rotations or fit indices to emphasize in the model runs. The invention produces both estimated BAC/BrAC curves, credible bands, and summary scores such as maximum estimated BAC/BrAC, time of maximum BAC/BrAC and area under the BAC/BrAC curve.
[0058] With reference to
[0059] The biosensor 6 may include a sensor 10. The sensor 10 may include an element configured to measure a TAC. For instance, a fuel cell device may process ethanol present on a user's skin or in a user's sweat to generate electricity, which may be measured. Because the voltage, current, power, and/or other measurable aspect of the generated electricity may be quantified, the corresponding amount of ethanol responsible for generating the electricity may be quantified. Various references to sensor 10 elsewhere herein provide example sensors for various embodiments.
[0060] The biosensor 6 may include a processor 20. The processor may be a computer, of a microcontroller, or a low power embedded microprocessor, or a single-board computer, an application-specific integrated circuit (ASIC) or any other electronic data processing device as desired. In various embodiments, the processor 20 is connected to a memory 80. The memory 80 may be a working memory, providing for data storage during calculation by the processor 20 of BAC/BrAC from TAC. The memory 80 may be a storage memory, such as for storage of data corresponding to TAC prior to transmission of this data to a backend control system 4. The memory maybe both a storage memory and a working memory.
[0061] The biosensor 6 may have a local display terminal connected to the processor 20. The local display terminal 30 may be a human-readable interface. For instance, the local display terminal 30 may be one or more LED, audio annunciator, tactile feedback device, LCD or other text or graphic display, or any other apparatus configured to provide information in human-readable form. In various embodiments, the local display terminal provides menu structures and other interface elements of an application as described herein. In various embodiments, the local display terminal displays a TAC measurement. In further embodiments, the local display terminal displays a calculated BAC/BrAC measurement calculated by the biosensor 6, the backend control system 4, or a combination of the biosensor 6 and the backend control system 4 that is calculated from a measured TAC.
[0062] The biosensor 6 may be connectable to a network 70. The backend control system 4 may also be connectable to the network 70. The network 70 may permit electronic communication between the biosensor 6 and the network 70. In various embodiments, the network 70 comprises the internet. In further embodiments, the network 70 may be a private network, or a virtual private network, or an RF data link, or an optical data link, or a wired link, or any electronic connection. The network 70 may include wireless aspects, such as cellular connections, or Wi-Fi connections or other aspects.
[0063] Having discussed the biosensor 6 and a network 70, attention is now directed to a backend control system 4. The backend control system 4 may comprise a server, or a cloud computing resource, or any other computing system as desired. In various instances, the backend control system 4 provides greater processing power than the biosensor 6 and facilitates calculation of BAC/BrAC from TAC by remotely handling calculations and other processing tasks. In various instances, the backend control system 4 collects and aggregates data from the biosensor 6 with data from other resources, such as user inputs, stored or laboratory research data, previously collected data such as prior TAC data, user-specific data such as weight, height, and other aspects, training data, and/or the like. In various instances, the backend control system 4 collects and aggregates data from multiple different biosensors 6. Various data, factors, and relevant variables are discussed throughout, each of which may be processed. stored, or otherwise received by the backend control system 4 and/or the biosensor 6.
[0064] The backend control system 4 may include a remote database 50. The remote database 50 may store the aforementioned data, TAC calculations, BAC/BrAC calculations and/or the like. The remote database 50 may provide both working memory and/or storage memory.
[0065] The backend control system may include a remote processor connected to the remote database 50 and the network 70. The remote processor may a computer, or a microcontroller, or a low power embedded microprocessor, or a single-board computer, an application-specific integrated circuit (ASIC) or any other electronic data processing device as desired. The remote processor may be a distributed or cloud computing resource.
[0066] The backend control system 4 may have a remote display terminal 40 connected to the remote processor 60. The remote display terminal 40 may be a human-readable interface. For instance, the remote display terminal 40 may be one or more LED, audio annunciator. tactile feedback device, LCD or other text or graphic display, or any other apparatus configured to provide information in human-readable form. In various embodiments, the remote display terminal 40 provides menu structures and other interface elements of an application as described herein. In various embodiments, the remote display terminal 40 displays a TAC measurement. In further embodiments, the remote display terminal 40 displays a calculated BAC/BrAC measurement calculated by the biosensor 6, the backend control system 4, or a combination of the biosensor 6 and the backend control system 4 that is calculated from a measured TAC. The remote display terminal 40 may be separate from the backend control system 4 and connected to the network 70. The remote display terminal 40 may be browser session of a user accessing the backend control system 4, such as via a website login interface on an internet browser running on a commodity personal computer.
[0067] Previously, it was mentioned that the backend control system 4 may collect and aggregate data from multiple different biosensors 6. In addition, the backend control system 4 may provide processing resources to multiple different biosensors for calculating a BAC/BrAC from a measured TAC. With reference to
[0068] Thus, in various instances, the system 2 for converting transdermal alcohol concentration (TAC) to blood or breath alcohol concentration (BAC/BrAC) in real-time may include a biosensor 6 for measuring the TAC of a human. The system may include a processor. The processor may be a processor 20, a remote processor 60, or a combination of the processor 20 and remote processor 60 such that certain processes are conducted on processor 20 and other processes are conducted on remote processor 60. As such, one or more of the processors may receive data from one or more drinking curves from a population of humans. One or more of the processors may receive data corresponding to at least one of (i) static characteristics of the human, (ii) physiological characteristics of the human, and (iii) the current environmental conditions. One or more of the processors may convert in real-time the TAC to BAC/BrAC using the data from one or more drinking curves and the at least one of (i) the static characteristics of the human, (ii) the physiological characteristics of the human, and (iii) the current environmental conditions.
[0069] Moreover, the biosensor 6 for converting transdermal alcohol concentration (TAC) to blood or breath alcohol concentration (BAC/BrAC) may include a wearable sensor 10 contactable to a human skin to measure the TAC of the human and a processor (processor 20, remote processor 60, and/or a combination of processor 20 and processor 60) connected to the wearable sensor 10 and connectable to a network 70, the processor configured to receive, via the network 70, data corresponding to one or more drinking curves for a population of humans. One or more of the processor may be configured to convert TAC to BAC/BrAC using (i) the data from one or more drinking curves and (ii) the measured TAC.
[0070] Turning now to
[0071] The method may include receiving, by a processor, data corresponding to at least one of (i) static characteristics of the human, (ii) physiological characteristics of the human, and (iii) current environmental conditions (block 104). Again, the processor may be a processor local to the biosensor 6 (
[0072] Finally the method may include converting, using a processor, the TAC to BAC/BrAC using the data from one or more drinking curves, and the at least one of (i) the static characteristics of the human, (ii) the physiological characteristics of the human, and (iii) the current environmental conditions (block 106). This processor may be the processor 20 (
[0073] Various methods for such converting are discussed at length throughout this disclosure. For instance, the converting may be performed using a deterministic or stochastic finite dimensional autoregressive moving average with exogenous input (ARMAX) input/output model. The converting may be performed using a blind or Bayesian deconvolution scheme. The converting may be performed using a lattice filter-based recursive identification scheme. The converting may be performed using an artificial neural network (ANN). The converting may be performed using a hidden Markov model (HMM) or a physics-informed bidden Markov model (PIHMM). The converting may be performed using a deconvolution filter based on output feedback linear quadratic gaussian tracking gain. Moreover, the converting may be performed using first principle physics-based forward models with random parameters having distributions fit to population BrAC/TAC data. The fitting the distributions may be based on a naive pooled or mixed effects statistical model using either maximum likelihood, method of moments, or Bayesian techniques. The converting may be performed in many different ways. The converting may be performed in real-time with progressive forecasting and modeling techniques and recursive updating.
[0074] Thus, one may appreciate that the method may have various additional aspects. For instance, the data corresponding to the one or more drinking curves may be different types of data. The data may be a measurement of TAC. The data may be a measurement of BAC. The data may be a measurement of BrAC. The data may include comparisons of TAC to BAC and/or BrAC.
[0075] The data that corresponds to the static characteristics may include a variety of different measurements. For instance, the measurements may relate to aspects of a specific human for whom TAC is being measured. The measurements may include a measurement of at least one of age, sex, ethnicity, height, weight, body fat and muscle, skin color, skin thickness, and skin tortuosity.
[0076] The data that corresponds to the one or more physiological characteristics may include a variety of different measurements. For example, the measurements may relate to aspects of a specific human for whom TAC is being measured but which may be dynamic. For instance, the measurements may include a measurement of at least one of sweat, skin conductance, skin hydration, exercise, heart rate, blood pressure, blood flow, and stomach content.
[0077] The data that corresponds to the current environmental conditions may include a variety of different measurements. For example, the measurements may relate to aspects of an environment that the human for whom TAC is being measured is exposed to. The measurements may include a measurement of at least one of ambient temperature, humidity, pressure, GPS, weather, and climate.
[0078] Having provided an overview of the system, method, and device above, attention is now directed to a discussion of a diffusion model to characterize ethanol diffusion across skin so that correspondingly the subject's BAC/BrAC may be model as a function of TAC.
[0079] The following discussion will include various types of models. For instance, a partial differential equation diffusion model may characterize alcohol transfusion across the skin. A least squares approach may be provided for estimating an unknown vector. M-estimation is provided and basic examples of its use, as well as an application of M-estimation to the mentioned model. Yet further, the application of the M-estimation to the partial differential equation diffusion model may be implemented to obtain results on the performance of resulting BrAC curves estimated from TAC. The discussion will also include an evaluation of theoretical results in simulations and an illustration using BrAC/TAC relationships measured experimentally.
[0080] Diffusion Model (Section 1). Although a goal is to model a human subject's BAC/BrAC as a function of TAC, the ethanol molecules themselves move in the other direction: from the blood, through the skin, to ultimately be measured by the sensor on the surface of the skin. Thus the relevant physics describe the TAC as a function of BAC/BrAC. Consider a specific model for this transport based on Fick's law of diffusion which depends on an unknown, 2-dimensional parameter q=(q.sub.1, q.sub.2). The result is TAC expressed as a convolution of BAC/BrAC with a kernel or filter, and as a function of the unknown q which may then be estimated via nonlinear least squares as described and whose properties are considered in Section 3. These properties determine the inferential consequences for BAC/BrAC estimation, and in particular have a large impact on the accuracy of the estimated BrAC curve, as studied in Section 3.
[0081] Let x(t, ?) denote the concentration of ethanol at time t?0 and depth ?? [0,1] from the skin surface through epidermis, choosing units so that ?(t)=x(t, 1), t?0 is the BAC at time t. A Fick's law-based model has been developed and used successfully to model data of this type. The model specifies x(t, ?) as the solution to the partial differential equation, with boundary condition
depending on the parameter q=(q.sub.1, q.sub.2). The TAC at skin level is then x(t, 0). When we want to emphasize dependency on the parameter q we will write, for instance, ?(t; q).
[0082] The system with its boundary conditions can be solved in continuous time in terms of unbounded linear operators, with solution
x(t)=e.sup.A(q)tx(0)+?.sub.0.sup.te.sup.A(q)(t?s)B(q)?(s)ds.(2)
[0083] In cases we consider, x(0) will be the zero function, that is, observation begins at, or before, the time of first intake of alcohol. By taking a discretization of the distance ? from skin level into k steps for some k sufficiently large, the operators in (2) can be approximated by k dimensional linear operators (i.e., matrices) yielding the approximation to the solution given by
x.sup.(k)(t)=?.sub.0.sup.te.sup.A(k).sup.
[0084] Now fixing, and suppressing in the notation, the level of discretization k, an observation taken at time t can be represented as the linear function of x(t) given by
ds,(4)
plus an additive error term. For observations taken at skin level, the vector C will have a one in its first component, and zeros elsewhere.
[0085] The matrices in (4) depend on the unknown parameter q as
A(q)=q.sub.1D+E and B(q)=q.sub.2F,(5)
where C, D, E, and F are known matrices that result from making the finite-dimensional approximation discussed. More precise assumptions and properties of these matrices, and the domain of q, will be specified in Section 3.
[0086] Non-Linear Least Squares Estimation (Section 1.2). To estimate the parameter q, we assume that TAC data {y.sub.ij, 1?i?n, 1?j?m.sub.i} is collected on a single individual over n different drinking episodes at the my times 0?t.sub.i,1< . . . <t.sub.i,m.sub.
where ?.sub.?.sub.
[0087] In Section 2 below, we consider the existence, consistency, and limiting distribution of our least squares estimators in a general M-estimation context, and present some examples. In Section 3 we apply the results in Section 2 to the diffusion model of Section 1, and present Theorems 3.1 and 3.3, which contain our main results on inference for the main parameter q of interest, and also for the error variance ?.sup.2. In Section 4 we apply the results of Section 3 for making inference on the BrAC curve, and in particular for the construction of uniform error bounds on the resulting curve estimate. We validate our theoretical work via simulation and real data analysis in Section 5.
[0088] M-Estimation (Section 2)Existence, Consistency, and Limiting Distribution. In this section we consider M-estimation in a general setting that contains what we will require to handle the diffusion model we consider. Prior discussions of M-estimation tend to focus on the case of a univariate parameter, whereas ours covers the multivariate case. Prior efforts cover only least squares estimation whereas our results apply to the more general estimating equation (7). Also, previous results only apply to approximate normality and require i.i.d. error terms, whereas our Theorem 2.2 can be applied to other limiting distributions and relaxed conditions on the error terms, although our main application is to limiting normality. Finally, previous results are more restrictive in terms of a number of technical conditions, such as compactness of the parameter space ? which our results do not require, and the existence of tail products of vectors of observation means and error terms, which our results eschew in favor of more conventional regularity conditions on the score type function U.sub.n.
[0089] After establishing the notation and setup in Section 2.1, we state our main results in Section 2.2. In Section 2.3 we provide some general examples of the applications of our results to least squares and maximum likelihood estimation.
[0090] Set Up and Summary of Results (Section 2.1). For n?1, observed data X.sup.(n) in a space ?.sup.(n), a parameter space ??.sup.p having non-empty interior, and a function U.sub.n:???.sup.(n).fwdarw.
.sup.p, consider the estimating equation
U.sub.n(?)=0, ???,(7)
where the dependence of .sub.n on the data is suppressed. In our examples ?.sup.(n) will a Euclidean space endowed with a family of densities p.sub.n(x.sup.(n); ?), ??? which generate the data from this family with ?=?.sub.0. Two important situations in which the solutions of such equations arise are maximum likelihood and least squares estimation.
[0091] For maximum likelihood, under smoothness conditions on the densities, the maximizer of the log likelihood L.sub.n(?)=log p.sub.n(x.sup.(n); ?) is given as a solution to (7) with
.sub.n(?)=?.sub.?L.sub.n(?; X.sup.(n)),(8)
where ?.sub.? denotes taking derivative with respect to ?, resulting in a column vector of partial derivatives when ? itself is a vector. When the data X.sup.(n) consists of n independent random vectors X.sub.1, . . . , X.sub.n in .sup.d, each with distribution p(x; ?.sub.0), the space ?.sup.(n) can be identified with
.sup.d?n, and p.sub.n(x.sup.(n), ?) is the product of the marginal densities p(x.sub.i?) for i=1, . . . , n.
[0092] To introduce least squares estimation, suppose that pairs (x.sub.i, y.sub.i)?.sup.d?
, i=1, . . . , n, are observed with distribution depending on ? for which
.sub.?[y.sub.i|x.sub.i]=?.sub.i(x.sub.i; ?) for ?.sub.i(x; ?) in some parametric class of functions. With x.sup.(n)=(x.sub.1, . . . , x.sub.n), the least squares estimate of ? is given as the minimizer of
which under smoothness conditions can be obtained via (7) with
[0093] The aim of the estimating equation U.sub.n(?)=0 is to provide a value close to the one where the function U.sub.n(?) takes the value of 0 in some expected, or asymptotic, sense. In particular, in Theorem 2.1 we will show, under that when U.sub.n(?.sub.0) is, under an appropriate scaling, close to zero as n.fwdarw.?, then the sequence of estimates obtained via the estimating equations will be consistent for the true parameter.
[0094] In Theorem 2.2, we will also provide a corresponding limiting distribution for solutions to the estimating equation (7). Let U.sub.n(, ?) have components
U.sub.n(?)=(U.sub.n,j(?)).sub.1?j?p where U.sub.n,j:.sup.n??.fwdarw.
.
[0095] In the case of maximum likelihood estimation, where we have (8), under the assumption of the existence and continuity of second derivatives of L.sub.n for ???, writing U.sub.n/(?) as short for the observed information matrix ?.sub.?U.sub.n.sup.T(?)?.sup.p?p, its k, j.sup.th component is given by
[0096] And in this case, the third condition in (11) below is equivalent to the condition that the limiting information matrix l is positive definite. Tolerating a slight abuse of notation, we may also write ?.sub.j rather than ?.sub.?.sub.
[0097] Over each coordinate j=1, . . . , p, under second order differentiabilty conditions, we will make use of the second order Taylor expansion of U.sub.n,j(?) around some ?.sub.0??,
where each 0*.sub.n,j lies on the line segment connecting ? and ?.sub.0. In the following, we let ??? denote the Euclidean norm of a vector, the operator norm of a matrix, and the supremum norm of a function.
[0098] Estimating equations, consistency, and asymptotic normality (Section 2.2). We now present results that provide conditions for the consistency and existence of a non-trivial limiting distribution for a properly centered and scaled sequence of estimating equation solutions. We also include results on the consistent estimation of parameters on which the asymptotic distribution of our estimate may depend.
[0099] Theorem 2.1 Suppose that U.sub.n:???.sup.(n).fwdarw..sup.p is twice continuously differentiable in an open set ?.sub.0?? containing ?.sub.0, and that there exist a sequence of real members a.sub.n, a matrix ??
.sup.p?p and ?>0 such that
[0100] Suppose further that for any ??(0,1), that there exists a K such that for all n sufficiently large,
P(|a.sub.n?.sub.k,lU.sub.n,j(?)|?K, 1?k,l,j?p, ???.sub.0)?1??.(12)
[0101] Then for any given ?>0 and ??(0,1), for all n sufficiently large, with probability at least 1?? there exists {circumflex over (?)}.sub.n?? satisfying U.sub.n({circumflex over (?)}.sub.n=0 and ?{circumflex over (?)}.sub.n??.sub.0???, that is, a sequence of roots to the estimating equation (7) consistent for ?.sub.0.
[0102] In addition, for any sequence {circumflex over (?)}.sub.n.fwdarw..sub.p?.sub.0, we have
a.sub.nU.sub.n({circumflex over (?)}.sub.n).fwdarw..sub.p?,(13)
that is, ? can be consistently estimated by a.sub.nU.sub.n({circumflex over (?)}.sub.n) from any sequence consistent for ?.sub.0.
[0103] Proof: By replacing U.sub.n by a.sub.nU.sub.n and ? by ???.sub.0, we may assume that the conditions of Theorem 2.1 hold with a.sub.n=1 and ?.sub.0=0. For ?>0 let
B.sub.?={?:?????}.
[0104] For the given ??(0,1), let K and n.sub.0 be such that (12) holds with ? replaced by ?/2 for n?n.sub.0. For the given ?>0, take ??(0, ?) such that B.sub.???.sub.0 and C?<? where
[0105] By (11) there exists n.sub.1?n.sub.0 such that for n?n.sub.1
P(?U.sub.n(0)?<?.sup.2)?1??/3 P(?U.sub.n(0)??<?)?1??3,(14)
and also taken large enough so that (12) holds with ? replaced by ?/3. By the union bound, all three events hold with probability at least 1??. For ??B.sub.? and ?*.sub.n,j given by (10), the components of R.sub.n(?)=(R.sub.n,1(?), . . . , R.sub.n,p(?)).sup.T as defined by
[0106] Then, for n?n.sub.1, with probability at least 1??, from (10), (14) and (12),
[0107] Assume for the sake of contradiction that U.sub.n(?) does not have a root in B.sub.?. Then for ??B.sub.?, the function ?(?)=??U.sub.n(?)/?U.sub.n(?)? continuously maps B.sub.? to itself. By the Brouwer fixed point theorem, there exists ??B.sub.?, with ?(?)=?. Since ??(?)?=? for all ??B.sub.?, we have ??(?)?=???=?, contradicting (15) via ?.sup.2=???.sup.2=?.sup.T?=?.sup.T?(?)<0. Hence U.sub.n(?) has a root within ? of 0, and since ?<?, therefore within ?, with probability at least 1??, as required.
[0108] To prove (13), taking {circumflex over (?)}.sub.n to be any consistent sequence for ?.sub.0, a first order Talyor expansion yields, for all 1?j, k?p,
where ?*.sub.n,j lies along the line segment connecting {circumflex over (?)}.sub.n and 0. Writing this identity in matrix notation, we have
U.sub.n({circumflex over (?)}.sub.n)?U.sub.n(0)=Q.sub.n({circumflex over (?)}.sub.n) where (Q.sub.n(?)).sub.k,j=Q.sub.n,k,j.sup.T?.
[0109] Let ??(0,1) and ?>0 be given, choose ??(0, ?/K ?{square root over (p)}) so that B.sub.???.sub.0, and let n.sub.2 be such that for all n?n.sub.2, with probability at least 1??, |?.sub.k,lU.sub.n(?)|?K for all 1?k,l?p and ?{circumflex over (?)}.sub.n???.
[0110] Then, for n?n.sub.2 with probability at least 1?? we have
where ?A?.sub.?=max.sub.i,j|A.sub.i,j| for A?.sup.p?p. The claim follows, since ? and ? are arbitrary, and U.sub.n(0).fwdarw..sub.p? by assumption. Our next result provides conditions under which a consistent estimator sequence, properly centered and scaled, converges in distribution.
[0111] Theorem 2.2 Suppose the sequence of solutions {circumflex over (?)}.sub.n, n?1 to (7) is consistent for ?.sub.0, that (12) and the second condition of (11) hold for some sequence a.sub.n, n?1 of real numbers, that the matrix ? in (11) is non-singular and that U.sub.n(?) is twice continuously differentiable in an open set ?.sub.0?? containing ?.sub.0. Further, let b.sub.n be a sequence of real numbers such that for some random variable Y,
[0112] Proof: As in the proof of Theorem 2.1, by replacing a.sub.n.sub.n by
.sub.n we may without loss of generality take a.sub.n=1, and also as done there, take ?.sub.0=0. Since a limit in distribution does not depend on events of vanishingly small probability, by the consistency of {circumflex over (?)}.sub.n and (12) we may assume that for each n, sufficiently large, that {circumflex over (?)}.sub.n??.sub.0, and for some K that |?.sub.k,jU.sub.n(?)|?K for all 1?j, k?p and ???.sub.0. For such n the expansion (10) holds, and substituting {circumflex over (?)}.sub.n for ? and using U.sub.n({circumflex over (?)}.sub.n)=0 yields
[0113] By the Cauchy-Schwarz inequality,
[0114] Hence ?.sub.n.fwdarw..sub.p so that ?.sub.n.sup.?1 exists with probability tending to 1, and converges in probability to ?.sup.?1. Now using (16), Slutsky's theorem, on an event of probability tending to one as n tends to infinity,
b.sub.n{circumflex over (?)}.sub.n=?.sub.n.sup.?1(b.sub.n?.sub.n{circumflex over (?)}.sub.n)=??.sub.n.sup.?1(b.sub.nU.sub.n(0)).fwdarw..sub.d??.sup.?1Y.
[0115] In the most common case the distributional convergence in (16) is to the normal, and shown by applying the Central Limit Theorem to a sum of independent random vectors. This situation is illustrated in the following lemma, in which we include distributional limits that may have covariance matrices of less than full rank. For a given vector ? and non-negative definite matrix ?,
we say X?N(?, ?) when E[e.sup.t.sup.
[0116] In particular, in one dimension N(?, 0) is unit mass at ?.
[0117] Lemma 2.1 Let =1,2, . . . be a sequence of arbitrary index sets satisfying |
|.fwdarw.? as
.fwdarw.?, and let {
, a?
} be a collection of
.sup.d valued independent, mean zero random vectors such that for some matrix ? and some ?>0
[0118] Proof: We first prove the result in . By the Lindeberg theorem, (e.g. Theorem 3.4.5, [Durrett, 2019]) if for all
?1 the random variables {
, a?
.sub.i} are independent, mean zero, and satisfy
then .fwdarw..sub.dN(0, ?.sup.2) where
=
X.sub.n,i. In
the second condition in (17) implies the second condition in (18), as for any ?>0, with p=1+?/2 and q=1+2/?, using H?lder's inequality followed by Markov's,
[0119] Hence, the claim holds in when the limiting variance is positive. When this limit is zero, Chebyshev's inequality yields that
.fwdarw..sub.p0, and hence
converges as well to zero in distribution, which is the normal distribution with mean and variance 0. Hence the conclusion of the lemma holds for d=1.
[0120] In general, given a collection of random vectors satisfying the given hypotheses, taking v to be of norm 1, the variables =v.sup.T
for a?
are independent and mean zero for each
, and satisfy the first condition of (17) with ? replaced by v.sup.T?v, and the second condition of (17) by virtue of this condition holding by assumption for the vector array
. and that |
|.sup.2+?=|v.sup.T
|.sup.2+???
?.sup.2+?. As the claim holds in d=1 for linear combinations given by any v of norm 1, the general result follows by the Cramer-Wold device.
[0121] Examples (Section 2.3). In the section we demonstrate the scope of the results in Section 2.2 by presenting two applications, one to least squares and the other to maximum likelihood.
[0122] The following lemma, a direct application of the dominated convergence theorem, is used to handle the technical matter of interchanges between integration and differentiation with respect to ????.sup.p.
[0123] Lemma 2.2 Let ?:.sup.m??.fwdarw.
be differentiable with respect to ? in an open set ?.sub.0??, and suppose that there exists g:
.sup.m.fwdarw.
such that
[0124] Then for all ???.sub.0,
[0125] Example 2.1 Least squares estimation. Suppose we observe
y.sub.i=?(x.sub.i, ?.sub.0)+?.sub.i i=1, . . . , n
where ?(x.sub.i, ?), ???? is some specified parametric family of functions; we take a one dimensional parameter here for simplicity. We estimate ?.sub.0 via least squares, minimizing
[0126] We assume that ?(x, ?) has three continuous derivatives with respect to ? that are uniformly bounded, say by K, over some open subset ?.sub.0 of ? that contains ?.sub.0, and that ?.sub.1, ?.sub.2, . . . are independent random variable distributed as ?, a mean zero, variance ?.sup.2 random variable E|?|.sup.2+?=?.sup.2+?<? for some ?>0.
[0127] Taking one derivative with respect to ?, we obtain the estimating equation .sub.n(?)=0 where
[0128] The first condition of (11) of Theorem 2.1 is satisfied with a.sub.n=1, as the errors have zero mean, are uncorrelated and have uniformly bounded variances, implying that E.sub.?.sub..sub.n(?.sub.0)]=0 and Var.sub.?.sub.
.sub.n(?.sub.0)].fwdarw.0. Regarding the second condition of (11) taking another derivative, we obtain
[0129] Arguing as for (20), the second sum tends to zero in probability. If we now take x.sub.i, i=1,2, . . . to be independent random vectors distributed as some x, then the law of large numbers yields that
showing the second condition of (11), and this limit will be positive when ?.sub.??(x, ?.sub.0) is a non-degenerate random variable, thus verifying the final condition in (11) in that case.
[0130] It is easy to see that taking another derivative in (21) yields an average of functions that are bounded over ?.sub.0, plus a weighted average of the error variables, each one multiplied by some bounded function. As the second weighted average can be seen to be bounded in probability by applying reasoning similar to that used for the score .sub.n(?.sub.0), condition (12) holds.
[0131] The only remaining verification needed to invoke Theorem 2.2 is to show the properly scaled score at ?.sub.0 has a limiting distribution. Taking b.sub.n=?{square root over (n)}, we have
by (22), and in addition using the representation of .sub.n(?.sub.0) from (20),
[0132] Hence, invoking Lemma 2.1, for any consistent sequence of roots,
?{square root over (n)}({circumflex over (?)}.sub.0??.sub.0).fwdarw..sub.dN(0, ?.sup.2?.sup.?1).
[0133] Example 2.2 Maximum likelihood. Let p(x, ?), ???.sub.0 be a family of density functions for ??.sup.p, and for some ?.sub.0??, let X.sub.1, . . . , X.sub.n be independent random vectors with density p(x, ?.sub.0). Let p(x, ?) be three times continuously differentiable in ? with the first two derivatives of p(x, ?), and the third derivative of q(x,?)=log p(x,?), dominated by an integrable function in some neighborhood ?.sub.0 of ?.sub.0. Assume further that the Fisher information matrix at ?.sub.0 is positive definite.
[0134] The maximum likelihood estimate of ?.sub.0 is obtained by maximizing the log likelihood of the data, and hence given by a solution to the estimating equation (7) with
[0135] By Lemma 2.2, for ???.sub.0 we have
.sub.?[?.sub.? log p(X, ?)]=
?.sub.?p(x, ?)dx=?.sub.?
p(x, ?)dx=0,(23)
and likewise that the Fisher information I(?) satisfies
l(?=?.sub.?[?.sub.?.sup.2log p(X, ?)]=Var.sub.?(?.sub.?log p(X, ?)).
[0136] Hence, by the law of large numbers the first two conditions of (11) are satisfied with a.sub.n=1 and ?=(?.sub.0), and the last holds by our assumption on the Fisher information. Next we show (12) is satisfied. Writing ?.sub.j short for ?.sub.?.sub.
[0137] Condition (12) can be verified by invoking the following uniform strong law of large numbers with h(x, ?) applied to the components ?.sub.k,i,jq(x, ?).
[0138] Theorem 2.3 Let ? be a compact metric space and ? a space on which a probability distribution F is defined. Let h(x, ?) be measurable in x for each ??? and continuous in ? for almost every x. Assume there exists K(x) such that E[K(X)]<? and |h(x, ?)|?K(x) for all x and ?. Then, with m(?)=E[h(X, ?)].
where X.sub.1, X.sub.2, . . . are independent with distribution F.
[0139] Lastly, under the given assumptions, the classical central limit theorem yields
?{square root over (n)}.sub.n(?.sub.0).fwdarw..sub.d
(0, I(?.sub.0))
so that, via Theorem 2.2.
?{square root over (n)}({circumflex over (?)}.sub.0??.sub.0).fwdarw..sub.d(0, I(?.sub.0).sup.?1).
[0140] For the exponential family
p(x; ?)=h(x)exp(?(?)T(x)?A(?)) we have q(x; ?) =log h(x)+?(?)T(x)?A(?).
Hence, the needed conditions are satisfied if A(?) and ?(?) have three bounded derivatives in some neighborhood of ?.sub.0, and E.sub.?.sub.
[0141] Application to a diffusion equation model (Section 3). To more fully specify the output function of the diffusion model arising from PDE as described herein, consider the parameter space
={(q.sub.1, q.sub.2)?
.sup.2:q.sub.2>0},(24)
and for given matrices D, E?.sup.k?k, a vector F?
.sup.k and q?
, recall from (5) that
A=A(q)=q.sub.1D+E, B=B(q)=q.sub.2F,(25)
and that the TAC at time t is given by
?.sub.?(t; q)=?.sub.0.sup.tCe.sup.A(t?s)B?(s)ds,(26)
where C.sup.T?.sup.k, and ?(s) is the BrAC/BAC at time s. Though our methods work in the given generality, in the physics based model the matrix A will have eigenvalues with negative real parts, and q.sub.1 will be strictly positive. The dependence of ? on A, B, C, ? or q may be dropped in the following for ease of notation, or included to emphasize some particular feature of interest.
[0142] Consider an individual whose data has been collected over i=1, . . . , n drinking sessions, where the BrAC curve ?.sub.i for episode i is integrable on [0, T.sub.i], and for some q.sub.0? and m.sub.i observations of TAC plus a mean zero error
y.sub.ij=?.sub.?.sub.
are taken at the times 0?t.sub.i,1? . . . ?t.sub.i,j.sub.
[0143] For asymptotics, we consider a sequence of experiments indexed by =1,2, . . . , where n and m=(m.sub.1, . . . , m.sub.n) may depend on
, and hence we may index using
in place of n, m, though this dependence may at times be suppressed in the notation. In the case of a single drinking episode, that is, when n=1, we let
=m. For consistency and asymptotic normality, we require that
?.sub.i=1.sup.nm.sub.i.fwdarw.? as .fwdarw.?.(28)
[0144] In the special case where the number of observations m.sub.i for each n equals a constant m, the requirement (28) becomes nm.fwdarw.?, and in the sub-case of a single drinking episode. that m.fwdarw.?.
[0145] Recall that a sequence of measures {v.sub.k, k?1} on is said to converge weakly to a measure v if
[0146] Any sequence {v.sub.k, k?1} of probability measures whose supports are contained in a bounded set is tight, and hence when the weak limit v exists it will also be a probability measure, and its support also so contained.
[0147] There are two special cases of note for the sequence of measures {v.sub.k, k?1}. One is where the distances between consecutive observation times on [0, T] are constant; in this case, the weak limit is the uniform probability measure on [0, T]. A second case is when the observation times are chosen independently according to the probability measure v supported on [0, T]; in this case, the weak limit in probability is ?.
[0148] Let the gradient of (q) be denoted
[0149] We apply the methods developed herein to the least squares estimator achieved as a solution to
(q)=0 where
(q)=?.sub.q
(q),(30)
where (q) is given by the sum of squares in (6). For i?{1,2}, we continue to let ?.sub.i denote taking the partial derivative with respect to q.sub.i; this notation will extend in the natural way to denote higher order, and mixed partial derivatives. Theorem 3.1 below gives conditions under which the least squares estimate is consistent and has a limiting, asymptotically normal distribution, and as well provides the form of the limiting covariance matrix. Theorem 3.1 is an immediate consequence of Theorems 3.2 and 3.3. that verify the conditions of Theorems 2.1 and 2.2 in the previous section.
[0150] To set the stage for the statements and proofs of our results, we note that when v.sub.n, m?1 is the discrete probability measure giving equal weight to the times t.sub.m,1, . . . , t.sub.m,m in [0, T], then for any continuous function h:[0, T].fwdarw., when v.sub.m converges weakly to v, we have
[0151] By considering components, the same relations hold when h continuously maps [0, T] to the space of matrices of some fixed dimension. For a given BrAC curve ?, of particular interest is the .sup.2?2 valued matrix function g.sub.? in (29) that determines, via
(q), the limiting covariance matrix ? of our q parameter estimate.
[0152] We consider two special cases where the existence of the limit ? is guaranteed. For a single drinking episode, that is, when n=1, when v.sub.m converges weakly to v, due to the continuity of elements of g.sub.?(u) as shown in Lemma 3.3, we have, as m.fwdarw.?,
In particular, v will be the uniform probability measure on [0, T] when the number m of sampling times tend to infinity, and the consecutive distances between them are equal.
[0153] For another case, consider a situation where the data from n drinking episodes are independent and identically distributed from replicates of the error distribution and canonical M, T, ?, v, where M is the distribution of m.sub.i, 1?i?n, making the summands in (33) i.i.d. When T<?a.s. Lemma 3.3 shows that the integrals in (33) are uniformly bounded, and one can show that as n.fwdarw.?.
where the expectation is taken over M, T, ? and v, whenever the expectation on the right hand side exists. We now present our main result regarding the least squares estimator for the diffusion model.
[0154] Theorem 3.1 Suppose the errors ?.sub.i,j, 1?i?n. 1?j?m.sub.i in model (27) are mean zero, uncorrelated and have constant positive variance ?.sup.2. With ?.sub.i and v the BrAC curve and the empirical measure of the observation times for episode i=1, . . . , n, we assume the existence of the limit
that ? is positive definite, and that (28) holds. Then there exists a consistent sequence of solutions to the estimating equation
(q)=0.
[0155] If in addition the errors ?.sub.i,j are i.i.d., and for some ?>0 satisfy E|?.sub.i,j|.sup.2+?=?.sup.2+?<?, then, along any such consistent sequence ,
[0156] When the errors ?.sub.i,j in (27) are Gaussian, then the least squares estimate that minimizes the sum of squares (6) is also maximum likelihood. In this case the contribution to the Fisher information from the single observation in (27) is obtained by taking the covariance matrix of the gradient of the log of the density of the observation,
[0157] Summing over the observation times yields m/?.sup.2 as in (33), hence taking the limit and comparing with the asymptotic variance obtained we see that for normal errors the least squares estimate of q achieves the lower bound of the information inequality in an asymptotic sense.
[0158] Before proceeding, we must verify the smoothness of the derivatives of ?.sub.?(t; q) in (26) with respect to q=(q.sub.1, q.sub.2). Because of the form of the dependence of the matrix A on q.sub.1 as given in (25), to differentiate ? with respect to q.sub.1 we will need to consider directional derivatives of matrix exponentials. For square matrices W and V of the same dimension and u?, define the first derivative of e.sup.uW in direction V by
[0159] We define higher order derivatives .sub.V.sup.k(u, W), k?0 in the natural way, with k=0 returning e.sup.uW. Now with A=q.sub.1D+E as in (25), we may represent the partial derivative with respect to q.sub.1 of e.sup.uA as
and extending to higher order derivatives we obtain
?.sub.1.sup.n(e.sup.(q.sup..sub.D.sup.n(u, A).(35)
[0160] For any n?0, letting B.sub.n be the (n+1)?(n+1) block matrix given by
[0161] A known theorem provides that
[0162] We now apply (37) to obtain bounds on higher order derivatives of the matrix exponential e.sup.uA with respect to q.sub.1.
[0163] Lemma 3.1 Let W and V be square matrices of the same dimension. Then for all n?0 the directional derivative .sub.V.sup.n(u, W) is analytic in u on
and satisfies the bound
?.sub.V.sup.n(u, W)??n!?e.sup.uB.sup.
,(38)
where B.sub.n is given by (36). For all n?0, q.sub.1? and A=q,.sub.1D+E, the partial derivative ?.sub.1.sup.ne.sup.Au exists, is analytic in q.sub.1 and satisfies ??.sub.1.sup.ne.sup.uA??n!e.sup.u?B.sup.
[0164] Proof: As the left hand side e.sup.uB.sup.
as any value over which the first supremum is taken can be achieved in the second by padding x and y with zeros in coordinates that are not in i and j, respectively. Hence, inequality (38) now follows from (37). The remaining claims now follow in light of (35).
[0165] We require the following result to handle the derivatives of matrix products. For k?0, Q as in (24), we say a matrix M depending on (q, u)?Q? is k-smooth if for any 0?j.sub.1, j.sub.2?k, the mixed partials ?.sub.1.sup.j.sup.
, and for any bounded subsets
?
and I?
,
We say M is smooth if it is k-smooth for all k?0.
[0166] Lemma 3.2 Let M.sub.i, i=1, . . . , d be matrices having dimensions such that we may form the product
If M.sub.1, . . . , M.sub.d are k-smooth then so is M.
[0167] Proof: The proof follows directly from the multivariate Leibniz rule that expresses the derivative ?.sub.1.sup.j.sup.
[0168] The next lemma provides us with additional smoothness estimates, and the forms of derivatives that later appear.
[0169] Lemma 3.3 For all u? the matrix function e.sup.AuB is smooth in q. If ?(?) is integrable on [0, T], then ?.sub.?(t; q) as in (26) is smooth in q, continuous for t?[0, T] and satisfies
|?.sub.?(t; q)|?q.sub.2e.sup.T(q.sup.
[0170] For q?, and
?.sub.1(e.sup.AuB)=?.sub.1(e.sup.Au)B and ?.sub.2(e.sup.AuB)=q.sub.2.sup.?1e.sup.AuB.(39)
[0171] Proof: That e.sup.Au is smooth follows from Lemma 3.1, and one easily verifies the smoothness of B directly from (25); hence, the product is smooth by Lemma 3.2. Differentiation under the integral is then justified by the dominated convergence theorem, from which the smoothness of ?.sub.?(t; q) in q then follows; continuity for t?[0, T] follows immediately from the integral representation (26). The claims in (39) follow by recalling that B=q.sub.2F.
[0172] We now begin to verify the conditions of Theorems 2.1 and 2.2.
[0173] Theorem 3.2 Suppose the errors ?.sub.i,j, 1?i?n, 1?j?m.sub.i in model (27) are mean zero, uncorrelated and have constant positive variance ?.sup.2. Assume in addition that the limit ? in (33) exists and is positive definite, and that (28) holds. Then with given by (??) and (6), the hypotheses of Theorem 2. 1 are satisfied with ? as in (33), a.sub.n=1 and any bounded neighborhood ?.sub.0?
of q.sub.0.
[0174] Proof: Let ?.sub.0 be any bounded neighborhood of q.sub.0. By Lemma 3.3, the partial derivatives of ?.sub.ij(q):=?.sub.?(t.sub.i,j; q) of (26) of all orders exist, and are continuous and uniformly bounded over ?.sub.0. Hence is twice continuously differentiable, with uniformly bounded derivatives, over ?.sub.0.
[0175] Now write the score function as
[0176] In particular,
[0177] Differentiating (40) and evaluating at q.sub.0, we find
(q.sub.0)=
(q.sub.0)?
(q.sub.0),(42)
where, using (33),
[0178] To show the first condition in (11), note that [
.sub.n(q.sub.0)]=0 as the error variables have mean zero. Next, using that the error variables are uncorrelated and have constant variance yields that the covariance matrix
=Var(
(q.sub.0)) is given by
[0179] The claim follows by noting that ?.sub.2?.sub.?(q)=(1/q.sub.2)?.sub.?(q) by Lemma 3.3, and that ??.sub.n?.fwdarw.0 as ?.sub.n.fwdarw.? and ?.sub.im.sub.i.fwdarw.? by assumption. For the second condition in (11), we recognize that .sub.n,1(q.sub.0)=?.sub.n, and can show that the components of
.sub.n,2(q.sub.0) have mean zero and variance converging to zero, so that the sum of these two matrices tends to ? in probability as n.fwdarw.?. The matrix ? is positive definite by assumption, so the last condition in (11) holds.
[0180] Lastly, we show that inequality (12) is satisfied. From the decomposition (40) we see that we may write ?.sub.k,lU.sub.n,r(q) as a difference of the form
for some functions g.sub.p,ij,p=1,2, where, by Lemma 3.3 and the product rule, for some K.sub.1
[0181] Hence, for the first component,
while for the second component,
[0182] Hence, for any ??(0,1), by Chebyshev's inequality, we may pick K.sub.2 such that P(|S.sub.n|?K.sub.2)??/8 for all n?1. Thus, setting K=K.sub.1+K.sub.2, we obtain, for all n?1,
[0183] The claim now follows by taking a union bound over the eight choices for k, l and r.
[0184] Theorem 3.3 Assume the errors ?.sub.ij, 1?i?n, 1?j?m.sub.i are i.i.d with mean zero, variance ?.sup.2 and for some ?>0 we have E|?.sub.ij|.sup.2+?=?.sup.2+?<?. Assume that (28) holds and that the limit ? as given in (33) exists. Then for .sub.n(q),
[0185] Proof: We verify (17) of Lemma 2.1. For the first condition there,
we see that the mean of b.sub.n.sub.n(q.sub.0) is zero, and by (33)
[0186] For the second condition of (17), write
[0187] By the assumption |?.sub.ij|.sup.2+???.sup.2+? and Lemma 3.3 there exists C such that
which tends to zero by (28).
[0188] We conclude this section with: Proof of Theorem 3.1: Theorems 3.2 and 3.3 show that the hypotheses of Theorems 2.1 and 2.2 are satisfied, yielding the claims for consistency and asymptotic normality. It remains to prove the claims on the consistency of the variance estimator. By (34), and letting m=?.sub.i=1.sup.nm.sub.i, we have
[0189] The first term tends to ?.sup.2 in probability by the weak law of large numbers. To handle the second term, letting
[0190] With B.sub.1 the unit ball centered at q.sub.0, Lemma 3.3 shows that the first derivatives of ?.sub.j(q) are uniformly bounded for (q, t)?B.sub.1?[0, T], that is, there exists some K>0 such that over this set
|?.sub.i j(q.sub.0)??.sub.i j(q)|?K?q.sub.0?q?.(45)
[0191] Let ??(0, K) be arbitrary, F={|q.sub.0?{circumflex over (q)}.sub.n|??/K} and note that
[0192] By Markov's inequality, for any ?>0,
using the non-negativity of R in the fourth inequality, and the consistency of q.sub.n when taking the limit. As ? can be made arbitrarily small we conclude that P(S??).fwdarw.0, and as t is arbitrary, that S.fwdarw..sub.p0.
[0193] Similarly decomposing the third term on the good event where q.sub.n is in B.sub.1, and the complentary event which tends in probability to zero, on the good event, applying the inequality (45), this last term is bounded as
which tends to zero in probability in view of the consistency of {circumflex over (q)}.sub.n.
[0194] Inference on the BrAC curve. In this section we obtain confidence bounds on a BrAC curve generated by a drinking episode of a subject in the field, and estimated using n TAC observations and an estimate q.sub.m computed from m measurements in a previous calibration experiment. Our notation here differs from that used in previous sections, the previous parameter n now being absorbed in the number m of total observations for calibration. Our uniform confidence bounds for the reconstructed BrAC curve are obtained by applying a variation on the standard multivariate delta method, using the properties provided by Theorem 3.1 on q.sub.m, and the assumed properties of the TAC measurement error.
[0195] We begin by specifying in detail how we obtain our estimate of the BrAC curve. Independently of q.sub.m, n TAC observations y.sub.n,1, . . . , y.sub.n,n are collected from a drinking episode at the increasing times 0?s.sub.n,1< . . . <s.sub.n,n?S, given by
y.sub.j=?.sub.?(s.sub.n, j; q.sub.0)+?.sub.n, j where ?.sub.?(s; q)=?.sub.0.sup.sCe.sup.A(s?u)B?(u)du(46)
as in (26), where ? is the unknown BrAC curve to be estimated, the matrices A and B depend on q as in (25), and C.sup.T is a given fixed vector.
[0196] To start, we assume only that the cross ?.sub.n,1, . . . , ?.sub.n,n are uncorrelated and have mean zero. We will allow for the possibility that the device used in the field may have different characteristics than the one used for calibration, and for now only impose the condition that the field noise variances are uniformly bounded above by
[0197] Assume v.sub.n, n?1, the empirical probability measure of the TAC observation times, has weak limit v.sub.0. For a given resolution level p?n we select a basis of p integrable functions {?.sub.k, 1?k?p} on [0, S]. The finite basis approximation of ? of order p at time s?[0, S], with coefficient vector ??.sup.p, is given by
{circumflex over (?)}(s; ?)=?(s).sup.T? where ?(s)=[?.sub.1(s), . . . , ?.sub.p(s)].sup.T?.sup.p.(47)
[0198] Substitution of ? by the approximation {circumflex over (?)}(s; ?) into the integral in (46) yields the predicted TAC values given at ?[0, S] by
?.sub.{circumflex over (?)}(s; q)=(?.sub.0.sup.sCe.sup.A(s?u)B?(u).sup.Tdu)?;=?(s; q).sup.T? where ?(s; q)?.sup.p.(48)
[0199] Now dropping the double index notation for simplicity, letting s.sub.n=(s.sub.1, . . , s.sub.n).sup.T, for a given q and a sequence of symmetric, non-negative definite matrices M.sub.n?R.sup.p?p, we take ?=?(s.sub.n, q)?.sup.p to minimize the objective function
[0200] By standard results in least squares estimation, when W(s.sub.n; q).sup.TW(s.sub.n; q) is full rank, the unique minimizer of J(?) is given by
[0201] We may also write these equations in a somewhat more convenient form. For n?1 let
and let these same formulas hold for n=0 upon setting ?.sub.0=0. Then, using the alternative notation ?.sub.n(q) for ?(s.sub.n, q), we recover (50) from
?.sub.n(q)=(G.sub.n(q)+M.sub.n).sup.?1(Z.sub.n(q)+?.sub.n(q))n?0,(52)
where, now assuming that the sequence of matrices M.sub.n has limit M.sub.0, we also define ?.sub.0(q) by (52) applying the stated convention that ?.sub.0=0. We note G.sub.n(q)+M.sub.n will be invertible whenever M.sub.n is positive definite. For notational simplicity in what follows, let
H.sub.n(q)=G.sub.n(q)+M.sub.n for n?0.(53)
When basing inference on the estimate q.sub.m obtained from a calibration session, as in (47), the estimated BrAC curve is given by ?.sub.n(s; q.sub.m), where
?.sub.n(s; q)=?(s).sup.T?.sub.n(q)n?0, s?[0,S].(54)
[0202] Next, define the Lipschitz (semi)norm of a real valued function g with domain ?
by
In order to control the variation in the estimate ?.sub.n(q) caused by that in q.sub.m, we introduce the following assumption.
[0203] Assumption 3.1 For a given sequence of empirical probability measures v.sub.n, n?1 with weak limit v.sub.0, all supported on [0, S], there exist a constant C and a sequence r.sub.n, n?1 of real numbers tending to zero as n.fwdarw.? such that
[0204] As v.sub.n([0, S])=v.sub.0([0, S]) the difference over which the supremum is taken in (55) is unchanged by replacing g(x) by g(x)+c for any constant c, and hence we may assume that g(0)=0. In particular, for x?[0,S] we then have that
[0205] When v.sub.n is the empirical probability measure of the n equally spaced observations made at times s.sub.i=Si/n, i=1, . . . , n, then the limit measure v.sub.0 is the uniform probability measure over [0, S]. Now,
and Assumption 3.1 holds with C=S.sup.2+S, and r.sub.n=1/n.
[0206] Alternatively, when v.sub.n is the empirical measure of times X.sub.1, . . . , X.sub.n, independent with common distribution v.sub.0 supported on [0, S], then Assumption 3.1 holds with r.sub.n=1/?{square root over (n)} with high probability. In particular, there exists a constant C such that
[0207] For S.sub.n=?{square root over (n)}W.sub.n, we have
P(S.sub.n?E[S.sub.n]+?{square root over ((4?g?E[S.sub.n]+2n?g?.sup.2)x)}+?g?x/3)?e.sup.?x for all x?0.
Using the bound (57), and recalling (56), with C being a constant not necessarily the same at each occurrence, we obtain
E[S.sub.n]+?{square root over (4?g?E[S.sub.n]+2n?g?.sup.2)x)}+?g?x/3 ?LS(C?{square root over (n)}+?{square root over ((C?{square root over (n)}+2n)x)}+x/3)?LSC(?{square root over (n)}+?{square root over (nx)}+x),
implying that, with r.sub.n=1/?{square root over (n)}
Hence, given ??(0,1), inequality (55) in Assumption 3.1 holds with probability at least 1?? for some constant C depending only on ? and r.sub.n=1/?{square root over (n)}.
[0208] We next pause to prove some technical results that will be invoked in Theorems 3.4 and 3.5; the partials inside the integral in (58) can be computed applying (39) of Lemma 3.3.
[0209] Lemma 3.4 The partial derivatives of ?(s, q) in (48) with respect to the i.sup.th component of q for i=1,2 exist and are given by
?.sub.i?(s, q).sup.T=?.sub.0.sup.sC(?.sub.ie.sup.A(s?u)B)?(u).sup.Tdu,(58)
are bounded and continuous as a function of s?[0,S] and continuous in q on as given in (24), and there exists a finite constant L such that on [0, S]
[0210] Further, at any q?. G.sub.n(q) and Z.sub.n(q) given in (51) are continuous and
?.sub.iZ.sub.0(q)=?.sub.0.sup.S?.sub.i?(s, q)?.sub.?(s, q.sub.0)dv.sub.0 and
?.sub.iG.sub.0(q)=?.sub.0.sup.S(?.sub.i?(s, q)?(s, q).sup.T+?(s, q)?.sub.i?(s, q).sup.T)dv.sub.0(60)
exist and are continuous. For ? in (52), the partials
?.sub.i?.sub.0(q)=H.sub.0(q).sup.?1?.sub.iZ.sub.0(q)?H.sub.0(q).sup.?1(?.sub.iG.sub.0)H.sub.0(q).sup.?1Z.sub.0(q)(61)
exist and are continuous at any q? for which H.sub.0(q).sup.?1 exists.
[0211] Proof: The claims for ?(s, q) and its partial derivatives follow directly from Lemma 3.3, and the integrability of ?.sub.k(u), k=1, . . . , p on [0,S]. The claims on the partials of G.sub.n(q) and Z.sub.n(q), that imply the continuity of these functions, follow from the continuity of ?.sub.?(s, q.sub.0) over s?[0, S] as provided by Lemma 3.3, the demonstrated properties of ?(s, q) and the dominated convergence theorem. The well known formula for differentiating matrix inverses yields (61) and the final claim, noting that the map taking a matrix to its inverse is continuous.
[0212] Lemma 3.5 Let G.sub.n(q) and Z.sub.n(q) be as in (51) for n?0. and suppose H.sub.0(q).sup.?1 exists for some q.sub.0?. Then there exists a compact set
?
with non-empty interior containing q.sub.0 such that H.sub.0(q).sup.?1 exists for all q?
, and if Assumption 3.1 holds then there exists a constant C such that for all n sufficiently large
and for all n sufficiently large and n=0.
When the field error variables ?.sub.1, . . . , ?.sub.n have mean zero, and are uncorrelated with variances uniformly dominated by
[0213] Proof: Denoting the i.sup.th largest eigenvalue of a symmetric matrix by ?.sub.i(?), by Weyl's theorem (see Theorem 4.3.1 of [Horn and Johnson, 2012]), for N.sub.1 and N.sub.2 symmetric matrices in .sup.p?p,
|?.sub.i(N.sub.1)??.sub.i(N.sub.2))|??N.sub.1?N.sub.2? for i=1, . . . , p.(65)
[0214] The matrix G.sub.0(q) is continuous in q by Lemma 3.4, hence H.sub.0(q) is likewise continuous. As G.sub.0(q) and M.sub.0 are non-negative definite for all q and H.sub.0(q) is invertible at q.sub.0 by assumption, the continuity of ?.sub.1(?) provided by (65) yields the existence of ?>0 such that ?.sub.1(H.sub.0(q))>? for all q in some bounded open subset of containing q.sub.0, and again by continuity this same inequality holds in the non-strict sense over the closure; the first claim in (63) follows.
[0215] By Lemma 3.4 and Assumption 3.1
for some constant C, thus proving the first claim of (62), and the final claim of (64). Since r.sub.n.fwdarw.0 as n.fwdarw.?, for all n sufficiently large CLr.sub.n<?/2, implying, by (65), that in ?.sub.1(H.sub.n(q))>?/2. Hence, for such n,
where the penultimate inequality follows from (65) and by noting that H.sub.n?H.sub.0=G.sub.n?G.sub.0. and the final inequality from (66). The proof of the the second claim in (62) is complete.
[0216] As the first claim in (63) holds for n=0, it holds for all n sufficiently large by the triangle inequality and the first claim in (62). Arguing as for the first claim in (62) and using the smoothness and continuity properties of ?.sub.?(s, q.sub.0) provided by Lemma 3.3, the second follows similarly as a consequence of Assumption 3.1; the second claim of (63) follows by the triangle inequality, as did the first. The final claim (64) follows directly from the definition (51) and the stated assumptions on the error terms.
[0217] Moving to the properties of ?.sub.n(q.sub.m), note the decomposition
?.sub.n(q.sub.m)??.sub.0(q.sub.0)=(?.sub.n(q.sub.m)??.sub.0(q.sub.m))+(?.sub.0(q.sub.m)??.sub.0(q.sub.0)),(67)
and for the first term, applying (52) we may write
We prove the following distributional limit theorem for the final term in (68).
[0218] Lemma 3.6 Assume H.sub.0(q).sup.?1 exists for q.sub.0?. In addition, let the errors ?.sub.ii?1 be independent, mean zero with common variance of ?.sub.?.sup.2?(0, ?), and suppose for some ?>0 and K>0 that E|?.sub.i|.sup.2+??K for all i?1. If ?{square root over (m)}(q.sub.m?q.sub.0)=O.sub.p(1). Assumption 3.1 holds. and that n, m.fwdarw.? so that ?{square root over (m)}r.sub.n.fwdarw.0, then
?{square root over (m)}H.sub.n(q.sub.m).sup.?1?.sub.n(q.sub.m)=?{square root over (m)}H.sub.0(q.sub.0).sup.?1?.sub.n(q.sub.0)+o.sub.p(1)(69)
and
?{square root over (n)}H.sub.n(q.sub.m).sup.?1?.sub.n(q.sub.m).fwdarw..sub.dY?(0, ?.sub.?.sup.2H.sub.0(q.sub.0).sup.?1G.sub.0(q.sub.0)H.sub.0(q.sub.0).sup.?1).(70)
The condition that ?{square root over (m)}(q.sub.m?q.sub.0)=O.sub.p(1) is implied by the conditions of Theorem 3.1, as they provide the stronger conclusion that this quantity converges in distribution.
[0219] Proof: By the consistency of q.sub.m for q.sub.0, we may assume without loss of generality that q.sub.m is contained in given in Lemma 3.5. Writing
magentaH.sub.n(q.sub.m).sup.?1?.sub.n(q.sub.m)
magenta=(H.sub.n(q.sub.m).sup.?1?H.sub.0(q.sub.m).sup.?1)?.sub.n(q.sub.m)+H.sub.0(q.sub.m).sup.?1(?.sub.n(q.sub.m)??.sub.n(q.sub.0))
magenta+(H.sub.0(q.sub.m).sup.?1?H.sub.0(q.sub.0).sup.?1)?.sub.n(q.sub.0)+H.sub.0(q.sub.0).sup.?1?.sub.n(q.sub.0),(71)
we show (69) by demonstrating that the first three terms tend to zero in probability after scaling by ?{square root over (m)}. We see the claim is true for the first term by virtue of the second inequality of (62) and (64) of Lemma 3.5.
[0220] For the second term, define
Let ?>0 be given and be as in Lemma 3.5. Since ?{square root over (m)}(q.sub.m?q.sub.0)=O.sub.p(1), there exists M such that
By the independence between q.sub.m and ?.sub.j, j=1, . . . , n,
E(A.sub.n,m1.sub.?.sub.
Hence, via the conditional variance formula, and that ?.sub.M,m is measurable with respect to q.sub.m, we obtain
where we used Lemma 3.4 in the first inequality. Hence, now invoking (72) and the ?>0 is arbitrary, the second term is o.sub.p(1) via (63).
[0221] For the third term we use the consistency of q.sub.m for q.sub.0, the continuity of H.sub.0(q) in q, and in addition (64). The proof of (69) is complete.
[0222] By (69) it suffices to show that
We apply Lemma 2.1, noting that the first condition in (17), that G.sub.m(q.sub.0) converges to G.sub.0(q.sub.0), holds by Lemma 3.5.
[0223] It remains to verify the second condition in (17). The vector ?(s, q) is uniformly bounded over s?[0, S] and q? by (59) of Lemma 3.4. Hence, for some constant C, as n.fwdarw.?,
thus completing the proof of the lemma.
[0224] We now prove a consistency result for ?.sub.n(q.sub.m), and apply it to show that the BrAC curve estimate converges uniformly in probability to ?.sub.0(s; q.sub.0), the unique function in the L.sup.2 space spanned by the first p basis elements that is closest to the true BrAC curve.
[0225] Theorem 3.4 Suppose that q.sub.m is consistent for q.sub.0 as m.fwdarw.?, that H.sub.0(q.sub.0) is invertible, and that the error variables ?.sub.1, . . . , ?.sub.n have mean zero, and are uncorrelated with variances dominated by
?.sub.n(q.sub.m).fwdarw..sub.p?.sub.0(q.sub.0),
and the reconstructed BrAC curve obeys ??.sub.n(?; q.sub.m)??.sub.0(?; q.sub.0)?.fwdarw..sub.p0.
[0226] Proof: Let ?
be given by Lemma 3.5. By the consistency of q.sub.m, for any given ? there exists m.sub.0 such that for all m?m.sub.0 the probability of E.sub.m={q.sub.m?
} is at least 1??. By the triangle inequality, to show ?.sub.n(q.sub.m) is consistent, it suffices to verify that the two terms on the right side of (67), with q=q.sub.m, both tend to zero in probability. The first term converges to zero in probability on E.sub.m by virtue of (68), Lemma 3.5 and Assumption 3.1. The second term tends to zero by virtue of the consistency of q.sub.m and the continuity of the function ?.sub.0(?).
[0227] The last claim follows from the first, using that ?(s) is bounded on [0, S], and from (47), yielding
[0228] Lastly, we determine the asymptotic distribution of the estimated BrAC curve, properly scaled, and show in (75) how uniform confidence bounds can be constructed asymptotically; an expression for the partials required for the computation of K in (73) is provided by (61) and (60).
[0229] Theorem 3.5 Suppose that
?{square root over (m)}(q.sub.m?q.sub.0).fwdarw..sub.d(0, ?.sup.2?.sup.?1)asm.fwdarw.?
for some invertible matrix ?, that G.sub.0(q.sub.0) is invertible, that Assumption 3.1 holds for r.sub.n ?{square root over (m)}r.sub.n.fwdarw.0, that ?.sub.i, i=1, . . . , n are independent mean zero random variables with variance ?.sub.?.sup.2 and uniformly bounded 2+? moments for some ?>0 and that sup.sub.k?1??.sub.k?<?.
If m/n.fwdarw.??[0, ?),
?{square root over (m)}(?.sub.n(q.sub.m)??.sub.0(q.sub.0)).fwdarw..sub.d(0, ?.sup.2K.sup.T?.sup.?1K+??.sub.?.sup.2G.sub.0.sup.?1(q.sub.0))
where K=?.sub.q?.sub.0(q.sub.0).sup.T,(73)
and
W.sub.m(s)=?{square root over (m)}(?.sub.n(s; q.sub.m)??.sub.0(s; q.sub.0)).fwdarw..sub.dW.sub.?, ?(s)
where W.sub.?, ?(s)=?(s).sup.T(?K.sup.T?.sup.?1/2Z.sub.1+?{square root over (?)}?.sub.?G.sub.0.sup.?1/2(q.sub.0)Z.sub.2)(74)
as processes on the space C[0, S] of continuous functions on [0, S], endowed with the supremum norm, where the ?.sup.?1/2 and G.sub.0.sup.?1/2(q.sub.0) are the the unique positive definite square roots of ?.sup.?1 and G.sub.0.sup.?1(q.sub.0) respectively, and Z.sub.1?(0, I) and Z.sub.2?
(0, I), are standard two dimensional Gaussian random vectors. In addition,
If m/n.fwdarw.?=?, then (73), (74) and (75) hold with the scaling ?{square root over (m)} replaced by ?{square root over (n)} and the parameters of the limiting distributions in those displays set to (?, ?)=(0,1).
[0230] Remark 3.1 The boundary case ?=0 corresponds to the situation where the number of observations taken in the field is so large that the variability of the resulting BrAC estimate depends only on the uncertainty in the parameter estimate q.sub.0. hence asymptotically equivalent to the situation where the field observations are taken without noise. At the other extreme, the case ?=? reflects the situation where the number of observations taken in the calibration experiment in the lab is so large that for the purposes of BrAC estimation, the parameter q.sub.0 is, in a practical sense, known.
[0231] Proof: By the delta method using that ?.sub.q?(q) is continuous in a neighborhood of q.sub.0 by Lemma 3.4, w obtain
?{square root over (m)}(?.sub.0(q.sub.m)??.sub.0(q.sub.0)).fwdarw..sub.d?U?(0,?.sup.2K.sup.T?.sup.?1K).(76)
Next we note that by the triangle inequality the first two terms in (68) tend to zero by the consistency of q.sub.m for q.sub.0, Lemma 3.5 and the condition ?{square root over (m)}r.sub.n.fwdarw.0. Now suppose that m/n.fwdarw.??[0, ?) by Lemma 3.6, and adopting the notation in (70).
Using (69) of Lemma 3.6 we see that Y is the distributional limit of a quantity not depending on q.sub.m, plus a term that tends to zero in probability, thus showing that U and Y are asymptotically independent. Hence
?{square root over (m)}(?.sub.n(q.sub.m)??.sub.0(q.sub.0)).fwdarw..sub.d?U+?{square root over (?)}Y,(78)
completing the proof of (73)
[0232] Letting ?(n, m)=?{square root over (m)}(?.sub.n(q.sub.m)??.sub.0(q.sub.0)), by the definition of W.sub.m in (74), ?.sub.n in (54), and the convergence in (78), for d?1 the finite dimensional distributions of W.sub.m at the times points 0?s.sub.1< . . . <s.sub.d?S converge to those of W.sub.0, as
[W.sub.m(s.sub.1), . . . , W.sub.m(s.sub.d)].sup.T=[?(s.sub.1), . . . , ?(s.sub.d)].sup.T?(n, m) .fwdarw..sub.d[?(s.sub.1), . . . , ?(s.sub.d)].sup.T(?U+?{square root over (?)}Y)=.sub.d[W.sub.?, ?(s.sub.1), . . . , W.sub.?, ?(s.sub.d)].sup.T.(79)
[0233] Define the modulus of continuity of a continuous function ?(s) on [0, S] by
[0234] The proof will be complete upon showing following two properties that together imply {W.sub.m, m?1} is tight: for every positive ?, there exists a such that
P(|W.sub.m(0)|>a)?? for all m?1,(80)
and for every positive ? and ?, there exists ?>0 and an integer m.sub.0 such that
P(?.sub.W.sub.
Condition (80) follows from (79) with d=1 and s.sub.1=0; as W.sub.m(0) converges in distribution, the sequence W.sub.m(0) is tight.
[0235] Let positive ? and ? be given. As ?(n, m) converges in distribution there exists C such that P(??(n, m)??C)?1?? for all m?1. Let
?=inf{?:?.sub.?.sub.
this quantity will be positive for all ?>0 as each basis function ?.sub.k, k=1,2, . . . , p is continuous on [0, S], and therefore uniformly continuous. Thus, with probability at least 1??, for |s?t|<?, 0?s, t?S and all m?1,
|W.sub.m(s)?W.sub.m(t)|=|(?(s)??(t)).sup.T?(n, m)|???(s)??(t)???(n, m)??C??(s)??(t)??C?.sub.i=k.sup.p|?.sub.k(s)??.sub.k(t)|?C?.sub.i=k.sup.p?.sub.?.sub.
from which (81) follows with m.sub.0=1.
[0236] Lastly, consider the case m/n.fwdarw.?. Scaling now by ?{square root over (n)} rather than ?{square root over (m)}, we see that
as the first term tends to zero and the second one converges in distribution. Hence, the only term contributing is (70), and the argument for the previous case carries through with essentially no modification.
[0237] As K and ? in (75) depend on the unknown q.sub.0, in practice these quantities can be estimated by their values at along a sequence of consistent estimates q.sub.m, m?1. As K and ? are continuous at q.sub.0, these resulting estimates will likewise be consistent. Similar remarks apply as to the estimation of ?.sub.?.sup.2 and G.sub.0(q.sub.0).
[0238] Remark 3.2 The regularization matrix M.sub.n in the objective function (49) is used to avoid numerical instability; details on the relevant choice of M.sub.n used here can be found in Section 4. However, regularization can induce bias. To illustrate, assume for some p that ?(s)?span{?.sub.1(s), . . . , ?.sub.p(s)}, that is,
?(s)=?(s).sup.T?.sub.* for some ?.sub.*?.sup.p.(82)
In the limiting case n=0 of (52) for q=q.sub.0, in light of (47), (48) and (82), we obtain B.sub.0(q.sub.0)=(G.sub.0(q.sub.0)+M.sub.0).sup.?1Z.sub.0(q.sub.0)=(G.sub.0(q.sub.0)+M.sub.0).sup.?1G.sub.0(q.sub.0)?.sub.*. In particular, the limiting coefficient vector ?.sub.0(q.sub.0) ? be biased for the true ?.sub.8 unless M.sub.0=0.
[0239] Regularization Details (Section 4). For u in the span of a basis {?.sub.i, i=0, . . . , p} of differentiable functions on [0, T], there exists a unique vector ?=[?.sub.0, . . . , ?.sub.p].sup.T?.sup.p+1 such that
where ?(t)=[?.sub.0(t), . . . , ?.sub.p(t)].sup.T?.sup.p+1. In this case, we the express the L.sup.2 norm of ? and its derivative as a quadratic form involving matrices R and S given by
?.sub.0.sup.T?.sup.2(t)dt=?.sub.0.sup.T(?(t).sup.T?).sup.T?(t).sup.T?dt=?.sup.T[?.sub.0.sup.T?(t)?(t).sup.Tdt]?=:?.sup.TR?,
and
?.sub.0.sup.T[?(t)].sup.2dt=?.sub.0.sup.T(?(t).sup.T?).sup.T?(t).sup.T?dt=?.sup.T[?.sub.0.sup.T?(t)?(t).sup.Tdt]?=:?.sup.TS?,
[0240] We regularize using a linear combination of these penalty terms, as
M=??.sub.0.sup.T?.sup.2(t)dt+??.sub.0.sup.T[?(t)].sup.2dt=?R+?S.
We specialize now to the chapeau functions given by
and letting r=T/p, we claim
We note that the result of zero when |i?j|?2 follows immediately from the fact that ?.sub.i and ?.sub.j have disjoint support in this case.
[0241] Defining
?.sub.?(t)=(1+t)1.sub.[?1,0] and ?.sub.+(t)=(1?t)1.sub.[0,1],
we may write
?.sub.j(t)=?.sub.?(t/r?j)1.sub.j?1+?.sub.+(t/r?j)1.sub.j?p?1.
We note that
?.sub.0.sup.1?.sub.+.sup.2(t)dt=?.sub.0.sup.1?.sub.?.sup.2(t)dt=?
implying ?.sub.0.sup.T?.sub.+.sup.2(t/r?j)dt=?.sub.0.sup.T?.sub.?.sup.2(t/r?j)dt=r/3,(84)
and that
?.sub.0.sup.1?.sub.?(t?1)?.sub.+(t)dt=?.sub.0.sup.1t(1?t)dt=?
implying ?.sub.0.sup.T?.sub.?(t/r?(j?1))?.sub.+(t/r?j)dt=r/6.(85)
[0242] Further, as ?.sub.?(t)=1.sub.[0,1] and ?.sub.+(t)=?1.sub.[0,1], we have
?.sub.j(t)=r.sup.?1(1.sub.[(j?1)r,jr]1.sub.j?1?1.sub.[jr,(j+1)r]1.sub.j?p?1).(86)
For i=j, by (84) we obtain
?.sub.i, ?.sub.i
=?.sub.0.sup.T?.sub.i.sup.2(t)dt
[0243] For the remaining case |i?j|=1, by relabelling we may assume i=j?1, j=1, . . . , p. In this case, by (85) we have
?.sub.j?1, ?.sub.j
=?.sub.0.sup.T?.sub.j?1(t)?.sub.j(t)dt=?.sub.0.sup.T?.sub.?1(t/r?(j?1))?.sub.+(t/r?j)dt=r/6.
Likewise, using (86),
?.sub.i, ?.sub.i
=?.sub.0.sup.T[?.sub.i(t)].sup.2dt=r.sup.?2?.sub.0.sup.T(1.sub.[(j?1)r,jr]1.sub.j?1+1.sub.[jr,(j+1)r]1.sub.j?p . . . 1)dt=r.sup.?1(h1.sub.j?1+1.sub.j?p?1)
and for j=1, . . . , p, noting that
(1.sub.[(j?2)r,(j?1)r]1.sub.j?2?1.sub.[(j?1)r,jr]1.sub.j?p)?(1.sub.[(j?1)r,jr]1.sub.j?1?1.sub.[jr,(j+1)r]1.sub.j?p?1)=?1.sub.[(j?1)r,jr]1.sub.1?j?p,
we obtain
?.sub.j?1, ?.sub.j
=?r.sup.?2?.sub.0.sup.T1.sub.[(j?1)r,jr]1.sub.i?j?pdt=?r.sup.?11.sub.1?j?p
Transdermal Blood Alcohol Monitoring: Simulations and Data Analysis (Section 5)
[0244] In both the simulation and real data study presented below we investigate the case where data are collected from a single drinking episode. The computations were carried out in MATLAB and the optimization producing the estimate of the parameter q was solved using the Optimization Toolbox routine FMINCON.
[0245] Simulation Studies (Section 5.1) Firstly, our simulation study aims to validate our theoretical results on the consistency and asymptotic normality of the parameter estimate given in Theorem 3.1, and to also illustrate the practical impact of the number of observations on its behavior.
[0246] To reflect a simple real-world situation, BrAC was simulated using a small but realistic drinking diary that consists of a single drink 6 minutes after the beginning of the drinking session. BrAC was computed using the Michaelis-Menten approach that models the metabolic effects of the ethanol specific enzymes ADH and ALDH typically found in the liver, and also known to be present in trace amounts in the skin.
[0247] For simplicity, we set q.sub.0=(1,1) to be the true value of the parameter q and T=1 hour to be the duration of the drinking session. Also for simplicity we consider the following choice of vectors and matrices in (4) and (5), D=I.sub.2, E=O.sub.2, C=(1,0) and F=(1,0).sup.T. Then, equally spaced TAC measurement were calculated after adding independent error terms each distributed as (0, ?.sup.2) with ?=0.01 to the expression given by (26).
[0248] Calculating the theoretical limiting covariance matrix in Theorem 3.1 we obtain
[0249] A comparison between ? and the scaled sample covariance matrices of {circumflex over (q)} is shown in Table 1, validating the theoretical results, and showing that, for the current set of parameters, 60 observations gives a reasonably close estimate to the true values.
TABLE-US-00001 TABLE 1 Number of TAC Mean Parameter Scaled Sample observations Estimate Covariance Matrix 20
[0250]
[0251] In a second experiment, our simulation study aims to validate the results of Theorem 3.5 and more specifically to provide confidence bounds for the reconstructed BrAC curve using the result in (75). We use ?(s; ?)=?0.2s(s?1) to generate the true BrAC curve and choose the orthonormal polynomial basis ?(s)=[?{square root over (3)}s, ?{square root over (80)}(s.sup.2?0.75s)].sup.T?.sup.2. Further, according to Theorem 3.5 we let q.sub.m to be generated from
(0, ?.sup.2?.sup.?1) where for simplicity we take ?.sup.2=1 and ? to be the identity matrix.
[0252] The running time of these experiments may be long due to the computation in (4) of the matrix exponential of A, which in general is not symmetric. For that reason, its worth noting that speed can be improved using the following diagonalization procedure. Letting ?.sub.i, i=1, . . . , n be the basis for the finite dimensional approximation discussed in xxx, define matrices K.sub.1, K.sub.2 and M by
K.sub.1,ij=?.sub.0.sup.1?.sub.i(u)?.sub.j(u)du, K.sub.2,ij=?.sub.i(0)?.sub.j(0) and M.sub.ij=?.sub.0.sup.1?.sub.i(u)?.sub.j(u)du.
Then
[0253]
D=?M.sup.?1K.sub.1, E=?M.sup.?1K.sub.2 and A=q.sub.1D+E=?M.sup.?1(q.sub.1K.sub.1+K.sub.2).(87)
We note that the matrices M and q.sub.1D+E are symmetric. Multiplying the final expression for M in (87) on the left and right by M.sup.1/2 and M.sup.?1/2 respectively we obtain
M.sup.1/2AM.sup.?1/2=?M.sup.?1/2(q.sub.1K.sub.1+K.sub.2)M.sup.?1/2.(88)
Note that the right hand side of (88) is symmetric and therefore we may apply the spectral theorem and write
M.sup.1/2AM.sup.?1/2=S.sub.q.sub. M.sup.1/2tAM.sup.?1/2=S.sub.q.sub.
where S.sub.q.sub.
M.sup.1/2(tA).sup.kM.sup.?1/2=(M.sup.1/2(tA)M.sup.?1/2).sup.k=S.sub.q.sub.
which implies
[0254] Real Data Analysis (Section 5.2). This data set was collected by a SCRAM (Secure Continuous Remote Alcohol Monitor by Alcohol Monitoring Systems, Inc.) alcohol biosensor worn by a subject, which, by using fuel-cell technology, measures TAC in terms of local ethanol vapor concentration over the skin surface. Measurements were taken and recorded at non-equally spaced times. In addition, non equally spaced breath measurements were collected, at times that may not have coincided with those of the TAC.
[0255] The data consists of 70 TAC and 28 BrAC observations collected during a single drinking session. The observations were taken over 6.3 hours and both TAC and BrAC observations were taken approximately every 10 minutes. BrAC was measured and recorded at the start of the drinking session and continued until it returned to 0.000. TAC was first measured 67 minutes after the first BrAC measurement and continued until it returned to 0.000. The TAC measurements provided by the sensor are in units of milligrams per deciliter (mg/dl), and the BrAC measurements are in units of percent alcohol.
[0256] For the data analysis, we used k=4 in (26) and computed the matrices C, D, E and F as outlined there. We discretized the given time interval into 300 equal length sub-intervals. over each of which the BrAC is approximated as a constant value determined by interpolating to known BrAC values closest to the endpoints. Minimizing (6) resulted in the estimator {circumflex over (q)}=(0.5577,0.7550)
[0257] Further, we estimate the matrix ? defined in (32) using {circumflex over (q)} in place of q in Lemma 3.3 to take the inner derivatives, and a Riemann sum approximation on the outer integral. Using the q and ? estimates so obtained, and choosing an orthonormal basis in (47) to be such that the reconstructed BrAC curve returns the value 0 at the start of the drinking episode, we conclude via cross validation that a degree p=7 polynomial, computed using (47), provides the best fit to the BrAC curve. Lastly, ?.sub.n(q.sub.m) and the estimated BrAC curve were calculated as in (52) and (54) respectively.
[0258] Uncertainty Quantification in Estimating Blood Alcohol Concentration from Transdermal Alcohol Concentration with Physics-Informed Neural Networks. Having discussed M-estimation in a diffusion model with applications for biosensor transdermal blood alcohol monitoring, the discussion will now shift to uncertainty quantification for the estimation of blood alcohol concentration using physics-informed neural networks. This model may be implemented in one or more embodiment of
[0259] Producing meaningful quantitative measures of alcohol consumption in naturalistic settings is a challenging task for researchers and clinicians. Typically, they rely on the use of a breath alcohol analyzer or a drinker's self report. Unfortunately, both methods have their shortcomings: Obtaining deep lung samples (alveolar air) needed for accurate results can be difficult, and often alcohol contained in the mouth after drinking contaminates the results. Also, the procedure does not allow for continuous measurements. Self reports on the other band might be inaccurate as well, especially since it is known that alcohol directly affects the memory function of the brain.
[0260] Measuring the transdermal alcohol concentration (TAC) creates a possible alternative for the tracking of alcohol consumed. In recent years, biosensor devices for this purpose have been developed. The availability of TAC measuring devices allows for near-continuous measurements of alcohol consumption and helps researchers to gain insight into alcohol metabolism and drinking behavior. In addition. TAC sensors collect the data passively, i.e. contrary to self reports or breath alcohol analyzers no active participation of the subject is required
[0261] However, researchers interested in drinking behavior and alcohol consumption typically base their studies on breath alcohol concentration (BrAC) or blood alcohol concentration (BAC), and it was shown that BAC and BrAC reasonably agree. Hence, to make TAC sensors useful for alcohol research, the need to convert TAC signals to BAC/BrAC signals arises. Unfortunately, the direct conversion from TAC to BAC/BrAC proves to be difficult due to many confounding factors. Differences between devices and varying environmental conditions such as temperature and humidity lead to variations in the measurements. Intra- and inter-individual variations are another source that poses a challenge in the direct conversion from TAC to BAC/BrAC. The porosity and thickness of an individual's skin, the subject's drinking behavior, hydration and vasodilation are important factors in the functional relationship between BAC/BrAC and TAC.
[0262] There may be different approaches to overcome these difficulties. Some approaches use deterministic models for the relationship between BAC/BrAC and TAC. Some are based on regression models, while others model the transport of the alcohol from the blood through the epidermis by a one-dimensional diffusion equation with unknown parameters. Those parameters are then fit to an individual drinking session, known as an alcohol challenge. This method had two major caveats. First, this method required an alcohol challenge for each individual before the device is applied in the field and secondly, it did not account for the presence of natural variation and uncertainty. Indeed, parameters calibrated via an alcohol challenge could yield inaccurate results in a more naturalistic drinking setting. One data-driven, machine learning-based approach uses random forest-like, Extra-Trees.
[0263] Some approaches consider the unknown parameters as random variables and estimate their distribution by fitting a population model to a range of training data across varying subjects, devices and environmental conditions. Using the estimated distribution of the parameters it is not only possible to deconvolve the BAC/BrAC signal using the most likely parameter values, but conservative error bands can be obtained to measure and quantify the corresponding uncertainty. One work on this approach uses a least squares estimator based on naive pooled data, while another uses a Bayesian approach to find a posterior distribution of the parameters.
[0264] In various embodiments, work is disclosed herein below that relates to these approaches in that it yields a nonparametric distribution of the unknown parameters and conservative error bands for the deconvolved BAC/BrAC signal based on developments in the field of neural networks. Generative adversarial networks (GANs) are a class of neural networks that is able of generating artificial data with the same statistics as the training set. In this data-driven approach, large amounts of data are required to train the model. In clinical alcohol research, such data is typically not available. Indeed, as a result of the above mentioned difficulties, the acquisition of drinking session data is labor intensive and expensive. Moreover. by the very nature of the problem, only blood/breath alcohol and the transdermal alcohol can be measured. Data in the domain between blood vessels and skin is clearly unobservable.
[0265] To account for this situation, some treatments penalize the loss function of deep neural networks to incorporate physical knowledge of the problem into the training process. In some instances, a class of physics-informed neural networks (PINNs) was established. A framework for uncertainty propagation in physical systems may only allow for small training sets, but where prior information is available in the form of governing physical laws. In the present work, we aim to further develop this framework for the conversion of TAC to BrAC. Using a one-dimensional diffusion equation as a model for the alcohol transport through the epidermal layer of the skin, we train a physics-informed generative adversarial network with available drinking session data to yield estimates for the distribution of the unknown parameters. Then, in a second step, we employ a simple PINN for the deconvolution of the BAC/BrAC signal.
[0266] An outline of the remainder of the paper is as follows. We present our underlying mathematical model (Section 6) for alcohol transport through the epidermal layer of the skin. Then, we describe the probabilistic formulation and the generative adversarial network in detail (Section 7). We propose a physics-informed network for the deconvolution of the BAC/BrAC signal (Section 8). Then we demonstrate the efficacy and evaluate the performance of our approach through numerical studies using human subject drinking session data (Section 9).
[0267] Mathematical Model (Section 6) A family of first-principles, physics-based models have been proposed for the transport of ethanol through the epidermal layer of the skin. Common to all of these treatments is that fundamentally, they all rely on Fickian-diffusion as the underlying mechanism by which ethanol molecules propagate from the blood vessels in the dermal layer of the skin to the outer surface of the skin. Where the models do, on occasion, differ is in how they treat boundary phenomena. We note that modifying our general approach to accommodate any of the models would be straight forward.
[0268] In the following sections, the numbering of equations will begin again with (1). The corresponding system of equations, once the spatial and temporal variables have been transformed to be dimensionless, is given by
[0269] Here, t is the temporal variable and ? is the spatial variable, where ?=0 is at the surface of the skin and ?=1 is at the boundary between the epidermal and dermal layers of the skin. Note that dermal layer cells have an active blood supply, while epidermal cells do not. The alcohol concentration in the epidermis at time t and depth ? is denoted by x(t, ?), u(t) is the BrAC/BAC level and y(t) denotes the TAC level at the skin surface. Note that x(t, ?) is inherently unobservable for ??0. The parameter q.sub.1 represents the normalized diffusivity of the epidermal layer and the parameter q.sub.2 describes the flux gain from the blood alcohol. These parameters are unknown and, as described above, they vary between individuals and drinking episodes In the following, we assume (q.sub.1, q.sub.2) to be random and we aim to estimate their joint distribution.
[0270] Physics-Informed Adversarial Learning (Section 7). A probabilistic formulation is available for propagating uncertainty through physics-informed neural networks using latent variable models of the form x=x(t, ?, z), z?d, s.t. x.sub.t+.sub.?x=0.
[0271] Here, z is the latent variable that has distribution d. We will assume d to be the standard normal distribution, but other continuous distributions are possible as well. Since z is a random variable, x(t, ?, z) is a random field and we will write p(x|t, ?, z) for the conditional density of x, knowing that t and ? are deterministic. However, given data, t and ? have some distribution in that data and so in this sense they can be assumed to be random and it is also possible to sample from those empirical distributions for t and ?. Further, .sub.? is a general differential operator. The random field x is approximated as x(t, ?, z)?x.sub.?(t, ?, z) by a deep neural network with the parameter set ?.
[0272] The main idea behind this approach is to combine all random effects and uncertainty into a single (possibly multidimensional) latent variable. That way, one can sample from the distribution of the latent variable z and propagate this through the neural network to yield samples of the random field x that reflect the uncertainty. To this end, the use of a generative adversarial net is proposed. Fundamentally, a GAN consists of two competing neural nets: The generator net tries to produce new data that is distributed as the training data. This new data is presented to the discriminator net that classifies the sample either as an actual sample or as a generated sample. Hence, the generator aims to fool the discriminator and the discriminator tries not to be fooled.
[0273] Kullerback-Leibler Based Training (Section 7.1). We use a learning mechanism for the generator that tries to match the joint distribution of the observed data q(t, ?, x) with the joint distribution of the generated data p.sub.?(t, ?, x) (the subscript ? denotes the parameters of the generator net). Such a matching can be achieved by minimizing the reverse Kullback-Leibler divergence of p.sub.?(t, ?, x) and q(t, ?,x). The Kullback-Leibler divergence is a measure of how different two distributions are, and by minimizing this divergence, we encourage the generator to produce samples that are distributed as the training data. The (reverse) Kullback-Leibler divergence is given by
This can further be written as
(p.sub.?(t, ?, x)?q(t, ?x))=?H(p.sub.?(t, ?, x))?
.sub.p.sub.
where H(p.sub.?(t, ?, x))=.sub.p.sub.
.sub.p.sub.
[0274] When minimizing (6) with respect to the generator parameters ?, we face the issue that we only have samples from the p.sub.? and the q distribution; the distributions themselves remain unknown. A general technique to approximate the density ratio of two distributions given only samples is based on a discriminator network T that acts as a binary classifier. Given N data points drawn from p.sub.?(t, ?, x) labeled y=+1 and N data points drawn from q(t, ?, x) labeled y=?1, the probabilities can be written as conditionals
p.sub.?(t,?,x)=?(t, ?, x|y=1), q(t, ?, x)=?(t, ?, x|y=?1).
[0275] Then, the discriminator T is defined by T(t, ?, x)=?(y=1|t,?,x) and using Bayes rule, the density ratio can be computed by
[0276] Another problem that arises when minimizing (7) is the computation of H(p.sub.?(t, ?, x)) due to the fact that p.sub.?(t, ?, x) is unknown a priori. Hence a computable lower bound for the entropy term is derived. Introducing a variational distribution q(z|t, ?, x) represented by an encoder net q.sub.?(z|t, ?, x) (the subscript ? denotes the parameters of encoder net), this bound reads
H(p.sub.?(t, ?, x))?H(z)+.sub.p.sub.
[0277] Here, the variational distribution q(z|t, ?, x) can be understood as a posterior distribution over the latent variable z, conditioned on t, ? and x. We will return to this in section 3.4
[0278] Using this entropy bound and the method for estimating the density ratio based on samples, the following loss functions for minimization of the reverse Kullback-Leibler divergence can be defined:
.sub.D(?)=
.sub.q(t,?)d(z)(log(?(T.sub.?(t, ?, x.sub.?(t, ?, z))))) +
.sub.q(t,?,x)(log(1??(T.sub.?(t, ?, x)))(9)
.sub.G(?, ?)=
.sub.q(t,?)d(z)(T.sub.?(t, ?, x.sub.?(t, ?z))) +(1??)log(q.sub.?(z|t, ?, x.sub.?(t, ?, z))).(10)
[0279] Here, the subscript D denotes the discriminator loss, the subscript G denotes the generator loss, ?(y)=1/(1+e.sup.?y) is the logistic sigmoidal function, and the subscript ? denotes the parameters of the discriminator network. The subscripts in the expectation denote the corresponding distributions. That is, the subscript q(t, ?)d(z) means that t and ? are to be sampled from the empirical data distribution and z should be sampled from its prior d(z). It is clear that the generator aims to reduce the Kullback-Leibler divergence as much as possible, i.e. it strives for a minimum. The discriminator, on the other hand, tries to maximize its ability to correctly classify data samples and generated samples. This can be well seen in the discriminator loss. On the generated data samples, (t, ?, x.sub.?(t, ?, z)), the discriminator, T.sub.?, should be large so that log(?(T.sup.?)) becomes large, and on the empirical data samples. (t, ?, x), the discriminator should be low so that log(1??(T.sup.?)) becomes large. Typically, such a model is trained by alternating between a minimization of the generator loss over the parameters ? and ? and a maximization of the discriminator loss over the parameters ?.
[0280] Integration of the Physical Model (Section 7.2). Up to this point, the proposed method resembles a adversarial neural network. Typically, those networks are trained with large amounts of data. In our case, however, due to the expense of data collection and the unobservability of data in the regime ??0, we only have a small training data set available. Thus, the pure data-driven approach of GANs will no longer work. We therefore resort to the idea of augmenting the above loss functions by information obtained from the physics of the problem. This is where the model from above proves to be of high value: The strong prior knowledge about the problem in form of a partial differential equation can be used to train the network. That way, a hybrid between pure data-driven approaches and physics-driven methods is created, a physics-informed neural network.
[0281] As a first step, we introduce two additional neural nets q.sub.1.sub.
[0282] Note that in this formulation, we treat the residual as a deterministic value, i.e. we set x.sub.?(t, ?, z)=x.sub.?(t, ?) as well as q.sub.1.sub.
[0283] Now, we augment the generator loss with a scaled version of the PDE loss as
.sub.G(?, ?)+?
.sub.PDE(?, ?, v).(12)
That way, for ?>0, the introduced PDE residual acts as a regularization term that leads the generator to create samples that approximately satisfy the diffusion equation model (1)-(5). The precise choice of ? is a tuning between the dominance of the data on the one hand, and the dominance of the physics on the other. As shown herein, the value of ? influences the result, so it has to be chosen experimentally to yield a good balance between data and physics.
[0284] We also want to emphasize that the augmentation of the generator loss with the physics information is the core element in the estimation of q.sub.1 and q.sub.2. By minimizing the combined loss, the network parameters ? and v are also adjusted such that the obtained estimates q.sub.1.sub.
[0285] Combining all of this together and using the loss functions, we ultimately obtain the following minimax problem for the generator and the discriminator
[0286] To see how the observable data for TAC and BAC/BrAC enter the training process, note that in the diffusion model (1)-(5), the TAC data acts as a Dirichlet output. We can thus directly use the TAC data as training data for the generator. The handling of the BAC/BrAC data, however, is more involved. The BAC/BrAC is a Neumann-type input of (1)-(5) and so it is not represented by x.sub.?(t, ?, z) for some values of t and ?. Hence, we only incorporate the BAC/BrAC data using the PDE loss. So the model is encouraged to match the distribution of the TAC data by minimizing the Kullback-Leibler divergence and it is encouraged to match the BAC/BrAC data and to obey the physical model by minimizing the residuals of the equations. The interplay between those two objectives is governed by the tuning parameter ?.
[0287] Estimating the Parameter Distribution (Section 7.3). After training the generative model. it remains to estimate the resulting distribution of (q.sub.1, q.sub.2) By design, q.sub.1.sub.
[0288] Posterior Distribution of the Latent Variable (Section 7.4). As mentioned, the entropic regularization requires the introduction of an additional encoder network q.sub.?(z|t, ?, x). While this might seem like a complication to the model which in and of itself is not all that useful, in fact, the encoder offers a remarkable advantage. During the training process, the encoder network learns the best, i.e. most likely, latent variables given the data. So, based on the TAC and BAC/BrAC data, the encoder network yields a posterior distribution over the latent variable conditioned on the training data. Moreover, since the encoder network is involved in the training of the generator which is physics-informed, the posterior for the latent variable will also be physics-informed. Thus, as a byproduct, we obtain a posterior distribution of the latent variable that is both data- and physics-informed. In the context of our given problem, this is very appealing. Instead of a direct sampling from the prior for the latent variable and the subsequent passing the samples through the parameter networks q.sub.2.sub.
[0289] Network Design (Section 7.5). The accuracy of the model is highly dependent on the architecture of the network. The formulation of the given problem only allows for observable data at ?=0. i.e. the TAC data, and additionally the BAC/BrAC data. Points inside the domain are inherently unobservable, hence they cannot be used as training data Consequently, the training data is very sparse. A formulation allowing for only a relatively few training data favors the discriminator network, i.e. it is easy to classify samples into generated samples and actual samples. However, when the discriminator network is too strong, the generator gains little information from the discriminator and the ability of the generator network to learn is impaired.
[0290] To account for this, we use a discriminator network with a low capacity compared to the other networks. This can be achieved in two different ways: First, we can decrease the capacity of the discriminator by choosing a network design that involves fewer hidden layers and neurons per layer. Secondly, we can strengthen the generator by allowing more learning steps in the alternating learning process. That way, we enable the generator to train a certain number of steps for a given discriminator before the discriminator improves further.
[0291] Deconvolution of the Input Signal (Section 8). After estimating the distribution of (q.sub.1, q.sub.2), the next step is to deconvolve the BAC/BrAC signal from the TAC signal. Here, we want to employ a simple physics-informed neural network for the deconvolution process. Given the TAC signal, the output of the network is x.sub.?(t, ?, q.sub.1, q.sub.2), the (unobservable) alcohol level in the epidermal layer. To this end, we use the only available training data, the TAC signal consisting of N.sub.T data points, and set up the TAC-related loss
This way, the network is encouraged to match the provided TAC signal at ?=0.
[0292] Using the penalty approach for the incorporation of the PDE described above, we augment the loss by a PDE-related loss
[0293] The complete loss is now given as =
.sub.r+?.Math.
.sub.PDE. Once the network is trained for the specific drinking episode using the TAC signal, the BAC/BrAC signal can then be estimated using equation (3) and automatic differentiation. Note that q.sub.1 and q.sub.2 are inputs of the network and are included in the training process. So, in order to obtain BAC/BrAC estimates for varying values of q.sub.1 and q.sub.2, only a simple forward pass through the network is required. This enables us to directly use the available sample of the joint distribution for (q.sub.1, q.sub.2) to produce BAC/BrAC estimates based on this sample. Hence, it is easy and time-efficient to come up with conservative error bands. In our discussions below, in the interest of brevity, we will refer to these conservative error regions simply as error regions.
[0294] Numerical Results (Section 9). The computations we report on here were based on a set of 150 recorded drinking episodes gathered in the laboratory. In each drinking episode, the BAC/BrAC signal was recorded as well as a TAC signal from two different biosensors. Some of those drinking sessions were recorded using a different test protocol, i.e. the TAC sensor was worn on a leg instead of an arm. We removed those sessions, so that we are left with a set of 126 drinking episodes as the basis for our numerical studies. All algorithms were implemented in Tensorflow and the corresponding computations were executed on a NVIDIA Tesla T4 GPU card.
[0295] The GAN model was trained for 50,000 iterations using the Adam optimizer. The learning rate was set to 10.sup.?4 and the ratio for the generator and discriminator updates was set to 10. For the entropic regularization we used the value ?=1.5 which was found to be suitable in prior works. If not stated explicitly, the penalty parameter ?=1 was used. In some examples however, we used ?=4 to reflect the fact that the problem is more physics-driven than data-driven. The dimension of the latent variable space was chosen to be 1. The network topology for the generator and the encoder consisted of four hidden layers with 50 neurons each, whereas the discriminator network only had one hidden layer with 20 neurons. As was indicated, this accounts for the small number of available training data sets. The networks for q.sub.1 and q.sub.2 each have two hidden layers with 50 neurons. We used N.sub.b=126,252 boundary training data together with N.sub.i=20,000 initial training data, i.e. N.sub.u=146,252, and N.sub.r.sub.
[0296] The deconvolution network was trained for 30,000 iterations using the Adam optimizer. This network had five bidden layers with 50 neurons each. We used N.sub.r.sub.
[0297]
[0298]
[0299] One of the main goals of this work is to estimate the distribution of the random parameters (q.sub.1, q.sub.2).
[0300] The histogram of the posterior latent variable is given.
TABLE-US-00002 TABLE XX Estimated Parameter Statistics latent Mean Mean Radius of m variable ? value q.sub.1 value q.sub.2 error circle 88 prior 1 1.0935 0.8528 0.5148 88 posterior 1 1.0639 0.8527 0.3256 126 prior 1 0.9573 0.8083 0.5766 126 posterior 1 0.9430 0.7915 0.4678 126 prior 4 1.1764 1.1582 0.6620
[0301] Using the estimated joint distribution for (q.sub.1, q.sub.2), we can use the deconvolution network described above to recover the BAC/BrAC signal and to find error bands. By sampling the joint distribution, we compute the mean parameter values to get the mean predicted BAC/BrAC signal. We also take a radius around this mean such that 90 per cent of the samples fall into this circle. Then, we use these samples to find the BAC/BrAC predictions corresponding to the parameter values. Note that after training the deconvolution network, this process only requires forward passes through the network. At each time, we use the maximal and the minimal value of these predicted signals to form conservative error bands.
[0302]
[0303] In this work, we have proposed a stochastic physics-informed generative adversarial network for the estimation of an unknown parameter distribution in the context of an input/output model for the transport of alcohol through the epidermal layers of the skin. Based on these estimated distributions, we designed a simple physics-informed network for the deconvolution of the BAC/BrAC signal from the TAC signal. Our approach using physics-informed learning techniques is novel in the realm of this application. The stochasticity of this approach further allowed us to obtain error bands for the estimated signal. Moreover, we employed an encoder network, introduced as an entropy regularization, to gain a posterior distribution over the latent variable which provides a means to sharpen the error region. Finally, we demonstrated the performance of this method with a range of numerical examples using a human subject data set consisting of 126 drinking episodes.
[0304] Discrete-Time Linear Quadratic Gaussian Control and Estimation Compensator. Continuing the discussion of determining blood alcohol concentration based on TAC, attention now moves to a discrete-time linear quadratic gaussian (LQG) control and estimation compensator for random abstract parabolic systems. This compensator may be implemented in one or more embodiment of
[0305] A finite-dimensional approximation and convergence theory for the closed-loop linear quadratic control and estimation of abstract parabolic systems with random parameters is developed. The motivation for this effort is the development of a real-time control scheme for intravenous infused alcohol studies based on a population model for the transdermal transport of alcohol and a transdermal alcohol biosensor that measures the ethanol content in perspiration. We apply Galerkin-based approximation to a weak formulation of the underlying random parabolic system in appropriately constructed Bochner spaces wherein the random parameters are treated as additional spatial variables. Our LQG optimization, approximation, and convergence results are argued using results from linear semigroup theory. An example and results from some of our numerical studies are included.
[0306] We develop a finite-dimensional approximation and convergence theory for the discrete-time linear quadratic Gaussian (LQG) control and estimation of abstract parabolic systems with random parameters. There are two primary motivations for this study. The first is the development of real-time closed-loop feedback for human subject laboratory studies involving the intravenous infusion of alcohol based on transdermal sensing, and the second is the development of an efficient, real-time, deconvolution scheme for a population model for the transdermal transport and measurement of alcohol. In both instances the underlying dynamical model takes the form of an abstract semi-linear, parabolic partial/ordinary differential equation (PDE/ODE) hybrid system describing the transport of ethanol from the blood through the skin, its excretion via perspiration, and finally its measurement on the surface of the skin by an electro-chemical biosensor (in actuality, a fuel cell) worn on the ankle or the wrist. In the first application, the control input to the model is the intravenously infused alcohol and in the second it is either blood or breath alcohol concentration (BAC/BrAC). The output is transdermal alcohol concentration (TAC). The goal in the control problem is to clamp the blood alcohol concentration at a predetermined (typically) constant level, while the goal of the deconvolution problem is to estimate BAC/BrAC from the biosensor measured TAC. Although the model captures the underlying physics quite well, the parameters can vary with the individual wearing the sensor, the particular sensor being worn, and environmental factors such as ambient temperature and humidity. This variation is dealt with by allowing the model parameters to be random with either known or estimated distribution, the result being a population model. In this paper we focus on the control problem and formulate it as an LQ regulator coupled with an LQG estimator or observer which together are known as an LQG compensator. We formulate the deconvolution problem as an LQG tracking problem and will report on our results for it in a subsequent paper.
[0307] The approximation theory for the continuous-time LQR problem in Hilbert space was developed and specifically for abstract parabolic systems For discrete-time LQR problems in Hilbert space, LQR approximation results can be found. The finite-dimensional approximation and convergence theory for the discrete-time LQG compensator in Hilbert space was developed. Here we investigate the application of these results into abstract parabolic systems with random parameters by exploiting some more recent results on systems of this type. In these treatments, the underlying parabolic systems are considered in weak form in appropriately constructed Bochner spaces wherein the random parameters are effectively treated as additional spatial variables. In this way their LQ control and estimation can be formulated in Hilbert space and their finite-dimensional approximation can be facilitated via a Galerkin approach. The closed-loop linear state feedback solution to the resulting LQG compensator problem and convergence results for the finite-dimensional approximations can be argued with the aid of linear semigroup theory.
[0308] An outline of the remainder of the discussion is as follows. In Sections 10, 11 and 12 we briefly outline the optimization, approximation, and convergence theory for the discrete-time LQR and LQG compensator problems in Hilbert space. In Section 13 we discuss the weak formulation of abstract parabolic systems with random parameters. In Section 14 we show how the LQR results in Sections 10 and 11 can be applied to systems of the form discussed in Section 13. In Section 15 we treat the control problem for the intravenous infusion of ethanol involving the transdermal alcohol biosensor and present the results of some of our numerical studies followed by some discussion and a few concluding remarks.
[0309] The Discrete-Time Linear Quadratic (Section 10). Let X, Y and U be separable Hilbert spaces with inner products ?, ?
.sub.X,
?, ?
.sub.Y and
?, ?
.sub.U, respectively. Let ??
(X, X), {circumflex over (B)}?
(U, X) and ??
(X, Y). Let {circumflex over (Q)}?
(X, X) and ??
(X, X) be positive semi-definite self-adjoint and let {circumflex over (R)}?
(U, U) be positive definite self-adjoint. Let {circumflex over (B)}.sub.1?
(
.sup.?, X), ?.sub.1?
(
.sup.v, Y) and consider a discrete-time linear dynamical system given by:
x.sub.k+1=?x.sub.k+{circumflex over (B)}u.sub.k+{circumflex over (B)}.sub.1?.sub.k, k?k.sub.0, x.sub.k.sub.
y.sub.k=?x.sub.k+?.sub.1?.sub.k
together with a quadratic performance index on the finite-time horizon [k.sub.0, k.sub.1]:
[0310] In the system above {?.sub.k} and {?.sub.k} denote respectively .sup.? and
.sup.v valued uncorrelated, zero-mean, stationary, Gaussian white noise processes with each component having common variance ?.sub.Q.sup.2 (state) and ?.sub.K.sup.2 (output) that corrupt the state and measurement through the operators {circumflex over (B)}.sub.1 and ?.sub.1. We interpret the Hilbert space valued stochastic perturbations to the state and output equations in the usual sense with respect to an orthonormal basis yielding the state and output covariance operators ?.sub.Q.sup.2{circumflex over (B)}.sub.1{circumflex over (B)}*.sub.1 and of ?.sub.K.sup.2?.sub.1?*.sub.1, respectively.
[0311] The deterministic time-invariant finite-horizon discrete-time linear quadratic regulator control problem is given by: [0312] (P1) Choose an input ??l.sup.2(k.sub.0, k.sub.1?1; U) for which the criterion (2) is minimized.
[0313] A control sequence u?l.sup.2(k.sub.0, ?; U) is defined to be an admissible control for the initial condition x.sub.0 if ?(u)<?. Then consider a quadratic performance index given by:
[0314] The deterministic time-invariant infinite-borizon discrete-time linear quadratic regulator control problem is given by: [0315] (P2) Choose an input ??l.sup.2(k.sub.0, ?; U) for which the criterion 3 is minimized, if an admissible control exists for the initial condition x.sub.0
[0316] The closed-loop solutions to these discrete-time LQR control problems in linear state feedback form are given. For every initial value x.sub.0, the optimal input for the problem (P1) is unique and generated by the linear control law ?.sub.k=?F.sub.k
F.sub.k={{circumflex over (R)}+{circumflex over (B)}*{circumflex over (?)}.sub.k+1{circumflex over (B)}}.sup.?1{circumflex over (B)}*{circumflex over (?)}.sub.k+1?
[0317] The operators {circumflex over (?)}.sub.k, k=k.sub.0, k.sub.0+1, . . . , k.sub.1?1, are the unique self-adjoint positive semi-definite operators satisfying the following Riccati difference equation
{circumflex over (?)}.sub.k=?*[{circumflex over (?)}.sub.k+1?{circumflex over (?)}.sub.k+1{circumflex over (B)}({circumflex over (R)}+{circumflex over (B)}*{circumflex over (?)}.sub.k+1{circumflex over (B)}).sup.?1{circumflex over (B)}*{circumflex over (?)}.sub.k+1 ]?+{circumflex over (Q)}
k=k.sub.0, k.sub.0+1, . . . , k.sub.1?1, with {circumflex over (?)}.sub.k.sub.
[0318] Moreover, it follows that ?(?)={circumflex over (?)}.sub.k.sub.
.sub.X and that the optimal trajectory {
(X, X) is a solution to the algebraic Riccati equation (ARE) if
{circumflex over (?)}=?*[{circumflex over (?)}?{circumflex over (?)}{circumflex over (B)}({circumflex over (R)}+{circumflex over (B)}*{circumflex over (?)}{circumflex over (B)}).sup.?1{circumflex over (B)}*{circumflex over (?)}]?+{circumflex over (Q)}.
[0319] The existence of a positive semi-definite self-adjoint solution to the ARE is equivalent to the existence of an admissible control for any initial condition x.sub.0. As in the case of finite-dimensional systems, the existence of an admissible control is equivalent to saying that the system is stabilizable. On the other hand, if the operators ?, {circumflex over (B)}, {circumflex over (Q)} and {circumflex over (R)} are such that if x.sub.0?X and u is an admissible control for x.sub.0, then lim.sub.k.fwdarw.??x.sub.k?.sub.X=0, then the system 10 is said to be detectable (we borrow the concept from finite-dimensional case) and the uniqueness of the solution to the ARE 6 is guaranteed.
[0320] Consequently, if we assume that the system 1 is both stabilizable and detectable, then there exists a unique solution {circumflex over (?)} to the ARE 6 and a unique optimal control ? for the problem (P2) for the initial value x.sub.0. It follows that ?(?)={circumflex over (?)}x.sub.0, x.sub.0
.sub.X where ?.sub.k=?F
[0321] We note that it is often the case in both the finite and infinite horizon problems, there is an additional separable Hilbert space, Z, an operator D?(X, Z), and a quantity to be controlled or regulated, z.sub.k={circumflex over (D)}x.sub.k, k=k.sub.0, k.sub.0+1, k.sub.0?2, . . . , in which case the positive semi-definite operator {circumflex over (Q)}?
(X, X) is given by {circumflex over (Q)}={circumflex over (D)}*{circumflex over (D)}.
[0322] Finite-Dimensional Approximation (Section 11). Let X.sup.N, N=1,2, . . . , be a sequence of finite-dimensional linear subspaces of a Hilbert space X and .sup.N:X.fwdarw.X.sup.N be the canonical orthogonal projections satisfying
.sup.Nx.fwdarw.x for any x?X. SHere we have an observation space Y, and a control space U that are potentially infinite-dimensional.
[0323] Assumption 1: There exist operators ?.sup.N:X.sup.N.fwdarw.X.sup.N, {circumflex over (B)}.sup.N:U.fwdarw.X.sup.N, {circumflex over (Q)}.sup.N:X.sup.N.fwdarw.X.sup.N, and ?.sup.N:X.sup.N.fwdarw.X.sup.N which satisfy ?.sup.N.sup.Nx.fwdarw.?x, (?.sup.N)*
.sup.Nx.fwdarw.?*x, x?X, {circumflex over (B)}.sup.Nu.fwdarw.{circumflex over (B)}u, u?U, ({circumflex over (B)}.sup.N)*
.sup.Nx.fwdarw.{circumflex over (B)}*x, x?X, {circumflex over (Q)}.sup.N=
.sup.N{circumflex over (Q)}=
.sup.N{circumflex over (Q)}
.sup.N, and ?.sup.N=
.sup.N?=
.sup.N?
.sup.N, as N.fwdarw.?.
[0324] Consider a sequence of approximating discrete-time LQR problems on the finite-time horizon [k.sub.0, k.sub.1]: [0325] (P1.sup.N) Choose ?.sup.N?l.sup.2(k.sub.0, k.sub.1?1; U) to minimize
[0326] The results concerning the existence and uniqueness of the solution to the discrete-time LQR problem on the finite-time horizon in a general Hilbert space, (P1), outlined in the previous section can be applied to each of the approximating finite-dimensional problems (P1.sup.N). The formulas characterizing the solution to problem (P1.sup.N) have the same form as those for problem (P1).
[0327] The fundamental convergence result is given by the following theorem.
[0328] Theorem 1: Let ?.sup.N and ? be the unique solutions to the approximation problem (P1.sup.N) and the original problem (P1), respectively.
lim.sub.N.fwdarw.?|?.sup.N??|.sub.l.sub.
lim.sub.N.fwdarw.?|
lim.sub.N.fwdarw.?|?.sup.N(?.sup.N)?{circumflex over (J)}(
lim.sub.N.fwdarw.?|{circumflex over (?)}.sub.k.sup.N.sup.Nx?{circumflex over (?)}.sub.kx|.sub.X=0, x?X, k.sub.0?k?k.sub.1,(iv)
lim.sub.N.fwdarw.?|F.sub.k.sup.N.sup.Nx?F.sub.kx|=0, x?X, k.sub.0?k?k.sub.1?1,(v)
where the l.sup.2 inner product and corresponding norm is defined by x, y
.sub.l.sub.
k.sub.k, y.sub.k
.sub.H for any x and y in l.sup.2 (k.sub.0, k.sub.1; H), and any Hilbert space H.
[0329] Note that if U is m-dimensional (i.e. U=.sup.m) and F?
(X, U), then by the Riesz Representation Theorem there exists ??x.sub.i=1.sup.mX, the so-called functional gains corresponding to F, such that u=Fx=[
?.sub.1, x
, . . . ,
?.sub.m,x
].sup.T, with
?, ?
denoting the X inner product.
[0330] Remark 1: If in addition we have that the control or input space, U, is finite-dimensional with dimension m, it then follows that F.sub.k.sup.N.sup.N.fwdarw.F.sub.k, k.sub.0?k?k.sub.1?1, in norm (i.e. in
(X, U)) and the m-dimensional functional gains corresponding to F.sup.N
.sup.N and F.sub.k, k.sub.0?k?k.sub.1?1, ?.sup.N and ?, respectively, satisfy ?.sup.N.fwdarw.? in l.sup.2(k.sub.0, k.sub.1?1; x.sub.i=1.sup.mX).
[0331] Remark 2: If the control or input space is finite-dimensional with dimension m and Assumption 1 holds with the exception that (?.sup.N)*.sup.Nx.fwdarw.?*x, x?X (i.e. only weak rather than strong convergence of the adjoint), it then follows that {circumflex over (?)}.sub.k.sup.N
.sup.Nx.fwdarw.{circumflex over (?)}.sub.kx, x?X, k.sub.0?k?k.sub.1, F.sub.k.sup.N
.sup.Nx.fwdarw.F.sub.kx, x?X, k.sub.0?k?K.sub.1?1, and ?.sup.N.fwdarw.? in l.sup.2(k.sub.0, k.sub.1?1; x.sub.i=1.sup.mX)
[0332] Now consider a sequence of approximating discrete-time LQR problems on the infinite-time horizon [k.sub.0, ?): [0333] (P2.sup.N) Choose ?.sup.N?l.sup.2(k.sub.0, ?; U) to minimize
for the same system 7).
[0334] To guarantee the solvability of (P2.sup.N), we need to assume the solvability of the approximating finite-dimensional AREs, i.e. for each N, there exists exactly one positive semi-definite self-adjoint solution to the approximation ARE.
As in the infinite-dimensional case, let F.sup.N=({circumflex over (R)}+({circumflex over (B)}.sup.N)*{circumflex over (?)}.sup.N{circumflex over (B)}.sup.N).sup.?1({circumflex over (B)}.sup.N)*{circumflex over (?)}.sup.N?.sup.N, ?.sup.N=?.sup.N?{circumflex over (B)}.sup.NF.sup.N
where {circumflex over (?)}.sup.N is the unique positive semi-definite self-adjoint solution to the approximating ARE assumed to exist. We then have the following convergence theorem.
[0335] Theorem 2: Under Assumption 1 if {circumflex over (?)}.sup.N.sup.N converges strongly to some bounded linear operator {circumflex over (?)}, then {circumflex over (?)} is a positive semi-definite self-adjoint solution to the original ARE 6, F.sup.N
.sup.N converges strongly to F and ?.sup.N
.sup.N converges strongly to ?, where F is defined in the original infinite-dimensional problem (P2) and S=??{circumflex over (B)}F.
[0336] Modifications to Theorem 1 analogous to those given in Remark 1 and Remark 2 apply to Theorem 2 as well. We have the following result.
[0337] Theorem 3: Under Assumption 1 suppose that there exists positive constants M and r, independent of N, with r<1, such that
{circumflex over (?)}.sup.N?M, N=1,2, . . . ,
|(?.sup.N).sup.t|?Mr.sup.t, t=1,2, . . . , N=1,2, . . . ,
where {circumflex over (?)}.sup.N is the unique positive semi-definite self-adjoint solution to the approximating ARE assumed to exist. Then a positive semidefinite self-adjoint solution {circumflex over (?)} to 6 exists, and {circumflex over (?)}.sup.N.sup.N.fwdarw.{circumflex over (?)} strongly as N.fwdarw.?. If there exists a positive m, independent of N, such that |{circumflex over (Q)}.sup.N?m, N=1,2, . . . , then this implies the existence of an r less than one and independent of N for which the above equation holds.
[0338] Finally we note that it is also possible to fully discretize the problems (P1) and (P2) with the introduction of a sequence of finite-dimensional approximating subspaces, {U.sup.M} of the in general infinite-dimensional input or control Hilbert space U and obtain a doubling indexed sequence of approximating LQR problems on either the finite or infinite time horizon Straight forward extensions of the theorems presented above can be proven which establish analogous convergence results as N, M.fwdarw.?.
[0339] The LQG Observer and Compensator (Section 12). The LQG compensator is based on combining the LQR theory described above with a Kalman filter state estimator or observer. The general theory for discrete-time systems in Hilbert space together with a finite-dimensional approximation and convergence results can be found. The observer or state estimator takes the form
{tilde over (x)}.sub.k+1=?x.sub.k+{circumflex over (B)}u.sub.k+{tilde over (L)}.sub.k(y.sub.k??{tilde over (x)}.sub.k), {tilde over (x)}.sub.k.sub.
where x.sub.0?X is arbitrary, the operators {tilde over (L)}.sub.k?(Y, X) are given by {tilde over (L)}.sub.k=?{tilde over (?)}.sub.k?*{{tilde over (R)}+?{tilde over (?)}.sub.k?*}, with the operators {tilde over (?)}.sub.k, k=k.sub.0, k.sub.0+1, . . . given by the recurrence
{tilde over (?)}.sub.k+1=?[{tilde over (?)}.sub.k?{tilde over (?)}.sub.k{circumflex over (C)}*({tilde over (R)}+?{tilde over (?)}.sub.k?*).sup.?1?{tilde over (?)}.sub.k]?*+{tilde over (Q)},(10)
[0340] k=k.sub.0, k.sub.0+1, . . . , with {tilde over (?)}.sub.k.sub.
[0341] The steady state form is given by L.sub.k=L where {tilde over (L)}?(Y, X) is given by: {tilde over (L)}=?{tilde over (?)}?*{{tilde over (R)}+?{tilde over (?)}?*}.sup.?1, with the operator {tilde over (?)} a positive semi-definite self-adjoint solution, if it exists, to the ARE given by
{tilde over (?)}=?[{tilde over (?)}?{tilde over (?)}?*({tilde over (R)}+?{tilde over (?)}?*).sup.?1?{tilde over (?)}]?*+{tilde over (Q)}
[0342] Note that if the output space Y is m-dimensional, then the optimal observer gains {tilde over (L)}.sub.k or {tilde over (L)} can be represented by an m-dimensional row vector ?.sub.k or ? of elements in X. These are referred to as the optimal functional observer gains.
[0343] In light of the duality between the LQR control and the LQG observer problems, existence and uniqueness results for solutions to the ARE are analogous to those given for the LQR ARE. Finite-dimensional approximation and convergence results for the observer/compensator are also analogous to the LQR theory presented above. Indeed, if in addition to Assumption 1 we have that there exist operators ?.sup.N?(X.sup.N, Y) and positive semi-definite self-adjoint operators {tilde over (Q)}.sup.N?
(X.sup.N, X.sup.N) such that ?.sup.N
.sup.Nx.fwdarw.?x, x?X, (?.sup.N)*y.fwdarw.?*y, y?Y and {tilde over (Q)}.sup.N
.sup.Nx.fwdarw.{tilde over (Q)}x, x?X, as N.fwdarw.?, we have that the solutions to the finite-dimensional approximating observer Riccati equations converge strongly to the solutions to the infinite-dimensional Riccati equations, and that the approximating optimal observer gain operators converge strongly to their infinite-dimensional counterparts. In the case that the output space is finite-dimensional, the approximating optimal functional observer gains converge in norm as well.
[0344] We note that in the steady state case, the state transition operator for the closed loop plant/compensator system is given
with the (closed-loop) spectrum of S given by ?(S)=?(??{circumflex over (B)}F)??(??{tilde over (L)}?).
Abstract Parabolic Systems With Random Parameters (Section 13).
[0345] Abstract Parabolic Systems (Section 13. A). Let V and H be Hilbert spaces with VH, i.e. V is continuously and densely embedded in H. Then the Gelfand triple V
H
V* is obtained by identifying H with its dual H*. Define the inner product on H by
?, ?
.sub.H and the norms on H and V by |?|.sub.H, ???.sub.V, respectively. Define a sesquilinear form a(?, ?):V?V.fwdarw.
which satisfies the following properties.
[0346] Assumption 2: (Boundedness) There exists a constant ?.sub.0>0 such that for each ?, ??V, we have
|a(?, ?)|??.sub.0???.sub.V???.sub.V.
[0347] Assumption 3: (Coercivity) There exist constants ?.sub.0? and ?.sub.0>0 such that for each ??V, we have
a(?, ?)+?.sub.0|?|.sub.H.sup.2??.sub.0???.sub.V.sup.2.
[0348] Now we consider the following parabolic system written in the weak form:
{dot over (x)}, ?
.sub.V*,V+a(x, ?)=
Bu, ?
.sub.V*,V, ?, ?V, x(0)=x.sub.0
where x.sub.0?H, u?L.sup.2([0, T], U) is an input to the system, and B:U.fwdarw.V* is a bounded linear operator.
[0349] It can be shown that the equation has a unique solution in the set:
{?:??L.sup.2([0, T], V), {dot over (?)}?L.sup.2([0, T], V*)}.Math.C([0, T], H).
[0350] Under these assumptions, a(?, ?) defines a bounded linear operator A:V.fwdarw.V* such that
?a(?, ?)=A?, ?
.sub.V*,V
where ?, ??V.
[0351] Furthermore, it can be shown that A restricted on the set
Dom (A)={??V:A??H}
is the infinitesimal generator of a holomorphic or analytic semigroup of bounded linear operators on H, {T(t):t?0}. Moreover, this semigroup can be restricted to be a holomorphic semigroup on V and extended to be a holomorphic semigroup on V* by appropriately restricting or extending the domain. Dom (A), of the operator A.
[0352] It follows that the system can be rewritten in state space form as the evolution system with time-invariant operators A and B:
{dot over (x)}(t)=Ax(t)+Bu(t), x(0)=x.sub.0.
[0353] Systems with Random Parameters (Section 13. B). Now we summarize the key idea from the framework and consider an abstract parabolic system with random parameters satisfying some known distribution. Assume q?Q, where the set of admissible parameters, Q is a compact subset of the finite-dimensional Euclidean space whose dimension is p, and is compact with respect to some metric d.sub.Q. For each q?Q, we require that besides satisfying Assumption 2 and 3, the sesquilinear form a(q; ??):V?V.fwdarw. also satisfies:
[0354] Assumption 4: (Continuity) For q, {tilde over (q)}?Q, we have for all ?, ??V,
|a(q; ?, ?)?a({tilde over (q)}; ?, ?)|?d.sub.Q(q, {tilde over (q)})??????,
where d.sub.Q(?, ?) denotes any p-metric on .sup.p. It is assumed that all of the constants, ?.sub.0, ?.sub.0, and ?.sub.0 do not depend on q, for q?Q.
[0355] In addition, it may also sometimes be required that the inner product on the space H depend upon q?Q. Let this space be denoted by H.sub.q={H, ?, ?
.sub.q, |?|.sub.q} and that the following assumption be satisfied.
[0356] Assumption 5: (H-Continuity) For q, {tilde over (q)}?Q, we have for all ?, ??H,
?, ?
.sub.q?
?, ?
.sub.{tilde over (q)}|?d.sub.Q(q, {tilde over (q)})|?|.sub.H|?|.sub.H
and that the identity map from V into H.sub.q be uniformly bounded for q?Q.
[0357] We formulate a population model by treating the parameter q as a random vector q, and assume that its support is x.sub.i=1.sup.p[a.sub.i, b.sub.i], where a.sub.i, b.sub.i are real numbers since we have assumed that the set of admissible parameters Q is a compact subset of a finite Euclidean space, i.e. ??<a.sub.i<b.sub.i<? on for all i=1,2, . . . , p. Typically, the distribution of q will depend on parameters in some parameter set ??.sup.r for some r which is closed and bounded. That is, we assume that the distribution of q is given by a known measure ?=?(?)=?({right arrow over (a)}, {right arrow over (b)}, {right arrow over (?)}), where ?=({right arrow over (a)}, {right arrow over (b)}, {right arrow over (?)}) with {right arrow over (a)}=[a.sub.i].sub.i?1.sup.p, {right arrow over (b)}=[b.sub.i].sub.i=1.sup.p, and {right arrow over (a)}??. It will typically be the case that the population model is determined by fitting the parameters ?=({right arrow over (a)}, {right arrow over (b)}, {right arrow over (?)}) to population data.
[0358] Define the Bochner spaces =L.sub.?.sup.2(Q; V),
=L.sub.?.sup.2(Q; H.sub.q) corresponding to the measure ?. Then the assumptions on the spaces V and H guarantee that the spaces
and
form the Gelfand triple
. Here
has been identified with its dual
=L.sub.?.sup.2(Q; H*.sub.q).
[0359] We then define the i-averaged sesquilinear form a(?; ?):?
.fwdarw.
by
a(?, ?)=?.sub.Qa(q; ?(q), ?(q))d?(q)=.sub.?[a(q; ?(q), ?(q))]
where ?, ??. Assumptions 23 and 4 guarantee that this integral is well defined on q.
[0360] We can easily check the boundedness and coercivity of a(?, ?) by using Assumption 24 and the Cauchy-Schwartz Inequality.
[0361] Therefore, we can use this sesquilinear form to define a bounded linear map .fwdarw.
by
?, ?
=?a(?, ?), ?, ??
which when appropriately restricted or extended is the infinitesimal generator of analytic semigroups of bounded linear operators (t), t?0 on
and
.
[0362] We next consider a nonhomogeneous parabolic system with random parameters written in the weak form:
{dot over (x)}, ?
.sub.V*,V+a(q; x, ?)=
B(q)u, ?
.sub.V*,V??V
where B(q):U.fwdarw.V* is a bounded linear operator defined on a Hilbert space of feasible inputs, U, and u?L.sup.2([0, ?), L.sub.?.sup.2(Q; U)). Then let be a closed subspace of the Bochner space L.sub.?.sup.2(Q; U) and define the operator
.fwdarw.
by
u, ?
=?.sub.Q
B(q)u(q), ?(q)
.sub.V*,Vd?(q)
where u?, ??
.
[0363] In light of 13 and 16 , as in the deterministic case, we can write the population model corresponding to (15) in weak (with respect to both ??(0,1) and q?Q) form as
{dot over (x)}, ?
+a(x, ?)=
u, ?
, ??
,
and then, in state space form as
{dot over (x)}(t)=x(t)+
u(t), x(0)=x.sub.0,
where u?L.sup.2([0, ?), ) It is shown that the solutions agree almost surely or ??a, eq?Q. It is interesting to note that in this way, the random parameters are treated like additional spatial variables and in particular the resulting weak form does not involve any derivatives with respect to these variables. More to the point, the resulting dynamical system is now effectively deterministic and abstract parabolic and thus amenable to the treatment for standard abstract parabolic systems discussed previously in subsection V?A
[0364] From linear semigroup theory, the mild solution is then given by:
x(t)=(t)x.sub.0+?.sub.0.sup.t
(t?s)
u(s)ds, t?0
[0365] Let ? denote the length of the sampling interval, and consider zeroorder hold inputs of the form u(t)=u.sub.k, for t?[k?, (k+1)?), k=0,1,2, . . . . . If we then define x.sub.k=x(k?), k=0,1,2, . . . and ?
), and
?
respectively by
=
(?) and
=?.sub.0.sup.?
(s)
ds, We obtain the discrete-time dynamical system given by:
x.sub.k+1=x.sub.k+
u.sub.kx(0)=x.sub.0.
[0366] We note that if the operator given in ?{square root over (14 )} is invertible (for example if ?.sub.0=0 in Assumption 3, it follows that
=?.sub.0.sup.?(s)
ds=(
?I)
.sup.?1
=
.sup.?1(
?I)
.
[0367] Finally we note that if there is an output or observation operator C(q)?(H.sub.q,
) (or L(V,
)), where
denotes the observation Hilbert space, let
=
and define
?
(
) (or
(
)) by
v=?.sub.QC(q)v(q)d?(q). Then the system 20 can be augmented with the output equation y.sub.k=
x.sub.k, k=0,1,2, . . .
[0368] Finite-Dimensional Approximation and Convergence (Section 13. C.). We consider a Galerkin approximation based on the weak form). Let N be a positive integer. For each N, let .sup.N be a finite-dimensional subspace of
, satisfying
.sup.Nx.fwdarw.x, for x?
, where
.sup.N is the orthogonal projection of
onto
.sup.N.
[0369] Now we define the operators .sup.N on
.sup.N by essentially restricting the form a to the subspace
.sup.N?
.sup.N of the space
?
. To be more specific, we have
where ?.sup.N, ?.sup.N?.sup.N
[0370] Obviously, since .sup.N is a linear operator on a finite-dimensional space, it is the infinitesimal generator of a uniformly continuous semigroup
.sup.N(t)=
for t?0. So we can define
.sup.N?
(
.sup.N,
.sup.N) by ?.sup.N=
.sup.N(?)=
?
[0371] We use a variational corollary of Trotter-Kato theorem to obtain the convergence of semigroup. Thereby, we obtain the convergence of operator ?.sup.N.
[0372] Theorem 4: Assume that the Assumptions 24 are satisfied. Then for each x?,
.sup.N(t)
.sup.Nx.fwdarw.
(t)x in the
norm for t>0 uniformly in ton compact sub intervals.
[0373] Remark 3: The Trotter-Kato theorem requires the following assumption: For each x?, there exists x.sup.N?
.sup.N such that ?x?x.sup.N
.fwdarw.0. However, we note that this assumption is actually equivalent to the strong convergence of orthogonal projection
.sup.N of
onto
.sup.N, i.e. for any x?
,
.sup.Nx.fwdarw.x as N.fwdarw.?. Indeed, for any x?
satisfying the assumption in [4], we have ?
.sup.Nx?x
??x?x.sup.N
.fwdarw.0. On the other hand, for any x?
satisfying
.sup.Nx?
.sup.Nthen by
.sup.N.fwdarw.I strongly, we have ?x?x.sup.N?
=?x?
.sup.Nx?
.fwdarw.0.
[0374] From the definition =
(?) and
.sup.N=
.sup.N(?), we immediately get that (
.sup.N
.sup.N.fwdarw.
strongly in
as N.fwdarw.?.
[0375] Weak convergence of the adjoint of .sup.N, (
.sup.N)*, which is also a bounded operator on
.sup.N, then immediately follows.
[0376] Corollary 4.1: Let (.sup.N)*, ?* and
.sup.N be defined as before. Then (
.sup.N)*
.sup.N.fwdarw.
* (i.e. weakly) in
.
[0377] Proof. For any ?, ? in , we have
where the inner product in the above calculation is the inner product. Since
.sup.N
.sup.N.fwdarw.{circumflex over (d)} strongly in
as N.fwdarw.?, we have
(
.sup.N)*
.sup.N?, ?
.fwdarw.
?,
?
=
*?, ?
.
[0378] We note that the Trotter-Kato theorem can also be used to argue that .sup.N(t)
.sup.Nx.fwdarw.
.sup.N(t)x in the
norm for t>0, uniformly in t on compact sub intervals. Moreover, since the operator d is regularly dissipative, the same arguments can also be used to argue that
.sup.N(t)*
.sup.Nx.fwdarw.
(t)*x in the
nom for t>0. uniformly in t on compact sub intervals and consequently that both
.sup.N and (
.sup.N)* converge to
and (
)*, respectively, strongly in
. However, it is worth noting that for some problems of interest the observation operator is bounded in
but unbounded in
, and in such a case, one may want to apply the LQR theory developed earlier in Sections II and III in
rather than in
. In this event arguing strong convergence of the adjoint may be difficult or simply not true. in which case, the weaker convergence results for the approximating solutions to the LQR problem will have to suffice.
An Example: A Random Parabolic ODE/PDF Hybrid System with Coupling on the Boundary of the Spatial Domain (Section 14)
[0379] We consider the design of an LQG control or regulator for a clamping experiment involving the intravenous infusion of ethanol with observations provided by a transdermal alcohol biosensor. The dynamical model takes the form of a hybrid, semi-linear, ODE/PDE reaction diffusion equation. The transdermal transport of ethanol through the epidermal layer of the skin is modeled by a one-dimensional diffusion equation which is coupled via Dirichlet boundary conditions to two well-mixed compartments, one representing the blood and the other the transdermal alcohol biosensor. The inflow to the two compartments is proportional to the flux at the boundary of the epidermal layer of the skin. Aside from the relatively small amount of ethanol excreted from the body through urine, tears, breast milk, sweat and perspiration, the primary mechanism by which ethanol is processed out of the body is via a reaction that takes place in the liver and which is catalyzed by a group of enzymes known as alcohol dehydrogenase (ADH). In the transdermal alcohol biosensor, the ethanol is consumed in an oxidation-reduction reaction wherein each molecule of ethanol produces four electrons. The resulting current is measured with the measurement being bench calibrated with a source of ethanol vapor with known concentration. The enzyme catalyzed reaction in the blood compartment (liver) is modeled Michaelis-Menten term which exhibits first-order kinetics at low concentrations and zero-order kinetics at higher concentrations once saturation is achieved. In addition, since the values of the parameters which appear in the model for an individual subject will in all likelihood be unknown and un-measurable, we will consider the parameters to be random with distribution that has previously been fit to cohort from an appropriately stratified population. Consequently, the resulting control problem is one in which the process is to be regulated for an individual based on a population model.
[0380] Problem Formulation (Section 11. A.). The underlying dynamical system as described in the previous paragraph takes the following form:
with boundary conditions, controlled variable and observation:
{tilde over (x)}(t, 0)={tilde over (w)}(t), {tilde over (x)}(t, 1)={tilde over (v)}(t), t>0,
{tilde over (z)}(t)={tilde over (v)}(t), {tilde over (y)}(t)={tilde over (w)}(t)+?(t), t>0,
respectively, and initial conditions:
{tilde over (v)}(0, ?)=?.sub.0(?), ??(0, 1), {tilde over (w)}(0)=?.sub.0, {tilde over (v)}(0)=?.sub.0,
where the parameters appearing in the model equations 21-23, ?, ?, ?, ?, M, K, and b are all assumed to be positive, and the initial conditions ?.sub.0, ?.sub.0, and ?.sub.0 are all assumed to be nonnegative. In the above system
[0381] If the desired clamped blood alcohol level is {tilde over (v)}(t)={tilde over (v)}.sub.0, then an equilibrium solution to the system 21 is given by
To formulate the regulator problem, we linearize about a clamped operating regime, {tilde over (x)}.sub.0, {tilde over (w)}.sub.0, {tilde over (v)}.sub.0, and ?.sub.0, by writing {tilde over (x)}={tilde over (x)}.sub.0+x, {tilde over (w)}={tilde over (w)}.sub.0+w, {tilde over (v)}={tilde over (v)}.sub.0+v, and ?=?.sub.0+u and obtain the linearized system for x, w, v, and u given by:
with boundary conditions, controlled variable and observation
x(t,0)=w(t), x(t, 1)=v(t), t>0,
z(t)=v(t), y(t)=w(t)+?(t), t>0,
respectively, where in the equations the parameters q.sub.1=?, q.sub.2=b, q.sub.3=?, q.sub.4=?, q.sub.5=?, and
are all positive. The state of the system is given by the triple (w, v, x) and the control objective is to determine an output feedback law for u that drives v to zero based on the observation of w, with the caveat that we only know the distribution of the parameters q=(q.sub.1, q.sub.2, q.sub.3, q.sub.4, q.sub.5, q.sub.6) in the subject cohort of interest.
[0382] We reformulate the system as an abstract parabolic system in a Gelfand triple of Hilbert spaces. Let Q be a compact subset of the positive orthant of .sup.6, let H=
.sup.2?L.sup.2(0, 1) be endowed with the standard inner product and norm and for q?Q let H.sub.q=
.sup.2?L.sup.2(0, 1) with the inner product
Let V be the Hilbert space
where ?, ?
.sub.H.sub.
H.sub.q
V* and that Assumption 5 is satisfied. Define the bilinear form a(q; ?, ?):V?V.fwdarw.
by
[0383] Our continuity and compactness assumptions and standard and straight forward calculations yield that the form a(q; ?, ?) satisfies Assumptions 25 and 4 with all relevant constants appearing in those assumptions independent of q?Q.
[0384] Let A(q):Dom (A(q)?H.fwdarw.H be given by:
A(q){circumflex over (?)}, {circumflex over (?)}
.sub.V*,V=?a(q; {circumflex over (?)}, {circumflex over (?)})
for {circumflex over (?)}?Dom (A(q)), and {circumflex over (?)}?V, where
Dom (A(q))={{circumflex over (?)}=(?(0), ?(1), ?)?V:??H.sup.2(0,1)}
is independent of q?Q. Moreover, it is not difficult to show that for {circumflex over (?)}=(?(0), ?(1), ?)?Dom (A(q)) we have
and that the operator A(q) is densely defined on H.sub.q, regularly dissipative and self-adjoint. Consequently A(q) is the infinitesimal generator of a uniformly exponentially stable, self-adjoint, analytic semigroup of bounded linear operators, {T(q; t):t?0}, on H.sub.q. The state variable to be controlled or regulated is v and consequently we define the controlled variable operator D?(H.sub.q,
) by D(?, ?, ?)=?. The observed state variable is w and therefore the observation or output operator C?
(H.sub.q,
) is given by C(?, ?, ?)=?. For q?Q, the input operator B(q)?
(
, H.sub.q) is given by B(Q)u=(0, q.sub.2u, 0), and the random noise influence operator B.sub.1?
(
.sup.2, H.sub.q) by B.sub.1?=(?.sub.1, ?.sub.2, 0). In this example, we have U=Y=Z=
, all of which are clearly finite dimensional. For the finite-time horizon problem if a terminal penalty is to be included, the operator G?
(H.sub.q, H.sub.q) would most likely be chosen to be G=?D*D, for some nonnegative weight ?. We assume zero-order hold input and random noise with sampling time ?>0, and we consider the quadratic performance index
where {circumflex over (r)}>0, k.sub.1 can be either finite or infinite (in the latter case, ?=0), and x.sub.k is given by the recurrence x.sub.k+1=?(q)x.sub.k+{circumflex over (B)}(q) u.sub.k+{circumflex over (B)}.sub.1(q)?(k?), x.sub.0=(w(0), v(0), x(0, ?)), with ?(t)=[?.sub.1(t), ?.sub.2(t)].sup.T, ?(q)=T(q; ?)? (H.sub.q, H.sub.q) and, recalling that A(q) is coercive, that {circumflex over (B)}(q)=A(q).sup.?1(?(q)?I)B(q)?
(
, H.sub.q)=H.sub.q with {circumflex over (B)}(q)?Dom (A(q)), and {circumflex over (B)}.sub.1(q)=A(q).sup.?1(?(q)?I)B.sub.1?
(
.sup.2, H.sub.q)=H.sub.q?H.sub.q with {circumflex over (B)}.sub.1(q)?Dom (A(q)?Dom (A(q). Furthermore, it follows that C=C and {circumflex over (D)}=D, {circumflex over (Q)}={circumflex over (D)}*{circumflex over (D)}?
(H.sub.q, H.sub.q), ?=?{circumflex over (D)}*{circumflex over (D)}?
(H.sub.q, H.sub.q), and {circumflex over (R)}={circumflex over (r)}.
[0385] In the observer or estimator, the state covariance operator and output covariance matrix are given by {tilde over (Q)}(q)={circumflex over (B)}.sub.1(q)?{circumflex over (B)}.sub.1(q)*?(H.sub.q, H.sub.q) where ?=diag (?.sub.1.sup.2, ?.sub.2.sup.2)?
.sup.2?2, and {tilde over (R)}=?.sup.2?
, respectively.
[0386] Now let q be a random vector with support Q and distribution described by the probability measure ? with all functions involving q?-measurable. Let be the Bochner space
=L.sub.?.sup.2(Q; V) and let
* be its dual. Let
=L.sub.?.sup.2(Q; H.sub.q), and identify the Hilbert space H
with its dual to obtain the Gelfand triple
. Define the bilinear form a(?, ?) on
?
and the operator
?
(
,
*) by:
for {circumflex over (?)}, {circumflex over (?)}?. As in the deterministic setting, the operator d is regularly dissipative and self-adjoint and can be restricted to Dom (
)={??
:
??
} as the infinitesimal generator of a uniformly exponentially stable, analytic semigroup {
(t):t?0} of bounded, self-adjoint, linear operators on
.
[0387] Define the operators ?
(
,
),
.sub.1?
(
.sup.2,
), and
,
?
(
,
) by
u=
.sub.?{B(q)}u, u?
,
.sub.1?=
.sub.?{B.sub.1}?=B.sub.1?, ??
.sup.2,
{circumflex over (?)}=
.sub.?{C{circumflex over (?)}}, and
{circumflex over (?)}=
.sub.?{D{circumflex over (?)}}, {circumflex over (?)}?
, respectively Then set
=
(?)?
(
,
), {circumflex over (B)}=
.sup.?1(
?
)
?
(
,
),
.sub.1=
.sup.?1(
?
)
.sub.1?
(
.sup.2,
),
=
?
(
,
), and {circumflex over (D)}=
?
(
,
), where
denotes the identity operator on
.
[0388] With these definitions, we then consider the discrete-time linear quadratic regulator problem in for the quadratic performance index
subject to the discrete-time linear system
x.sub.k+1=x.sub.k+
u.sub.k+
.sub.1?(k?), x.sub.0={circumflex over (?)}.sub.0,
y.sub.k=x.sub.k+?(k?), k=k.sub.0, k.sub.0+1, k.sub.0+2, . . . ,
where {circumflex over (Q)}, ?
(
,
) are given by {circumflex over (Q)}=
, and
=?
, respectively and {circumflex over (?)}.sub.0?
. Note that in light of our definitions, the quadratic performance index is the same.
[0389] In what follows we will only concern ourselves with the infinite horizon problem (i.e. when k.sub.1=? and ?=0); the results for the finite horizon problem are analogous. The uniform exponential stability of the semigroup {(t):t?0} and therefore of
as well guarantee that there exists a unique solution, and consequently that an admissible control exists for any initial value. Moreover, we have for any admissible control, lim.sub.k.fwdarw.??x.sub.k
=0. It follows that there exists a unique positive semi-definite self-adjoint solution to the ARE
{circumflex over (?)}=*[{circumflex over (?)}?{circumflex over (?)}
(
+
*{circumflex over (?)}
).sup.?1
*{circumflex over (?)}]
+
the optimal input in closed-loop linear state feedback form is given by:
?.sub.k=?,
, k=k.sub.0, k.sub.0+1, . . . ,
where
={{circumflex over (r)}+
*{circumflex over (?)}
}.sup.?1
*{circumflex over (?)}
,
{circumflex over (?)}=* is the corresponding functional gain,
(?)=
{circumflex over (?)}{circumflex over (?)}.sub.0, {circumflex over (?)}.sub.0
, and that the optimal trajectory {
?
)
[0390] To construct the compensator, the observer takes the form
{tilde over (x)}.sub.k+1={tilde over (x)}.sub.k+
u.sub.k+
(y(k?)?
{tilde over (x)}.sub.k), {tilde over (x)}.sub.k.sub.
k? k.sub.0, where {tilde over (?)}.sub.0? is arbitrary and the operator observer gain
?
(
,
) is given by:
=
{tilde over (?)}
*{?.sup.2+
{tilde over (?)}
*}.sup.?1
with the operator {tilde over (?)} the unique positive semi-definite self-adjoint solution guaranteed to exist to the ARE given by:
{tilde over (?)}=[{tilde over (?)}?{tilde over (?)}
*(?.sup.2+
{tilde over (?)}
*).sup.?1
{tilde over (?)}]
*+
where {tilde over (Q)}.sub.1?
*.sub.1?
(
,
). The optimal LQG compensator is then given by ?.sub.k=?
{tilde over (x)}.sub.k=?
{circumflex over (?)}, {tilde over (x)}.sub.k
, k=k.sub.0, k.sub.0+1, k.sub.0+2, . . . , where the feedback operator
and functional control gains {circumflex over (?)} are given by: 31 and 30, respectively. Note that since
?
(
,
), it follows that in fact
={tilde over ()}?
. The element
in
is the optimal functional observer gain. Finally, we note that the spectrum for the closed-loop compensator system is given by:
from which it is not difficult to argue that in fact ?()=?(
)?
)??(
?
).
[0391] Approximation and Convergence (Section 14. B). The theory presented in Sections above tells us how to proceed here. We need only describe (1) how to construct a sequence of finite-dimensional approximating subspaces of .sup.N, whose corresponding sequence of orthogonal projections converges strongly to the identity in
, and (2) how to define appropriately converging sequences of approximating operators to
, and {tilde over (Q)}.
[0392] Let N represent the multi-index N=(n, m.sub.1, m.sub.2, . . . , m.sub.6) and we write N.fwdarw.? we mean n.fwdarw.? and m.sub.i.fwdarw.?, i=1,2, . . . . 6. We assume that the random parameter q.sub.i has support [a.sub.i, b.sub.i], i=1,2, . . . ,6, all assumed to be bounded. Let Q be the compact subset of .sup.6 given by Q=x.sub.i=1.sup.6[a.sub.i, b.sub.i]. For i=1,2, . . . , 6, partition [a.sub.i, b.sub.i] into m.sub.i equal subintervals, and let ?.sub.j.sup.m.sup.
and set {circumflex over (?)}.sub.j.sup.n=(?.sub.j.sup.n(0), ?.sub.j.sup.n(1), ?.sub.j.sup.n)?V. Let J denote a multi-index of the form J=(j.sub.0, j.sub.1, . . . j.sub.6) where j.sub.0?{0,1,2, . . . , n} and j.sub.i?, i=1,2, . . . ,6. Then set ?.sub.J.sup.N={circumflex over (?)}.sub.j.sub..sup.N=span.sub.J{?.sub.J.sup.N}, let
.sup.N:
.fwdarw.
.sup.N denote the orthogonal projection of
onto
.sup.N. Standard arguments from the theory of splines and piecewise constant approximation in L.sup.2 can be used to argue that
.sup.N converges strongly to the identity in both
and
.
[0393] Since .sup.N is a subspace of
(and
) we simply set
.sup.N=
.sup.N for all multi-indices N. We obtain
.sup.N?
(
.sup.N,
.sup.N) via a Galerkin approach as described above and set
.sup.N=exp(
.sup.N?)?
(
.sup.N,
.sup.N). We then set
.sup.N=(
.sup.N).sup.?1(
.sup.N?
.sup.N)
.sup.N
?
(
,
.sup.N). Similarly we set
.sub.1.sup.N=(
.sup.N).sup.?1(
.sup.N?
.sup.N)
.sup.N
.sub.1?
(
.sup.2,
.sup.N), and then set {tilde over (Q)}.sup.N=
.sub.1.sup.N?(
.sub.1.sup.N)*?
(
.sup.N,
.sup.N). We note also that {circumflex over (Q)}.sup.N=
.sup.N{circumflex over (Q)}
.sup.N=(
.sup.N)*
.sup.N, where {circumflex over (D)}.sup.N=
.sup.N. Arguments from functional analysis (specifically, linear semigroup theory) can then be used to argue the requisite convergence.
[0394] We then consider the sequence of finite-dimensional approximating LQR/LQG compensator problems on the infinite-time horizon to minimize
subject to
x.sub.k+1.sup.N=.sup.Nx.sub.k.sup.N+
.sup.Nu.sub.k.sup.N+
.sub.1.sup.N?(k?), x.sub.0.sup.N=
.sup.N{circumflex over (?)}.sub.0.
y.sub.k.sup.N=.sup.Nx.sub.k.sup.N+?(k?), k=k.sub.0k.sub.0+1, k.sub.0+2, . . .
[0395] As in the infinite-dimensional case, the unique solution to this problem is given in closed-loop linear state feedback form by [10] ?.sub.k.sup.N=.sup.N
{circumflex over (?)}.sup.N,
, k=k.sub.0, k.sub.0+1, . . . , where
.sup.N={{circumflex over (r)}+(
.sup.N)*{circumflex over (?)}.sup.N)
.sup.N}.sup.?1(
.sup.N)*{circumflex over (?)}.sup.N
.sup.N
and {circumflex over (?)}.sup.N is the unique positive semi-definite, symmetric solution to the approximating ARE,
{circumflex over (?)}.sup.N=(.sup.N)*[{circumflex over (?)}.sup.N?{circumflex over (?)}.sup.N
.sup.N({circumflex over (r)}+(
.sup.N)*{circumflex over (?)}.sup.N
.sup.N).sup.?1(
.sup.N)*{circumflex over (?)}.sup.N]
.sup.N+(
.sup.N)*
.sup.N
where .sup.N(?.sup.N)=
{circumflex over (?)}.sup.N
.sup.N{circumflex over (?)}.sub.0,
.sup.N{circumflex over (?)}.sub.0
and that the optimal trajectory is given by
.sup.N?
.sup.N
.sup.N)
.sup.N{circumflex over (?)}.sub.0. We note that in actual practice, the control applied would be ?.sub.k.sup.N=?
{circumflex over (?)}.sup.N,
, k=k.sub.0k.sub.0+1, . . . , where
{tilde over (x)}.sub.k+1.sup.N=.sup.N{tilde over (x)}.sub.k.sup.N+
.sup.Nu.sub.k.sup.N+
.sup.N(y(k?)?
.sup.N{tilde over (x)}.sub.k.sup.N)
{tilde over (x)}.sub.0.sup.N=.sup.N{tilde over (?)}.sub.0
where .sup.N?
(
,
.sup.N) is given by:
.sup.N=
.sup.N{tilde over (?)}.sup.N
*{?.sup.2+
.sup.N{tilde over (?)}.sup.N(
.sup.N)*}.sup.?1
with ?.sup.N the unique positive semi-definite symmetric solution to the
{tilde over (?)}.sup.N=.sup.N[{tilde over (?)}.sup.N?{tilde over (?)}.sup.N(
.sup.N)*(?.sup.2+
.sup.N{tilde over (?)}.sup.N(
.sup.N)*).sup.?1
.sup.N{tilde over (?)}.sup.N](
.sup.N)*+
.sub.1.sup.N?(
.sub.1.sup.N)*.
[0396] The equations are operator equations, albeit finite dimensional ones. In order to actually carry out computations (i.e. by using standard ARE solvers) these equations must be converted to equivalent matrix equations. Since the basis we have chosen for W.sup.N is not orthonormal, some care must be exercised in making this conversion so as to obtain a standard symmetric matrix ARE.
[0397] The approximating compensator is then given by
where in the above expression we have used the following notational convention {circumflex over (?)}.sup.N=({circumflex over (?)}.sub.1.sup.N, {circumflex over (?)}.sub.2.sup.N, {circumflex over (?)}.sub.3.sup.N)?.sup.N and {tilde over (x)}.sub.k.sup.N=({tilde over (x)}.sub.1,k.sup.N, {tilde over (x)}.sub.2,k.sup.N, {tilde over (x)}.sub.3,k.sup.N)?
.sup.N. We note that because the generator
.sup.N of the approximating semigroup {exp(
.sup.Nt):t?0}, was constructed using a Galerkin approach, we are guaranteed the existence of unique positive semi-definite symmetric solutions to the AREs for the same reasons that this is true in the infinite-dimensional case stated in the previous sub-section. In addition, the convergence results given herein apply and finally we note that approximating closed loop eigenvalues can be obtained as
[0398] Numerical Results (Section 15). We consider a system of the general form of the one given previously. In particular we let q.sub.1=0.2, q.sub.2=0.5, q.sub.3=0.5, q.sub.4=0.5, q.sub.5=0.5, q.sub.6=0.5, ?.sub.1=0.05, ?.sub.2=0.05, ?=0.05 k.sub.0=0, k.sub.1=?, ?=0, and {circumflex over (r)}=0.1. We assume further that we do not actually know the precise value of q.sub.1, but rather only that it is random with q.sub.1?Beta (?, ?) with ?=3 and ?=2. We take the sampling interval to be ?=0.1 and the discretization level of ??[0,1] and q.sub.1?[0,1] to be given by the multi-index N=(n, m).
[0399] In
TABLE-US-00003 TABLE I m = n 4 8 12 16 20 24 28 Norm 18.00 10.00 5.18 2.61 1.25 0.54 0.17 (?10.sup.?4)
TABLE-US-00004 TABLE II m = n 4 8 12 16 20 24 28 Norm 12.62 7.39 4.81 3.21 2.09 1.24 0.56 (?10.sup.?4)
[0400] In Table III we show the optimal functional control gains {circumflex over (?)}.sub.1 and {circumflex over (?)}.sub.2 and iwe have plotted the optimal functional control gains {circumflex over (?)}.sub.3 for the full state feedback controller when q.sub.1=0.1j,j=1,2, . . . ,9,10 all computed with n=32. In the same table and figure we have also tabulated and plotted the expected value of the optimal functional control gains, .sub.?[{circumflex over (?)}] computed using our approach with n=m=16 and q.sub.1?Beta (?, ?) with ?=3 and ?=2. In addition, since our scheme yields the approximating optimal control (and observer) gains as a function of q.sub.1, we can readily compute 90% credible intervals and bands for the optimal control gains computed with our method. In
[0401] In Table IV we show the values of the performance index, J(u), when the system was simulated with different approximating optimal controllers/compensators. We took x(0, ?)=1.0,0???1, w(0)=1.0, v(0)=1.0 and computed the approximating controllers with either n=32 or n=m=32. We took the plant parameter values to be q=[0.2,0.5,0.5,0.5.0.5,0.5].sup.T, the final time to be T=10.0, and the length of the sampling interval to be ?=0.1. The standard deviations of the noise processes were taken to be ?.sub.1=?.sub.2=?=0.05 and the control penalty weight was r{circumflex over ()}=0.1. We set the seed in Matlab's random number generator to be equal to one in all of the simulations. We simulated the linearized plant using our spline model with n=64 and for our scheme we assumed that q.sub.1=q.sub.1 was random with q.sub.1?Beta(3, 2).
TABLE-US-00005 TABLE III q.sub.1 0.1 0.3 0.5 0.7 0.9 E?[q.sub.1] f.sub.1 0.2630 0.2128 0.1740 0.1462 0.1258 0.1603 f.sub.2 3.9239 1.8792 1.2699 0.9654 0.7807 1.2339
TABLE-US-00006 TABLE IV Con/Comp 1 2 3 6 J(u) 9.07 5.08 7.78 5.60
TABLE-US-00007 TABLE V Con/Comp 1 2 3 4 5 6 J(u) 21.99 10.88 10.89 11.92 11.70 11.57
[0402] In Table IV Controller/Compensator 1 was no control (i.e. u.sub.k=0, k=0,1,2, . . . ,99), Controller/Compensator 2 was the optimal infinite-dimensional (n=64) full state feedback controller computed with the plant's value for q.sub.1 to be q.sub.1=0.2, Controller/Compensator 3 was the optimal finite-dimensional (n=32) output feedback compensator computed with the plant's value for q.sub.1 to be the plant value of q.sub.1, q.sub.1=0.2. Controller/Compensator 4 was the optimal finite-dimensional (n=32) output feedback compensator but computed with the incorrect value for q.sub.1, q.sub.1=0.8, Controller/Compensator 5 was the optimal finite-dimensional (n=32) output feedback compensator but computed with q.sub.1=[q.sub.1]=0.6, and finally Controller/Compensator 6 was the optimal finite-dimensional (n=32, m=32) output feedback compensator computed using the approach we developed.
[0403] Finally, in Table V we show results of simulating controller/compensator 1, 2, and 3 along with compensator 6, the one developed here, for the case where q.sub.1=q.sub.1,k?Beta (?, ?) with ?=3 and ?=2, k=0,1,2, . . .
[0404] We have demonstrated the optimality and convergence of approximating finite-dimensional compensators for a plant of the form herein. As can be seen from the numerical studies in the previous section, we have also demonstrated that our finite-dimensional compensators perform well in both the case where the plant system parameters are fixed but unknown (with known distribution) and where they take on a different random value in each sampling interval. However, the rigorous analysis of the performance of the actual closed loop system (e.g. could the finite-dimensional compensator destabilize the infinite dimensional plant or is the compensator in any sense optimal, etc.) in each of these cases, at present, remains open.
[0405] An extension of our results for the LQG compensator problem for random parabolic systems developed here may be contemplated to the LQG tracking problem for random parabolic systems. As was the case with the results presented here, this effort is again motivated by problems involving transdermal alcohol transport and sensing. Specifically, there are two problems of particular interest to us; one is a control problem and the other is an estimation or filtering problem. The first problem is the natural extension of the results presented here for the control of the alcohol clamping studies to experiments whose aim is to have the subject's BAC track or follow a pre-specified trajectory. Once again sensing would be based on observations of transdermal alcohol level. As in the case of the clamping studies, the resulting control problem is complicated by the fact that the underlying model is population-based with only the distribution of the model parameters known.
[0406] The second problem of interest to us involves the estimation of BAC or BrAC from TAC measurements. The technology to measure TAC is relatively new. Consequently, researchers and clinicians working in the area of alcohol use disorders have traditionally based their studies and diagnoses almost exclusively on observations of BAC or BrAC. In addition, BAC and BrAC are the preferred measure of intoxication in the consumer (i.e. wearable technology) and forensic (e.g. DUI) communities. Observations of BAC and BrAC are difficult or impossible to collect in a naturalistic setting in the field, while through the use of this new technology, TAC can be. Thus, a reliable means to convert TAC into equivalent BAC/BrAC is desired. The approach we are looking at is to formulate the TAC to BAC/BrAC conversion as an LQG tracking problem wherein the input (i.e. the BAC or BrAC) that forces the model (rather than the plant!) to track the biosensor measured TAC is determined. The underlying diffusion and transport model is augmented with actuator dynamics so that the input penalty term in the quadratic performance index can serve as regularization to mitigate over-fitting. Again, the underlying dynamics are in the form of a population model with only the distributions rather than the actual values of the parameters known.