Multicomponent model parameterisation
11113362 · 2021-09-07
Assignee
Inventors
- Seyi Latunde-Dada (Malvern, GB)
- Oksana Iryna Leszczyszyn (Malvern, GB)
- Karl Hampton (Malvern, GB)
- Rachel Bott (Malvern, GB)
Cpc classification
G01N13/00
PHYSICS
G06F17/18
PHYSICS
International classification
G06F17/18
PHYSICS
G01N13/00
PHYSICS
Abstract
A method of estimating a parameter for fitting a multi-component Taylorgram model to Taylorgram data g(t) is disclosed. The data comprises a multi-component Taylorgram peak or front at t=t.sub.r. The method comprises: evaluating a value of an integration or differential of the data; determining the parameter, based on an analytical expression that includes the value of the integral or differential of the data, the parameter corresponding with a physical property of a sample from which the Taylorgram data was obtained.
Claims
1. A method of estimating a diffusion coefficient or hydrodynamic radius of a sample by: performing a Taylor dispersion analysis, wherein performing the Taylor dispersion analysis comprises: flowing a carrier solution through a capillary; injecting a sample fluid into the capillary to form a sample plug or a sample front; determining Taylorgram data by measuring absorbance over time at a detector as the sample plug or sample front flows past the detector; and using a computer to fit a multi-component Taylorgram model to the Taylorgram data g(t) obtained from the sample, the Taylorgram data comprising a multi-component Taylorgram peak or front at t=t.sub.r; the method comprising: evaluating a value of an integration or a differential of the data; determining the diffusion coefficient or hydrodynamic radius of a component of the multi-component Taylorgram model, based on an analytical expression that includes the value of the integral or differential of the data, the diffusion coefficient or hydrodynamic radius corresponding with a physical property of a component of the sample from which the Taylorgram data was obtained; determining a maximum value, w, of a differential of the data,
2. The method of claim 1, wherein evaluating a value of an integration or a differential of the data comprises evaluating at least one of: a first differential of the data, a second differential of the data, a third differential of the data, a first integral of the data, a second integral of the data and a third integral of the data.
3. The method of claim 1, comprising: evaluating, based on the data, at least two values selected from: x, a value of the data at t=t.sub.r; y, a value of a second differential of the data,
4. The method of claim 1, wherein the simultaneous equations comprise at least two of:
u=Σ.sub.i=1.sup.nA.sub.iσ.sub.i.sup.2.
5. The method of claim 4, wherein the model comprises a two-component Taylorgram model, and wherein at least one of the following conditions are met: none of A.sub.1, A.sub.2, σ.sub.1, σ.sub.2 are known before the diffusion coefficient or hydrodynamic radius is estimated; one of σ.sub.1 and σ.sub.2 are known, and none of the remaining parameters A.sub.1, A.sub.2, σ.sub.1, σ.sub.2 are known before the diffusion coefficient or hydrodynamic radius is estimated; the second component has a negative amplitude A.sub.2, and the amplitude of the second component A.sub.2, is smaller than the amplitude of the first component A.sub.1.
6. The method of claim 4, wherein: i) the model comprises a two or three component Taylorgram model, and wherein a relationship between the values of σ.sub.i for the sample is known, and wherein the absolute values of A.sub.i and σ.sub.i for the sample are not known, and the ratio between the values of A.sub.i for the sample is not known; or ii) the model comprises a two, three or four component Taylorgram model, and wherein the values σ.sub.i for the sample are each known but the values of A.sub.i for the biopharmaceutical formulation are not known.
7. The method of claim 1, wherein the Taylorgram model is of the general form:
8. The method of claim 7, wherein the diffusion coefficient or hydrodynamic radius is determined based on at least one of the expressions:
9. The method of claim 8, wherein a further parameter of the model is determined based on at least one of the expressions:
10. An apparatus comprising a processor, configured to perform the method of claim 1.
11. The apparatus of claim 10, further comprising an instrument for performing a Taylor dispersion analysis, so as to obtain the data.
12. The method of claim 1, further comprising performing a measurement on the sample to obtain the Taylorgram data.
13. A method of estimating a diffusion coefficient or hydrodynamic radius of a sample by: performing a Taylor dispersion analysis, wherein performing the Taylor dispersion analysis comprises: flowing a carrier solution through a capillary; injecting a sample fluid into the capillary to form a sample plug or a sample front; determining Taylorgram data by measuring absorbance over time at a detector as the sample plug or sample front flows past the detector; and using a computer to fit a two-component Taylorgram model to the Taylorgram data g(t), the data being a Taylorgram obtained from the sample comprising a multi-component Taylorgram peak or front at t=t.sub.r, the method comprising: performing a least squares fit of a single component Taylorgram model to the data g(t); determining the diffusion coefficient or hydrodynamic radius by finding a root of a cubic equation, the cubic equation derived from: an analytical expression for the integrated residual squared error, R.sup.2, resulting from a fit of a one-component Taylorgram model to a two-component Taylorgram distribution; and a priori knowledge of the standard deviation, α, of a dominant Taylorgram component of the data g(t); and, wherein the two-component Taylorgram model is of the general form:
a.Math.s.sub.2.sup.3+b.Math.s.sub.2+c=0, where
14. A non-transitory computer readable medium, containing a set of instructions that are operable to cause a computer to perform the method of claim 13.
15. The method of claim 13, further comprising performing a measurement on the sample to obtain the Taylorgram data.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments will be described in more detail, purely by way of example, with reference to the accompanying drawings, in which;
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
DETAILED DESCRIPTION OF THE INVENTION
(23) In the long time limit of Taylor dispersion, the resultant Taylorgram obtained from a pulse Taylorgram comprising a mixture of n non-interacting components can be approximated by:
(24)
where A.sub.i and σ.sub.i are the respective amplitudes and Taylorgram widths for the i to n components, t.sub.r is the residence time of the mixture at the observation point and t is the time.
(25) The value of the Taylorgram x at t=t.sub.r (i.e. the time of the peak in
x=Σ.sub.i=1.sup.nA.sub.i (Equation 8)
(26) Referring to
(27)
(28) The value of the second differential 105 of a multicomponent Gaussian of the form given in Equation 7 can be evaluated at t=t.sub.r to give y, as:
(29)
(30)
(31)
(32)
u=I.sub.2=Σ.sub.i=1.sup.nA.sub.iσ.sub.i.sup.2 (Equation 11)
(33) The four equations (Equations 9-11) provide simultaneous equations in the amplitudes, A, and widths, σ, which can be solved for mixtures with differing numbers of components.
(34) From the widths of the individual distributions, the hydrodynamic radii can be determined for the individual components. Furthermore, with knowledge of the extinction coefficient of each component (relating the reading at a concentration sensor to a concentration of the component in question), the proportion of each component in the mixture can be estimated by computing the area under each individual Taylorgram component.
(35) Although this example has dealt with the case where the sample plug is injected as a pulse (with short duration), the skilled person will appreciate that a similar set of simultaneous equations can be written for the case where the sample plug is injected as a slug (with longer duration). Equations analogous to Equations 9-11 can be written, based on Equation 3, which describes a slug Taylorgram, rather than based on Equation 2, describing a pulse Taylorgram).
(36) Referring to
(37) Illustrative methods of estimating parameters for fitting multi-component models to Taylorgram data will be disclosed that address each of the cases mentioned in the background of the invention.
(38) Each of these methods can be used to analytically calculate estimates for model parameters that fit the Taylorgram data. The parameter estimates may be used as starting points for further regression analysis, intended to tweak the parameters so as to minimise an error function determined with reference to the model and the data. For instance, the initial parameter estimates may be provided as an input to a least squares regression that adjusts the parameters to minimise the sum of the squares of the error. In some circumstances, the parameter estimates may be sufficiently accurate to be used without further regression analysis to determine properties of the sample from which the data was obtained (e.g. hydrodynamic radius or concentration).
(39) In some embodiments, pre-processing of Taylorgram data before the parameter estimates are generated may be desirable. For example, a detector of the instrument used to perform the analysis may be subject to baseline drift. This may be corrected, for example by fitting a linear function to the baseline data, and correcting the raw Taylorgram based on the linear function. In addition or alternatively, the raw Taylorgram may be subjected to a filtering or smoothing operation to remove or attenuate high frequency components. For example a moving average filter may be applied (e.g. a Savitzky-Golay filter), or a spline fit made to the data, or some other smoothing/filtering technique applied.
(40) Two Component Mixture of Unrelated Components
(41) In this case, none of A.sub.1, A.sub.2, σ.sub.1, σ.sub.2 are known a priori. The four equations (Equations 8 to 11) reduce to:
(42)
(43) Solving the four equations (12 to 15) simultaneously gives:
(44)
where
(45)
(46) Using these equations, the four unknown parameters can be estimated. In cases where the two widths σ.sub.1, σ.sub.2 are similar in magnitude, it is possible to obtain unphysical solutions to these equations. To alleviate this, the values of w and k can be varied (for instance by pseudorandom amounts) until physical solutions are obtained.
(47) Two Component Mixture Where the Size of One of the Components is Known a Priori
(48) In one embodiment, the width σ.sub.1 of the known component can be determined from the size by rearranging Equation 1. In doing so, one of the four unknown parameters has been determined a priori and any three of the four equations (Equation 12-15) presented above can be used to obtain initial estimates for the remaining parameters. There are four such combinations of three equations which can be used.
(49) In an alternative embodiment, a least squares fitting method finds the best fit to a given set of data points which minimizes the sum of the squares of the offsets (residuals) of the data points from the fit. For example, for the set of data y.sub.i, the best fit function ƒ(t, a.sub.1 . . . a.sub.n), which varies with time t and parameters a.sub.1 . . . a.sub.n, is the one which minimizes the sum of the residuals, R.sup.2 given by:
R.sup.2=Σ[y.sub.i−ƒ(t,a.sub.1, . . . ,a.sub.n)].sup.2 (Equation 22)
(50) A least squares fit of a single Gaussian distribution to a two-component pulse Taylorgram (Equation 6) therefore finds the solution which minimizes the residuals R.sup.2 given by:
(51)
where A.sub.s and σ.sub.s are the amplitude and the width of the fitted Gaussian respectively.
(52) This sum can be estimated as an integral over time to give:
(53)
(54) To find the minimum, Equation 24 may be differentiated with respect to A.sub.s and σ.sub.s and equated to zero to give the following simultaneous equations:
(55)
(56) Given a priori knowledge of the hydrodynamic radius of component 1 (and hence σ.sub.1) and since the peak height of the Taylorgram, A.sub.peak, is equal to the sum of A.sub.1 and A.sub.2, equations 25 and 26 can be solved simultaneously to give σ.sub.2 (and hence the second hydrodynamic radius) as the solution of the cubic equation:
a.Math.s.sub.2.sup.3+b.Math.s.sub.2+c=0 (Equation 27)
where
(57)
(58) A.sub.1 and A.sub.2 (and hence the relative proportions of the two components in the sample from which the Taylorgram has been obtained) can then be determined by substitution into Equations 25 and 27.
(59) Although the example above has again been illustrated with respect to a pulse Taylorgram, the skilled person will appreciate that a similar approach can be applied to the case where the Taylorgram mode corresponds with a slug Taylorgram (as described by Equation 3).
(60) Referring to
(61) Two Component Mixture Where the Second Component Has a Smaller Amplitude Which Differs in Sign to the Amplitude of the First Component
(62) Such a scenario can arise in situations where there a mismatch between the sample and run buffers, and may arise as a result of evaporation, sample mishandling or when sample/buffer components change during storage. The effect of a mismatch between the sample and run buffers on the resulting Taylorgram data is illustrated in
(63) In this case, we are considering a mixture of two components that are unrelated i.e. the sample and the sample buffer. The application of a two-component model is required so that the true width of the sample can be extracted.
(64)
(65) As for any mixture of two components, there are four unknown parameters: A.sub.1, A.sub.2, σ.sub.1 and σ.sub.2 and thus four equations are required in order to estimate the initial guesses for the fitting algorithm. The first equation can be obtained from the value of the absorbance x at the dip in the trace i.e. at t˜t.sub.r.
x=A.sub.1+A.sub.2 (Equation 12)
(66) The second and third equations are obtained from the maximum absolute value w of the differential of the profile. For a single component fit, this has an absolute value of:
(67)
(68) For a profile displaying a buffer mismatch, w is dominated by the sample component and hence can be approximated by:
(69)
(70) The location of the maximum absolute value of the single component fit is given by:
t′=t.sub.r±σ.sub.1 (Equation 33)
(71) Since the first component is dominant, the value of t′ from Equation 33 approximates the time of the maximum value of w for the two component Taylorgram data.
(72) At the dip in the trace (corresponding with t=t.sub.r), the second differential 105 is non-zero and positive as shown in
(73)
(74) Both A.sub.1 and σ.sub.1 are determinable using only Equation 30 and Equation 31. Once estimates for these parameters have been obtained, it is straightforward to obtain estimates for A.sub.2 and σ.sub.2 from Equation 12 and Equation 13 by direct substitution.
(75) Referring to
(76) Two or Three Components with Known Size Ratios
(77) In this case, there is only one unknown width since the other one (or two in the case of three components) is related by a ratio known a priori. For the case of three components, there are three unknown amplitudes and one unknown size or width. Hence, with the corresponding four equations (cf. Equations 8-11), the four unknowns (A.sub.1, A.sub.2, A.sub.3 and σ.sub.1) can be solved for:
(78)
where a and b are the known size ratios of the second and third components to the first component respectively.
(79) These simultaneous equations can be reduced to a single quartic equation in σ.sub.1 which can be solved using well known methods. From this solution, estimates can then be made for A.sub.1, A.sub.2 and A.sub.3 by substitution. A similar solution can be obtained for a mixture of two components by solving any three of the four corresponding equations.
(80) Two, Three of Four Components with Known Sizes
(81) In this case, since the sizes (and hence the widths) are known, the only unknowns are the amplitudes. Hence, for the general case, the four equations (8-11) can be solved for the n unknowns (A.sub.n) by reducing them to a matrix equation. The following is the equation applicable to n=4 components:
(82)
where σ.sub.1, σ.sub.2, σ.sub.3 and σ.sub.4 are known a priori (determined from Equation 4). This can then be solved by well-known matrix methods to obtain initial estimates for A.sub.1, A.sub.2, A.sub.3, and A.sub.4. Note that for two- (and three-) component mixtures a 2 by 2 (and 3 by 3) matrix constructed from any two (and three) of the four equations can be used to obtain initial estimates.
(83) Apparatus
(84) Referring to
(85) Referring to
(86) Windows W1, W2 are spaced apart along the length of the capillary 2 between the containers V1, V2. The capillary 2 may be formed in a loop so that both windows W1, W2 may be imaged using a single optical assembly, for instance by arranging for them to be adjacent to one another in an area imaged by the pixel array of an area imaging detector 6. In other embodiments, a single window may be used, or the detector 6 may comprise a single element, rather than a pixel array.
(87) To inject a plug of the sample A into the capillary 2 the third container V3 may be connected to the capillary 2 and then disconnected after a suitable volume of the sample A has been injected under pressure. The second container V2 is connected the capillary when the third container V3 is disconnected from the capillary 2. The detector 6 captures a frame sequence comprising measures of the received light intensity at the detector 6 as the pulse of sample solution 4 or the flow front passes through each window W1, W2. The detector output thereby provides data on absorbance versus time: a Taylorgram.
(88)
EXAMPLES
(89) Two-Component Mixtures
(90) Three different combinations of two-component mixtures were prepared and analysed with the Malvern Instruments® Viscosizer®. They were prepared from caffeine (R.sub.h˜0.332 nm), BSA (R.sub.h˜3.8 nm), Myoglobin (R.sub.h˜2.1 nm) and IgG (R.sub.h˜5.8 nm) dissolved in a PBS buffer solution. Multi-component models were fitted to the Taylorgram data obtained from the analyses, according to: i) the prior art, using least squares regression analysis based on pseudo-random initial parameters; and ii) using least squares regression analysis, starting from parameter estimates generated in accordance with an embodiment. In each case, the models were generated without a priori knowledge of: the relative concentrations of the two components in the sample, and the hydrodynamic radius of the two components in the sample.
(91) The mixtures were: 1. Caffeine and BSA (shown in
(92) The instrument includes two measurement locations, which have two different corresponding residence times, and hence two sets of Taylorgram data per analysis are generated. Models can be fitted independently to the output from either sensor location, or based on minimising the errors from both sets of data based on common model parameters.
(93)
(94)
(95)
(96) Table 1 below illustrates the accuracy obtained by using parameter estimates in accordance with an embodiment.
(97) TABLE-US-00001 TABLE 1 Results for the hydrodynamic radii obtained from the two-component fits Average Rh Average Rh Mixture of 1.sup.st of 2.sup.nd (3 measurements of each) component/nm component/nm Caffeine (0.33 nm) BSA (~3.6 nm) 0.34 3.6 BSA (~3.6 nm) Myoglobin (~2.1 nm) 3.7 2.1 Myoglobin (~2.1 nm) IgG (~5.7 nm) 2.0 5.8
(98) The hydrodynamic radii determined based on the parameter estimates are in good agreement with the known hydrodynamic radii of the mixtures.
(99) Buffer Mismatch with Unknown Sample
(100)
(101) Model parameters were estimated in accordance with an embodiment, and used as a starting point for a least squares regression analysis to fit model parameters to the data. The first component 603, 605 of each fitted model (corresponding with the IgG component of the sample), the second component 604, 606 of each fitting model (corresponding with the buffer mismatch) and the combined models 607, 608 are shown in
(102) TABLE-US-00002 TABLE 2 Results for the hydrodynamic radii obtained from fits to buffer mismatches Sample and buffer Average Rh Average Rh (4 measurements) (IgG)/nm (Buffer)/nm IgG (~5.7 nm) in PBS buffer 5.8 0.24
(103) The analysis was performed without a priori knowledge of the IgG component or the buffer mismatch.
(104) A Mixture of Two Components with Known Sizes
(105) When dissolved in a PBS buffer solution, the hydrodynamic radii of immunoglobin (IgG) monomers and dimers are 5.2 nm and 7 nm respectively. In order to determine the relative proportions of the oligomers in a sample of IgG, three Taylorgrams were obtained and fitted with initial parameter estimates generated by fixing the component sizes and hence their pseudo-Gaussian widths, in accordance with an embodiment. The areas under each component fit were then calculated to obtain the relative proportions of the monomer and dimer in the sample.
(106)
(107) TABLE-US-00003 IgG Average monomer Average dimer (3 measurements) proportion/% proportion/% Monomer (5.2 nm), 89 11 dimer(7 nm)
(108) For comparison, a fit obtained with no predetermined estimates (i.e. pseudo-random) is shown in
(109) Embodiments of the invention have been described that solve a number of important problems in sample characterisation. Embodiments of the invention make Taylor dispersion analysis a more viable option for characterisation of aggregates in formulations (e.g. biopharmaceutical formulations).
(110) Although the examples have generally been illustrated with respect to a pulse Taylorgram, the skilled person will appreciate that a similar approach can be applied to the case where the Taylorgram mode corresponds with a slug Taylorgram (as described by Equation 3). The applicable equations for a slug Taylorgram are more lengthy, but are straightforward to generate, for instance using software such as Wolfram Mathematica.
(111) A number of variations are possible, within the scope of the invention, as defined by the appended claims.