Methods and computer program products for compression of sequencing data
09864846 ยท 2018-01-09
Assignee
Inventors
- Charles Sugnet (San Francisco, CA, US)
- Simon Cawley (Oakland, CA)
- Mohit Gupta (San Mateo, CA, US)
- Iztok Marjanovic (Sunnyvale, CA, US)
- Mark BEAUCHEMIN (S. Glastonbury, CT, US)
- Todd Rearick (Cheshire, CT, US)
Cpc classification
H03M7/30
ELECTRICITY
G16C20/10
PHYSICS
International classification
H03M7/30
ELECTRICITY
Abstract
A compression method includes: measuring a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises a plurality of measured values and the chemical event is indicative of a number of nucleotide incorporations in a genetic sequencing reaction; applying a first compression process to the waveform, the first compression process including a truncating of data corresponding to a portion of the waveform that is not related to nucleotide incorporations in the genetic sequencing reaction; and applying a second compression process to the waveform, the second compression process including a data substitution process that replaces at least a portion of the waveform with a plurality of coefficients representative of the portion of the waveform.
Claims
1. A compression method, comprising: measuring a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises a plurality of measured values and the chemical event is indicative of a number of nucleotide incorporations in a genetic sequencing reaction; and applying a first compression process to the waveform using a processor, the first compression process including a truncating of data corresponding to a portion of the waveform that is not related to nucleotide incorporations in the genetic sequencing reaction thereby forming a compressed data structure and storing the compressed data structure in a memory.
2. The method of claim 1, wherein the truncating of data comprises determining, for each of a plurality of sensors in the sensor array, a cut-off time point for the waveform for that sensor defining a data range to be truncated.
3. The method of claim 2, wherein each cut-off time point is determined by mining a plurality of past analysis runs for a given sensor array geometry.
4. The method of claim 2, wherein each cut-off time point is determined prior to every run or during a calibration procedure.
5. The method of claim 2, wherein each cut-off time point is factory pre-determined for a given sensor array geometry.
6. The method of claim 2, wherein the sensors are arranged in a plurality of regions in the sensor array, and wherein each cut-off time point for sensors in a given region is determined to have a common cut-off time point determined for sensors for that region.
7. The method of claim 6, wherein each common cut-off time point is determined by finding a best fit to a linear hinge model on a median trace for a region.
8. The method of claim 6, wherein each common cut-off time point is determined empirically and depends on a position of that region relative to other regions along a fluidic flow of nucleotides onto the sensor array.
9. The method of claim 8, wherein the cut-off time point of sensors in a region substantially near a fluidic inlet is different from the cut-off time point of sensors in a region substantially near a fluidic outlet.
10. The method of claim 1, further comprising: applying a second compression process to the waveform using the processor, the second compression process including a data substitution process that replaces at least a second portion of the waveform with a plurality of coefficients representative of the second portion of the waveform, wherein the second portion is related to nucleotide incorporations in the genetic sequencing reaction.
11. The method of claim 10, wherein the second compression process includes replacing the second portion of the waveform with a plurality of coefficients of a linear combination of one or more principal component vectors representative of the second portion of the waveform.
12. The method of claim 11, wherein the second compression process includes storing the plurality of coefficients compactly in the memory by dynamically truncating the coefficients to a lower precision and encoding the truncated coefficients using a Huffman code.
13. The method of claim 11, wherein the second compression process includes replacing the second portion of the waveform with a plurality of coefficients of a linear combination of between about 5 and about 10 principal component vectors representative of the second portion of the waveform.
14. The method of claim 11, wherein the second compression process includes replacing the second portion of the waveform with a plurality of coefficients of a linear combination of 5 or 6 principal component vectors representative of the second portion of the waveform.
15. The method of claim 1, wherein the measuring the waveform comprises measuring the waveform of a dynamic response of an ion-sensitive field effect transistor (ISFET) array to a change in ionic strength of an analyte solution in fluid contact with the ISFET array, wherein the measuring the waveform of the dynamic response of the ISFET array comprises associating a portion of the waveform to a stepwise increase in ion concentration in the analyte solution and associating another portion of the waveform to at least one portion of the dynamic response outside of the stepwise increase in ion concentration.
16. A computer program product comprising a non-transitory computer-usable medium having computer program logic recorded thereon that, when executed by one or more processors, samples and compresses data from a sensor array, the computer program logic comprising: first computer readable program code that enables a processor to measure a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises a plurality of measured values and the chemical event is indicative of a number of nucleotide incorporations in a genetic sequencing reaction; and second computer readable program code that enables a processor to apply a first compression process to the waveform, the first compression process including a truncating of data corresponding to a portion of the waveform that is not related to nucleotide incorporations in the genetic sequencing reaction thereby forming a compressed data structure and storing the compressed data structure in a memory.
17. The computer program product of claim 16, further comprising third computer readable program code that enables a processor to apply a second compression process to the waveform, the second compression process including replacing at least a second portion of the waveform with a plurality of coefficients representative of the second portion of the waveform, wherein the second portion of the waveform is related to nucleotide incorporations in the genetic sequencing reaction.
18. A method for compressing nucleic acid sequencing data, comprising: obtaining raw data from a semiconductor-based sequencing sensor array comprising a plurality of sensors during a data acquisition time period, the raw data comprising at least a non-informative portion corresponding to a subinterval of the data acquisition time period having a location within the data acquisition time period that varies for different sensors according to a position of the sensor in the sensor array; and transforming the raw data into compressed data using a lossy compression process including a data truncation process, the data truncation process being related for each sensor to the position of the sensor in the sensor array and configured to discard the non-informative portion of the raw data thereby forming a compressed data structure and storing the compressed data structure in a memory.
19. The method of claim 18, wherein the lossy compression process further comprises a data substitution process adapted to replace at least a portion of the raw data for each sensor with a plurality of coefficients of a linear combination of one or more principal component vectors representative of the portion of the raw data for each sensor.
20. The method of claim 19, wherein the data substitution process comprises storing the plurality of coefficients compactly in the memory by dynamically truncating the coefficients to a lower precision and encoding the truncated coefficients using a Huffman code.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) In the drawings, like reference characters generally refer to the same parts throughout the different views. Also, the drawings are not necessarily to scale, emphasis instead being placed upon generally illustrating the various concepts discussed herein.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)
(54)
(55)
(56)
(57) FIG. 42F1 is a diagrammatic illustration of an example of a ceiling baffle arrangement for a flow cell in which fluid is introduced at one corner of the chip and exits at a diagonal corner, the baffle arrangement facilitating a desired fluid flow across the array.
(58) FIGS. 42F2-42F8 comprise a set of illustrations of an exemplary flow cell member that may be manufactured by injection molding and may incorporate baffles to facilitate fluid flow, as well as a metalized surface for serving as a reference electrode, including an illustration of said member mounted to a sensor array package over a sensor array, to form a flow chamber thereover.
(59)
(60)
(61)
(62)
(63)
(64)
(65)
(66)
(67)
(68)
(69)
(70)
(71)
(72)
(73)
(74)
(75)
(76)
(77)
(78)
(79)
(80)
(81)
(82)
(83)
(84)
(85)
(86)
(87)
(88)
(89)
(90)
(91)
(92)
(93)
(94)
(95)
(96)
(97)
(98)
(99)
(100)
(101)
(102)
(103)
(104)
(105)
(106)
(107)
(108)
(109)
(110)
(111)
(112)
(113)
(114)
(115)
(116)
(117)
(118)
(119)
(120)
(121)
(122)
(123)
(124)
(125)
(126)
(127)
(128)
(129)
DETAILED DESCRIPTION
(130) Following below are more detailed descriptions of various concepts related to, and embodiments of, inventive methods and apparatus relating to large scale chemFET arrays for analyte detection and/or measurement. It should be appreciated that various concepts introduced above and discussed in greater detail below may be implemented in any of numerous ways, as the disclosed concepts are not limited to any particular manner of implementation. Examples of specific implementations and applications are provided primarily for illustrative purposes.
(131) Various inventive embodiments according to the present disclosure are directed at least in part to a semiconductor-based/microfluidic hybrid system that combines the power of microelectronics with the biocompatibility of a microfluidic system. In some examples below, the microelectronics portion of the hybrid system is implemented in CMOS technology for purposes of illustration. It should be appreciated, however, that the disclosure is not intended to be limiting in this respect, as other semiconductor-based technologies may be utilized to implement various aspects of the microelectronics portion of the systems discussed herein.
(132) One embodiment disclosed herein is directed to a large sensor array (e.g., a two-dimensional array) of chemically-sensitive field effect transistors (chemFETs). In related embodiments, the individual chemFET sensor elements or pixels of the array are configured to detect analyte presence (or absence), analyte levels (or amounts), and/or analyte concentration in a sample such as an unmanipulated sample, or as a result of chemical and/or biological processes (e.g., chemical reactions, cell cultures, neural activity, nucleic acid sequencing reactions, etc.) occurring in proximity to the array. Examples of chemFETs contemplated by various embodiments discussed in greater detail below include, but are not limited to, ion-sensitive field effect transistors (ISFETs) and enzyme-sensitive field effect transistors (EnFETs). In one exemplary implementation, one or more microfluidic structures is/are fabricated above the chemFET sensor array to provide for containment and/or confinement of a biological or chemical reaction in which an analyte of interest may be captured, produced, or consumed, as the case may be. For example, in one implementation, the microfluidic structure(s) may be configured as one or more wells (or microwells, or reaction chambers, or reaction wells as the terms are used interchangeably herein) disposed above one or more sensors of the array, such that the one or more sensors over which a given well is disposed detect and measure analyte presence, level, and/or concentration in the given well. Preferably, there is a 1:1 correspondence of chemFET sensors and reaction wells.
(133) In another exemplary implementation, the invention encompasses a system comprising at least one two-dimensional array of reaction chambers, wherein each reaction chamber is coupled to a chemically-sensitive field effect transistor (chemFET) and each reaction chamber is no greater than 10 m.sup.3 (i.e., 1 pL) in volume. Preferably, each reaction chamber is no greater than 0.34 pL, and more preferably no greater than 0.096 pL or even 0.012 pL in volume. A reaction chamber can optionally be 2.sup.2, 3.sup.2, 4.sup.2, 5.sup.2, 6.sup.2, 7.sup.2, 8.sup.2, 9.sup.2 or 10.sup.2 square microns in cross-sectional area at the top. Preferably, the array has at least 10.sup.2, 10.sup.3, 10.sup.4, 10.sup.5, 10.sup.6, 10.sup.7, 10.sup.8, 10.sup.9, or more reaction chambers. The reaction chambers may be capacitively coupled to the chemFETs, and preferably are capacitively coupled to the chemFETs. Such systems may be used for high-throughput sequencing of nucleic acids.
(134) As used herein, an array is a planar arrangement of elements such as sensors or wells. The array may be one or two dimensional. A one dimensional array is an array having one column (or row) of elements in the first dimension and a plurality of columns (or rows) in the second dimension. An example of a one dimensional array is a 15 array. A two dimensional array is an array having a plurality of columns (or rows) in both the first and the second dimensions. The number of columns (or rows) in the first and second dimensions may or may not be the same. An example of a two dimensional array is a 510 array.
(135) In some embodiments, such a chemFET array/microfluidics hybrid structure may be used to analyze solution(s)/material(s) of interest containing nucleic acids. For example, such structures may be employed to sequence nucleic acids. Sequencing of nucleic acids may be performed to determine partial or complete nucleotide sequence of a nucleic acid, to detect the presence and in some instances nature of a mutation such as but not limited to a single nucleotide polymorphism in a nucleic acid, to identify source of a cell(s) or nucleic acid for example for forensic purposes, to detect abnormal cells such as cancer cells in the body optionally in the absence of detectable tumor masses, to identify pathogens in a sample such as a bodily sample for example for diagnostic and/or therapeutic purposes, to identify antibiotic resistant strains of pathogens in order to avoid unnecessary (and ineffective) therapeutic regimens, to determine what therapeutic regimen will be most effective to treat a subject having a particular condition as can be determined by the subject's genetic make-up (e.g., personalized medicine), to determine and compare nucleic acid expression profiles of two or more states (e.g., comparing expression profiles of diseased and normal tissue, or comparing expression profiles of untreated tissue and tissue treated with drug, enzymes, radiation or chemical treatment), to haplotype a sample (e.g., comparing genes or variations in genes on each of the two alleles present in a human subject), to karyotype a sample (e.g., analyzing chromosomal make-up of a cell or a tissue such as an embryo, to detect gross chromosomal or other genomic abnormalities), and to genotype (e.g., analyzing one or more genetic loci to determine for example carrier status and/or species-genus relationships).
(136) The systems described herein can be utilized to sequence the nucleic acids of an entire genome, or any portion thereof. Genomes that can be sequenced include mammalian genomes, and preferably human genomes. Other genomes that can be sequenced include bacterial, viral, fungal and parasitic genomes. Such sequencing may lead to the identification of mutations that give rise to drug resistance, or general evolutionary drift from known species. This latter aspect is useful in determining for example whether a prior therapeutic (such as a vaccine) may be effective against current infecting strains. A specific example is the detection of new influenza strains and a determination of whether a prior year's vaccine cocktail will be effective against a new flu outbreak.
(137) Thus the methods of the invention may be embraced by methods for detecting a nucleic acid in a sample. The nucleic acid may be a marker of its source such as a pathogen including but not limited to a virus, or a cancer or tumor in an individual. In the latter aspect, a sample such as a blood sample may be harvested from a subject and screened for the presence of an occult cancer cell such as one that has extravasated from its original tumor site. In yet another aspect, the methods may be used for forensic purposes in which samples are screened for the presence of a known nucleic acid (e.g., from a suspect or from a law enforcement DNA bank). In related aspects, a sample may be analyzed for nucleic acid heterogeneity in order to determine whether a sample is derived from one source (e.g., a single subject) or more than one source (e.g., a contaminated sample). The methods described herein may also be used to detect the presence of a nucleotide mutation such as but not limited to a single nucleotide polymorphism. Such mutation analysis or screening is typically performed in prenatal or postnatal diagnostics. The nature of the mutation can then be used to determine the most suitable course of therapy, in some instances. Thus the invention intends that any of the methods provided herein can be used in one or more diagnostic, forensic and/or therapeutic methods.
(138) Various aspects of the invention employ a sequencing-by-synthesis approach for sequencing nucleic acids. This approach involves the synthesis of a new nucleic acid strand using a template nucleic acid. The template strand may be primed intermolecularly by hybridizing a sequencing primer to it at one end, or intramolecularly by folding over on itself at one end. The template strand may also be primed by introducing a break or a nick in one strand of a double-stranded nucleic acid, preferably but not exclusively near an end, as described in greater detail herein. In these embodiments, known nucleotides are incorporated into the primer based on complementarity with the template. The method requires that nucleotides be contacted with the primer (and thus template) (in the presence of polymerase and any other factors required for incorporation) in a selective manner.
(139) In many embodiments, each nucleotide type is individually contacted with the primer and/or template. In other embodiments, combinations of two or three types of nucleotides may be contacted with the primer and/or template simultaneously. Since the identity of the nucleotides in contact with the primer and/or template at any given time is known, the identity of the incorporated nucleotides (if incorporated) is also known. And based on the necessary complementarity with the template, the sequence of the template can also be deduced. Using different types of nucleotides separately (e.g., using dATP, dCTP, dGTP or dTTP separately from each other), a high resolution sequence can be obtained. Using combinations (or mixtures) of nucleotides (e.g., using dATP, dCTP and dGTP together and separately from dTTP), a lower resolution sequence can be obtained that is nevertheless valuable for certain applications (e.g., ordering and aligning various higher resolution sequences). With regards to the latter embodiment, it will be understood that the invention contemplates using mixtures of any three nucleotides, and in some cases any two nucleotides, and not just the specific combinations recited above.
(140) The nucleotides (or nucleotide triphosphates or deoxyribonucleotides or dNTPs, as they are referred to herein interchangeably) need not be and typically are not extrinsically labeled. Thus, naturally occurring nucleotides (i.e., nucleotides identical to those that exist in vivo naturally) or their synthetic counterparts are suitable for use in the methods of the invention. Such nucleotides may be referred to herein as being unlabeled.
(141) Preferably, the nucleotides are delivered at substantially the same time to each template. Polymerase(s) are preferably already present, although they also may be introduced along with the nucleotides. The polymerases may be immobilized or may be free flowing. Once the nucleotides are incorporated (if complementarity exists) and any associated signal is detected, an enzyme, such as apyrase, is typically delivered to degrade any unused nucleotides, followed by a washing step to remove substantially all of the enzyme as well as any other remaining and undesirable components. The reaction may occur in a reaction chamber in some embodiments, while in others it may occur in the absence of reaction chambers. In these latter embodiments, the sensor surface may be continuous without any physical divider between sensors.
(142) In important embodiments, the sequencing reaction is performed simultaneously on a plurality of identical templates in a reaction chamber, and optionally in a plurality of reaction chambers. Sequencing a different template in each reaction chamber allows a greater amount of sequence data to be obtained in any given run. Thus, using as many reaction chambers (and sensors) as possible in a given run also maximizes the amount of sequence data that can be obtained in any given run. In important embodiments, the templates in a reaction well are immobilized (e.g., covalently or non-covalently) onto and/or in a bead, referred to herein as a capture bead, or onto a solid support such as the chemFET surface.
(143) It is to be understood that in this and other embodiments and aspects of the invention, a plurality may represent a subset of elements rather than the entirety of all elements. As an example, in the above embodiment, the plurality of templates in the reaction chamber that are sequenced may represent a subset or all of the templates in the reaction chamber. Thus this particular embodiment requires that at least two templates be sequenced, and it does not require that all the templates present in the reaction chamber be sequenced.
(144) As described extensively herein, in some embodiments, nucleotide incorporation is detected through byproducts of the incorporation or by changes in charge to the newly synthesized nucleic acid, especially where it is immobilized on a chemFET surface, rather than by detecting the incorporated nucleotide itself. More specifically, some embodiments exploit the release of inorganic pyrophosphate (PPi), inorganic phosphate (Pi), and hydrogen ions (all of which are considered sequencing reaction byproducts) that occurs following incorporation of a nucleotide into a nucleic acid (such as a primer, for example). In some embodiments of the invention, the method detects the released hydrogen ions as an indication of nucleotide incorporation. The chemFETs (and chemFET arrays) described herein are suited to the detection of these ions as well as other sequencing reaction byproducts. It is to be understood that the aspects and embodiments described herein related to chemFETs equally contemplate and embrace ISFETs unless otherwise stated.
(145) The invention includes methods for improving detection of the hydrogen ions by the chemFET. These methods include generating and/or detecting more hydrogen ions in a given sequencing reaction. This can be done by increasing the number of templates per reaction chamber, increasing the number of templates attached to each capture bead, increasing the number of templates being sequenced per reaction chamber, increasing the number of templates bound to the sensor surface, increasing the stability of the primer/template hybrid, increasing the processivity of the polymerase, and/or combining nucleotide incorporation with nucleotide excision (e.g., performing the sequencing-by-synthesis reaction in the context of a nick translation reaction), among other things. Another alternative or additional approach is to increase the number of released hydrogen ions that are actually detected by the chemFET. This can be done by preventing the released hydrogen ions from interacting with other components in the reaction well including any components with buffering potential. These embodiments include using buffering inhibitors (as described more fully herein) to saturate components that might otherwise sequester released hydrogen ions. Buffering inhibitors may be short RNA oligomers that bind to single stranded regions of the templates, or chemical compounds that interact with the materials comprised in the reaction chambers and/or chemFETs themselves.
(146) Some aspects and embodiments presented herein involve dense chemFET arrays and reaction chamber arrays. It will be apparent that as arrays become more dense, area and/or volume of individual elements (e.g., sensor surfaces and reaction chambers) will typically become smaller in order to accommodate a greater number of sensors or reaction chambers without a concomitant (or significant) increase in total array area. However, it has been determined in accordance with an aspect of the invention that as volume of a reaction chamber decreases, the signal to noise ratio can actually increase due to an increased nucleic acid concentration. For example, it has been determined that a roughly 2.3 fold decrease in reaction chamber volume can yield about a 1.5 fold increase in signal to noise ratio. This increase can occur even if the total number of nucleic acids being sequenced is reduced. Thus, in some instances rather than losing signal by moving to more dense arrays, the invention contemplates a greater signal due to an increased concentration of nucleic acids in the smaller volume reaction chambers.
(147) The invention also contemplates sequencing-by-synthesis methods that detect nucleotide incorporation events based on changes in charge at the chemFET surface due to the a change in charge of a moiety attached to the surface, such as a nucleic acid or a nucleic acid complex (e.g., a template/primer hybrid). Such methods include those that use or extend nucleic acids that are immobilized (e.g., covalently) to the surface of a chemFET. Nucleotide incorporation into a nucleic acid that is bound to a chemFET surface typically results in an increase in the negative charge of the bound nucleic acid or the complex in which it is present (e.g., a template/primer hybrid). In some instances, the primer will be bound to the chemFET surface while in other instances the template will be bound to the chemFET surface. In such instances, a plurality of identical, typically physically separate, nucleic acids are immobilized to individual chemFET surfaces and sequencing-by-synthesis reactions are performed on the plurality simultaneously and synchronously. In some embodiments, the nucleic acids are not concatemers and rather each will include only a single copy of the nucleic acid to be sequenced.
(148) It will be understood that the sequencing methods provided herein can be used to sequence a genome or part thereof. As an example, such a method may include delivering fragmented nucleic acids from the genome or part thereof to a system for high-throughput sequencing comprising at least one array of reaction chambers, wherein each reaction chamber is coupled to a chemFET, and detecting a sequencing reaction in a reaction chamber via a signal from the chemFET coupled with the reaction chamber. Alternatively, the method may include delivering fragmented nucleic acids from the genome or part thereof to a sequencing apparatus comprising an array of reaction chambers, wherein each of the reaction chambers is disposed in a sensing relationship with an individual associated chemFET, and detecting a sequencing reaction a reaction chambers via a signal from its associated chemFET. Typically, all four nucleotides are flowed into the same reaction chamber, either individually (or separately) or as some mixture of less than all four nucleotides, in an ordered and known manner.
(149) The methods provided herein may allow for at least 10.sup.3, preferably at least 10.sup.4, more preferably at least 10.sup.5, and even more preferably at least 10.sup.6 bases to be determined (or sequenced) per hour. In even more preferred embodiments, at least 10.sup.7 bases, at least 10.sup.8 bases, at least 10.sup.9 bases, or at least 10.sup.10 bases are sequenced per hour using the methods and arrays discussed herein. Thus, the methods may be used to sequence an entire human genome within about 24 hours, more preferably within about 20 hours, even more preferably within about 15 hours, even more preferably within about 10 hours, even more preferably within about 5 hours, and most preferably within about 1 hour.
(150) It should be appreciated, however, that while some illustrative examples of the concepts disclosed herein focus on nucleic acid sequencing, the invention contemplates a broader application of these methods and is not intended to be limited to these examples.
(151)
(152) The system 1000 includes a semiconductor/microfluidics hybrid structure 300 comprising an ISFET sensor array 100 and a microfluidics flow cell 200. In one aspect, the flow cell 200 may comprise a number of wells (not shown in
(153) As illustrated in
(154) The flow cell 200 in the system of
(155) In the system 1000 of
(156) Various embodiments of the present invention may relate to monitoring/measurement techniques that involve the static and/or dynamic responses of an ISFET. It is to be understood that although the particular example of a nucleic acid synthesis or sequencing reaction is provided to illustrate the transient or dynamic response of chemFET such as an ISFET, the transient or dynamic response of a chemFET such as an ISFET as discussed below may be exploited for monitoring/sensing other types of chemical and/or biological activity beyond the specific example of a nucleic acid synthesis or sequencing reaction.
(157) As noted above, the ISFET may be employed to measure steady state pH values, since in some embodiments pH change is proportional to the number of nucleotides incorporated into the newly synthesized nucleic acid strand. In other embodiments discussed in greater detail below, the FET sensor array may be particularly configured for sensitivity to other analytes that may provide relevant information about the chemical reactions of interest. An example of such a modification or configuration is the use of analyte-specific receptors to bind the analytes of interest, as discussed in greater detail herein.
(158) Via an array controller 250 (also under operation of the computer 260), the ISFET array may be controlled so as to acquire data (e.g., output signals of respective ISFETs of the array) relating to analyte detection and/or measurements, and collected data may be processed by the computer 260 to yield meaningful information associated with the processing (including sequencing) of the template nucleic acid.
(159) With respect to the ISFET array 100 of the system 1000 shown in
(160) The ISFET array 100 is not limited to any particular size, as one- or two-dimensional arrays, including but not limited to as few as two to 256 pixels (e.g., 16 by 16 pixels in a two-dimensional implementation) or as many as 54 mega-pixels (e.g., 7400 by 7400 pixels in a two-dimensional implementation) or even greater may be fabricated and employed for various chemical/biological analysis purposes pursuant to the concepts disclosed herein. In one embodiment of the exemplary system shown in
(161) More generally, a chemFET array according to various embodiments of the present disclosure may be configured for sensitivity to any one or more of a variety of analytes. In one embodiment, one or more chemFETs of an array may be particularly configured for sensitivity to one or more analytes and/or one or more binding events, and in other embodiments different chemFETs of a given array may be configured for sensitivity to different analytes. For example, in one embodiment, one or more sensors (pixels) of the array may include a first type of chemFET configured to be sensitive to a first analyte, and one or more other sensors of the array may include a second type of chemFET configured to be sensitive to a second analyte different from the first analyte. In one exemplary implementation, both a first and a second analyte may indicate a particular reaction such as for example nucleotide incorporation in a sequencing-by-synthesis method. Of course, it should be appreciated that more than two different types of chemFETs may be employed in any given array to detect and/or measure different types of analytes and/or other reactions. In general, it should be appreciated in any of the embodiments of sensor arrays discussed herein that a given sensor array may be homogeneous and include chemFETs of substantially similar or identical types to detect and/or measure a same type of analyte (e.g., hydrogen ions), or a sensor array may be heterogeneous and include chemFETs of different types to detect and/or measure different analytes. For simplicity of discussion, again the example of an ISFET is discussed below in various embodiments of sensor arrays, but the present disclosure is not limited in this respect, and several other options for analyte sensitivity are discussed in further detail below (e.g., in connection with
(162) The chemFET arrays configured for sensitivity to any one or more of a variety of analytes may be disposed in electronic chips, and each chip may be configured to perform one or more different biological reactions. The electronic chips can be connected to the portions of the above-described system which read the array output by means of pins coded in a manner such that the pins convey information to the system as to characteristics of the array and/or what kind of biological reaction(s) is(are) to be performed on the particular chip.
(163) In one embodiment, the invention encompasses an electronic chip configured for conducting biological reactions thereon, comprising one or more pins for delivering information to a circuitry identifying a characteristic of the chip and/or a type of reaction to be performed on the chip. Such reactions or applications may include, but are not limited to, nucleotide polymorphism detection, short tandem repeat detection, or general sequencing.
(164) In another embodiment, the invention encompasses a system adapted to perform more than one biological reaction on a chip the system comprising a chip receiving module adapted for receiving the chip, and a receiver for detecting information from the electronic chip, wherein the information determines a biological reaction to be performed on the chip. Typically, the system further comprises one or more reagents to perform the selected biological reaction.
(165) In another embodiment, the invention encompasses an apparatus for sequencing a polymer template comprising at least one integrated circuit that is configured to relay information about spatial location of a reaction chamber, the type of monomer added to the spatial location, and the time required to complete reaction of a reagent comprising a plurality of the monomers with an elongating polymer.
(166) In exemplary implementations based on 0.35 micrometer CMOS processing techniques (or CMOS processing techniques capable of smaller feature sizes), each pixel of the ISFET array 100 may include an ISFET and accompanying enable/select components, and may occupy an area on a surface of the array of approximately ten micrometers by ten micrometers (i.e., 100 micrometers.sup.2) or less; stated differently, arrays having a pitch (center of pixel-to-center of pixel spacing) on the order of 10 micrometers or less may be realized. An array pitch on the order of 10 micrometers or less using a 0.35 micrometer CMOS processing technique constitutes a significant improvement in terms of size reduction with respect to prior attempts to fabricate ISFET arrays, which resulted in pixel sizes on the order of at least 12 micrometers or greater.
(167) More specifically, in some embodiments discussed further below based on the inventive concepts disclosed herein, an array pitch of approximately nine (9) micrometers allows an ISFET array including over 256,000 pixels (e.g., a 512 by 512 array), together with associated row and column select and bias/readout electronics, to be fabricated on a 7 millimeter by 7 millimeter semiconductor die, and a similar sensor array including over four million pixels (e.g., a 2048 by 2048 array) to be fabricated on a 21 millimeter by 21 millimeter die. In other examples, an array pitch of approximately 5 micrometers allows an ISFET array including approximately 1.55 Mega-pixels (i.e., a 1348 by 1152 array) and associated electronics to be fabricated on a 9 millimeter by 9 millimeter die, and an ISFET sensor array including over 14 Mega-pixels and associated electronics on a 22 millimeter by 20 millimeter die. In yet other implementations, using a CMOS fabrication process in which feature sizes of less than 0.35 micrometers are possible (e.g., 0.18 micrometer CMOS processing techniques), ISFET sensor arrays with a pitch significantly below 5 micrometers may be fabricated (e.g., array pitch of 2.6 micrometers or pixel area of less than 8 or 9 micrometers.sup.2), providing for significantly dense ISFET arrays.
(168) As will be understood by those of skill in the art, the ability to miniaturize sequencing reactions reduces the time, cost and labor involved in sequencing of large genomes (such as the human genome). Of course, it should be appreciated that pixel sizes greater than 10 micrometers (e.g., on the order of approximately 20, 50, 100 micrometers or greater) may be implemented in various embodiments of chemFET arrays according to the present disclosure also.
(169) In other aspects of the system shown in
(170) In general, data may be removed from the array in serial or parallel or some combination thereof. On-chip controllers (or sense amplifiers) can control the entire chip or some portion of the chip. Thus, the chip controllers or signal amplifiers may be replicated as necessary according to the demands of the application. The array may, but need not be, uniform. For instance, if signal processing or some other constraint requires instead of one large array multiple smaller arrays, each with its own sense amplifiers or controller logic, that is quite feasible.
(171) Having provided a general overview of the role of a chemFET (e.g., ISFET) array 100 in an exemplary system 1000 for measuring one or more analytes, following below are more detailed descriptions of exemplary chemFET arrays according to various inventive embodiments of the present disclosure that may be employed in a variety of applications. Again, for purposes of illustration, chemFET arrays according to the present disclosure are discussed below using the particular example of an ISFET array, but other types of chemFETs may be employed in alternative embodiments. Also, again, for purposes of illustration, chemFET arrays are discussed in the context of nucleic acid sequencing applications, however, the invention is not so limited and rather contemplates a variety of applications for the chemFET arrays described herein.
(172) As noted above, various inventive embodiments disclosed herein specifically improve upon the ISFET array design of Milgrew et al. discussed above in connection with
(173) To this end,
(174) In one aspect of the embodiment shown in
(175) As illustrated in
(176) By employing the diode-connected MOSFET Q6 in the bias/readout circuitry 110.sub.j of
(177) In
(178)
(179) In another aspect of the embodiment shown in
(180) In yet another aspect of the embodiment shown in
(181) By not tying the body connection of each ISFET to its source, the possibility of some non-zero source-to-body voltage V.sub.SB may give rise to the body effect, as discussed above in connection with
(182)
(183) In the top view of
(184) With reference now to the cross-sectional view of
(185) In the composite cross-sectional view of
(186) Above the substrate, gate oxide, and polysilicon layers shown in
(187) As indicated above,
(188) At least in some applications, pixel capacitance may be a salient parameter for some type of analyte measurements. Accordingly, in another embodiment related to pixel layout and design, various via and metal layers may be reconfigured so as to at least partially mitigate the potential for parasitic capacitances to arise during pixel operation. For example, in one such embodiment, pixels are designed such that there is a greater vertical distance between the signal lines 112.sub.1, 114.sub.1, 116.sub.1 and 118.sub.1, and the topmost metal layer 304 constituting the floating gate structure 170.
(189) In the embodiment described immediately above, with reference again to
(190) To this end, in another embodiment some via and metal layers are reconfigured such that the signal lines 112.sub.1, 114.sub.1, 116.sub.1 and 118.sub.1 are implemented in the Metal1 and Metal2 layers, and the Metal3 layer is used only as a jumper between the Metal2 layer component of the floating gate structure 170 and the topmost metal layer 304, thereby ensuring a greater distance between the signal lines and the metal layer 304.
(191) In
(192) With reference now to the cross-sectional view of
(193) More specifically, as in the embodiment of
(194) Accordingly, by consolidating the signal lines 112.sub.1, 114.sub.1, 116.sub.1 and 118.sub.1 to the Metal1 and Metal2 layers and thereby increasing the distance between these signal lines and the topmost layer 304 of the floating gate structure 170 in the Metal4 layer, parasitic capacitances in the ISFET may be at least partially mitigated. It should be appreciated that this general concept (e.g., including one or more intervening metal layers between signal lines and topmost layer of the floating gate structure) may be implemented in other fabrication processes involving greater numbers of metal layers. For example, distance between pixel signal lines and the topmost metal layer may be increased by adding additional metal layers (more than four total metal layers) in which only jumpers to the topmost metal layer are formed in the additional metal layers. In particular, a six-metal-layer fabrication process may be employed, in which the signal lines are fabricated using the Metal1 and Metal2 layers, the topmost metal layer of the floating gate structure is formed in the Metal6 layer, and jumpers to the topmost metal layer are formed in the Metal3, Metal4 and Metal5 layers, respectively (with associated vias between the metal layers). In another exemplary implementation based on a six-metal-layer fabrication process, the general pixel configuration shown in
(195) In yet another aspect relating to reduced capacitance, a dimension f of the topmost metal layer 304 (and thus the ISFET sensitive area 178) may be reduced so as to reduce cross-capacitance between neighboring pixels. As may be observed in
(196) Thus, the pixel chip layout designs respectively shown in
(197) In one exemplary implementation, the gate oxide 165 for the ISFET may be fabricated to have a thickness on the order of approximately 75 Angstroms, giving rise to a gate oxide capacitance per unit area C.sub.ox of 4.5 fF/m.sup.2. Additionally, the polysilicon gate 164 may be fabricated with dimensions corresponding to a channel width W of 1.2 m and a channel length L of from 0.35 to 0.6 m (i.e., W/L ranging from approximately 2 to 3.5), and the doping of the region 160 may be selected such that the carrier mobility for the p-channel is 190 cm.sup.2V.Math.s (i.e., 1.9E10 m.sup.2V.Math.s). From Eq. (2) above, this results in an ISFET transconductance parameter on the order of approximately 170 to 300 A/V.sup.2. In other aspects of this exemplary implementation, the analog supply voltage VDDA is 3.3 Volts, and VB1 and VB2 are biased so as to provide a constant ISFET drain current I.sub.Dj on the order of 5 A (in some implementations, VB1 and VB2 may be adjusted to provide drain currents from approximately 1 A to 20 A). Additionally, the MOSFET Q6 (see bias/readout circuitry 110.sub.j in
(198) With respect to the analyte-sensitive passivation layer 172 shown in
(199) For CMOS processes involving aluminum as the metal (which has a melting point of approximately 650 degrees Celsius), a silicon nitride and/or silicon oxynitride passivation layer generally is formed via plasma-enhanced chemical vapor deposition (PECVD), in which a glow discharge at 250-350 degrees Celsius ionizes the constituent gases that form silicon nitride or silicon oxynitride, creating active species that react at the wafer surface to form a laminate of the respective materials. In one exemplary process, a passivation layer having a thickness on the order of approximately 1.0 to 1.5 m may be formed by an initial deposition of a thin layer of silicon oxynitride (on the order of 0.2 to 0.4 m) followed by a slighting thicker deposition of silicon oxynitride (on the order of 0.5 m) and a final deposition of silicon nitride (on the order of 0.5 m). Because of the low deposition temperature involved in the PECVD process, the aluminum metallization is not adversely affected.
(200) However, while a low temperature PECVD process provides adequate passivation for conventional CMOS devices, the low-temperature process results in a generally low-density and somewhat porous passivation layer, which in some cases may adversely affect ISFET threshold voltage stability. In particular, during ISFET device operation, a low-density porous passivation layer over time may absorb and become saturated with ions from the solution, which may in turn cause an undesirable time-varying drift in the ISFETs threshold voltage V.sub.TH, making accurate measurements challenging.
(201) In view of the foregoing, in one embodiment a CMOS process that uses tungsten metal instead of aluminum may be employed to fabricate ISFET arrays according to the present disclosure. The high melting temperature of Tungsten (above 3400 degrees Celsius) permits the use of a higher temperature low pressure chemical vapor deposition (LPCVD) process (e.g., approximately 700 to 800 degrees Celsius) for a silicon nitride or silicon oxynitride passivation layer. The LPCVD process typically results in significantly more dense and less porous films for the passivation layer, thereby mitigating the potentially adverse effects of ion absorption from the analyte solution leading to ISFET threshold voltage drift.
(202) In yet another embodiment in which an aluminum-based CMOS process is employed to fabricate ISFET arrays according to the present disclosure, the passivation layer 172 shown in
(203) Examples of materials suitable for the second portion 172B (or other additional portions) of the passivation layer 172 include, but are not limited to, silicon nitride, silicon oxynitride, aluminum oxide (Al.sub.2O.sub.3), tantalum oxide (Ta.sub.3O.sub.5), tin oxide (SnO.sub.2) and silicon dioxide (SiO.sub.2). In one aspect, the second portion 172B (or other additional portions) may be deposited via a variety of relatively low temperature processes including, but not limited to, RF sputtering, DC magnetron sputtering, thermal or e-beam evaporation, and ion-assisted depositions. In another aspect, a pre-sputtering etch process may be employed, prior to deposition of the second portion 172B, to remove any native oxide residing on the first portion 172A (alternatively, a reducing environment, such as an elevated temperature hydrogen environment, may be employed to remove native oxide residing on the first portion 172A). In yet another aspect, a thickness of the second portion 172B may be on the order of approximately 0.04 m to 0.06 m (400 to 600 Angstroms) and a thickness of the first portion may be on the order of 1.0 to 1.5 m, as discussed above. In some exemplary implementations, the first portion 172A may include multiple layers of silicon oxynitride and silicon nitride having a combined thickness of 1.0 to 1.5 m, and the second portion 172B may include a single layer of either aluminum oxide or tantalum oxide having a thickness of approximately 400 to 600 Angstroms. Again, it should be appreciated that the foregoing exemplary thicknesses are provided primarily for purposes of illustration, and that the disclosure is not limited in these respects.
(204) Thus it is to be understood that the chemFET arrays described herein may be used to detect and/or measure various analytes and, by doing so, may monitor a variety of reactions and/or interactions. It is also to be understood that the discussion herein relating to hydrogen ion detection (in the form of a pH change) is for the sake of convenience and brevity and that static or dynamic levels/concentrations of other analytes (including other ions) can be substituted for hydrogen in these descriptions. In particular, sufficiently fast concentration changes of any one or more of various ion species present in the analyte may be detected via the transient or dynamic response of a chemFET, as discussed above in connection with
(205) The chemFETs, including ISFETs, described herein are capable of detecting any analyte that is itself capable of inducing a change in electric field when in contact with or otherwise sensed or detected by the chemFET surface. The analyte need not be charged in order to be detected by the sensor. For example, depending on the embodiment, the analyte may be positively charged (i.e., a cation), negatively charged (i.e., an anion), zwitterionic (i.e., capable of having two equal and opposite charges but being neutral overall), and polar yet neutral. This list is not intended as exhaustive as other analyte classes as well as species within each class will be readily contemplated by those of ordinary skill in the art based on the disclosure provided herein.
(206) In the broadest sense of the invention, the passivation layer may or may not be coated and the analyte may or may not interact directly with the passivation layer.
(207) Passivation Layer Specificity
(208) In some embodiments, the passivation layer and/or the layers and/or molecules coated thereon dictate the analyte specificity of the array readout.
(209) Detection of hydrogen ions, and other analytes as determined by the invention, can be carried out using a passivation layer made of silicon nitride (Si.sub.3N.sub.4), silicon oxynitride (Si.sub.2N.sub.2O), silicon oxide (SiO.sub.2), aluminum oxide (Al.sub.2O.sub.3), tantalum pentoxide (Ta.sub.2O.sub.5), tin oxide or stannic oxide (SnO.sub.2), and the like.
(210) The passivation layer can also detect other ion species directly including but not limited to calcium, potassium, sodium, iodide, magnesium, chloride, lithium, lead, silver, cadmium, nitrate, phosphate, dihydrogen phosphate, and the like.
(211) In some embodiments, the passivation layer is coated with a receptor for the analyte of interest. Preferably, the receptor binds selectively to the analyte of interest or in some instances to a class of agents to which the analyte belongs. As used herein, a receptor that binds selectively to an analyte is a molecule that binds preferentially to that analyte (i.e., its binding affinity for that analyte is greater than its binding affinity for any other analyte). Its binding affinity for the analyte of interest may be 2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 15-fold, 20-fold, 25-fold, 30-fold, 40-fold, 50-fold, 100-fold or more than its binding affinity for any other analyte. In addition to its relative binding affinity, the receptor must also have an absolute binding affinity that is sufficiently high to efficiently bind the analyte of interest (i.e., it must have a sufficient sensitivity). Receptors having binding affinities in the picomolar to micromolar range are suitable. Preferably such interactions are reversible.
(212) The receptor may be of any nature (e.g., chemical, nucleic acid, peptide, lipid, combinations thereof and the like). In such embodiments, the analyte too may be of any nature provided there exists a receptor that binds to it selectively and in some instances specifically. It is to be understood however that the invention further contemplates detection of analytes in the absence of a receptor. An example of this is the detection of PPi and Pi by the passivation layer in the absence of PPi or Pi receptors.
(213) In one aspect, the invention contemplates receptors that are ionophores. As used herein, an ionophore is a molecule that binds selectively to an ionic species, whether anion or cation. In the context of the invention, the ionophore is the receptor and the ion to which it binds is the analyte. Ionophores of the invention include art-recognized carrier ionophores (i.e., small lipid-soluble molecules that bind to a particular ion) derived from microorganisms. Various ionophores are commercially available from sources such as Calbiochem.
(214) Detection of some ions can be accomplished through the use of the passivation layer itself or through the use of receptors coated onto the passivation layer. For example, potassium can be detected selectively using polysiloxane, valinomycin, or salinomycin; sodium can be detected selectively using monensin, nystatin, or SQI-Pr; calcium can be detected selectively using ionomycin, calcimycine (A23187), or CA 1001 (ETH 1001).
(215) Receptors able to bind more than one ion can also be used in some instances. For example, beauvericin can be used to detect calcium and/or barium ions, nigericin can be used to detect potassium, hydrogen and/or lead ions, and gramicidin can be used to detect hydrogen, sodium and/or potassium ions. One of ordinary skill in the art will recognize that these compounds can be used in applications in which single ion specificity is not required or in which it is unlikely (or impossible) that other ions which the compounds bind will be present or generated. Similarly, receptors that bind multiple species of a particular genus may also be useful in some embodiments including those in which only one species within the genus will be present or in which the method does not require distinction between species.
(216) As another example, receptors for neurotoxins are described in Simonian Electroanalysis 2004, 16: 1896-1906.
(217) Passivation Layer and PPi Receptors
(218) In other embodiments, including but not limited to nucleic acid sequencing applications, receptors that bind selectively to PPi can be used. Examples of PPi receptors include those compounds shown in
(219) Passivation LayerReceptor Binding
(220) Receptors may be attached to the passivation layer covalently or non-covalently. Covalent attachment of a receptor to the passivation layer may be direct or indirect (e.g., through a linker).
(221) A bifunctional linker is a compound having at least two reactive groups to which two entities may be bound. In some instances, the reactive groups are located at opposite ends of the linker. In some embodiments, the bifunctional linker is a universal bifunctional linker such as that shown in
(222) The bifunctional linker may be a homo-bifunctional linker or a hetero-bifunctional linker, depending upon the nature of the molecules to be conjugated. Homo-bifunctional linkers have two identical reactive groups. Hetero-bifunctional linkers are have two different reactive groups. Various types of commercially available linkers are reactive with one or more of the following groups: primary amines, secondary amines, sulphydryls, carboxyls, carbonyls and carbohydrates. Examples of amine-specific linkers are bis(sulfosuccinimidyl) suberate, bis[2-(succinimidooxycarbonyloxy)ethyl]sulfone, disuccinimidyl suberate, disuccinimidyl tartarate, dimethyl adipimate.2 HCl, dimethyl pimelimidate.2 HCl, dimethyl suberimidate.2 HCl, and ethylene glycolbis-[succinimidyl-[succinate]]. Linkers reactive with sulfhydryl groups include bismaleimidohexane, 1,4-di-[3-(2-pyridyldithio)-propionamido)]butane, 1-[p-azidosalicylamido]-4-[iodoacetamido]butane, and N-[4-(p-azidosalicylamido) butyl]-3-[2-pyridyldithio]propionamide. Linkers preferentially reactive with carbohydrates include azidobenzoyl hydrazine. Linkers preferentially reactive with carboxyl groups include 4-[p-azidosalicylamido]butylamine.
(223) Heterobifunctional linkers that react with amines and sulfhydryls include N-succinimidyl-3-[2-pyridyldithio]propionate, succinimidyl[4-iodoacetyl]aminobenzoate, succinimidyl 4-[N-maleimidomethyl]cyclohexane-1-carboxylate, m-maleimidobenzoyl-N-hydroxysuccinimide ester, sulfosuccinimidyl 6-[3-[2-pyridyldithio]propionamido]hexanoate, and sulfosuccinimidyl 4-[N-maleimidomethyl]cyclohexane-1-carboxylate. Heterobifunctional linkers that react with carboxyl and amine groups include 1-ethyl-3-[3-dimethylaminopropyl]-carbodiimide hydrochloride. Heterobifunctional linkers that react with carbohydrates and sulfhydryls include 4-[N-maleimidomethyl]-cyclohexane-1-carboxylhydrazide.2 HCl, 4-(4-N-maleimidophenyl)-butyric acid hydrazide.2 HCl, and 3-[2-pyridyldithio]propionyl hydrazide.
(224) Alternatively, receptors may be non-covalently coated onto the passivation layer. Non-covalent deposition of the receptor onto the passivation layer may involve the use of a polymer matrix. The polymer may be naturally occurring or non-naturally occurring and may be of any type including but not limited to nucleic acid (e.g., DNA, RNA, PNA, LNA, and the like, or mimics, derivatives, or combinations thereof), amino acid (e.g., peptides, proteins (native or denatured), and the like, or mimics, derivatives, or combinations thereof, lipids, polysaccharides, and functionalized block copolymers. The receptor may be adsorbed onto and/or entrapped within the polymer matrix. The nature of the polymer will depend on the nature of the receptor being used and/or analyte being detected.
(225) Alternatively, the receptor may be covalently conjugated or crosslinked to the polymer (e.g., it may be grafted onto a functionalized polymer).
(226) An example of a suitable peptide polymer is poly-lysine (e.g., poly-L-lysine). Examples of other polymers include block copolymers that comprise polyethylene glycol (PEG), polyamides, polycarbonates, polyalkylenes, polyalkylene glycols, polyalkylene oxides, polyalkylene terepthalates, polyvinyl alcohols, polyvinyl ethers, polyvinyl esters, polyvinyl halides, polyvinylpyrrolidone, polyglycolides, polysiloxanes, polyurethanes, alkyl cellulose, hydroxyalkyl celluloses, cellulose ethers, cellulose esters, nitrocelluloses, polymers of acrylic and methacrylic esters, methyl cellulose, ethyl cellulose, hydroxypropyl cellulose, hydroxypropyl methyl cellulose, hydroxybutyl methyl cellulose, cellulose acetate, cellulose propionate, cellulose acetate butyrate, cellulose acetate phthalate, carboxylethyl cellulose, cellulose triacetate, cellulose sulphate sodium salt, poly(methyl methacrylate), poly(ethyl methacrylate), poly(butylmethacrylate), poly(isobutyl methacrylate), poly(hexylmethacrylate), poly(isodecyl methacrylate), poly(lauryl methacrylate), poly(phenyl methacrylate), poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutyl acrylate), poly(octadecyl acrylate), polyethylene, polypropylene, poly(ethylene glycol), poly(ethylene oxide), poly(ethylene terephthalate), poly(vinyl alcohols), polyvinyl acetate, polyvinyl chloride, polystyrene, polyhyaluronic acids, casein, gelatin, glutin, polyanhydrides, polyacrylic acid, alginate, chitosan, poly(methyl methacrylates), poly(ethyl methacrylates), poly(butylmethacrylate), poly(isobutyl methacrylate), poly(hexylmethacrylate), poly(isodecyl methacrylate), poly(lauryl methacrylate), poly(phenyl methacrylate), poly(methyl acrylate), poly(isopropyl acrylate), poly(isobutyl acrylate), and poly(octadecyl acrylate), poly(lactide-glycolide), copolyoxalates, polycaprolactones, polyesteramides, polyorthoesters, polyhydroxybutyric acid, polyanhydrides, poly(styrene-b-isobutylene-b-styrene) (SIBS) block copolymer, ethylene vinyl acetate, poly(meth)acrylic acid, polymers of lactic acid and glycolic acid, polyanhydrides, poly(ortho)esters, polyurethanes, poly(butic acid), poly(valeric acid), and poly(lactide-cocaprolactone), and natural polymers such as alginate and other polysaccharides including dextran and cellulose, collagen, albumin and other hydrophilic proteins, zein and other prolamines and hydrophobic proteins, copolymers and mixtures thereof, and chemical derivatives thereof including substitutions and/or additions of chemical groups, for example, alkyl, alkylene, hydroxylations, oxidations, and other modifications routinely made by those skilled in the art.
(227) Trapped Charge
(228) Another issue that relates to ISFET threshold voltage stability and/or predictability involves trapped charge that may accumulate (especially) on metal layers of CMOS-fabricated devices as a result of various processing activities during or following array fabrication (e.g., back-end-of-line processing such as plasma metal etching, wafer cleaning, dicing, packaging, handling, etc.). In particular, with reference to
(229) One opportunity for trapped charge to accumulate includes plasma etching of the topmost metal layer 304. Other opportunities for charge to accumulate on one or more conductors of the floating gate structure or other portions of the FETs includes wafer dicing, during which the abrasive process of a dicing saw cutting through a wafer generates static electricity, and/or various post-processing wafer handling/packaging steps, such as die-to-package wire bonding, where in some cases automated machinery that handles/transports wafers may be sources of electrostatic discharge (ESD) to conductors of the floating gate structure. If there is no connection to the silicon substrate (or other semi-conductor substrate) to provide an electrical path to bleed off such charge accumulation, charge may build up to the point of causing undesirable changes or damage to the gate oxide 165 (e.g., charge injection into the oxide, or low-level oxide breakdown to the underlying substrate). Trapped charge in the gate oxide or at the gate oxide-semiconductor interface in turn can cause undesirable and/or unpredictable variations in ISFET operation and performance, such as fluctuations in threshold voltage.
(230) In view of the foregoing, other inventive embodiments of the present disclosure are directed to methods and apparatus for improving ISFET performance by reducing trapped charge or mitigating the antenna effect. In one embodiment, trapped charge may be reduced after a sensor array has been fabricated, while in other embodiments the fabrication process itself may be modified to reduce trapped charge that could be induced by some conventional process steps. In yet other embodiments, both during fabrication and post fabrication techniques may be employed in combination to reduce trapped charge and thereby improve ISFET performance.
(231) With respect to alterations to the fabrication process itself to reduce trapped charge, in one embodiment the thickness of the gate oxide 165 shown in
(232) In another embodiment, the topmost metal layer 304 of the ISFETs floating gate structure 170 shown in
(233) In yet another embodiment, the metal etch process for the topmost metal layer 304 may be modified to include wet chemistry or ion-beam milling rather than plasma etching. For example, the metal layer 304 could be etched using an aqueous chemistry selective to the underlying dielectric (e.g., see website for Transene relating to aluminum, which is hereby incorporated herein by reference). Another alternative approach employs ion-milling rather than plasma etching for the metal layer 304. Ion-milling is commonly used to etch materials that cannot be readily removed using conventional plasma or wet chemistries. The ion-milling process does not employ an oscillating electric field as does a plasma, so that charge build-up does not occur in the metal layer(s). Yet another metal etch alternative involves optimizing the plasma conditions so as to reduce the etch rate (i.e. less power density).
(234) In yet another embodiment, architecture changes may be made to the metal layer to facilitate complete electrical isolation during definition of the floating gate. In one aspect, designing the metal stack-up so that the large area ISFET floating gate is not connected to anything during its final definition may require a subsequent metal layer serving as a jumper to realize the electrical connection to the floating gate of the transistor. This jumper connection scheme prevents charge flow from the large floating gate to the transistor. This method may be implemented as follows (M=metal layer): i) M1 contacting Poly gate electrode; ii) M2 contacting M1; iii) M3 defines floating gate and separately connects to M2 with isolated island; iv) M4 jumper, having very small area being etched over the isolated islands and connections to floating gate M3, connects the M3 floating gate to the M1/M2/M3 stack connected to the Poly gate immediately over the transistor active area; and v) M3 to M4 interlayer dielectric is removed only over the floating gate so as to expose the bare M3 floating gate. In the method outlined immediately above, step v) need not be done, as the ISFET architecture according to some embodiments discussed above leaves the M4 passivation in place over the M4 floating gate. In one aspect, removal may nonetheless improve ISFET performance in other ways (i.e. sensitivity). In any case, the final sensitive passivation layer may be a thin sputter-deposited ion-sensitive metal-oxide layer. It should be appreciated that the over-layer jumpered architecture discussed above may be implemented in the standard CMOS fabrication flow to allow any of the first three metal layers to be used as the floating gates (i.e. M1, M2 or M3).
(235) With respect to post-fabrication processes to reduce trapped charge, in one embodiment a forming gas anneal may be employed as a post-fabrication process to mitigate potentially adverse effects of trapped charge. In a forming gas anneal, CMOS-fabricated ISFET devices are heated in a hydrogen and nitrogen gas mixture. The hydrogen gas in the mixture diffuses into the gate oxide 165 and neutralizes certain forms of trapped charges. In one aspect, the forming gas anneal need not necessarily remove all gate oxide damage that may result from trapped charges; rather, in some cases, a partial neutralization of some trapped charge is sufficient to significantly improve ISFET performance. In exemplary annealing processes according to the present disclosure, ISFETs may be heated for approximately 30 to 60 minutes at approximately 400 to 425 degrees Celsius in a hydrogen/nitrogen mixture that includes 10% to 15% hydrogen. In one particular implementation, annealing at 425 degrees Celsius at 30 minutes in a hydrogen/nitrogen mixture that includes 10% hydrogen is observed to be particularly effective at improving ISFET performance. For aluminum CMOS processes, the temperature of the anneal should be kept at or below 450 degrees Celsius to avoid damaging the aluminum metallurgy. In another aspect of an annealing process according to the present disclosure, the forming gas anneal is performed after wafers of fabricated ISFET arrays are diced, so as to ensure that damage due to trapped charge induced by the dicing process itself, and/or other pre-dicing processing steps (e.g., plasma etching of metals) may be effectively ameliorated. In yet another aspect, the forming gas anneal may be performed after die-to-package wirebonding to similarly ameliorate damage due to trapped charge. At this point in the assembly process, a diced array chip is typically in a heat and chemical resistant ceramic package, and low-tolerance wirebonding procedures as well as heat-resistant die-to-package adhesives may be employed to withstand the annealing procedure. Thus, in one exemplary embodiment, the invention encompasses a method for manufacturing an array of FETs, each having or coupled to a floating gate having a trapped charge of zero or substantially zero comprising: fabricating a plurality of FETs in a common semiconductor substrate, each of a plurality of which is coupled to a floating gate; applying a forming gas anneal to the semiconductor prior to a dicing step; dicing the semiconductor; and applying a forming gas anneal to the semiconductor after the dicing step. Preferably, the semiconductor substrate comprises at least 100,000 FETs. Preferably, the plurality of FETs are chemFETs. The method may further comprise depositing a passivation layer on the semiconductor, depositing a polymeric, glass, ion-reactively etchable or photodefineable material layer on the passivation layer and etching the polymeric, glass ion-reactively etchable or photodefineable material to form an array of reaction chambers in the glass layer.
(236) In yet other processes for mitigating potentially adverse effects of trapped charge according to embodiments of the present disclosure, a variety of electrostatic discharge (ESD)-sensitive protocols may be adopted during any of a variety of wafer post-fabrication handling/packaging steps. For example, in one exemplary process, anti-static dicing tape may be employed to hold wafer substrates in place (e.g., during the dicing process). Also, although high-resistivity (e.g., 10 M) deionized water conventionally is employed in connection with cooling of dicing saws, according to one embodiment of the present disclosure less resistive/more conductive water may be employed for this purpose to facilitate charge conduction via the water; for example, deionized water may be treated with carbon dioxide to lower resistivity and improve conduction of charge arising from the dicing process. Furthermore, conductive and grounded die-ejection tools may be used during various wafer dicing/handling/packaging steps, again to provide effective conduction paths for charge generated during any of these steps, and thereby reduce opportunities for charge to accumulate on one or more conductors of the floating gate structure of respective ISFETs of an array.
(237) In yet another embodiment involving a post-fabrication process to reduce trapped charge, the gate oxide region of an ISFET may be irradiated with UV radiation. With reference again to
(238) To facilitate a UV irradiation process to reduce trapped charge, materials other than silicon nitride and silicon oxynitride generally need to be employed in the passivation layer 172 shown in
(239) In another aspect of an embodiment involving UV irradiation, each ISFET of a sensor array must be appropriately biased during a UV irradiation process to facilitate reduction of trapped charge. In particular, high energy photons from the UV irradiation, impinging upon the bulk silicon region 160 in which the ISFET conducting channel is formed, create electron-hole pairs which facilitate neutralization of trapped charge in the gate oxide as current flows through the ISFETs conducting channel. To this end, an array controller, discussed further below in connection with
(240) Utilizing at least one of the above-described techniques for reducing trapped charge, we have been able to fabricate FETs floating gates having a trapped charge of zero or substantially zero. Thus, in some embodiments, an aspect of the invention encompasses a floating gate having a surface area of about 4 m.sup.2 to about 50 m.sup.2 having baseline threshold voltage and preferably a trapped charge of zero or substantially zero. Preferably the FETs are chemFETs. The trapped charge should be kept to a level that does not cause appreciable variations from FET to FET across the array, as that would limit the dynamic range of the devices, consistency of measurements, and otherwise adversely affect performance.
(241) Array and Chip Design
(242)
(243) Also, as discussed above, it should be appreciated that arrays according to various embodiments of the present invention may be fabricated according to conventional CMOS fabrications techniques, as well as modified CMOS fabrication techniques (e.g., to facilitate realization of various functional aspects of the chemFET arrays discussed herein, such as additional deposition of passivation materials, process steps to mitigate trapped charge, etc.) and other semiconductor fabrication techniques beyond those conventionally employed in CMOS fabrication. Additionally, various lithography techniques may be employed as part of an array fabrication process. For example, in one exemplary implementation, a lithography technique may be employed in which appropriately designed blocks are stitched together by overlapping the edges of a step and repeat lithography exposures on a wafer substrate by approximately 0.2 micrometers. In a single exposure, the maximum die size typically is approximately 21 millimeters by 21 millimeters. By selectively exposing different blocks (sides, top & bottoms, core, etc.) very large chips can be defined on a wafer (up to a maximum, in the extreme, of one chip per wafer, commonly referred to as wafer scale integration).
(244) In one aspect of the array 100 shown in
(245) In yet another implementation of an array similar to that shown in
(246) In
(247)
(248) Regarding the column select shift registers 194.sub.1 and 194.sub.2, these are implemented in a manner similar to that of the row select shift registers, with each column select shift register comprising 256 series-connected flip-flops and responsible for enabling readout from either the odd columns of the array or the even columns of the array. For example,
(249) With reference again for the moment to
(250) In the embodiment of
(251) In one exemplary implementation, the switches of both the even and odd output drivers 198.sub.1 and 198.sub.2 (e.g., the switches 191.sub.2, 191.sub.4, . . . 191.sub.512 shown in
(252) The ability of the bus 175 to settle quickly following enabling of successive switches in turn facilitates rapid data acquisition from the array. To this end, in some embodiments the switches 191 of the output drivers 198.sub.1 and 198.sub.2 are particularly configured to significantly reduce the settling time of the bus 175. Both the n-channel and the p-channel MOSFETs of a given switch add to the capacitance of the bus 175; however, n-channel MOSFETs generally have better frequency response and current drive capabilities than their p-channel counterparts. In view of the foregoing, some of the superior characteristics of n-channel MOSFETs may be exploited to improve settling time of the bus 175 by implementing asymmetric switches in which respective sizes for the n-channel MOSFET and p-channel MOSFET of a given switch are different.
(253) For example, in one embodiment, with reference to
(254) While the example above describes asymmetric switches 191 for the output drivers 198.sub.1 and 198.sub.2 in which the n-channel MOSFET is larger than the p-channel MOSFET, it should be appreciated that in another embodiment, the converse may be implemented, namely, asymmetric switches in which the p-channel MOSFET is larger than the n-channel MOSFET. In one aspect of this embodiment, with reference again to
(255) In yet another embodiment directed to facilitating rapid settling of the bus 175 shown in
(256) For purposes of illustration, the bus 175 may have a capacitance in the range of approximately 5 pF to 20 pF in any of the embodiments discussed immediately above (e.g. symmetric switches, asymmetric switches, greater numbers of output drivers, etc.). Of course, it should be appreciated that the capacitance of the bus 175 is not limited to these exemplary values, and that other capacitance values are possible in different implementations of an array according to the present disclosure.
(257) In one aspect of the array design discussed above in connection with
(258)
(259) Generally, the array controller 250 provides various supply voltages and bias voltages to the array 100, as well as various signals relating to row and column selection, sampling of pixel outputs and data acquisition. In particular, the array controller 250 reads one or more analog output signals (e.g., Vout1 and Vout2) including multiplexed respective pixel voltage signals from the array 100 and then digitizes these respective pixel signals to provide measurement data to the computer 260, which in turn may store and/or process the data. In some implementations, the array controller 250 also may be configured to perform or facilitate various array calibration and diagnostic functions, and an optional array UV irradiation treatment as discussed above in connection with
(260) As illustrated in
(261) In another aspect, the power supply 258 includes one or more digital-to-analog converters (DACs) that may be controlled by the computer 260 to allow any or all of the bias voltages, reference voltage, and supply voltages to be changed under software control (i.e., programmable bias settings). For example, a power supply 258 responsive to computer control (e.g., via software execution) may facilitate adjustment of one or more of the supply voltages (e.g., switching between 3.3 Volts and 1.8 Volts depending on chip type as represented by an identification code), and/or adjustment of one or more of the bias voltages VB1 and VB2 for pixel drain current, VB3 for column bus drive, VB4 for column amplifier bandwidth, and VBO0 for column output buffer current drive. In some aspects, one or more bias voltages may be adjusted to optimize settling times of signals from enabled pixels. Additionally, the common body voltage V.sub.BODY for all ISFETs of the array may be grounded during an optional post-fabrication UV irradiation treatment to reduce trapped charge, and then coupled to a higher voltage (e.g., VDDA) during diagnostic analysis, calibration, and normal operation of the array for measurement/data acquisition. Likewise, the reference voltage VREF may be varied to facilitate a variety of diagnostic and calibration functions.
(262) As also shown in
(263) Regarding data acquisition from the array 100, in one embodiment the array controller 250 of
(264) The array controller 250 of
(265) In the embodiment of
(266)
(267) In the example of
(268) For example, with respect to the method for detecting nucleotide incorporation, appropriate frame rates may be chosen to sufficiently sample the ISFET's output signal. In some exemplary implementations, a hydrogen ion signal may have a full-width at half-maximum (FWHM) on the order of approximately 1 second to approximately 2.5 seconds, depending on the number of nucleotide incorporation events. Given these exemplary values, a frame rate (or pixel sampling rate) of 20 Hz is sufficient to reliably resolve the signals in a given pixel's output signal. Again, the frame rates given in this example are provided primarily for purposes of illustration, and different frame rates may be involved in other implementations.
(269) In one implementation, the array controller 250 controls the array 100 to enable rows successively, one at a time. For example, with reference again for the moment to
(270)
(271) In
(272)
(273) As discussed above in connection with
(274) In one embodiment, once pixel values are sampled and digitized by the ADC(s) 254, the computer 260 may be programmed to process pixel data obtained from the array 100 and the array controller 250 so as to facilitate high data acquisition rates that in some cases may exceed a sufficient settling time for pixel voltages represented in a given array output signal. A flow chart illustrating an exemplary method according to one embodiment of the present invention that may be implemented by the computer 260 for processing and correction of array data acquired at high acquisition rates is illustrated in
(275) Regarding pixel settling time, with reference again to
V.sub.PIX(t)=A(1e.sup.t/),(PP)
where A is the difference (V.sub.COLjV.sub.COLj-1) between two pixel voltage values and k is a time constant associated with a capacitance of the bus 175.
(276) For purposes of the present discussion, pixel settling time t.sub.settle is defined as the time t at which V.sub.PIX(t) attains a value that differs from it's final value by an amount that is equal to the peak noise level of the array output signal. If the peak noise level of the array output signal is denoted as n.sub.p, then the voltage at the settling time t.sub.settle is given by V.sub.PIX(t.sub.settle)=A[1(n.sub.p/A)]. Substituting in Eq. (PP) and solving for t.sub.settle yields
(277)
(278)
(279) As indicated above, in one embodiment pixel data may be acquired from the array at data rates that exceed those dictated by the pixel settling time.
(280) Subsequently, in block 506 of
(281) In block 510 of
(282) In addition to controlling the sensor array and ADCs, the timing generator 256 may be configured to facilitate various array calibration and diagnostic functions, as well as an optional UV irradiation treatment. To this end, the timing generator may utilize the signal LSTV indicating the selection of the last row of the array and the signal LSTH to indicate the selection of the last column of the array. The timing generator 256 also may be responsible for generating the CAL signal which applies the reference voltage VREF to the column buffer amplifiers, and generating the UV signal which grounds the drains of all ISFETs in the array during a UV irradiation process (see
(283) Having discussed several aspects of an exemplary ISFET array,
(284)
(285) In one aspect of the embodiment shown in
(286) In particular,
(287) For each of the first and second groups of rows, the array 100A of
(288) In one exemplary implementation of the array 100A of
(289) Like the array 100 of
(290)
(291) As noted in
(292)
(293) While the exemplary arrays discussed above in connection with
(294) The array 100D of
(295) The array 100E of
(296) Thus, in various examples of ISFET arrays based on the inventive concepts disclosed herein, an array pitch of approximately nine (9) micrometers (e.g., a sensor surface area of less than ten micrometers by ten micrometers) allows an ISFET array including over 256,000 pixels (i.e., a 512 by 512 array), together with associated row and column select and bias/readout electronics, to be fabricated on a 7 millimeter by 7 millimeter semiconductor die, and a similar sensor array including over four million pixels (i.e., a 2048 by 2048 array, over 4 Mega-pixels) to be fabricated on a 21 millimeter by 21 millimeter die. In other examples, an array pitch of approximately 5 micrometers allows an ISFET array including approximately 1.55 Mega-pixels (i.e., a 1348 by 1152 array) and associated electronics to be fabricated on a 9 millimeter by 9 millimeter die, and an ISFET sensor array including over 14 Mega-pixels and associated electronics on a 22 millimeter by 20 millimeter die. In yet other implementations, using a CMOS fabrication process in which feature sizes of less than 0.35 micrometers are possible (e.g., 0.18 micrometer CMOS processing techniques), ISFET sensor arrays with a pixel size/pitch significantly below 5 micrometers may be fabricated (e.g., array pitch of 2.6 micrometers or pixel/sensor area of less than 8 or 9 micrometers.sup.2), providing for significantly dense ISFET arrays.
(297) In the embodiments of ISFET arrays discussed above, array pixels employ a p-channel ISFET, as discussed above in connection with
(298) For example,
(299) One of the primary differences between the n-channel ISFET pixel design of
(300) In addition to the pixel designs shown in
(301)
(302) In
(303) In
(304)
(305)
(306) Computer Hardware and Software
(307) With respect to the computer interface 252 of the array controller 250, in one exemplary implementation the interface is configured to facilitate a data rate of approximately 200 MB/sec to the computer 260, and may include local storage of up to 400 MB or greater. The computer 260 is configured to accept data at a rate of 200 MB/sec, and process the data so as to reconstruct an image of the pixels (e.g., which may be displayed in false-color on a monitor). For example, the computer may be configured to execute a general-purpose program with routines written in C++ or Visual Basic to manipulate the data and display is as desired.
(308) The systems described herein, when used for sequencing, typically involve a chemFET array supporting reaction chambers, the chemFETs being coupled to an interface capable of executing logic that converts the signals from the chemFETs into sequencing information.
(309) The sequencing information obtained from the system may be delivered to a handheld computing device, such as a personal digital assistant. Thus, in one embodiment, the invention encompasses logic for displaying a complete genome of an organism on a handheld computing device. The invention also encompasses logic adapted for sending data from a chemFET array to a handheld computing device. Any of such logic may be computer-implemented.
(310) Microfluidics and Microwell Arrays
(311) Turning from the sensor discussion, we will now be addressing the combining of the ISFET array with a microwell array and the attendant fluidics. As most of the drawings of the microwell array structure are presented only in cross-section or showing that array as only a block in a simplified diagram,
(312) Fluidic System: Apparatus and Method for Use with High Density Electronic Sensor Arrays
(313) For many uses, to complete a system for sensing chemical reactions or chemical agents using the above-explained high density electronic arrays, techniques and apparatus are required for delivery to the array elements (called pixels) fluids containing chemical or biochemical components for sensing. In this section, exemplary techniques and methods will be illustrated, which are useful for such purposes, with desirable characteristics.
(314) As high speed operation of the system may be desired, it is preferred that the fluid delivery system, insofar as possible, not limit the speed of operation of the overall system.
(315) Accordingly, needs exist not only for high-speed, high-density arrays of ISFETs or other elements sensitive to ion concentrations or other chemical attributes, or changes in chemical attributes, but also for related mechanisms and techniques for supplying to the array elements the samples to be evaluated, in sufficiently small reaction volumes as to substantially advance the speed and quality of detection of the variable to be sensed.
(316) There are two and sometimes three components or subsystems, and related methods, involved in delivery of the subject chemical samples to the array elements: (1) macrofluidic system of reagent and wash fluid supplies and appropriate valving and ancillary apparatus, (2) a flow cell and (3) in many applications, a microwell array. Each of these subsystems will be discussed, though in reverse order.
(317) Microwell Array
(318) As discussed elsewhere, for many uses, such as in DNA sequencing, it is desirable to provide over the array of semiconductor sensors a corresponding array of microwells, each microwell being small enough preferably to receive only one DNA-loaded bead, in connection with which an underlying pixel in the array will provide a corresponding output signal.
(319) The use of such a microwell array involves three stages of fabrication and preparation, each of which is discussed separately: (1) creating the array of microwells to result in a chip having a coat comprising a microwell array layer; (2) mounting of the coated chip to a fluidic interface; and in the case of DNA sequencing, (3) loading DNA-loaded bead or beads into the wells. It will be understood, of course, that in other applications, beads may be unnecessary or beads having different characteristics may be employed.
(320) The systems described herein can include an array of microfluidic reaction chambers integrated with a semiconductor comprising an array of chemFETs. In some embodiments, the invention encompasses such an array. The reaction chambers may, for example, be formed in a glass, dielectric, photodefineable or etchable material. The glass material may be silicon dioxide.
(321) Preferably, the array comprises at least 100,000 chambers. Preferably, each reaction chamber has a horizontal width and a vertical depth that has an aspect ratio of about 1:1 or less. Preferably, the pitch between the reaction chambers is no more than about 10 microns.
(322) The above-described array can also be provided in a kit for sequencing. Thus, in some embodiments, the invention encompasses a kit comprising an array of microfluidic reaction chambers integrated with an array of chemFETs, and one or more amplification reagents.
(323) In some embodiments, the invention encompasses a sequencing apparatus comprising a dielectric layer overlying a chemFET, the dielectric layer having a recess laterally centered atop the chemFET. Preferably, the dielectric layer is formed of silicon dioxide.
(324) Microwell Array Fabrication
(325) Microwell fabrication may be accomplished in a number of ways. The actual details of fabrication may require some experimentation and vary with the processing capabilities that are available.
(326) In general, fabrication of a high density array of microwells involves photo-lithographically patterning the well array configuration on a layer or layers of material such as photoresist (organic or inorganic), a dielectric, using an etching process. The patterning may be done with the material on the sensor array or it may be done separately and then transferred onto the sensor array chip, of some combination of the two. However, techniques other than photolithography are not to be excluded if they provide acceptable results.
(327) One example of a method for forming a microwell array is now discussed, starting with reference to
(328) After the semiconductor structures, as shown, are formed, the microwell structure is applied to the die. That is, the microwell structure can be formed right on the die or it may be formed separately and then mounted onto the die, either approach being acceptable. To form the microwell structure on the die, various processes may be used. For example, the entire die may be spin-coated with, for example, a negative photoresist such as Microchem's SU-8 2015 or a positive resist/polyimide such as HD Microsystems HD8820, to the desired height of the microwells. The desired height of the wells (e.g., about 4-12 m in the example of one pixel per well, though not so limited as a general matter) in the photoresist layer(s) can be achieved by spinning the appropriate resist at predetermined rates (which can be found by reference to the literature and manufacturer specifications, or empirically), in one or more layers. (Well height typically may be selected in correspondence with the lateral dimension of the sensor pixel, preferably for a nominal 1:1-1.5:1 aspect ratio, height:width or diameter. Based on signal-to-noise considerations, there is a relationship between dimensions and the required data sampling rates to achieve a desired level of performance. Thus there are a number of factors that will go into selecting optimum parameters for a given application.) Alternatively, multiple layers of different photoresists may be applied or another form of dielectric material may be deposited. Various types of chemical vapor deposition may also be used to build up a layer of materials suitable for microwell formation therein.
(329) Once the photoresist layer (the singular form layer is used to encompass multiple layers in the aggregate, as well) is in place, the individual wells (typically mapped to have either one or four ISFET sensors per well) may be generated by placing a mask (e.g., of chromium) over the resist-coated die and exposing the resist to cross-linking (typically UV) radiation. All resist exposed to the radiation (i.e., where the mask does not block the radiation) becomes cross-linked and as a result will form a permanent plastic layer bonded to the surface of the chip (die). Unreacted resist (i.e., resist in areas which are not exposed, due to the mask blocking the light from reaching the resist and preventing cross-linking) is removed by washing the chip in a suitable solvent (i.e., developer) such as propyleneglycolmethylethylacetate (PGMEA) or other appropriate solvent. The resultant structure defines the walls of the microwell array.
(330)
(331)
(332) After exposure of the die/resist to the UV radiation, a second layer of resist may be coated on the surface of the chip. This layer of resist may be relatively thick, such as about 400-450 m thick, typically. A second mask 3210 (
(333) Other photolithographic approaches may be used for formation of the microwell array, of course, the foregoing being only one example.
(334) For example, contact lithography of various resolutions and with various etchants and developers may be employed. Both organic and inorganic materials may be used for the layer(s) in which the microwells are formed. The layer(s) may be etched on a chip having a dielectric layer over the pixel structures in the sensor array, such as a passivation layer, or the layer(s) may be formed separately and then applied over the sensor array. The specific choice or processes will depend on factors such as array size, well size, the fabrication facility that is available, acceptable costs, and the like.
(335) Among the various organic materials which may be used in some embodiments to form the microwell layer(s) are the above-mentioned SU-8 type of negative-acting photoresist, a conventional positive-acting photoresist and a positive-acting photodefineable polyimide. Each has its virtues and its drawbacks, well known to those familiar with the photolithographic art.
(336) Naturally, in a production environment, modifications will be appropriate.
(337) Contact lithography has its limitations and it may not be the production method of choice to produce the highest densities of wellsi.e., it may impose a higher than desired minimum pitch limit in the lateral directions. Other techniques, such as a deep UV step-and-repeat process, are capable of providing higher resolution lithography and can be used to produce small pitches and possibly smaller well diameters. Of course, for different desired specifications (e.g., numbers of sensors and wells per chip), different techniques may prove optimal. And pragmatic factors, such as the fabrication processes available to a manufacturer, may motivate the use of a specific fabrication method. While novel methods are discussed, various aspects of the invention are limited to use of these novel methods.
(338) Preferably the CMOS wafer with the ISFET array will be planarized after the final metallization process. A chemical mechanical dielectric planarization prior to the silicon nitride passivation is suitable. This will allow subsequent lithographic steps to be done on very flat surfaces which are free of back-end CMOS topography.
(339) By utilizing deep-UV step-and-repeat lithography systems, it is possible to resolve small features with superior resolution, registration, and repeatability. However, the high resolution and large numerical aperture (NA) of these systems precludes their having a large depth of focus. As such, it may be necessary, when using such a fabrication system, to use thinner photodefinable spin-on layers (i.e., resists on the order of 1-2 m rather than the thicker layers used in contact lithography) to pattern transfer and then etch microwell features to underlying layer or layers. High resolution lithography can then be used to pattern the microwell features and conventional SiO.sub.2 etch chemistries can be usedone each for the bondpad areas and then the microwell areashaving selective etch stops; the etch stops then can be on aluminum bondpads and silicon nitride passivation (or the like), respectively. Alternatively, other suitable substitute pattern transfer and etch processes can be employed to render microwells of inorganic materials.
(340) Another approach is to form the microwell structure in an organic material. For example, a dual-resist soft-mask process may be employed, whereby a thin high-resolution deep-UV resist is used on top of a thicker organic material (e.g., cured polyimide or opposite-acting resist). The top resist layer is patterned. The pattern can be transferred using an oxygen plasma reactive ion etch process. This process sequence is sometimes referred to as the portable conformable mask (PCM) technique. See B. J. Lin et al., Practicing the Novolac deep-UV portable conformable masking technique, Journal of Vacuum Science and Technology 19, No. 4, 1313-1319 (1981); and A. Cooper et al, Optimization of a photosensitive spin-on dielectric process for copper inductor coil and interconnect protection in RF SoC devices.
(341) Alternatively a drill-focusing technique may be employed, whereby several sequential step-and-repeat exposures are done at different focal depths to compensate for the limited depth of focus (DOF) of high-resolution steppers when patterning thick resist layers. This technique depends on the stepper NA and DOF as well as the contrast properties of the resist material.
(342) Another PCM technique may be adapted to these purposes, such as that shown in U.S. patent application publication no. 2006/0073422 by Edwards et al. This is a three-layer PCM process and it is illustrated in
(343) In a first step, 3320, a layer of high contrast negative-acting photoresist such as type Shipley InterVia Photodielectric Material 8021 (IV8021) 3322 is spun on the surface of a wafer, which we shall assume to be the wafer providing the substrate 3312 of
(344) Although as shown above, the wells bottom out (i.e. terminate) on the top passivation layer of the ISFETs, it is believed that an improvement in ISFET sensor performance (i.e. such as signal-to-noise ratio) can be obtained if the active bead(s) is(are) kept slightly elevated from the ISFET passivation layer. One way to do so is to place a spacer bump within the boundary of the pixel microwell. An example of how this could be rendered would be not etching away a portion of the layer-or-layers used to form the microwell structure (i.e. two lithographic steps to form the microwellsone to etch part way done, the other to pattern the bump and finish the etch to bottom out), by depositing and lithographically defining and etching a separate layer to form the bump, by using a permanent photo-definable material for the bump once the microwells are complete, or by forming the bump prior to forming the microwell. The bump feature is shown as 3350 in
(345) Using a 6 micron thick layer of tetra-methyl-ortho-silicate (TEOS) as a SiO.sub.2-like layer for microwell formation,
(346) In the orthogonal cross-sectional view (i.e., looking down from the top), the wells may be formed in either round or square shape. Round wells may improve bead capture and may obviate the need for packing beads at the bottom or top of the wells.
(347) The tapered slopes to the sides of the microwells also may be used to advantage. Referring to
(348) Thus, microwells can be fabricated by any high aspect ratio photo-definable or etchable thin-film process, that can provide requisite thickness (e.g., about 4-10 m). Among the materials believed to be suitable are photosensitive polymers, deposited silicon dioxide, non-photosensitive polymer which can be etched using, for example, plasma etching processes, etc. In the silicon dioxide family, TEOS and silane nitrous oxide (SILOX) appear suitable. The final structures are similar but the various materials present differing surface compositions that may cause the target biology or chemistry to react differently.
(349) When the microwell layer is formed, it may be necessary to provide an etch stop layer so that the etching process does not proceed further than desired. For example, there may be an underlying layer to be preserved, such as a low-K dielectric. The etch stop material should be selected according to the application. SiC and SiN materials may be suitable, but that is not meant to indicate that other materials may not be employed, instead. These etch-stop materials can also serve to enhance the surface chemistry which drives the ISFET sensor sensitivity, by choosing the etch-stop material to have an appropriate point of zero charge (PZC). Various metal oxides may be suitable addition to silicon dioxide and silicon nitride
(350) The PZCs for various metal oxides may be found in various texts, such as Metal Oxides-Chemistry and Applications by J. Fierro. We have learned that Ta.sub.2O.sub.5 may be preferred as an etch stop over Al.sub.2O.sub.3 because the PZC of Al.sub.2O.sub.3 is right at the pH being used (i.e., about 8.8) and, hence, right at the point of zero charge. In addition Ta.sub.2O.sub.5 has a higher sensitivity to pH (i.e., mV/pH), another important factor in the sensor performance Optimizing these parameters may require judicious selection of passivation surface materials.
(351) Using thin metal oxide materials for this purpose (i.e., as an etch stop layer) is difficult due to the fact of their being so thinly deposited (typically 200-500 A). A post-microwell fabrication metal oxide deposition technique may allow placement of appropriate PZC metal oxide films at the bottom of the high aspect ratio microwells.
(352) Electron-beam depositions of (a) reactively sputtered tantalum oxide, (b) non-reactive stoichiometric tantalum oxide, (c) tungsten oxide, or (d) Vanadium oxide may prove to have superior down-in-well coverage due to the superior directionality of the deposition process.
(353) The array typically comprises at least 100 microfluidic wells, each of which is coupled to one or more chemFET sensors. Preferably, the wells are formed in at least one of a glass (e.g., SiO.sub.2), a polymeric material, a photodefinable material or a reactively ion etchable thin film material. Preferably, the wells have a width to height ratio less than about 1:1. Preferably the sensor is a field effect transistor, and more preferably a chemFET. The chemFET may optionally be coupled to a PPi receptor. Preferably, each of the chemFETs occupies an area of the array that is 10.sup.2 microns or less.
(354) In some embodiments, the invention encompasses a sequencing device comprising a semiconductor wafer device coupled to a dielectric layer such as a glass (e.g., SiO.sub.2), polymeric, photodefinable or reactive ion etchable material in which reaction chambers are formed. Typically, the glass, dielectric, polymeric, photodefinable or reactive ion etchable material is integrated with the semiconductor wafer layer. In some instances, the glass, polymeric, photodefinable or reactive ion etchable layer is non-crystalline. In some instances, the glass may be SiO.sub.2. The device can optionally further comprise a fluid delivery module of a suitable material such as a polymeric material, preferably an injection moldable material. More preferably, the polymeric layer is polycarbonate.
(355) In some embodiments, the invention encompasses a method for manufacturing a sequencing device comprising: using photolithography, generating wells in a glass, dielectric, photodefinable or reactively ion etchable material on top of an array of transistors.
(356) Yet another alternative when a CMOS or similar fabrication process is used for array fabrication is to form the microwells directly using the CMOS materials. That is, the CMOS top metallization layer forming the floating gates of the ISFET array usually is coated with a passivation layer that is about 1.3 m thick. Microwells 1.3 m deep can be formed by etching away the passivation material. For example, microwells having a 1:1 aspect ratio may be formed, 1.3 m deep and 1.3 m across at their tops. Modeling indicates that as the well size is reduced, in fact, the DNA concentration, and hence the SNR, increases. So, other factors being equal, such small wells may prove desirable.
(357) Mounting the Flow Cell (Fluidic Interface) to the Sensor Chip
(358) The process of using the assembly of an array of sensors on a chip combined with an array of microwells to sequence the DNA in a sample is referred to as an experiment. Executing an experiment requires loading the wells with the DNA-bound beads and the flowing of several different fluid solutions (i.e., reagents and washes) across the wells. A fluid delivery system (e.g., valves, conduits, pressure source(s), etc.) coupled with a fluidic interface is needed which flows the various solutions across the wells in a controlled even flow with acceptably small dead volumes and small cross contamination between sequential solutions. Ideally, the fluidic interface to the chip (sometimes referred to as a flow cell) would cause the fluid to reach all microwells at the same time. To maximize array speed, it is necessary that the array outputs be available at as close to the same time as possible. The ideal clearly is not possible, but it is desirable to minimize the differentials, or skews, of the arrival times of an introduced fluid, at the various wells, in order to maximize the overall speed of acquisition of all the signals from the array.
(359) Flow cell designs of many configurations are possible; thus the system and methods presented herein are not dependent on use of a specific flow cell configuration. It is desirable, though, that a suitable flow cell substantially conform to the following set of objectives: have connections suitable for interconnecting with a fluidics delivery systeme.g., via appropriately-sized tubing; have appropriate head space above wells; minimize dead volumes encountered by fluids; minimize small spaces in contact with liquid but not quickly swept clean by flow of a wash fluid through the flow cell (to minimize cross contamination); be configured to achieve uniform transit time of the flow over the array; generate or propagate minimal bubbles in the flow over the wells; be adaptable to placement of a removable reference electrode inside or as close to the flow chamber as possible; facilitate easy loading of beads; be manufacturable at acceptable cost; and be easily assembled and attached to the chip package.
(360) Satisfaction of these criteria so far as possible will contribute to system performance positively. For example, minimization of bubbles is important so that signals from the array truly indicate the reaction in a well rather than being spurious noise.
(361) Each of several example designs will be discussed, meeting these criteria in differing ways and degrees. In each instance, one typically may choose to implement the design in one of two ways: either by attaching the flow cell to a frame and gluing the frame (or otherwise attaching it) to the chip or by integrating the frame into the flow cell structure and attaching this unified assembly to the chip. Further, designs may be categorized by the way the reference electrode is integrated into the arrangement. Depending on the design, the reference electrode may be integrated into the flow cell (e.g., form part of the ceiling of the flow chamber) or be in the flow path (typically to the outlet or downstream side of the flow path, after the sensor array).
(362) A first example of a suitable experiment apparatus 3410 incorporating such a fluidic interface is shown in
(363) The apparatus comprises a semiconductor chip 3412 (indicated generally, though hidden) on or in which the arrays of wells and sensors are formed, and a fluidics assembly 3414 on top of the chip and delivering the sample to the chip for reading. The fluidics assembly includes a portion 3416 for introducing fluid containing the sample, a portion 3418 for allowing the fluid to be piped out, and a flow chamber portion 3420 for allowing the fluid to flow from inlet to outlet and along the way interact with the material in the wells. Those three portions are unified by an interface comprising a glass slide 3422 (e.g., Erie Microarray Cat #C22-5128-M20 from Erie Scientific Company, Portsmouth, N.H., cut in thirds, each to be of size about 25 mm25 mm).
(364) Mounted on the top face of the glass slide are two fittings, 3424 and 3426, such as nanoport fittings Part #N-333 from Upchurch Scientific of Oak Harbor, Wash. One port (e.g., 3424) serves as an inlet delivering liquids from the pumping/valving system described below but not shown here. The second port (e.g., 3426) is the outlet which pipes the liquids to waste. Each port connects to a conduit 3428, 3432 such as flexible tubing of appropriate inner diameter. The nanoports are mounted such that the tubing can penetrate corresponding holes in the glass slide. The tube apertures should be flush with the bottom surface of the slide.
(365) On the bottom of the glass slide, flow chamber 3420 may comprise various structures for promoting a substantially laminar flow across the microwell array. For example, a series of microfluidic channels fanning out from the inlet pipe to the edge of the flow chamber may be patterned by contact lithography using positive photoresists such as SU-8 photoresist from MicroChem Corp. of Newton, Mass. Other structures will be discussed below.
(366) The chip 3412 will in turn be mounted to a carrier 3430, for packaging and connection to connector pins 3432.
(367) For ease of description, to discuss fabrication starting with
(368) A layer of photoresist 3810 is applied to the top of the slide (which will become the bottom side when the slide and its additional layers is turned over and mounted to the sensor assembly of ISFET array with microwell array on it). Layer 3810 may be about 150 m thick in this example, and it will form the primary fluid carrying layer from the end of the tubing in the nanoports to the edge of the sensor array chip. Layer 3810 is patterned using a mask such as the mask 3910 of
(369) A second layer of photoresist is formed quite separately, not on the resist 3810 or slide 3422. Preferably it is formed on a flat, flexible surface (not shown), to create a peel-off, patterned plastic layer. As shown in
(370) The other alignment mark or set of marks produced by pattern 4022 is used for alignment with a subsequent layer to be discussed.
(371) The second layer is preferably about 150 m deep and it will cover the fluid-carrying channel with the exception of a slit about 150 m long at each respective edge of the sensor array chip, under slit-forming regions 4014 and 4016.
(372) Once the second layer of photoresist is disposed on the first layer, a third patterned layer of photoresist is formed over the second layer, using a mask such as mask 4110, shown in
(373)
(374) The fluidics assembly may be secured to the sensor array chip assembly by applying an adhesive to parts of mating surfaces of those two assemblies, and pressing them together, in alignment.
(375) Though not illustrated in
(376) Another way to introduce the reference electrode is shown in
(377) Achieving a uniform flow front and eliminating problematic flow path areas is desirable for a number of reasons. One reason is that very fast transition of fluid interfaces within the system's flow cell is desired for many applications, particularly gene sequencing. In other words, an incoming fluid must completely displace the previous fluid in a short period of time. Uneven fluid velocities and diffusion within the flow cell, as well as problematic flow paths, can compete with this requirement. Simple flow through a conduit of rectangular cross section can exhibit considerable disparity of fluid velocity from regions near the center of the flow volume to those adjacent the sidewalls, one sidewall being the top surface of the microwell layer and the fluid in the wells. Such disparity leads to spatially and temporally large concentration gradients between the two traveling fluids. Further, bubbles are likely to be trapped or created in stagnant areas like sharp corners interior the flow cell. (The surface energy (hydrophilic vs. hydrophobic) can significantly affect bubble retention. Avoidance of surface contamination during processing and use of a surface treatment to create a more hydrophilic surface should be considered if the as-molded surface is too hydrophobic.) Of course, the physical arrangement of the flow chamber is probably the factor which most influences the degree of uniformity achievable for the flow front.
(378) One approach is to configure the flow cross section of the flow chamber to achieve flow rates that vary across the array width so that the transit times are uniform across the array. For example, the cross section of the diffuser (i.e., flow expansion chamber) section 3416, 3610 may be made as shown at 4204A in
(379) Another configuration, shown in
(380) FIGS. 42F2-42F8 illustrate an example of a single-piece, injection-molded (preferably of polycarbonate) flow cell member 42F200 which may be used to provide baffles 4220F, a ceiling to the flow chamber, fluid inlet and outlet ports and even the reference electrode. FIG. 42F7 shows an enlarged view of the baffles on the bottom of member 42F200 and the baffles are shown as part of the underside of member 42F200 in FIG. 42F6. As it is difficult to form rectangular features in small dimensions by injection molding, the particular instance of these baffles, shown as 4220F, are triangular in cross section.
(381) In FIG. 42F2, there is a top, isometric view of member 42F200 mounted onto a sensor array package 42F300, with a seal 42F202 formed between them and contact pins 42F204 depending from the sensor array chip package. FIGS. 42F3 and 42F4 show sections, respectively, through section lines H-H and I-I of FIG. 42F5, permitting one to see in relationship the sensor array chip 42F250, the baffles 4220F and fluid flow paths via inlet 42F260 and outlet 42F270 ports.
(382) By applying a metallization to bottom 42F280 of member 42F200, the reference electrode may be formed.
(383) Various other locations and approaches may be used for introducing fluid flow into the flow chamber, as well. In addition to embodiments in which fluid may be introduced across the width of an edge of the chip assembly 42F1, as in
(384)
(385) A variation on this idea is depicted in
(386) In all cases, attention should be given to assuring a thorough washing of the entire flow chamber, along with the microwells, between reagent cycles. Flow disturbances may exacerbate the challenge of fully cleaning out the flow chamber.
(387) Flow disturbances may also induce or multiply bubbles in the fluid. A bubble may prevent the fluid from reaching a microwell, or delay its introduction to the microwell, introducing error into the microwell reading or making the output from that microwell useless in the processing of outputs from the array. Thus, care should be taken in selecting configurations and dimensions for the flow disruptor elements to manage these potential adverse factors. For example, a tradeoff may be made between the heights of the disruptor elements and the velocity profile change that is desired.
(388)
(389) In the illustrated embodiment, the reference electrode is introduced to the top of the flow chamber via a bore 4325 in the member 4320. The placement of the removable reference electrode is facilitated by a silicone sleeve 4360 and an epoxy stop ring 4370 (see the blow-up of
(390)
(391)
(392) Yet another alternative for a fluidics assembly, as shown in
(393) Some of the foregoing alternative embodiments also may be implemented in a hybrid plastic/PDMS configuration. For example, as shown in
(394) The fluidic structure may also be made from glass as discussed above, such as photo-definable (PD) glass. Such a glass may have an enhanced etch rate in hydrofluoric acid once selectively exposed to UV light and features may thereby be micromachined on the top-side and back-side, which when glued together can form a three-dimensional low aspect ratio fluidic cell.
(395) An example is shown in
(396) Nanoports may be secured over the nanoport fluidic holes to facilitate connection of input and output tubing.
(397) A central bore 5550 may be etched through the glass layers for receiving a reference electrode, 5560. The electrode may be secured and sealed in place with a silicone collar 5570 or like structure; or the electrode may be equipped integrally with a suitable washer for effecting the same purpose.
(398) By using glass materials for the two-layer fluidic cell, the reference electrode may also be a conductive layer or pattern deposited on the bottom surface of the second glass layer (not shown). Or, as shown in
(399) Another alternative is to integrate the reference electrode to the sequencing chip/flow cell by using a metalized surface on the ceiling of the flow chamberi.e., on the underside of the member forming the ceiling of the fluidic cell. An electrical connection to the metalized surface may be made in any of a variety of ways, including, but not limited to, by means of applying a conductive epoxy to the ceramic package seal ring that, in turn, may be electrically connected through a via in the ceramic substrate to a spare pin at the bottom of the chip package. Doing this would allow system-level control of the reference potential in the fluid cell by means of inputs through the chip socket mount to the chip's control electronics.
(400) By contrast, an externally inserted electrode requires extra fluid tubing to the inlet port, which requires additional fluid flow between cycles.
(401) Ceramic pin grid array (PGA) packaging may be used for the ISFET array, allowing customized electrical connections between various surfaces on the front face with pins on the back.
(402) The flow cell can be thought of as a lid to the ISFET chip and its PGA. The flow cell, as stated elsewhere, may be fabricated of many different materials. Injection molded polycarbonate appears to be quite suitable. A conductive metal (e.g., gold) may be deposited using an adhesion layer (e.g., chrome) to the underside of the glow cell roof (the ceiling of the flow chamber). Appropriate low-temperature thin-film deposition techniques preferably are employed in the deposition of the metal reference electrode due to the materials (e.g., polycarbonate) and large step coverage topography at the bottom-side of the fluidic cell (i.e., the frame surround of ISFET array). One possible approach would be to use electron-beam evaporation in a planetary system.
(403) The active electrode area is confined to the central flow chamber inside the frame surround of the ISFET array, as that is the only metalized surface that would be in contact with the ionic fluid during sequencing.
(404) Once assembly is completeconductive epoxy (e.g., Epo-Tek H20E or similar) may be dispensed on the seal ring with the flow cell aligned, placed, pressed and curedthe ISFET flow cell is ready for operation with the reference potential being applied to the assigned pin of the package.
(405) The resulting fluidic system connections to the ISFET device thus incorporate shortened input and output fluidic lines, which is desirable.
(406) Still another example embodiment for a fluidic assembly is shown in
(407) Still further examples of flow cell structures are shown in
(408) Whether glass or plastic or other material is used to form the flow cell, it may be desirable, especially with larger arrays, to include in the inlet chamber of the flow cell, between the inlet conduit and the front edge of the array, not just a gradually expanding (fanning out) space, but also some structure to facilitate the flow across the array being suitably laminar. Using the bottom layer 5990 of an injection molded flow cell as an example, one example type of structure for this purpose, shown in
(409) The above-described systems typically utilize a laminar fluid flow system. In part, the fluid flow system preferably includes a flow chamber formed by the sensor chip and a single piece, injection molded member comprising inlet and outlet ports and mountable over the chip to establish the flow chamber. The surface of such member interior to the chamber is preferably formed to facilitate a desired expedient fluid flow, as described herein.
(410) In some embodiments, the invention encompasses an apparatus for detection of pH comprising a laminar fluid flow system. Preferably, the apparatus is used for sequencing a plurality of nucleic acid templates present in an array.
(411) The apparatus typically includes a fluidics assembly comprising a member comprising one or more apertures for non-mechanically directing a fluid to flow to an array of at least 100K (100 thousand), 500K (500 thousand), or 1M (1 million) microfluidic reaction chambers such that the fluid reaches all of the microfluidic reaction chambers at the same time or substantially the same time. Typically, the fluid flow is parallel to the sensor surface. Typically, the assembly has a Reynolds number of less than 1000, 500, 200, 100, 50, 20, or 10. Preferably, the member further comprises a first aperture for directing fluid towards the sensor array and a second aperture for directing fluid away from the sensor array.
(412) In some embodiments, the invention encompasses a method for directing a fluid to a sensor array comprising: providing a fluidics assembly comprising an aperture fluidly coupling a fluid source to the sensor array; and non-mechanically directing a fluid to the sensor array. By non-mechanically it is meant that the fluid is moved under pressure from a gaseous pressure source, as opposed to a mechanical pump.
(413) In some embodiments, the invention encompasses an array of wells, each of which is coupled to a lid having an inlet port and an outlet port and a fluid delivery system for delivering and removing fluid from said inlet and outlet ports non-mechanically.
(414) In some embodiments, the invention encompasses a method for sequencing a biological polymer such as a nucleic acid utilizing the above-described apparatus, comprising: directing a fluid comprising a monomer to an array of reaction chambers wherein the fluid has a fluid flow Reynolds number of at most 2000, 1000, 200, 100, 50, or 20. The method may optionally further comprise detecting a pH or a change in pH from each said reaction chamber. This is typically detected by ion diffusion to the sensor surface. There are various other ways of providing a fluidics assembly for delivering an appropriate fluid flow across the microwell and sensor array assembly, and the forgoing examples are thus not intended to be exhaustive.
(415) Reference Electrode
(416) Commercial flow-type fluidic electrodes, such as silver chloride proton-permeable electrodes, may be inserted in series in a fluidic line and are generally designed to provide a stable electrical potential along the fluidic line for various electrochemical purposes. In the above-discussed system, however, such a potential must be maintained at the fluidic volume in contact with the microwell ISFET chip. With conventional silver chloride electrodes, it has been found difficult, due to an electrically long fluidic path between the chip surface and the electrode (through small channels in the flow cell), to achieve a stable potential. This led to reception of noise in the chip's electronics. Additionally, the large volume within the flow cavity of the electrode tended to trap and accumulate gas bubbles that degraded the electrical connection to the fluid. With reference to
(417) Fluidics System
(418) A complete system for using the sensor array will include suitable fluid sources, valving and a controller for operating the valving to low reagents and washes over the microarray or sensor array, depending on the application. These elements are readily assembled from off-the-shelf components, with and the controller may readily be programmed to perform a desired experiment.
(419) It should be understood that the readout at the chemFET may be current or voltage (and change thereof) and that any particular reference to either readout is intended for simplicity and not to the exclusion of the other readout. Therefore any reference in the following text to either current or voltage detection at the chemFET should be understood to contemplate and apply equally to the other readout as well. In important embodiments, the readout reflects a rapid, transient change in concentration of an analyte. The concentration of more than one analyte may be detected at different times. In some instances, such measurements are to be contrasted with methods that focus on steady state concentration measurements.
(420) Biological and Chemical Reactions
(421) As already discussed, the apparatus, systems and methods of the invention can be used to detect and/or monitor interactions between various entities. These interactions include biological and chemical reactions and may involve enzymatic reactions and/or non-enzymatic interactions such as but not limited to binding events. As an example, the invention contemplates monitoring enzymatic reactions in which substrates and/or reagents are consumed and/or reaction intermediates, byproducts and/or products are generated. An example of a reaction that can be monitored according to the invention is a nucleic acid synthesis method such as one that provides information regarding nucleic acid sequence. This reaction will be discussed in greater detail herein.
(422) Nucleic Acid Sequencing
(423) In the context of a sequencing reaction, the apparatus and system provided herein is able to detect nucleotide incorporation based on changes in the chemFET current and/or voltage, as those latter parameters are interrelated. Current changes may be the result of one or more of the following events either singly or some combination thereof: generation of hydrogen (and concomitant changes in pH for example in the presence of low strength buffer or no buffer), generation of PPi, generation of Pi (e.g., in the presence of pyrophosphatase), increased charge of nucleic acids attached to the chemFET surface, and the like.
(424) As discussed herein, the invention contemplates methods for determining the nucleotide sequence of a nucleic acid. Such methods involve the synthesis of a new nucleic acid (e.g., using a primer that is hybridized to a template nucleic acid or a self-priming template, as will be appreciated by those of ordinary skill), based on the sequence of a template nucleic acid. That is, the sequence of the newly synthesized nucleic acid is complimentary to the sequence of the template nucleic acid and therefore knowledge of sequence of the newly synthesized nucleic acid yields information about the sequence of the template nucleic acid.
(425) More specifically, knowledge of the sequence of the newly synthesized nucleic acid is obtained by determining whether a known nucleotide has been incorporated into the newly synthesized nucleic acid and, if so, how many of such known nucleotides have been incorporated. Importantly, the order in which the known nucleotides are added to the reaction mixture is known and thus the order of incorporated nucleotides (if any) is also known. In an illustrative embodiment, a template hybridized to a primer is contacted with a first pool of identical known nucleotides (e.g., dATP) in the presence of polymerase. If the next available position on the template is a thymidine residue, then the dATP is incorporated into the primed nucleic acid strand and a signal is detected for example based on hydrogen release. If the next available position is not a thymidine residue, then the dATP will not incorporate and no signal will be detected because no hydrogen will be released. If the next available position and one or more contiguous positions thereafter are thymidine residues, then a corresponding number of dATP will be incorporated and a signal commensurate with the number of nucleotides incorporated will be detected. The reaction well or chamber is then washed to remove unincorporated nucleotides and released hydrogen, following which another pool of identical known nucleotides (e.g., dCTP) is added. The process is repeated until all four nucleotides are separately added to the reaction well (i.e., one cycle), and then the cycles are repeated. The cycles may be repeated for 50 times, 100 times, 200 times, 300 times, 400 times, 500 times, 750 times, or more, depending on the length of sequence information desired.
(426) Nucleotide incorporation can be monitored in a number of ways, including the production of products such as PPi, Pi and/or H.sup.+. The incorporation of a dNTP into the nucleic acid strand releases PPi which can then be hydrolyzed to two orthophosphates (Pi) and one hydrogen ion (
(427) Alternatively, when templates or primers are attached to the sensor surface, nucleotide incorporation is detected based on an increase in charge (typically, negative charge) of the template, primer or template/primer complex. Templates may be bound to the chemFET surface or they may be hybridized to primers that are bound to the chemFET surface. Primers hybridized to the templates can be extended in the presence of polymerase and one or a combination of known nucleotides. Nucleotide incorporation is detected by increases in charge at the chemFET surface that result from the addition of phosphodiester backbone linkages that carry negative charges. Thus, with each successive addition of a nucleotide, the negative charge of the immobilized nucleic acid increases, and this increase can be detected by the chemFET. The number of nucleotide incorporations that can be detected in this manner may be at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, or more. The invention contemplates that in this instance nucleotide incorporation can be detected by measuring change in charge at the chemFET surface as well as released hydrogen ions that come into contact with the chemFET.
(428) Any and all of these events (and more as described herein) may be detected at the chemFET thereby causing a current change that correlates with nucleotide incorporation.
(429) The systems described herein can be used for sequencing nucleic acids without optical detection. Preferably, at least 10.sup.6 base pairs are sequenced per hour, more preferably at least 10.sup.7 base pairs are sequenced per hour, and most preferably at least 10.sup.8 base pairs are sequenced per hour using the above-described method. Thus, the method may be used to sequence an entire human genome within about 24 hours, more preferably within about 20 hours, even more preferably within about 15 hours, even more preferably within about 10 hours, even more preferably within about 5 hours, and most preferably within about 1 hour. These rates may be achieved using multiple ISFET arrays as shown herein, and processing their outputs in parallel.
(430) pH Based Nucleic Acid Sequencing
(431) Reduced Buffering
(432) Certain aspects of the invention therefore relate to detecting hydrogen ions released as a function of nucleotide incorporation and in some embodiments as a function of nucleotide excision. It is important in these and various other aspects to detect as many released hydrogen ions as possible in order to achieve as high a signal (and/or a signal to noise ratio) as possible. Strategies for increasing the number of released protons that are ultimately detected by the chemFET surface include without limitation limiting interaction of released protons with reactive groups in the well, choosing a material from which to manufacture the well in the first instance that is relatively inert to protons, preventing released protons from exiting the well prior to detection at the chemFET, and increasing the copy number of templates per well (in order to amplify the signal from each nucleotide incorporation), among others.
(433) Some instances of the invention employ an environment, including a reaction solution, that is minimally buffered, if at all. Buffering can be contributed by the components of the solution or by the solid supports in contact with such solution. A solution having no or low buffering capacity (or activity) is one in which changes in hydrogen ion concentration on the order of at least about +/0.005 pH units, at least about +/0.01, at least about +/0.015, at least about +/0.02, at least about +/0.03, at least about +/0.04, at least about +/0.05, at least about +/0.10, at least about +/0.15, at least about +/0.20, at least about +/0.25, at least about +/0.30, at least about +/0.35, at least about +/0.45, at least about +/0.50, or more are detectable (e.g., using the chemFET sensors described herein). In some embodiments, the pH change per nucleotide incorporation is on the order of about 0.005. In some embodiments, the pH change per nucleotide incorporation is a decrease in pH. Reaction solutions that have no or low buffering capacity may contain no or very low concentrations of buffer, or may use weak buffers.
(434) A buffer is an ionic molecule (or a solution comprising an ionic molecule) that resists, to varying extents, changes in pH. Buffers include without limitation Tris, tricine, phosphate, boric acid, borate, acetate, morpholine, citric acid, carbonic acid, and phosphoric acid. The strength of a buffer is a relative term since it depends on the nature, strength and concentration of the acid or base added to or generated in the solution of interest. A weak buffer is a buffer that allows detection (and therefore is not able to control or mask) pH changes on the order of those listed above.
(435) The reaction solution may have a buffer concentration equal to or less than 1 mM, equal to or less than 0.9 mM, equal to or less than 0.8 mM, equal to or less than 0.7 mM, equal to or less than 0.6 mM, equal to or less than 0.5 mM, equal to or less than 0.4 mM, equal to or less than 0.3 mM, equal to or less than 0.2 mM, equal to or less than 0.1 mM, or less including zero. The buffer concentration may be 50-100 M. A non-limiting example of a weak buffer suitable for the sequencing reactions described herein wherein pH change is the readout is 0.1 mM Tris or Tricine.
(436) In some aspects, in addition to or instead of using reduced buffering solutions, nucleotide incorporation (and optionally excision) is carried out in the presence of additional agents which serve to shield potential buffering events that may occur in solution. These agents are referred to herein as buffering inhibitors since they inhibit the ability of components within a solution or a solid support in contact with the solution to sequester and/or otherwise interfere with released hydrogen ions prior to their detection by the chemFET surface. In the absence of such inhibitors, released hydrogen ions may interact with or be sequestered by reactive groups in the solution or on solid supports in contact with the solution. These hydrogen ions are less likely to reach and be detected by the chemFET surface, leading to a weaker signal than is otherwise possible. In the presence of such inhibitors however there will be fewer reactive groups available for interaction with or sequestration of hydrogen ions. As a result, a greater proportion of released hydrogen ions will reach and be detected by the chemFET surface, leading to stronger signals. Reactive groups that can interfere with released hydrogen ions include without limitation reactive groups such as free bases on single stranded nucleic acids and SiOH groups that may be present in the passivation layer. Some suitable buffering inhibitors demonstrate little or no buffering capacity in the pH range of 5-9, meaning that pH changes on the order of 0.005, 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 or more pH units are detectable (e.g., by using an ISFET) in the presence of such inhibitors.
(437) There are various types of buffering inhibitors. One example of a buffering agent is an agent that binds to single stranded nucleic acids (or single stranded nucleic acid regions, as may occur in a template nucleic acid) thereby shielding reactive groups such as free bases. These agents may be RNA oligonucleotides (or RNA oligomers, or oligoribonucleotides, as they are referred to herein interchangeably) having complementary sequences to the afore-mentioned single stranded regions of template nucleic acids. RNA oligonucleotides are useful because they are not able to serve as primers for a sequencing reaction as compared to DNA oligonucleotides. In order to bind to (or shield the effects of) as much of a single stranded nucleic acid as possible, a plurality (or set, or mixture) of RNA oligonucleotides can be used. As an example, a set of RNA oligonucleotides that are 2, 3, 4, 5, 6, or more nucleotides in length can be used together with single stranded templates. The short length of these RNA oligonucleotides allows them to be displaced by the polymerase as it progresses along with the length of the nucleic acid template. Such displacement does not require exonuclease activity from the polymerase. Typically, the RNA oligonucleotides are of random sequence. In some embodiments, this is preferred as no prior knowledge of the sequence of the single stranded region of the template is required.
(438) Another example of a class of buffering inhibitors is phospholipids. The phospholipids may be naturally occurring or non-naturally occurring phospholipids. Examples of phospholipids that may be used as buffering inhibitors include but are not limited to phosphatidylcholine, phosphatidylethanolamine, phosphatidylglycerol, and phosphatidylserine.
(439) Another example of a buffering inhibitor is sulfonic acid based surfactants such as poly(ethylene glycol) 4-nonylphenyl 3-sulfopropyl ether (PNSE), the potassium salt of which is shown in
(440) Another example of a buffering inhibitor is polyanionic electrolytes such as poly(styrenesulfonic acid), the sodium salt of which is shown in
(441) Another example of a buffering inhibitor is polycationic electrolytes such as poly(diallydimethylammonium), the chloride salt of which is shown in
(442) Another example of a buffering inhibitor is tetramethyl ammonium, the chloride salt of which is shown in
(443) These various inhibitors may be present throughout a reaction by being included in nucleotide solutions, wash solutions, and the like. Alternatively, they may be flowed through the chamber at set times relative to the flow through of nucleotides and/or other reaction reagents. In still other embodiments, they may be coated on the chemFET surface (or reaction chamber surface). Such coating may be covalent or non-covalent.
(444) Another way of reducing the buffering capacity in the reaction well is to covalently attach nucleic acids to capture beads, in embodiments in which capture beads are used. Such covalent attachment is in contrast to non-covalent methods described herein that include for example biotin, streptavidin interactions. In these latter embodiments, biotinylated primers can be attached to streptavidin coated beads, followed by hybridization to template. However, streptavidin, like other proteins, is capable of buffering, and therefore its presence would interfere with the detection of hydrogen ions released as a consequence of nucleotide incorporation. Thus, the invention also contemplates in some instances approaches that do not rely on streptavidin in the attachment mechanism. One such alternative involves covalently coupling primers to beads (and/or other solid supports such as the chemFET surface). Covalently coupling primers to such solid supports serves at least two purposes. First, it eliminates the need for proteins, such as streptavidin, that comprise functional side groups (such as primary, secondary or tertiary amines and carboxylic acids) that can buffer pH changes in the range of pH 5-9. Second, it serves to increase the number of templates that can be conjugated to the solid support, such as a single bead, by reducing steric hindrance effects that may exist when using bulky proteins such as streptavidin. In still other embodiments, templates may be directly conjugated covalently to solid supports such as beads.
(445) Primers can be covalently coupled to beads in any number of ways, several of which are shown in
(446) Increasing the number of templates or primers (i.e., copy number) results in a greater number of nucleotide incorporations per sensor and/or per reaction chamber, thereby leading to a higher signal and thus signal to noise ratio. Copy number can be increased for example by using templates that are concatemers (i.e., nucleic acids comprising multiple, tandemly arranged, copies of the nucleic acid to be sequenced), by increasing the number of nucleic acids on or in beads up to and including saturating such beads, and by attaching templates or primers to beads or to the sensor surface in ways that reduce steric hindrance and/or ensure template attachment (e.g., by covalently attaching templates), among other things. Concatemer templates may be immobilized on or in beads or on other solid supports such as the sensor surface, although in some embodiments concatemers templates may be present in a reaction chamber without immobilization. For example, the templates (or complexes comprising templates and primers) may be covalently or non-covalently attached to the chemFET surface and their sequencing may involve detection of released hydrogen ions and/or addition of negative charge to the chemFET surface upon a nucleotide incorporation event. The latter detection scheme may be performed in a buffered environment or solution (i.e., any changes in pH will not be detected by the chemFET and thus such changes will not interfere with detection of negative charge addition to the chemFET surface).
(447) For some aspects described herein, it is important that buffering capacity not be affected in the process of increasing copy number. Thus, various methods are provided for increasing copy number using strategies and/or linkers that do not impact the buffering capacity of the environment. In some instances, the functional groups, linkers and/or polymers themselves have no or limited buffering capacity, and their use does not obscure the detection of hydrogen ions released as a result of nucleotide incorporation or excision, as the case may be.
(448) Increasing copy number may also be accomplished by increasing the number of attachment points for primers (or templates). Some of these methodologies are described below.
(449) In one embodiment, the solid support is coated with a polymer such as polyethylene glycol (PEG) which does not comprise functional groups that interact with the primer and its functional groups, except as provided below for initially attaching primer. PEG linkers of varying lengths can be used so that primers can be attached at varying distances from the solid support surface, thereby decreasing the amount of steric hindrance that may otherwise exist between primers and the complexes they ultimately form (e.g., primer/template hybrids). The solid supports can be coated one or more times with a mixture of 2, 3, 4, or more PEG linkers of differing lengths. The end result is an increased distance between ends of PEG linkers attached to the solid support. Attachment of primers to the PEG linkers can be accomplished using any reactive groups known in the art. As an example, click chemistry can be used between azide groups on the ends of PEG linkers and alkyde groups on the primers.
(450) In another embodiment, polymers having preferably more than one functional (or reactive) group are used. Each of the functional groups is available for conjugation with a separate primer. Useful polymers in this regard include those having hydroxyl groups, amine groups, thiol groups, and the like. Examples of suitable polymers include dextran and chitosan. Linear or branched forms of these polymers may be used. An example of a branched polymer with multiple functionalities is branched dextran. It will be apparent to those of ordinary skill in the art than any chimeric polymer or copolymer may also be used provided it has a sufficient number of functional groups for primer attachment.
(451) Yet another embodiment involves the use of dendrimers and preferably higher order dendrimers to bind primer. Dendrimers are three-dimensional complexes that can be made having any functional group. Examples of dendrimers include the PAMAM dendrimers, an example of which is CAS No. 163442-69-1 which has 256 amine groups. Dendrimers are commercially available from sources such as Sigma-Aldrich and Dendritic Nanotechnologies Inc. It will be understood that dendrimers with other functional groups also can be used.
(452) The invention further contemplates the use of any combination of the above embodiments for maximizing the number of primers attached to a solid support. Thus for example the solid support surface may be coated one or more times (e.g., once or twice) with the PEG linkers of varying lengths, and to such linkers may be attached multifunctionality polymers such as dextran or chitosan (in either linear or branched form), followed by attachment of primers. As another example, dendrimers may be attached to the PEG linkers, followed by primer attachment to the dendrimers.
(453) In still another embodiment, the invention contemplates coating the solid support surface with a population of self-assembling monomers some proportion of which are bound to primers. As an example, the monomers may be acrylamide monomers some of which are attached to primers. The end result is a solid support having a polyacrylamide coating with interspersed primers. The density of primers bound to the solid support can be manipulated by changing the ratio of monomers that have primers and monomers that lack primers. This strategy has been reported by Rehman et al. Nucleic Acids Research, 1999, 27(2):649-655.
(454) Still other methods for attaching nucleic acids to beads are taught by Lund et al., Nucleic Acids Research, 1988, 16(22):10861-10880, Joos et al. Anal Biochem, 1997, 247:96-101, Steinberg et al. Biopolymers, 2004, 73:597-605, and Steinberg-Tatman et al. Bioconjugate Chem 2006 17:841-848.
(455) Beads can be made of any material including but not limited to cellulose, cellulose derivatives, gelatin, acrylic resins, glass, silica gels, polyvinyl pyrrolidine (PVP), co-polymers of vinyl and acrylamide, polystyrene, polystyrene cross-linked with divinylbenzene or the like (see, Merrifield Biochemistry 1964, 3, 1385-1390), polyacrylamides, latex gels, dextran, crosslinked dextrans (e.g., Sephadex), rubber, silicon, plastics, nitrocellulose, natural sponges, metal, and agarose gel (Sepharose). In one embodiment, the beads are streptavidin-coated beads.
(456) Beads suitable for covalent attachment may be magnetic or non-magnetic in nature. They may have a polymer core with a polymer surface, a polymer core with a silica surface, and a silica core with a silica surface. The bead core may be hollow, porous, or solid, as described below.
(457) The bead diameter will depend on the density of the chemFET and microwell arrays used, with larger arrays (and thus smaller sized wells) requiring smaller beads. Generally the bead size may be about 1-10 microns, and more preferably 2-6 microns. In some embodiments, the beads are about 5.9 microns while in other embodiments the beads are about 2.8 microns. In still other embodiments, the beads are about 1.5 microns, or about 1 micron in diameter. In some embodiments, beads having a diameter that ranges from about 3.3 to 3.5 microns may be used for reaction well arrays having a pitch on the order of about 5.1 microns. In other embodiments, beads having a diameter that ranges from about 5 to 6.5 microns may be used for reaction well arrays having a pitch on the order to about 9 microns. It is to be understood that the beads may or may not be perfectly spherical in shape. It is also to be understood that other beads may be used and other mechanisms for attaching the nucleic acid to the beads may be used. In some instances the capture beads (i.e., the beads on which the sequencing reaction occurs) are the same as the template preparation beads including the amplification beads. In some instances, even where non-covalent attachment is contemplated, a spacer is used to distance the template nucleic acid (and in particular the target nucleic acid sequence comprised therein) from a solid support such as a bead. This facilitates sequencing of the end of the target closest to the bead, for instance. Examples of suitable linkers are known in the art (see Diehl et al. Nature Methods, 2006, 3(7):551-559) and include but are not limited to carbon-carbon linkers such as but not limited to iSp18. Beads can be purchased from commercial suppliers such as Bangs, Dynal and Micromod. Additional spacers and nucleic acid attachment mechanisms are discussed above.
(458) As stated above, some beads may be solid while others may be porous or hollow. These beads will have a porous surface such that reagents from the reaction solution may move into and out of the bead These may have empty channels or hollow cores that comprise at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% of the bead volume. These beads will be referred to herein as porous beads, porous microparticles, or capsules in view of their non-solid cores, and these terms are intended to embrace porous as well as hollow beads regardless of diameter or volume. They may or may not be spherical.
(459) The invention contemplates the use of such porous beads in the various sequencing methods described herein. More specifically, the invention contemplates sequencing nucleic acids that are present in porous beads. Porous beads may be generated by methods known in the art. See for example Mak et al. Adv. Funct. Mater. 2008 18:2930-2937; Morimoto et al. MEMS 2008 Tucson Ariz. USA Jan. 13-17, 2008 Poster Abstract 304-307; Lee et al. Adv. Mater. 2008 20:3498-3503; Martin-Banderas et al. Small. 2005 1(7):688-92; and published PCT application WO03/078659.
(460) Porous microparticles may be initially generated to contain a single template nucleic acid which is later amplified with all amplified copies of the nucleic acid being retained in the microparticle. Amplification may occur before or while the bead is in contact with a chemFET array, and/or optionally in a reaction chamber. If performed before contact with the chemFET array, beads that have successfully undergone amplification can be selected and thereby enriched. As an example, beads having amplified nucleic acids can be separated from other beads based on density. Amplification may be isothermal or PCR amplification, or other means of amplification, as the invention is not to be limited in this regard. The beads may contain at least two types of enzymes such as two types of polymerases. For example, the beads may contain one type of polymerase that is suitable for amplification of the nucleic acid and a second type of polymerase that is suitable for sequencing the amplified nucleic acids. The beads preferably contain a plurality of both types of polymerases and preferably the number of each polymerase will be in excess of a saturating amount so as not to create a polymerase-limited environment. Once amplification is completed, the amplification polymerase may be inactivated, while maintaining the activity of the sequencing polymerase. Typically, the enzymes and nucleic acids will be retained in the bead while smaller compounds, such as dNTPs and other nucleic acid synthesis reagents and cofactors, are allowed to diffuse into and out of the bead. Importantly for the invention, synthesis byproducts such as PPi and hydrogen ions will also diffuse out of the beads, in order to be detected by the chemFET.
(461) Nick Translation
(462) The invention provides, in various other aspects, other modes for analyzing, including for example sequencing, nucleic acids using reactions that involve interdependent nucleotide incorporation and nucleotide excision. As used herein, interdependent nucleotide incorporation and nucleotide excision means that both reactions occur on the same nucleic molecule at contiguous sites on the nucleic acid, and one reaction facilitates the other.
(463) An example of such a reaction is a nick translation reaction. A nick translation reaction, as used herein, refers to a reaction catalyzed by a polymerase enzyme having 5 to 3 exonuclease activity, that involves incorporation of a nucleotide onto the free 3 end of a nicked region of double stranded DNA and excision of a nucleotide located at the free 5 end of the nicked region of the double stranded DNA. Nick translation therefore refers to the movement of the nicked site along the length of the nicked strand of DNA in a 5 to 3 direction. As will be recognized by those of ordinary skill in the art, the nick translation reaction includes a sequencing-by-synthesis reaction based on the intact strand of the double stranded DNA. This strand acts as the template from which the new strand is synthesized. The method does not require the use of a primer because the double stranded DNA can prime the reaction independently. These aspects of the invention will refer specifically to nick translation for the sake of brevity, but it is to be understood that any other combined reaction of nucleotide excision and incorporation will be equally and fully intended in the following discussion.
(464) The nick translation approach has two features that make it well suited to the detection methods provided herein. First, the nick translation reaction results in the release of two hydrogen ions for each combined excision/incorporation step, thereby providing a more robust signal at the chemFET each time a nucleotide is incorporated into a newly synthesized strand. A sequencing-by-synthesis method, in the absence of nucleotide excision, releases one hydrogen ion per nucleotide incorporation. In contrast, nick translation releases a first hydrogen ion upon incorporation of a nucleotide and a second hydrogen ion upon excision of another nucleotide. This increases the signal that can be sensed at the chemFET, thereby increasing signal to noise ratio and providing a more definitive readout of nucleotide incorporation.
(465) Second, the use of a double stranded DNA template (rather than a single stranded DNA template) results in less interference of the template with released ions and a better signal at the chemFET. A single stranded DNA has exposed groups that are able to interfere with (for example, sequester) hydrogen ions. These reactive groups are shielded in a double stranded DNA where they are hydrogen bonded to complementary groups. By being so shielded, these groups do not substantially impact hydrogen ion level or concentration. As a result, signal resulting from hydrogen ion release is greater in the presence of double stranded as compared to single stranded templates, as will be signal to noise ratio, thereby further contributing to a more definitive readout of nucleotide incorporation.
(466) Templates suitable for nick translation typically are completely or partially double stranded. Such templates comprise an opening (or a nick) which acts as an entry point for a polymerase. Such openings can be introduced into the template in a controlled manner as described below and known in the art.
(467) As will be appreciated by one of ordinary skill in the art, it is preferable that these openings be present in each of the plurality of identical templates at the same location in the template sequence. Typical molecular biology techniques involving nick translation use randomly created nicks along the double stranded DNA because their aim is to produce a detectably labeled nucleic acid. These prior art methods generate nicks through the use of sequence-independent nicking enzymes such as DNase I. In the methods of the invention however the nick location must be known, non-random and uniform for all templates of identical sequence. There are various ways of achieving this, and some of these are discussed below.
(468) One way of achieving this is to create a population of identical double stranded nucleic acid templates that comprise a uracil residue in a defined location on one strand. The uracil may be present in a primer that is used to generate the double stranded nucleic acid or a probe that is hybridized to a single stranded region of a predominantly double stranded nucleic acid. The population of identical template nucleic acids can be generated by an amplification reaction, for example a PCR reaction. The PCR reaction can be performed using a primer pair, one of which comprises a uracil residue. Alternatively, the PCR reaction can be performed with non-uracil containing primers, followed by denaturation of the double stranded amplified products, and hybridization of one strand to a uracil-containing primer. This latter embodiment requires that the single stranded, primed templates be made double stranded prior to the nick translation reaction. These reactions may be carried out while the nucleic acids are bound to a solid support such as a bead. Alternatively, the double stranded nucleic acid templates may be first generated and then attached to a solid support.
(469) The uracil-containing double stranded nucleic acids are then contacted with uracil DNA glycosylase (UDG). UDG is an enzyme that removes uracil from DNA by cleaving the N-glycosylic bond. In some instances, the nucleic acid is contacted with a second enzyme that removes uracil. The second enzyme may be an AP endonuclease, or a lyase or another enzyme having similar nuclease activity. The nucleic acids may be in the reaction chamber (or well) discussed herein during exposure to these enzymes, or they may be added to the reaction chamber (or well) following enzyme contact. Following contact with one (in some instances) or both enzymes, the double stranded nucleic acid comprises a nick at a specific location. More importantly, all nucleic acids of the same sequence and treated in an identical manner will be nicked at the same location. These nicked nucleic acids can then be used as templates for nucleic acid sequencing or other analysis.
(470) Another way in which double stranded nucleic acids can be uniformly nicked is similar to the method just described with the exception that a nucleotide sequence recognized by a nickase or nicking enzyme is incorporated into the nucleic acid. The nickase cuts on only one strand of the double stranded DNA. Some nickases cut their recognition sequence while others cut at a distance from their recognition sequence (e.g., type II nickases). Nickases with longer recognition sites are preferred because such sites are more infrequent and thus less likely to be present in the target nucleic acid (e.g., the genomic fragment) included in the template nucleic acid. Examples of single stranded sequence specific nucleases (and their respective sequences) include without limitation Nb.BbvCI (CCTCAGC), Nt.BbvCI (CCTCAGC), Nb.Bsml (GAATGC), Nt.Sapl (GCTCTTCN), Nb.BsrDI (GCAATG), and Nb.BtsI (GCAGTG), wherein the arrow indicates the site of nicking. Nickases are commercially available from a number of suppliers including NEB. Accordingly, the nucleic acids are prepared having a copy of the nickase recognition sequence in a region of known sequence (e.g., a primer or other artificial sequence in the template nucleic acid). These nucleic acids are then contacted with the corresponding nickase to nick the nucleic acid. As with the uracil embodiment, contact with the nickase can occur before or after the nucleic acids are attached to solid support such as beads, and before or after the nucleic acids are loaded in reaction wells.
(471) Still another way in which double stranded nucleic acids may be uniformly nicked is by incorporating ribonucleotides (rather than deoxyribonucleotides) into one strand of the double stranded nucleic acids. This can be accomplished in a manner similar to that described for the generation of uracil-containing nucleic acids. In other words, a double stranded nucleic acid can be generated using primers that contain one or more ribonucleotides at predetermined and thus known positions. The resultant nucleic acids are then contacted with RNase H or other enzyme that degrades the RNA portion of DNA-RNA hybrids. RNase H in particular hydrolyses phosphodiester bonds of RNA in RNA:DNA heteroduplexes, thereby producing 3 OH groups and 5 phosphate groups. If the double stranded nucleic acid is generated with only a single ribonucleotide then only a single abasic site will result, whereas if the double stranded nucleic acid is generated with multiple ribonucleotides then multiple abasic sites will result. In either case, identical nucleic acids can still be analyzed using a nick translation reaction once all but one of the abasic sites are filled by the polymerase. Taq polymerase is preferred in some embodiments involving these RNA-DNA hybrids. Again, as with the other methods described above, contact with RNase H or other similar enzyme can occur before or after the nucleic acids are attached to a solid support such as beads, and before or after the nucleic acids are loaded in reaction wells.
(472) Still another way to prepare double stranded nucleic acids suitable as templates for nucleotide incorporation and excision events is to generate a double stranded nucleic acid having a 3 overhang on one end, and then subsequently hybridize to the 3 overhang a nucleic acid that is shorter than the overhang by at least one nucleotide. Preferably, after hybridization of the two nucleic acids to each other, there will be one unpaired internal nucleotide in the overhang and this will be the site from which nick translation will begin. Again, the sequence of the 3 overhang and the hybridizing nucleic acid will be known and therefore the location of the abasic site will also be known and will be identical for all template nucleic acids. The hybridization can occur before or after the nucleic acids are attached to a solid support such as beads, and before or after the nucleic acids are loaded into reaction wells.
(473) Another example of a suitable nick translation template is a self-priming nucleic acid. The self priming nucleic acid may comprise a double stranded and a single stranded region that is capable of self-annealing in order to prime a nucleic acid synthesis reaction. The single stranded region is typically a known synthetic sequence ligated to a nucleic acid of interest. Its length can be predetermined and engineered to create an opening following self-annealing, and such opening can act as an entry point for a polymerase.
(474) It is to be understood that, as the term is used herein, a nicked nucleic acid, such as a nicked double stranded nucleic acid, is a nucleic acid having an opening (e.g., a break in its backbone, or having abasic sites, etc.) from which a polymerase can incorporate and optionally excise nucleotides. The term is not limited to nucleic acids that have been acted upon by an enzyme such as a nicking enzyme, nor is it limited simply to breaks in a nucleic acid backbone, as will be clear based on the exemplary methods described herein for creating such nucleic acids.
(475) Once the nicked double stranded nucleic acids are generated, they are then subjected to a nick translation reaction. If the nick translation reaction is performed to sequence the template nucleic acid, the nick translation can be carried out in a manner that parallels the sequencing-by-synthesis methods described herein. More specifically, in some embodiments each of the four nucleotides is separately contacted with the nicked templates in the presence of a polymerase having 5 to 3 exonuclease activity. In other embodiments, known combinations of nucleotides are used. Examples of suitable enzymes include DNA polymerase I from E. coli, Bst DNA polymerase, and Taq DNA polymerase. The order of the nucleotides is not important as long as it is known and preferably remains the same throughout a run. After each nucleotide is contacted with the nicked templates, it is washed out followed by the introduction of another nucleotide, just as described herein. In the nick translation embodiments, the wash will also carry the excised nucleotide away from the chemFET.
(476) It should be appreciated that just as with other aspects and embodiments described herein the nucleotides that are incorporated into the nicked region need not be extrinsically labeled since it is a byproduct of their incorporation that is detected as a readout rather than the incorporated nucleotide itself. Thus, the nick translation methods may be referred to as label-free methods, or fluorescence-free methods, since incorporation detection is not dependent on an extrinsic label on the incorporated nucleotide. The nucleotides are typically naturally occurring nucleotides. It should also be recognized that since the methods benefit from the consecutive incorporation of as many nucleotides as possible, the nucleotides are not for example modified versions that lead to premature chain termination, such as those used in some sequencing methods.
(477) Target and Template Nucleic Acids
(478) The nucleic acid being sequenced is referred to herein as the target nucleic acid. Target nucleic acids include but are not limited to DNA such as but not limited to genomic DNA, mitochondrial DNA, cDNA and the like, and RNA such as but not limited to mRNA, miRNA, and other interfering RNA species, and the like. The nucleic acids may be naturally or non-naturally occurring. They may be obtained from any source including naturally occurring sources such as any bodily fluid or tissue that contains DNA, including, but not limited to, blood, saliva, cerebrospinal fluid (CSF), skin, hair, urine, stool, and mucus, or synthetic sources. The nucleic acids may be PCR products, cosmids, plasmids, naturally occurring or synthetic libraries, and the like. The invention is not intended to be limited in this regard. It should therefore be understood that the invention contemplates analysis, including sequencing, of DNA as well as RNA.
(479) With respect to RNA, amplification methods such as the SMART system and NASBA are known in the art and have been reported by van Gelder et al. PNAS, 1990, 87:1663-1667, Chadwick et al. BioTechniques, 1998, 25:818-822, Brink et al. J Clin Microbiol, 1998, 36(11):3164-3169, Voisset et al. BioTechniques, 2000,29:236-240, and Zhu et al. BioTechniques, 2001, 30:892-897. The amplification methods described in these references are incorporated by reference herein.
(480) The starting amounts of nucleic acids to be sequenced determine the minimum sample requirements. Considering the following bead sizes, with an average of 450 bases in the single stranded region of a template, with an average molecular weight of 325 g/mol per base, Table 2 shows the following:
(481) TABLE-US-00002 TABLE 2 Bead Size (um) femto gram of DNA 0.2 0.124 0.3 0.279 0.7 1.52 1.05 3.42 2.8 24.3 5.9 108
(482) Given the number of beads and microwells contemplated for use in an array, in some embodiments of the invention, it will be apparent that a sample taken from a subject to be tested need only be on the order of 3 g. Thus, the systems and methods described herein can be utilized to sequence an entire genome of an organism from about 3 g of DNA or less. As discussed herein, such sequences can be obtained without the use of optics or extrinsic labels.
(483) Target nucleic acids are prepared using any manner known in the art. As an example, genomic DNA may be harvested from a sample according to techniques known in the art (see for example Sambrook et al. Maniatis). Following harvest, the DNA may be fragmented to yield nucleic acids of smaller length. The resulting fragments may be on the order of hundreds, thousands, or tens of thousands nucleotides in length. In some embodiments, the fragments are 200-1000 base pairs in size, or 300-800 base pairs in size, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, or about 1000 base pairs in length, although they are not so limited.
(484) Nucleic acids may be fragmented by any means including but not limited to mechanical, enzymatic or chemical means. Examples include shearing, sonication, nebulization, endonuclease (e.g., DNase I) digestion, amplification such as PCR amplification, or any other technique known in the art to produce nucleic acid fragments, preferably of a desired length. As used herein, fragmentation also embraces the use of amplification to generate a population of smaller sized fragments of the target nucleic acid. That is, the target nucleic acids may be melted and then annealed to two (and preferably more) amplification primers and then amplified using for example a thermostable polymerase (such as Taq). An example is a massively parallel PCR-based amplification. Fragmentation can be followed by size selection techniques to enrich or isolate fragments of a particular length or size. Such techniques are also known in the art and include but are not limited to gel electrophoresis or SPRI.
(485) Alternatively, target nucleic acids that are already of sufficiently small size (or length) may be used. Such target nucleic acids include those derived from an exon enrichment process. Thus, rather than fragmenting (randomly or non-randomly) longer target nucleic acids, the targets may be nucleic acids that naturally exist or can be isolated in shorter, useable lengths such as mRNAs, cDNAs, exons, PCR products (as described above), and the like. See Albert et al. Nature Methods 2007 4(11):903-905 (microarray hybridization of exons and locus-specific regions), Porreca et al. Nature Methods 2007 4(11):931-936, and Okou et al. Nature Methods 2007 4(11):907-909 for methods of isolating and/or enriching sequences such as exons prior to sequencing.
(486) The target nucleic acids are typically ligated to adaptor sequences on both the 5 and 3 ends. The resulting nucleic acid is referred to herein as a template nucleic acid. The template nucleic acid therefore comprises at least the target nucleic acid and usually comprises nucleotide sequences in addition to the target at both the 5 and 3 ends. The template nucleic acids may be engineered such that different templates have identical 5 ends and identical 3 ends. The 5 and 3 ends in each individual template are preferably different in sequence.
(487) Adaptor sequences may comprise sequences complementary to amplification primer sequences, to be used in amplifying the target nucleic acids. One adaptor sequence may also comprise a sequence complementary to the sequencing primer (i.e., the primer from which sequencing occurs). The opposite adaptor sequence may comprise a moiety that facilitates binding of the nucleic acid to a solid support such as but not limited to a bead. An example of such a moiety is a biotin molecule (or a double biotin moiety, as described by Diehl et al. Nature Methods, 2006, 3(7):551-559) and such a labeled nucleic acid can therefore be bound to a solid support having avidin or streptavidin groups. Another moiety that can be used is the NHS-ester and amine affinity pair. It is to be understood that the invention is not limited in this regard and one of ordinary skill is able to substitute these affinity pairs with other binding pairs. In some embodiments, the solid support is a bead and in others it is a wall of the reaction chamber (or well) such as a bottom wall or a side wall, or both.
(488) In some embodiments, the invention contemplates the use of a plurality of template populations, wherein each member of a given plurality shares the same 3 end but different template populations differ from each other based on their 3 end sequences. As an example, the invention contemplates in some instances sequencing nucleic acids from more than one subject or source. Nucleic acids from a first source may have a first 3 sequence, nucleic acids from a second source may have a second 3 sequence, and so on, provided that the first, second, and any additional 3 sequences are different from each other. In this respect, the 3 end, which is typically a unique sequence, can be used as a barcode or identifier to label (or identify) the source of the particular nucleic acid in a given well. Reference can be made to Meyer et al. Nucleic Acids Research 2007 35(15):e97 for a discussion of labeling nucleic acid with barcodes followed by sequencing.
(489) Templates disposed onto a chemFET array (and thus over more than one sensor in the array) may share identical primer binding sequences. This facilitates the use of an identical primer across microwells and also ensures that a similar (or identical) degree of primer hybridization occurs across microwells. Once annealed to complementary primers such as sequencing primers, the templates are in a complex referred to herein as a template/primer hybrid. In this hybrid, at least one region of the template is double stranded (i.e., where it is bound to its complementary primer) and in some instances the remaining region of the template is single stranded. It is this single stranded region that acts as the template for the incorporation of nucleotides to the end of the primer and thus it is also this single stranded region which is ultimately sequenced according to the invention. As discussed herein, this single stranded region may be bound by short RNA oligomers, of known or unknown (i.e., random) sequence, and still capable of being sequenced.
(490) In some embodiments, the template nucleic acid is able to self-anneal thereby creating a 3 end from which to incorporate nucleotide triphosphates. Thus in such instances, there is no need for a separate sequencing primer since the template acts as both template and primer. See Eriksson et al. Electrophoresis 25:20-27, 2004 for a discussion of the use of self-annealing template in a pyrosequencing reaction. In other instances, sequencing primers are hybridized (or annealed, as the terms are used interchangeably herein) to the templates prior to introduction or contact with the chemFET or reaction chamber.
(491) The plurality of templates in each microwell may be introduced into the microwells (e.g., via a nucleic acid loaded bead), or it may be generated in the microwell itself. A plurality is defined herein as at least two, and in the context of template nucleic acids in a microwell or on a nucleic acid loaded bead includes tens, hundreds, thousands, ten thousands, hundred thousands, millions, or more copies of the template nucleic acid. The limit on the number of copies will depend on a number of variables including the number of binding sites for template nucleic acids (e.g., on the beads or on the walls of the microwells), the size of the beads, the length of the template nucleic acid, the extent of the amplification reaction used to generate the plurality, and the like. It is generally preferred to have as many copies of a given template per well in order to increase signal to noise ratio as much as possible, as discussed herein. In some embodiments, the amplification is a representative amplification. A representative amplification is an amplification that does not alter the relative representation of any nucleic acid species.
(492) Thus, the template nucleic acid may be amplified prior to or after placement in the well and/or contact with the sensor. Amplification and conjugation of nucleic acids to solid supports such as beads may be accomplished in a number of ways. For example, in one aspect once a template nucleic acid is loaded into a well of the flow cell 200, amplification may be performed in the well, the resulting amplified product denatured, and sequencing-by-synthesis then performed. In one embodiment, the template is amplified in solution and then hybridized to a single primer that is immobilized on the chemFET surface. The use of only one primer type on the surface ensures that only one of the amplified strands is eventually bound to the surface, and the other strand is removed through wash.
(493) Amplification methods include but are not limited to emulsion PCR (i.e., water in oil emulsion amplification) as described by Margulies et al. Nature 2005 437(15):376-380 and accompanying supplemental materials, bridge amplification, rolling circle amplification (RCA), concatemer chain reaction (CCR), or other strategies using isothermal or non-isothermal amplification techniques.
(494) Bridge amplification can be used to produce a solid support (such as a reaction chamber wall or a bead) having amplified copies of the same template. The method involves contacting template nucleic acids with the chemFET/reaction chamber array at a limiting dilution in order to ensure that reaction chambers contain only a single template. The chemFET surface will typically be coated with two populations of primers. In one embodiment, the chemFET surface is coated with both forward and reverse primers that are complementary to the engineered 5 and 3 sequences of the template. The template is bound to the chemFET surface directly and then allowed to hybridize at its free end with a complementary primer on the surface. The primer is extended using unlabeled nucleotides, and the resultant double stranded nucleic acid is then denatured. This results in immobilized copies of the template nucleic acid and its complement in close proximity on the surface. This process is repeated by allowing the template and its complement to hybridize at their free ends to other primers on the surface. The net result is a population of immobilized template and a population of immobilized complement that are interspersed amongst each other. The sequencing-by-synthesis reaction is then carried out using a sequencing primer that binds to one but not both immobilized strands. This effectively selects for one of the strands and ensures that only one strand is sequenced. Either strand can be sequenced since they are complements of each other.
(495) In a related embodiment, the solid support is a bead and the bead is coated with the two primer populations and only a single stranded template nucleic acid, at least initially. This amplification method is described in U.S. Pat. No. 5,641,658 to Adams et al.
(496) In still another embodiment, each solid support surface (whether bead or reaction chamber wall) has bound thereto a specific and unique primer pair that may be but is not limited to a gene specific primer pair. One or both of the primers in the pair select for templates in a library that is applied to the solid support. Due to the unique sequence of the primers, it is expected that only the desired template will hybridize and then be amplified and sequenced, as described above.
(497) RCA or CCR amplification methods generate concatemers of template nucleic acids that comprise tens, hundreds, thousands or more tandemly arranged copies of the template. Such concatemers may still be referred to herein as template nucleic acids, although they may contain multiple copies of starting template nucleic acids. In some embodiments, they may also be referred to as amplified template nucleic acids. Alternatively, they may be referred to herein as comprising multiple copies of target nucleic acid fragment. Concatemers may contain 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 500, 1000, or more copies of the starting nucleic acid. They may contain 10-10.sup.2, 10.sup.2-10.sup.3, 10.sup.3-10.sup.4, 10.sup.3-10.sup.5, or more copies of the starting nucleic acid. Concatemers generated using these or other methods (such as for example DNA nanoballs) can be used in the sequencing-by-synthesis methods described herein. The concatemers may be generated in vitro apart from the array and then placed into reaction chambers of the array or they may be generated in the reaction chambers. One or more inside walls of the reaction chamber may be treated to enhance attachment and retention of the concatemers, although this is not required. In some embodiments of the invention, if the concatemers are attached to an inside wall of the reaction chamber, such as the chemFET surface, then nucleotide incorporation at least in the context of a sequencing-by-synthesis reaction may be detected by a change in charge at the chemFET surface, as an alternative to or in addition to the detection of released hydrogen ions as discussed herein. If the concatemers are deposited onto a chemFET surface and/or into a reaction chamber, sequencing-by-synthesis can occur through detection of released hydrogen ions as discussed herein. The invention embraces the use of other approaches for generating concatemerized templates. One such approach is a PCR described by Stemmer et al. in U.S. Pat. No. 5,834,252, and the description of this approach is incorporated by reference herein.
(498) The ability to use template nucleic acids independently of beads and that can be deposited into reaction chambers or onto chemFET surfaces facilitates the use of dense chemFET arrays. As will be understood, denser arrays will typically incorporate more chemFETs and optionally more reaction chambers (where they are used) per array (or chip). In order to accommodate the increased number of chemFETs and optionally reaction chambers, the size of the chemFETs and optionally reaction chambers is reduced. Accordingly, in some instances, it may be preferable to use nucleic acids that are concatemers of the nucleic acid to be sequenced, independently of beads. Such nucleic acids may be allowed to self-assemble onto a treated chemFET surface, or they may settle into the well (for example, by gravity), or they may be pulled in by magnetic or other force. Thus, the invention contemplates the use of such concatemerized template nucleic acids in the pH based sequencing-by-synthesis methods described herein.
(499) As discussed herein, one approach for generating nucleic acids that comprise multiple copies of a nucleic acid to be sequenced involves amplification of a circular template. The resultant amplified product forms a three dimensional structure that may occupy a spherical volume or other three dimensional volume and shape. The occupied volume may vary, depending on the size of the resultant nucleic acid. For example, in some instances the spherical volume may have an average diameter on the order of about 100-300 nm. The generation of these three dimensional structures is described further in published US patent applications US20070072208A1 and US20070099208A1, both to Drmanac et al.
(500) Such nucleic acids may be generated in solution (i.e., amplification occurs in solution) and therefore emulsion based techniques or reaction chambers or wells are not necessary in some instances. As each resultant nucleic acid consists of a clonal amplified population of a starting nucleic acid, there will be no cross contamination of nucleic acids and nor does there have to be any physical separation between individual amplification reactions. Thus, in some aspects, it is contemplated that nucleic acids (such as DNA nanoballs or amplicons) are generated in solution and then deposited onto chemFET surfaces and/or into reaction chambers. Further references that describe amplifications methods suitable for the synthesis of these nucleic acids include U.S. Pat. Nos. 4,683,195, 4,965,188, 4,683,202, 4,800,159, 5,210,015, 6,174,670, 5,399,491, 6,287,824, 5,854,033 and published US patent application US20060024 711. Linear rolling circle amplification, multiple displacement amplification, and padlock probe rolling circle amplification can all be used to generate clonal amplicons without the need for limiting dilution in order to avoid cross-contamination of nucleic acid templates by each other.
(501) The chemFET surfaces may be treated (or patterned) or untreated (or unpatterned). In some instances, treated (or patterned) surfaces are preferred in order to maximize nucleic acid deposition and/or retention onto a surface. It is further known in the art that these nucleic acids may self-assemble onto the chemFET surface provided the chemFET array surface comprises regions to which the nucleic acids bind and optionally regions to which they do not bind. Additionally, the binding of a nucleic acid to one region on the surface will repel the binding of another nucleic acid, thereby precluding the possibility that two or more nucleic acids of different sequence could co-exist at the same chemFET surface. The chemFET array may have an occupancy on the order of greater than 50%, greater than 60%, greater than 70%, greater than 80%, or 90% or greater (i.e., the number of individual chemFET surfaces onto which a single nucleic acid is deposited). It will be understood that, as used herein, the term deposited refers simply to the placement of the nucleic acid in close proximity and potentially in contact with a chemFET surface (and optionally reaction chamber), but it does not require any particular interaction, whether covalent or non-covalent, between the nucleic acid and the chemFET surface.
(502) The amplified nucleic acids discussed herein may be attached to the chemFET surface through functionalities incorporated into (e.g., during amplification) or added post-synthesis to the nucleic acid. Such functionalities may be located at adaptor regions within the nucleic acid which are not intended for sequencing according to the methods provided herein. For example, a concatemer may be generated from a circular template having two or more adaptor sequences (or nucleic acids) located upstream and downstream of the nucleic acids being sequenced. Alternatively, the starting (or initial) nucleic acid may consist of a single adaptor sequence and a single nucleic acid to be sequenced and in the process of amplification (such as, for example, RCA) the adaptor sequence is used to separate the copies of the nucleic acid to be sequenced from each other. Whether in this embodiment or others described herein, functionalities present in the adaptor sequences may be used to attach and/or retain the resultant amplified nucleic acids on a chemFET surface and optionally a reaction chamber. Exemplary functionalities include but are not limited to amino groups, sulfhydryl groups, carbonyl groups, biotin, streptavidin, avidin, amine allyl labeled nucleotides, NHS-ester interaction, thioether linkages, and the like.
(503) Attachment may be via non-covalent bonds between capture nucleic acids present on the chemFET surface and complementary sequences in the adapter regions, or adsorption to the surface via Van der Waals forces, hydrogen bonding, static charge interactions, ionic and hydrophobic interactions, and the like. Techniques used to attach DNAs to microarrays may also be used to attach the amplified products to the chemFET surface. These techniques include but are not limited to those described by Smirnov, Genes, Chrom & Cancer 40:72-77, 2004 and Beaucage Curr Med Chem 8:1213-1244, 2001.
(504) Deposition and/or retention may also be accomplished using magnetic forces. In these embodiments, magnetic particles may be incorporated into and/or attached post-synthesis to the amplified nucleic acids (e.g., at regions not intended for sequencing). Once the nucleic acids are distributed on a chemFET array and optionally a reaction chamber array, the array is placed in proximity to a magnet in order to move the nucleic acids towards the chemFET surface and optionally into a reaction chamber.
(505) It should also be understood that the methods described herein contemplate the synthesis of the amplified nucleic acids on or in proximity to the chemFET and optionally in a reaction chamber in addition to synthesis in solution followed by deposition onto the chemFET surface. It is expected however that the latter approach will result in a greater degree of occupancy of chemFET surfaces in the array.
(506) Accordingly, provided herein is an array of nucleic acids comprising a plurality of chemFETs each having a surface, and a plurality of nucleic acids, each nucleic acid deposited onto (or attached to) individual chemFET surfaces, wherein each nucleic acid comprises multiple identical copies of an initial nucleic acid to be sequenced. In some instances, the nucleic acid has a random coil state.
(507) Also provided herein is a method for sequencing a nucleic acid present in a reaction chamber of a reaction chamber array, comprising synthesizing a concatemer of a starting nucleic acid, wherein the concatemer has a cross-sectional diameter greater than the diameter of the reaction well, optionally immobilizing (whether covalently or non-covalently) the concatemer in the reaction chamber, and sequencing the concatemer, preferably by sequencing-by-synthesis methods provided herein (e.g., pH based sequencing-by-synthesis methods). It will be understood that if the reaction chamber has a non-circular cross-section then one or more or an average of cross-sectional dimensions can be used (as can a cross-sectional area) in comparing the concatemer and the reaction chamber sizes or dimensions. It should also be understood that the size of the concatemer relative to the reaction chamber will preclude the presence of more than one concatemer per reaction chamber.
(508) Solid Supports and Capture Beads
(509) The solid support to which the template nucleic acids or primers are bound is referred to herein as the capture solid support. The solid support may be a wall of the reaction chamber (or well) including the surface of the chemFET, or a bottom or side wall of the reaction chamber provided such wall is capacitively coupled to the chemFET. If the solid support is a bead, then such bead may be referred to herein as a capture bead. Such beads are generally referred to herein as loaded with or bearing nucleic acid if they have nucleic acids attached to their surface (whether covalently or non-covalently) and/or present in their interior core. Some capture beads comprise a porous surface that allows entry and exit of small compounds such as amplification or sequencing reagents (e.g., dNTPs, co-factors, etc.). This class of beads typically will comprise nucleic acids internally and in this way they function to localize the nucleic acids, optionally without the need to attach the nucleic acids to a solid support. In embodiments in which capture beads are used, preferably each reaction well comprises only a single capture bead.
(510) The degree of saturation of any capture (i.e., sequencing) bead with template nucleic acid to be sequenced may not be 100%. In some embodiments, a saturation level of 10%-100% exists. As used herein, the degree of saturation of a capture bead with a template refers to the proportion of sites on the bead that are conjugated to template. In some instances this may be at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or it may be 100%.
(511) Microwell Arrays
(512) Important aspects of the invention contemplate sequencing a plurality of different template nucleic acids simultaneously. This may be accomplished using the sensor arrays described herein. In one embodiment, the sensor arrays are overlayed (and/or integral with) an array of microwells (or reaction chambers or wells, as those terms are used interchangeably herein), with the proviso that there be at least one sensor per microwell. Present in a plurality of microwells is a population of identical copies of a template nucleic acid. There is no requirement that any two microwells carry identical template nucleic acids, although in some instances such templates may share overlapping sequence. Thus, each microwell comprises a plurality of identical copies of a template nucleic acid, and the templates between microwells may be different.
(513) The microwells may vary in size between arrays. The size of these microwells may be described in terms of a width (or diameter) to height ratio. In some embodiments, this ratio is 1:1 to 1:1.5. The bead to well size (e.g., the bead diameter to well width, diameter, or height) is preferably in the range of 0.6-0.8.
(514) The microwell size may be described in terms of cross section. The cross section may refer to a slice parallel to the depth (or height) of the well, or it may be a slice perpendicular to the depth (or height) of the well. The microwells may be square in cross-section, but they are not so limited. The dimensions at the bottom of a microwell (i.e., in a cross section that is perpendicular to the depth of the well) may be 1.5 m by 1.5 m, or it may be 1.5 m by 2 m. Suitable diameters include but are not limited to at or about 100 m, 95 m, 90 m, 85 m, 80 m, 75 m, 70 m, 65 m, 60 m, 55 m, 50 m, 45 m, 40 m, 35 m, 30 m, 25 m, 20 m, 15 m, 10 m, 9 m, 8 m, 7 m, 6 m, 5 m, 4 m, 3 m, 2 m, 1 m or less. In some particular embodiments, the diameters may be at or about 44 m, 32 m, 8 m, 4 m, or 1.5 m. Suitable heights include but are not limited to at or about 100 m, 95 m, 90 m, 85 m, 80 m, 75 m, 70 m, 65 m, 60 m, 55 m, 50 m, 45 m, 40 m, 35 m, 30 m, 25 m, 20 m, 15 m, 10 m, 9 m, 8 m, 7 m, 6 m, 5 m, 4 m, 3 m, 2 m, 1 m or less. In some particular embodiments, the heights may be at or about 55 m, 48 m, 32 m, 12 m, 8 m, 6 m, 4 m, 2.25 m, 1.5 m, or less. Various embodiments of the invention contemplate the combination of any of these diameters with any of these heights. In still other embodiments, the reaction well dimensions may be (diameter in m by height in m) 44 by 55, 32 by 32, 32 by 48, 8 by 8, 8 by 12, 4 by 4, 4 by 6, 1.5 by 1.5, or 1.5 by 2.25.
(515) The reaction well volume may range (between arrays, and preferably not within a single array) based on the well dimensions. This volume may be at or about 100 picoliter (pL), 90, 80, 70, 60, 50, 40, 30, 20, 10, or fewer pL. In important embodiments, the well volume is less than 1 pL, including equal to or less than 0.5 pL, equal to or less than 0.1 pL, equal to or less than 0.05 pL, equal to or less than 0.01 pL, equal to or less than 0.005 pL, or equal to or less than 0.001 pL. The volume may be 0.001 to 0.9 pL, 0.001 to 0.5 pL, 0.001 to 0.1 pL, 0.001 to 0.05 pL, or 0.005 to 0.05 pL. In particular embodiments, the well volume is 75 pL, 34 pL, 23 pL, 0.54 pL, 0.36 pL, 0.07 pL, 0.045 pL, 0.0024 pL, or 0.004 pL. In some embodiments, each reaction chamber is no greater than about 0.39 pL in volume and about 49 m.sup.2 surface aperture, and more preferably has an aperture no greater than about 16 m.sup.2 and volume no greater than about 0.064 pL.
(516) It is to be understood therefore that the invention contemplates a sequencing apparatus for sequencing unlabeled nucleic acid acids, optionally using unlabeled nucleotides, without optical detection and comprising an array of at least 100 reaction chambers. In some embodiments, the array comprises 10.sup.3, 10.sup.4, 10.sup.5, 10.sup.6, 10.sup.7 or more reaction chambers. The pitch (or center-to-center distance between adjacent reaction chambers) is on the order of about 1-10 microns, including 1-9 microns, 1-8 microns, 1-7 microns, 1-6 microns, 1-5 microns, 1-4 microns, 1-3 microns, or 1-2 microns.
(517) In various aspects and embodiments of the invention, the nucleic acid loaded beads, of which there may be tens, hundreds, thousands, or more, first enter the flow cell and then individual beads enter individual wells. The beads may enter the wells passively or otherwise. For example, the beads may enter the wells through gravity without any applied external force. The beads may enter the wells through an applied external force including but not limited to a magnetic force or a centrifugal force. In some embodiments, if an external force is applied, it is applied in a direction that is parallel to the well height/depth rather than transverse to the well height/depth, with the aim being to capture as many beads as possible. Preferably, the wells (or well arrays) are not agitated, as for example may occur through an applied external force that is perpendicular to the well height/depth. Moreover, once the wells are so loaded, they are not subjected to any other force that could dislodge the beads from the wells.
(518) The Examples provide a brief description of an exemplary bead loading protocol in the context of magnetic beads. It is to be understood that a similar approach could be used to load other bead types. The protocol has been demonstrated to reduce the likelihood and incidence of trapped air in the wells of the flow chamber, uniformly distribute nucleic acid loaded beads in the totality of wells of the flow chamber, and avoid the presence and/or accumulation of excess beads in the flow chamber.
(519) In various instances, the invention contemplates that each well in the flow chamber contain only one nucleic acid loaded bead. This is because the presence of two beads per well will yield unusable sequencing information derived from two different template nucleic acids.
(520) In some embodiments, the microwell array may be analyzed to determine the degree of loading of beads into the microwells, and in some instances to identify those microwells having beads and those lacking beads. The ability to know which microwells lack beads provides another internal control for the sequencing reaction. The presence or absence of a bead in a well can be determined by standard microscopy or by the sensor itself.
(521) It has also been found that in the absence of flow the background signal (i.e., noise) is less than or equal to about 0.25 mV, but that in the presence of DNA-loaded capture beads that signal increases to about 1.0 mV+/0.5 mV. This increase is sufficient to allow one to determine wells with beads.
(522) The percentage of occupied wells in the well array may vary depending on the methods being performed. If the method is aimed at extracting maximum sequence data in the shortest time possible, then higher occupancy is desirable. If speed and throughout is not as critical, then lower occupancy may be tolerated. Therefore depending on the embodiment, suitable occupancy percentages may be at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% of the wells. As used herein, occupancy refers to the presence of one nucleic acid loaded bead in a well and the percentage occupancy refers to the proportion of total wells in an array that are occupied by a single bead. Wells that are occupied by more than one bead typically cannot be used in the analyses contemplated by the invention.
(523) Simultaneous Sequencing Reactions
(524) The invention therefore contemplates performing a plurality of different sequencing reactions simultaneously. A plurality of identical sequencing reactions is occurring in each occupied well simultaneously. It is this simultaneous and identical incorporation of dNTP within each well that increases the signal to noise ratio. By performing sequencing reactions in a plurality of wells simultaneously, a plurality of different nucleic acids are simultaneously sequenced. The methods aim to maximize complete incorporation across all microwells for any given dNTP, reduce or decrease the number of unincorporated dNTPs that remain in the wells after signal detection is complete, and achieve as a high a signal to noise ratio as possible.
(525) Before and/or while in the wells, the template nucleic acids are incubated with a sequencing primer that binds to its complementary sequence located on the 3 end of the template nucleic acid (i.e., either in the amplification primer sequence or in another adaptor sequence ligated to the 3 end of the target nucleic acid) and with a polymerase for a time and under conditions that promote hybridization of the primer to its complementary sequence and that promote binding of the polymerase to the template nucleic acid. The primer can be of virtually any sequence provided it is long enough to be unique. The hybridization conditions are such that the primer will hybridize to only its true complement on the 3 end of the template. Suitable conditions are disclosed in Margulies et al. Nature 2005 437(15):376-380 and accompanying supplemental materials.
(526) It will be understood that the amount of sequencing primers and polymerases may be saturating, above saturating level, or in some instances below saturating levels. As used herein, a saturating level of a sequencing primer or a polymerase is a level at which every template nucleic acid is hybridized to a sequencing primer or bound by a polymerase, respectively. Thus the saturating amount is the number of polymerases or primers that is equal to the number of templates on a single bead. In some embodiments, the level is greater than this, including at least 2 fold, 3 fold, 4 fold, 5 fold, 10 fold, or more than the level of the template nucleic acid. In other embodiments, the number of polymerases and/or primers may be 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or up to 100% of the number of templates on a single bead in a single well.
(527) Suitable polymerases include but are not limited to DNA polymerase, RNA polymerase, or a subunit thereof, provided it is capable of synthesizing a new nucleic acid strand based on the template and starting from the hybridized primer. An example of a suitable polymerase subunit for some but not all embodiments of the invention is the exo-minus (exo.sup.) version of the Klenow fragment of E. coli DNA polymerase I which lacks 3 to 5 exonuclease activity. Other polymerases include T4 exo, Therminator, and Bst polymerases. In still other embodiments that require excision of nucleotides (e.g., in the process of a nick translation reaction), polymerases with exonuclease activity are preferred. The polymerase may be free in solution (and may be present in wash and dNTP solutions) or it may be bound for example to the beads (or corresponding solid support) or to the walls of the chemFET but preferably not to the ISFET surface itself. The polymerase may be one that is modified to comprise accessory factors including without limitation single or double stranded DNA binding proteins.
(528) Some embodiments of the invention require that the polymerase have sufficient processivity. As used herein, processivity is the ability of a polymerase to remain bound to a single primer/template hybrid. As used herein, it is measured by the number of nucleotides that a polymerase incorporates into a nucleic acid (such as a sequencing primer) prior to dissociation of the polymerase from the primer/template hybrid. In some embodiments, the polymerase has a processivity of at least 100 nucleotides, although in other embodiments it has a processivity of at least 200 nucleotides, at least 300 nucleotides, at least 400 nucleotides, or at least 500 nucleotides. It will be understood by those of ordinary skill in the art that the higher the processivity of the polymerase, the more nucleotides that can be incorporated prior to dissociation, and therefore the longer the sequence that can be obtained. In other words, polymerases having low processivity will provide shorter read-lengths than will polymerases having higher processivity. As an example, a polymerase that dissociates from the hybrid after five incorporations will only provide a sequence of 5 nucleotides in length, while a polymerase that dissociates on average from the hybrid after 500 incorporations will provide sequence of about 500 nucleotides.
(529) The rate at which a polymerase incorporates nucleotides will vary depending on the particular application, although generally faster rates of incorporation are preferable. The rate of sequencing will depend on the number of arrays on chip, the size of the wells, the temperature and conditions at which the reactions are run, etc.
(530) In some embodiments of the invention, the time for a 4 nucleotide cycle may be 50-100 seconds, 60-90 seconds, or about 70 seconds. In other embodiments, this cycle time can be equal to or less than 70 seconds, including equal to or less than 60 seconds, equal to or less than 50 seconds, equal to or less than 40 seconds, or equal to or less than 30 seconds. A read length of about 400 bases may take on the order of 30 minutes, 60 minutes, 1.5 hours, 2 hours, 2.5 hours, 3 hours, 3.5 hours, 4 hours, 4.5 hours, or in some instance 5 or more hours. These times are sufficient for the sequencing of megabases, and more preferably gigabases of sequence, with greater amounts of sequence being attainable through the use of denser arrays (i.e., arrays with greater numbers of reaction wells and FETs) and/or the simultaneous use of multiple arrays.
(531) Table 3 provides estimates for the rates of sequencing based on various array, chip and system configurations contemplated herein. It is to be understood that the invention contemplates even denser arrays than those shown in Table 3. These denser arrays can be characterized as 90 nm CMOS with a pitch of 1.4 m and a well size of 1 m which may be used with 0.7 m beads, or 65 nm CMOS with a pitch of 1 m and a well size of 0.5 m which may be used with 0.3 m beads, or 45 m CMOS with a pitch of 0.7 m and a well size of 0.3 m which can be used with 0.2 m beads.
(532) TABLE-US-00003 TABLE 3 Reaction Parameters and Read Rates. chip type A B C D E pixel/CMOS 2.8 m/0.18 m 5.1 m/0.35 m 5.1 m/0.35 m 9 m/3.5 m 9 m/0.35 m chip size 17.5 17.5 17.5 17.5 12 12 17.5 17.5 12 12 # possible reads 27,800,000 7,220,000 2,950,000 2,320,000 1,060,000 read length (assumption) 400 400 400 400 400 # chips/board 4 4 4 4 4 bead load efficiency 0.80 0.80 0.80 0.80 0.80 # yielded Gbp* per run 35.6 9.2 3.8 3.0 1.4 # times HG** (3 Gbp/HG) 11.86 3.08 1.26 0.99 0.45 *Gbp is gigabases **HG is human genome
(533) The template nucleic acid is also contacted with other reagents and/or cofactors including but not limited to buffer, detergent, reducing agents such as dithiothrietol (DTT, Cleland's reagent), single stranded binding proteins, and the like before and/or while in the well. In one embodiment, the polymerase comprises one or more single stranded binding proteins (e.g., the polymerase may be one that is engineered to include one or more single stranded binding proteins). In one embodiment, the template nucleic acid is contacted with the primer and the polymerase prior to its introduction into the flow chamber and wells thereof.
(534) The primers may be DNA in nature or they may be modified moieties such as PNA or LNA, or they may comprise some other modification such as those described herein, or some combination of the foregoing. It has been found according to the invention that LNA-containing primers bind efficiently to DNA templates under stringent conditions and are still able to mediate a polymerase-mediated extension.
(535) Some reactions may be carried out at a pH equal to or greater than 7.5, equal to or greater than 8, equal to or greater than 8.5, equal to or greater than 9, equal to or greater than 9.5, equal to or greater than 10, or equal to or greater than 11. The polymerase may be one that incorporates nucleotides into a nucleic acid at a pH of 7-11, 7.5-10.5, 8-10, 8.5-9.5, or at about 9.
(536) In some embodiments, the enzyme has high activity in low concentrations of dNTPs. In some embodiments, the dNTP concentration is 50 M, 40 M, 30 M, 20 M, 10 M, 5 M, and preferably 20 M or less.
(537) Apyrase is an enzyme that degrades residual unincorporated nucleotides converting them into monophosphate and releasing inorganic phosphate in the process. It is useful for degrading dNTPs that are not incorporated and/or that are in excess. It is important that excess and/or unincorporated dNTP be washed away from all wells after measurements are complete and before introduction of the subsequent dNTP. Accordingly, addition of apyrase between the introduction of different dNTPs is useful to remove unincorporated dNTPs that would otherwise obscure the sequencing data.
(538) Thus, according to some aspects of the invention, a homogeneous population of (or a plurality of identical) template nucleic acids is placed into each of a plurality of wells, each well situated over and thus corresponding to at least one sensor. As discussed above, preferably the well contains at least 10, at least 100, at least 1000, at least 10.sup.4, at least 10.sup.5, at least 10.sup.6, or more copies of an identical template nucleic acid. Identical template nucleic acids means that the templates are identical in sequence. Most and preferably all the template nucleic acids within a well are uniformly hybridized to a primer. Uniform hybridization of the template nucleic acids to the primers means that the primer hybridizes to the template at the same location (i.e., the sequence along the template that is complementary to the primer) as every other template/primer hybrid in the well. The uniform positioning of the primer on every template allows the co-ordinated synthesis of all new nucleic acid strands within a well, thereby resulting in a greater signal-to-noise ratio.
(539) In some embodiments, nucleotides are then added in flow, or by any other suitable method, in sequential order to the flow chamber and thus the wells. The nucleotides can be added in any order provided it is known and for the sake of simplicity kept constant throughout a run.
(540) In some embodiments, the method involves adding ATP to the wash buffer so that dNTPs flowing into a well displace ATP from the well. The ATP matches the ionic strength of the dNTPs entering the wells and it also has a similar diffusion profile as dNTPs. In this way, influx and efflux of dNTPs during the sequencing reaction do not interfere with measurements at the chemFET. The concentration of ATP used is on the order of the concentration of dNTP used.
(541) In some embodiments, the dNTP and/or the polymerase may be pre-incubated with divalent cation such as but not limited to Mg.sup.2+ (for example in the form of MgCl.sub.2) or Mn.sup.2+ (for example in the form of MnCl.sub.2). Other divalent cations can also be used including but not limited to Ca.sup.2+, Co.sup.2+. This pre-incubation (and thus pre-loading of the dNTP and/or the polymerase can ensure that the polymerase is exposed to a sufficient amount of divalent cation for proper and necessary functioning even if it is present in a low ionic strength environment. Pre-incubation may occur for 1-60 minutes, 5-45 minutes, or 10-30 minutes, depending on the embodiment, although the invention is not limited to these time ranges.
(542) A sequencing cycle may therefore proceed as follows washing of the flow chamber (and wells) with wash buffer (optionally containing ATP), introduction of a first dNTP species (e.g., dATP) into the flow chamber (and wells), release and detection of PPi and then unincorporated nucleotides (if incorporation occurred) or detection of solely unincorporated nucleotides (if incorporation did not occur) (by any of the mechanisms described herein), washing of the flow chamber (and wells) with wash buffer, washing of the flow chamber (and wells) with wash buffer containing apyrase (to remove as many of the unincorporated nucleotides as possible prior to the flow through of the next dNTP, washing of the flow chamber (and wells) with wash buffer, and introduction of a second dNTP species. This process is continued until all 4 dNTP (i.e., dATP, dCTP, dGTP and dTTP) have been flowed through the chamber and allowed to incorporate into the newly synthesized strands. This 4-nucleotide cycle may be repeated any number of times including but not limited to 10, 25, 50, 100, 200 or more times. The number of cycles will be governed by the length of the template being sequenced and the need to replenish reaction reagents, in particular the dNTP stocks and wash buffers.
(543) As part of the sequencing reaction, a dNTP will be ligated to (or incorporated into as used herein) the 3 of the newly synthesized strand (or the 3 end of the sequencing primer in the case of the first incorporated dNTP) if its complementary nucleotide is present at that same location on the template nucleic acid. Incorporation of the introduced dNTP (and concomitant release of PPi) therefore indicates the identity of the corresponding nucleotide in the template nucleic acid. If no dNTP has been incorporated, no hydrogens are released and no signal is detected at the chemFET surface. One can therefore conclude that the complementary nucleotide was not present in the template at that location. If the introduced dNTP has been incorporated into the newly synthesized strand, then the chemFET will detect a signal. The signal intensity and/or area under the curve is a function of the number of nucleotides incorporated (for example, as may occur in a homopolymer stretch in the template. The result is that no sequence information is lost through the sequencing of a homopolymer stretch (e.g., poly A, poly T, poly C, or poly G) in the template.
(544) The sequencing reaction can be run at a range of temperatures. Typically, the reaction is run in the range of 30 C. to 70 C., 30 C. to 65 C., 30-60 C., 35-55 C., 40-50 C., or 40-45 C. It is preferable to run the reaction at temperatures that prevent formation of secondary structure in the nucleic acid. However this must be balanced with the binding of the primer (and the newly synthesized strand) to the template nucleic acid and the reduced half-life of apyrase at higher temperatures. The optimum temperature for the polymerase is also important as the closer the reaction is run to that temperature, the higher the nucleotide incorporation rate will be. Bst polymerase has a optimum temperature of about 65 C., while T4 polymerase has an optimum temperature of about 37 C. Thus, the optimum temperature will depend upon the polymerase being used. Some embodiments use a temperature of about 41 C. Other embodiments use a temperature that is higher including for example about 45 C., about 50 C. or about 65 C. The solutions, including the wash buffers and the dNTP solutions, are generally warmed to these temperatures in order not to alter the temperature in the wells. The wash buffer containing apyrase however is preferably maintained at a lower temperature in order to extend the half-life of the enzyme. Typically, this solution is maintained at about 4-15 C., and more preferably 4-10 C.
(545) As will be appreciated all of the foregoing methods may be automated such that the various biological and/or chemical reactions are performed via robotics. In addition, the information obtained via the signal from the chemFET (or chemFET array) may be provided to a personal computer, a personal digital assistant, a cellular phone, a video game system, or a television so that a user can monitor the progress of the sequencing reactions remotely. This process is illustrated, for example, in
(546) Diffusion Control
(547) The nucleotide incorporation reaction can occur very rapidly. As a result, it may be desirable in some instances to slow the reaction down or to slow the diffusion of analytes in the well in order to ensure maximal data capture during the reaction. The diffusion of reagents and/or byproducts can be slowed down in a number of ways including but not limited to addition of packing beads in the wells, and/or the use of polymers such as polyethylene glycol in the wells (e.g., PEG attached to the capture beads and/or to packing beads). The packing beads also tend to increase the concentration of reagents and/or byproducts at the chemFET surface, thereby increasing the potential for signal. The presence of packing beads generally allows a greater time to sample (e.g., by 2- or 4-fold).
(548) Data capture rates can vary and be for example anywhere from 10-100 frames per second and the choice of which rate to use will be dictated at least in part by the well size and the presence of packing beads or other diffusion limiting techniques. Smaller well sizes generally require faster data capture rates.
(549) In some aspects of the invention that are flow-based and where the top face of the well is open and in communication with fluid over the entirety of the chip, it is important to detect the released hydrogen ion prior to its diffusion out of the well. Diffusion of reaction byproducts out of the well will lead to false negatives (because the byproduct is not detected in that well) and potential false positives in adjacent or downstream wells (where the byproduct may be detected), and thus should be avoided. Packing beads and/or polymers such as PEG may also help reduce the degree of diffusion and/or cross-talk between wells.
(550) In addition to the nucleic acid loaded beads, each well may also comprise a plurality of smaller beads, referred to herein as packing beads. The packing beads may be composed of any inert material that does not interact or interfere with analytes, reagents, reaction parameters, and the like, present in the wells. The packing beads may be magnetic (including superparamagnetic) but they are not so limited. In some embodiments the packing beads and the capture beads are made of the same material (e.g., both are magnetic, both are polystyrene, etc.), while in other embodiments they are made of different materials (e.g., the packing beads are polystyrene and the capture beads are magnetic).
(551) The packing beads are generally smaller than the capture beads. The difference in size may vary and may be 5-fold, 10-fold, 15-fold, 20-fold or more. As an example, 0.35 m diameter packing beads can be used with 5.91 m capture beads. Such packing beads are commercially available from sources such as Bang Labs.
(552) The placement of the packing beads relative to the capture bead may vary. Packing beads may be positioned between the chemFET surface and the nucleic acid loaded bead, in which case they may be introduced into the wells before the nucleic acid loaded beads. In this way, the packing beads prevent contact and thus interference of the chemFET surface with the template nucleic acids bound to the capture beads. A layer of packing beads that is 0.1-0.5 m in depth or height would preclude this interaction. The presence of packing beads between the capture bead and the chemFET surface may also slow the diffusion of the sequencing byproducts such as hydrogen ions, thereby facilitating data capture in some embodiments. Alternatively, the packing beads may be positioned all around the nucleic acid loaded beads, in which case they may be added to the wells before, during and/or after the nucleic acid loaded beads. In still other embodiments, the majority of the packing beads may be positioned on top of the nucleic acid loaded beads, in which case they may be added to the wells after the nucleic acid loaded beads. If placed above the nucleic acid loaded beads, the packing beads may act to minimize or prevent altogether dislodgement of nucleic acid loaded beads from wells. In still other embodiments, the reaction wells may comprise packing beads even if nucleic acid loaded beads are not used. It is to be understood that in other embodiments however packing beads are not required as there is no need to slow the diffusion of reaction byproducts such as hydrogen ions.
(553) In some embodiments, diffusion may also be impacted by including in the reaction chambers viscosity increasing agents. An example of such an agent is a polymer that is not a nucleic acid (i.e., a non-nucleic acid polymer). The polymer may be naturally or non-naturally occurring, and it may be of any nature provided it does not interfere with nucleotide incorporation and/or excision and detection thereof except for slowing the diffusion of polymerase, released hydrogen ions, PPi, unincorporated nucleotides, and/or other reaction byproducts or reagents. An example of a suitable polymer is polyethylene glycol (PEG). Other examples include PEO, PEA, dextrans, acrylamides, celluloses (e.g. methyl cellulose), and the like. The polymer may be free in solution (e.g., PEG, DMSO, glycerol, and the like) or it may be immobilized (covalently or non-covalently) to one or more sides of the reaction chamber, to the capture bead (e.g., PEG, PEO, dextrans, and the like), and/or to any packing beads that may be present. Non-covalent attachment may be accomplished via a biotin-avidin interaction.
(554) The invention further contemplates in some embodiments the use of soluble counterions that bind to released hydrogen ions and prevent their exit from the well. Counterions having a pKa that is close to the pH of the reaction are preferred. Examples of counterions with diffusion rates that are slower than that of protons (at both 25 C. and 37 C.) include without limitation Cl.sup., H.sub.2P04.sup., HCO3.sup., acetate, butyrate, histidyl, formate, lactate, and the like. In some embodiments, the counterions are free in solution while in others they are immobilized on a solid support including without limitation reaction chamber walls. One of ordinary skill in the art will be able to select the most appropriate counterion and concentration based on its pKa and the pH at which the reaction is conducted, and the mobility of the counterion. It will be understood that various embodiments of the invention do not require the use of counterions.
(555) Kits
(556) The invention further contemplates kits comprising the various reagents necessary to perform a sequencing reaction and instructions of use according to the methods set forth herein.
(557) One preferred kit comprises one or more containers housing wash buffer, one or more containers each containing one of the following reagents: dATP buffer, dCTP buffer, dGTP buffer or dTTP buffer, dATP, dCTP, dGTP and dTTP stocks, apyrase, SSB, polymerase, packing beads, and optionally pyrophosphatase. Importantly the kits may comprise only naturally occurring dNTPs. The kits may also comprise one or more wash buffers comprising components as described in the Examples, but are not so limited. The kits may also comprise instructions for use including diagrams that demonstrate the methods of the invention.
(558) The following Examples are included for purposes of illustration and are not intended to limit the scope of the invention.
EXAMPLES
(559) The Examples provide a proof of principle demonstration of the sequencing of four templates of known sequence. This artificial model is intended to show that embodiments of the apparatuses and systems described herein are able to readout nucleotide incorporation that correlates to the known sequence of the templates. This is not intended to represent typical use of the method or system in the field. The following is a brief description of these methods.
Example 1. Bead Preparation
(560) Binding of Single-Stranded Oligonucleotides to Streptavidin-Coated Magnetic Beads. Single-stranded DNA oligonucleotide templates with a 5 Dual Biotin tag (HPLC purified), and a 20-base universal primer were ordered from IDT (Integrated DNA Technologies, Coralville, Ind.). Templates were 60 bases in length, and were designed to include 20 bases at the 3 end that were complementary to the 20-base primer (Table 4, italics). The lyophilized and biotinylated templates and primer were re-suspended in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8) as 40 M stock solutions and as a 400 M stock solution, respectively, and stored at 20 C. until use.
(561) For each template, 60 l of magnetic 5.91 m (Bangs Laboratories, Inc. Fishers, Ind.) streptavidin-coated beads, stored as an aqueous, buffered suspension (8.5710.sup.4 beads/L), at 4 C., were prepared by washing with 120 l bead wash buffer three times and then incubating with templates 1, 2, 3 and 4 (T1, T2, T3, T4: Table 4) with biotin on the 5 end, respectively.
(562) Due to the strong covalent binding affinity of streptavidin for biotin (Kd10-15), these magnetic beads are used to immobilize the templates on a solid support, as described below. The reported binding capacity of these beads for free biotin is 0.650 pmol/L of bead stock solution. For a small (<100 bases) biotinylated ssDNA template, it was conservatively calculated that 9.110.sup.5 templates could be bound per bead. The beads are easily concentrated using simple magnets, as with the Dynal Magnetic Particle Concentrator or MPC-s (Invitrogen, Carlsbad, Calif.). The MPC-s was used in the described experiments.
(563) An MPC-s was used to concentrate the beads for 1 minute between each wash, buffer was then added and the beads were resuspended. Following the third wash the beads were resuspended in 120 L bead wash buffer plus 1 l of each template (40 M). Beads were incubated for 30 minutes with rotation (Labquake Tube Rotator, Barnstead, Dubuque, Iowa). Following the incubation, beads were then washed three times in 120 L Annealing Buffer (20 mM Tris-HCl, 5 mM magnesium acetate, pH 7.5), and re-suspended in 60 L of the same buffer.
(564) TABLE-US-00004 TABLE4 SequencesforTemplates1,2,3,and4 T1:5/52Bio/GCAAGTGCCCTTAGGCTTCAGTTCAAAAGTCCTAAC TGGGCAAGGCACACAGGGGATAGG-3 (SEQIDNO:1) T2:5/52Bio/CCATGTCCCCTTAAGCCCCCCCCATTCCCCCCTGAACCC CCAAGGCACACAGGGGATAGG-3 (SEQIDNO:2) T3:5/52Bio/AAGCTCAAAAACGGTAAAAAAAAGCCAAAAAACTGG AAAACAAGGCACACAGGGGATAGG-3 (SEQIDNO:3) T4:5/52Bio/TTCGAGTTTTTGCCATTTTTTTTCGGTTTTTTGACCTTT TCAAGGCACACAGGGGATAGG-3 (SEQIDNO:4)
(565) Annealing of Sequencing Primer. The immobilized templates, bound at the 5 end to 5.91 m magnetic beads, are then annealed to a 20-base primer complementary to the 3 end of the templates (Table 4). A 1.0 L, aliquot of the 400 M primer stock solution, representing a 20-fold excess of primer to immobilized template, is then added and then the beads plus template are incubated with primer for 15 minutes at 95 C. and the temperature was then slowly lowered to room temperature. The beads were then washed 3 times in 120 L of 25 mM Tricine buffer (25 mM Tricine, 0.4 mg/ml PVP, 0.1% Tween 20, 8.8 mM Magnesium Acetate; ph 7.8) as described above using the MPC-s. Beads were resuspended in 25 mM Tricine buffer.
(566) Incubation of Hybridized Templates/Primer with DNA Polymerase.
(567) Template and primer hybrids are incubated with polymerase essentially as described by Margulies et al. Nature 2005 437(15):376-380 and accompanying supplemental materials.
(568) Loading of Prepared Test Samples onto the ISFET Sensor Array.
(569) The dimensions and density of the ISFET array and the microfluidics positioned thereon may vary depending on the application. A non-limiting example is a 512512 array. Each grid of such an array (of which there would be 262144) has a single ISFET. Each grid also has a well (or as they may be interchangeably referred to herein as a microwell) positioned above it. The well (or microwell) may have any shape including columnar, conical, square, rectangular, and the like. In one exemplary conformation, the wells are square wells having dimensions of 7710 m. The center-to-center distance between wells is referred to herein as the pitch. The pitch may be any distance although it is preferably to have shorter pitches in order to accommodate as large of an array as possible. The pitch may be less than 50 m, less than 40 m, less than 30 m, less than 20 m, or less than 10 m. In one embodiment, the pitch is about 9 m. The entire chamber above the array (within which the wells are situated) may have a volume of equal to or less than about 30 L, equal to or less than about 20 L, equal to or less than about 15 L, or equal to or less than 10 L. These volumes therefore correspond to the volume of solution within the chamber as well.
(570) Loading of Beads in an Open System.
(571) Beads with templates 1-4 were loaded on the chip (10 L, of each template). Briefly, an aliquot of each template was added onto the chip using an Eppendorf pipette. A magnet was then used to pull the beads into the wells.
(572) Loading of Beads in a Closed System.
(573) Both the capture beads the packing beads are loaded using flow. Microliter precision of bead solution volume, as well as positioning of the bead solution through the fluidics connections, is achieved as shown in
(574) The chip comprising the ISFET array and flow cell is seated in the ZIF (zero insertion force) socket of the loading fixture, then attaching a stainless steel capillary to one port of the flow cell and flexible nylon tubing on the other port. Both materials are microfluidic-type fluid paths (e.g., on the order of <0.01 inner diameter). The bead loading fitting, consisting of the major and minor reservoirs, it attached to the end of the capillary. A common plastic syringe is filled with buffer solution, then connected to the free end of the nylon tubing. The electrical leads protruding from the bottom of the chip are inserted into a socket on the top of a fixture unit (not shown).
(575) The chip comprising the ISFET array and flow cell is seated in a socket such as a ZIF (zero insertion force) socket of the loading fixture, then a stainless steel capillary may be attached to one port of the flow cell and flexible nylon tubing on the other port. Both materials are microfluidic-type fluid paths (e.g., on the order of <0.01 inner diameter). The bead loading fitting, consisting of the major and minor reservoirs, it attached to the end of the capillary. A common plastic syringe is filled with buffer solution, then connected to the free end of the nylon tubing. The electrical leads protruding from the bottom of the chip are inserted into a socket on the top of a fixture unit (not shown).
(576) It will be appreciated that there will be other ways of drawing the beads into the wells of the flow chamber, including centrifugation or gravity. The invention is not limited in this respect.
(577) DNA Sequencing Using the ISFET Sensor Array in an Open System.
(578) A illustrative sequencing reaction can be performed in an open system (i.e., the ISFET chip is placed on the platform of the ISFET apparatus and then each nucleotide (5 L resulting in 6.5 M each) is manually added in the following order: dATP, dCTP, dGTP and dTTP (100 mM stock solutions, Pierce, Milwaukee, Wis.), by pipetting the given nucleotide into the liquid already on the surface of the chip and collecting data from the chip at a rate of 2.5 MHz. This can result in data collection over 7.5 seconds at approximately 18 frames/second. Data may then analyzed using Lab View.
(579) Given the sequences of the templates, it is expected that addition of dATP will result in a 4 base extension for template 4. Addition of dCTP will result in a 4 base extension in template 1. Addition of dGTP will cause template 1, 2 and 4 to extend as indicated in Table 5 and addition of dTTP will result in a run-off (extension of all templates as indicated).
(580) Preferably when the method is performed in a non-automated manner (i.e., in the absence of automated flow and reagent introduction), each well contains apyrase in order to degrade the unincorporated dNTPs, or alternatively apyrase is added into each well following the addition and incorporation of each dNTP (e.g., dATP) and prior to the addition of another dNTP (e.g., dTTP). It is to be understood that apyrase can be substituted, in this embodiment or in any other embodiment discussed herein, with another compound (or enzyme) capable of degrading dNTPs.
(581) TABLE-US-00005 TABLE 5 Set-up of experiment and order of nucleotide addition. dATP dCTP dGTP dTTP T1 0 (3:C; 1:A)4 1 Run-off (25) T2 0 0 4 Run-off (26) T3 0 0 0 Run-off (30) T4 4 0 2 Run-off (24)
(582) DNA Sequencing Using Microfluidics on Sensor Chip.
(583) Sequencing in the flow regime is an extension of open application of nucleotide reagents for incorporation into DNA. Rather than add the reagents into a bulk solution on the ISFET chip, the reagents are flowed in a sequential manner across the chip surface, extending a single DNA base(s) at a time. The dNTPs are flowed sequentially, beginning with dTTP, then dATP, dCTP, and dGTP. Due to the laminar flow nature of the fluid movement over the chip, diffusion of the nucleotide into the microwells and finally around the nucleic acid loaded bead is the main mechanism for delivery. The flow regime also ensures that the vast majority of nucleotide solution is washed away between applications. This involves rinsing the chip with buffer solution and apyrase solution following every nucleotide flow. The nucleotides and wash solutions are stored in chemical bottles in the system, and are flowed over the chip using a system of fluidic tubing and automated valves. The ISFET chip is activated for sensing chemical products of the DNA extension during nucleotide flow.
Example 2. On-Chip Polymerase Extension Detected by pH Shift on an ISFET Array
(584) Streptavidin-coated 2.8 micron beads carrying biotinylated synthetic template to which sequencing primers and T4 DNA polymerase are bound were subjected to three sequential flows of each of the four nucleotides. The template sequence downstream of the sequencing primer was a G(C)10(A) 10 (SEQ ID N0:5). Each nucleotide cycle consisted of flows of dATP, dCTP, dGTP and dTTP, each interspersed with a wash flow of buffer only. Flows from the first cycle are shown in blue, flows from the second cycle in red, and the third cycle in yellow. As shown in
Example 3. Sequencing in a Closed System and Data Manipulation
(585) Sequence has been obtained from a 23-mer synthetic oligonucleotide and a 25-mer PCR product oligonucleotide. The oligonucleotides were attached to beads which were then loaded into individual wells on a chip having 1.55 million sensors in a 13481152 array having a 5.1 micron pitch (38400 sensors per mm.sup.2) About 1 million copies of the synthetic oligonucleotide were loaded per bead, and about 300000 to 600000 copies of the PCR product were loaded per bead. A cycle of 4 nucleotides through and over the array was 2 minutes long. Nucleotides were used at a concentration of 50 micromolar each. Polymerase was the only enzyme used in the process. Data were collected at 32 frames per second.
(586)
(587)
(588)
(589) Improving Pixel and Array Signal-to-Noise Ratio
(590) The reliability of signal decoding from each ISFET and from the ISFET array as a whole is dependent on the amplitude of the signal output by each ISFET, and its respective signal-to-noise ratio. Some changes to the foregoing fabrication methods, judicious materials selection, and changes to pixel and array design can be employed to increase considerably the output of the ISFETs in the array and decrease various noise sources. That is, these changes result in a more sensitive and more accurate sensor array. The improvements can be implemented individually or in various combinations, and can result in significant performance gains to the signal-to-noise ratio (SNR).
(591) As more fully discussed below, the improvements involve: (1) over-coating (i.e., passivating) the sidewalls (typically formed of TEOS-oxide or another suitable material, as above-described) and sensor surface at the bottom of the microwells with various metal oxide or like materials, to improve their surface chemistry (i.e., make the sidewalls less reactive) and electrical properties; (2) thinning out the coating (deposition material) on the floating gate; (3) increasing the surface area for charge collection at the floating gate; (4) and modified array and pixels designs to reduce charge injection into the electrolyte and other noise sources.
(592) Floating Gate Deposition Layer Material and Thickness
(593) As illustrated in
(594) It is well known that capacitances in series form a capacitive voltage divider. Consequently, only a fraction of the signal voltage, V.sub.S, generated by or in the analyte, is applied to the gate oxide as the voltage V.sub.G that drives the ISFET. If we define the gate gain as V.sub.G/V.sub.S, one would ideally like to have unity gaini.e., no signal loss across any of the three capacitances. Of course, unity gain is not achievable, but the actual gate gain can be optimized. The value of C.sub.DL is a function of material properties and is typically on the order of about 10-40 g/cm.sup.2. The gate oxide capacitance is typically a very small value by comparison. Thus, by making C.sub.FGD much greater than the series combination of C.sub.OX and C.sub.DL (for short, C.sub.FGD>>C.sub.OX), the gate gain can be made to approach unity as closely as is practical.
(595) To achieve the relationship C.sub.FGD>>C.sub.OX, one can minimize C.sub.OX, maximize C.sub.FGD, or both. There is not a lot that can be done to alter the gate oxide capacitance much when using standard CMOS foundry techniques to fabricate the ISFETs. That is, for practical reasons one must typically accept the gate oxide capacitance value as a given. Thus, emphasis may be placed on maximizing C.sub.FGD. Such maximization can be achieved by using a thin layer of high dielectric constant material, or by increasing the area of the floating gate metallization. Since increasing floating gate area conflicts with a goal of having a high density sensor array, attention has been focused on the dielectric layer.
(596) Materials exist and may be used that have higher dielectric constants than the customary CMOS gate oxide material, silicon dioxide. So, if in the course of fabrication such a gate oxide material has been deposited onto the floating gate metallization, one may etch away that material, essentially eliminating it, and deposit a suitable high dielectric constant floating gate dielectric layer directly onto the floating gate metallization. Or, one may simply deposit such a floating gate dielectric layer directly onto the floating gate metallization without having to etch first. In either situation, there are then only two series capacitances that matter between the analyte and the ISFET gate, C.sub.DL and C.sub.FGD. Gate gain can then be maximized by making C.sub.FGD>>C.sub.DL. Thus, achieving a large value for C.sub.FGD is desirable, while also satisfying other requirements (e.g., reliable manufacture).
(597) The capacitance C.sub.FGD is essentially formed by a parallel plate capacitor having the floating gate dielectric layer as its dielectric. Consequently, for a given plate (i.e., floating gate metallization) area, the parameters principally available for increasing the value of C.sub.FGD are (1) the thickness of the dielectric layer and (2) the selection of the dielectric material and, hence, its dielectric constant. The capacitance of the floating gate dielectric layer varies directly with its dielectric constant and inversely with its thickness. Thus, a thin, high-dielectric-constant layer would be preferred, to satisfy the objective of obtaining maximum gate gain.
(598) One candidate for the floating gate dielectric layer material is the passivation material used by standard CMOS foundry processes. The standard (typically, PECVD nitride or, to be more precise, silicon nitride over silicon oxynitride) passivation layer is relatively thick when formed (e.g., about 1.3 m), and typical passivation materials have a limited dielectric constant. A first improvement can be achieved by thinning the passivation layer after formation. This can be accomplished by etching back the CMOS passivation layer, such as by using an over-etch step during microwell formation, to etch into and consume much of the nitride passivation layer, leaving a thinner layer, such as a layer only about 200-600 Angstroms thick. While simple, this approach is prone to wafer-to-wafer etch variations, resulting in variability in the final passivation layer thickness and capacitance.
(599) Two approaches have been at least partially evaluated for etching a standard CMOS passivation layer of silicon nitride deposited over silicon oxynitride. We call the first approach the partial etch technique. It involves etching away the silicon nitride layer plus approximately half of the silicon oxynitride layer before depositing the thin-film metal oxide sensing layer. The second approach we call the etch-to-metal technique. It involves etching away all of the silicon nitride and silicon oxynitride layers before depositing the thin-film metal oxide sensing layer. Theoretical modeling indicates that the partial etch approach should lead to an ISFET gate gain of about 0.42. This corresponds to an increase of signal level by about 50% compared with a non-etched passivation layer. With an ALD Ta.sub.2O.sub.5 thin-film sensing layer deposited over a partial etch, ISFET gains from about 0.37 to about 0.43 have been obtained empirically, with sensor sensitivities of about 15.02-17.08 mV/pH.
(600) Theoretical modeling indicates that in the etch-to-metal approach with the same sensing layer, an ISFET gate gain of about 0.94 should be possible. This would correspond to a greater than three-fold increase in signal. With an imperfect etch process that does not produce a uniform etch across the surface of the floating gate, the empirically obtained gain has only been about 0.6, corresponding to a little more than doubling of the signal. With improved etch chemistry/process to obtain a more uniform and flat surface at the bottom of the well, a gain close to the model 0.94 gain should be possible.
(601) One promising approach for improving the uniformity and flatness of the etch process is to perform two or more separate etches in seriesi.e., use a multi-step etch process. A first etch step may be performed and the progress of that etch step may be monitored optically, at one or multiple wavelengths. When it is detected that the first step etch has exposed a part of the underlying metal surface, the first etch process can be stopped and a second process or step may be begun, using conditions that will remove the dielectric material without removing (much of) the metal.
(602) An alternative to use of the foregoing etch processes is to simply deposit a thinner layer of dielectric (passivation) material in the first place, such as the indicated 200-600 Angstroms instead of the 1.3 m of the conventional CMOS passivation process. Even better performance can be achieved with the use of other materials and deposition techniques to form a thin dielectric layer, preferably one of relatively higher dielectric constant. Among the materials believed useful for the floating gate dielectric layer are metal oxides such as tantalum oxide, tungsten oxide, aluminum oxide, and hafnium oxide, though other materials of dielectric constant greater than that of the usual silicon nitride passivation material may be substituted, provided that such material is sensitive to the ion of interest. The etch-to-metal approach is preferred, with the CMOS process' passivation oxide on the floating gate being etched completely away prior to depositing the floating gate dielectric material layer. That dielectric layer may be applied directly on the metal extended ISFET floating gate electrode. This will help maximize the value of the capacitance C.sub.FGD.
(603) The etch-to-metal approach is preferred, with the CMOS process' passivation oxide on the floating gate being etched completely away prior to depositing the floating gate dielectric material layer. That dielectric layer may be applied directly on the metal extended ISFET floating gate electrode. This will help maximize the value of the capacitance C.sub.FGD.
(604) Among the processes which may be used for depositing a thin layer of floating gate dielectric material are reactive or non-reactive sputtering, electron cyclotron resonance (ECR), e-beam evaporation, and atomic layer deposition (ALD), though any suitable technique may be employed. Each of the foregoing processes has well known characteristics. Importantly, however, these processes differ in their abilities to provide conformal and uniform films, which are qualities that may be important for some applications. Thus, all are usable, but they are not necessarily equally desirable. Of the four enumerated techniques, ALD appears to be superior with respect to the particular desired qualities. It is good for depositing layers whose thickness can be controlled precisely so that wafer-to-wafer repeatability is not a problem. Also, it is a low-temperature process that does not threaten the aluminum interconnects that typically already will have been formed on the wafer by the time the floating gate dielectric material layer is applied. ALD, moreover, promises to enable conformal, pinhole-free and crack-free film coverage on the well bottom, which is required; and it is compatible with extending the deposition from the well bottoms onto the high aspect ratio (i.e., steep) well sidewalls. Covering the sidewalls with a passivation or buffering layer will render them more inert to the analyte.
(605) To create such structures, a layer of microwells should be formed on top of the ISFETs wherein the microwells are open at their bottoms. If the structure is to be formed without a floating gate dielectric layer other than a conventional passivation material over the floating gate, then the passivation material preferably should be partially etched down to the desired thinness. This alone increases the floating gate dielectric capacitance C.sub.FGD relative to C.sub.OL, improving gate gain. Optionally, a thin, higher-dielectric constant layer may be deposited or otherwise formed over such passivation material.
(606) The deposited layer should preferably be relatively thine.g., only about 200-600 Angstroms thick, possibly even less. As the thin layer of floating gate dielectric material is deposited over the well bottom onto the floating gate or its immediate coating layer, it also may be allowed to deposit conformally over the well sidewalls using, for example, the aforementioned ALD process.
(607) The potential for improvement is considerable. As a starting point, consider one standard CMOS foundry passivation material, silicon nitride, Si.sub.3N.sub.4. This particular material has a sub-Nernstian response to pH. Consequently, the best response we have been able to measure is about 40 mV/pH for an ISFET sensor with a silicon nitride floating gate deposition layer (though some improvement might be obtainable with improved nitride deposition). This is considerably less than the ideal Nernstian response of 59 mV/pH at 25 C. Thus, about one-third of the signal voltage at the interface between the analyte and the floating gate deposition layer is lost due to use of materials with so great a sub-Nernstian response. Indeed, in one example, simulations indicated that a three-fold improvement in gate gain is possible with changes in both floating gate deposition material and floating gate deposition layer thickness, for the gate geometries studied. This was then corroborated empirically with electrical test results on a 400 Angstrom aluminum oxide (Al.sub.2O.sub.3) floating gate deposition material.
(608) From available literature or experimentation, one can determine that in addition to Al.sub.2O.sub.3, there are other metal oxides that can be substituted for silicon nitride at the well bottom, to obtain a closer to Nernstian response. For example, Table 6 compares the pH response of ISFETs with various floating gate deposition oxides (specifically, SiO.sub.2, Si.sub.3N.sub.4, Al.sub.2O.sub.3 and Ta.sub.2O.sub.5, using published data.
(609) TABLE-US-00006 TABLE 6 Characteristic SiO.sub.2 Si.sub.3N.sub.4 Al.sub.2O.sub.3 Ta.sub.2O.sub.5 pH range 4-10 1-13 1-13 1-13 Sensitivity (mV/pH) 23-35 (pH > 7) 46-56 53-57 56-57 37-48 (pH < 7) Sensitivity (mV/pX) Na+ 30-50 5-20 2 <1 K+ 20-30 5-25 2 <1 Response time (95%) (s) 1 <0.1 <0.1 <0.1 (98%) (min) Undefined 4-10 2 1 Drift (mV/hr, pH 1-7) Unstable 1.0 0.1-0.2 0.1-0.2
(610) Of the four materials compared in Table 6, SiO.sub.2 had the lowest sensitivity to pH and no linear dependence on pH. The literature indicated that Si.sub.3N.sub.4 had higher sensitivities (46-56 mV/pH) but experiments have shown its performance to be dependent on the type of deposition technique and oxygen content. The best reported materials were Al.sub.2O.sub.3 and Ta.sub.2O.sub.5, which exhibited higher sensitivity in the ranges of 53-57 and 56-57 mV/pH, respectively. One other study has indicated that tungsten oxide, WO.sub.3, a material with a high dielectric constant (about 300), has a sensitivity of 50 mV/pH.
(611) Consequently, the data indicates that using a floating gate deposition material such as Ta.sub.2O.sub.5, Al.sub.2O.sub.3, HfO.sub.3 or WO.sub.3 will result in a larger signal in response to pH changes. In other words, if it is assumed that the sensitivity of Ta.sub.2O.sub.5 is 56 mV/pH and that the Nernstian gain is defined as the material sensitivity divided by the ideal Nernstian response of 59 mV/pH at 25 C., then the Nernstian gain increases from 0.67 for Si.sub.3N.sub.4 to about 0.95-0.96 for Ta.sub.2O.sub.5. Thus, with Ta.sub.2O.sub.5, only about 4-5% of the signal voltage is lost across the floating gate deposition layer.
(612) The deposition of a thin film floating gate deposition layer over the side walls of the microwells provides a further benefit. By coating the walls with a material whose pKa value is more conducive to analyte pH conditions than the TEOS oxide sidewall above mentioned, the floating gate deposition material buffers the sidewalls so that surface reactions there capture fewer of the protons that otherwise would be available as signal generators once they reach the gate region.
(613) Thus, the above-taught thin-film floating gate deposition layers provide a three-fold benefit: First, they enhance sensor performance at the sensor surface by providing a more reactive interface between the analyte and the ISFET gate (or, in other words, they are more Nernstian). Second, they serve as a replacement, thinner dielectric between the analyte and the metal ISFET gate (if directly applied to the metal) or between the analyte and the gate oxide (if applied over a gate oxide layer), thereby increasing the coupling capacitance and gate gain. Third, if also used to cover the microwell sidewalls, as would be typical for most deposition processes, they provide buffering by coating the TEOS-oxide sidewalls with a material whose pKa differs more substantially from the analyte pH than that of the sidewall material itself.
(614) There are also materials such as Iridium oxide which provide super-Nernstian responses, which can provide a still further improvement in SNR if used as the thin film floating gate deposition layer. See, e.g., D. O. Wipf et al, Microscopic Measurement of pH with Iridium Oxide Microelectrodes, Anal. Chem. 2000, 72, 4921-4927, and Y. J. Kim et al, Configuration for Micro pH Sensor, Electronics Letters, Vol. 39, No. 21 (Oct. 16, 2003).
(615)
(616) TABLE-US-00007 TABLE 7 Gate oxide thickness (m) 7.70E09, typ. ISFET gate length (m) 6.00E07 ISFET gate width (m) 1.20E06 ISFET gate area (m2) 7.20E13 ISFET gate capacitance (F) 3.23E15 ISFET sensor plate side length (m) 6.00E06 ISFET sensor plate area (m2) 3.60E11 ISFET sensor plate capacitance (F) 1.84E15
(617) The conditions for
(618) TABLE-US-00008 TABLE 8 Gate oxide thickness (m) 7.70E09, typ. ISFET gate length (m) 5.00E07 ISFET gate width (m) 1.20E06 ISFET gate area (m2) 6.00E13 ISFET gate capacitance (F) 2.69E15 ISFET sensor plate side length (m) 3.50E06 ISFET sensor plate area (m2) 1.23E11 ISFET sensor plate capacitance (F) 6.25E16
(619) The conditions for
(620) TABLE-US-00009 TABLE 9 Gate oxide thickness (m) 3.80E09 ISFET gate length (m) 4.00E07 ISFET gate width (m) 7.00E07 ISFET gate area (m2) 2.80E13 ISFET gate capacitance (F) 2.54E15 ISFET sensor plate side length (m) 1.60E06 ISFET sensor plate area (m2) 2.56E12 ISFET sensor plate capacitance (F) 9.71E17
(621) The deposited ALD thin film layers discussed above, like all deposited thin-films, have an intrinsic stress and stress gradient resulting from material properties and/or deposition conditions. These properties can affect the adhesion of the deposited film to the underlying substrate (the floating gate metallization and microwell sidewalls). In the fabrication examples above, various metal-oxide ceramic materials are to be deposited onto silicon dioxide (i.e., the TEOS material of the microwells), silicon nitride (i.e., the remaining CMOS passivation material that has been etched through but which is still present on the bottom of the sidewalls) and aluminum (i.e., the metal ISFET floating gate electrode).
(622) Some ALD processes involve depositing materials at temperatures below 400 C.; others, at temperatures above 400 C. As described above, an end-of-line forming gas anneal above 400 C. may be employed as part of the CMOS trapped charge neutralization process. The ALD layers deposited at temperatures less than this tend to delaminate or spall off the silicon dioxide sidewalls. It has been found empirically that Ta.sub.2O.sub.5 (deposited at 325 C.) spalls off the well sidewalls and Al.sub.2O.sub.3 (deposited at 460 C.) does not.
(623) Two methods are proposed to correct this problem, as applied to fabricating an optimum microwell passivation/floating gate dielectric (protection) material into a microwell and onto an ISFET sensor gate. In a first method, a laminated film may be used to relieve the stress in the as-deposited metal oxide ceramic. In a second method, a glue layer is first deposited, having superior adhesion onto which a microwell passivation/floating gate dielectric (protection) material of optimum surface chemistry is deposited.
(624) As an example, the laminate layer may be an approximately 400 Angstrom thick structure of alternating layers of Ta.sub.2O.sub.5, and Al.sub.2O.sub.3, (for instance, but not limited to, each about 10-20 Angstroms thick, or of different thicknesses). It is believed to be preferable to start with Al.sub.2O.sub.3 (as it exhibits better adhesion to oxide) and terminate with Ta.sub.2O.sub.5, (for its superior surface chemistry). The ALD process is ideally suited to this as film thicknesses can be controlled down to the atomic layer (i.e., a few Angstroms) and can be switched easily from one material to another simply by switching the precursor gasses introduced into the reactor system.
(625) The overall stress of such a laminate layer would be a combination of the intrinsic stressescompressive and tensileof the individual layers. More than two materials could be used if, say, a tertiary laminate were required.
(626) The glue layer idea is a more straight-forward implantation. First, a very thin (e.g., 50 Angstrom) layer of good adherent material (e.g., high temperature Al.sub.2O.sub.3) may be deposited and then immediately following that, a thicker (e.g., 400 Angstrom) layer of Ta.sub.2O.sub.5.
(627) Increasing Floating Gate Surface Area
(628) The various metal oxide materials discussed above for improving the surface properties both of the well surface and/or of the sensor surface at the bottom of the well are not electrically conductive. However, one can create an extended floating gate electrode underneath such material, extending the electrically conductive properties of the ISFET gate electrode, by first depositing and planarize-etching a thin conformal metal coating prior to the passivation layer deposition. The removal via CMP (chemical-mechanical polishing) or other etch techniques of the thin-metal from the tops of the microwells would realize discrete electrically isolated wells having passivated gates consisting of substantially the entire interior surface area of the microwell sidewalls. This would increase the available surface area of the ISFET gate several fold. Doing so would virtually eliminate lost protons at microwell walls (i.e., those protons emanating from the sequencing reaction on the bead and otherwise hitting the non-sensing microwell wall).
(629) The extended floating gate dielectric capacitor would be formed (by, e.g., ALD) after microwell etch. Adjustments to the microwell lateral dimensions could be necessary, depending on the thickness of the thin-metal plus passivation layers being deposited, and the bead size.
(630)
(631) As an alternate fabrication process, the removal of material 75E4 could be done by patterning an inverse of the microwell pattern as a mask (i.e., opening the areas between the wells) and then using a standard metal etch.
(632) The collection of all charge that reaches the microwell sidewalls, as well as its bottom, renders the pixels more sensitive to the reaction in the wells.
(633) Another way to improve charge collection and sensitivity is to employ for the surfaces contacting the electrolyte a material that has a point of zero charge that matches the operating pH of the analyte.
(634) Improved On-Chip Electronics
(635) There are two key areas where the on-chip electronics can potentially be improved to increase the voltage signal gain and to reduce noise: the pixel circuit and the readout circuit.
(636) Pixel Circuit
(637) In a basic pixel circuit such as is illustrated above in
V.sub.T=V.sub.T0+({square root over ((2|.sub.F|+V.sub.SB))}{square root over (2|.sub.F|)}
(638) where V.sub.T0 is the threshold voltage when the source voltage is equal to the bulk potential, 2 is the surface potential at threshold, and is the body effect coefficient. Consequently, the threshold voltage will vary due to the body effect; and the ISFET source gain, defined as the ratio VS/VG, will be less than the ideal value of unity. Although it cannot be measured directly, it is thought that the ISFET source gain is in the neighborhood of 0.9. In other words, up to 10% of the maximum voltage signal that could be measured may be lost due to the body effect.
(639) If each ISFET is placed in its own n-well, and the source and bulk terminals are connected together, then the body effect can be eliminated and an ISFET source gain of unity can be realized. Furthermore, if each ISFET is isolated from the rest of the chip by a reverse-biased diode between the n-well and the substrate, then the device will be less susceptible to substrate noise. In other words, the total ISFET noise should be lower if the device is located in its own n-well.
(640) Reducing Injection of Noise into Electrolyte
(641) A second aspect to improving the SNR is that of reducing noise. A major component of such noise is noise that is coupled into the analyte fluid by the pixels in every column of the array, due to the circuit dynamics. Two noise injection mechanisms have been identified: the drain side column buffer injects noise through each pixel and each row selection pumps charge into the fluid. These mechanisms are focused on the ISFET drain and on the ISFET source.
(642) The ISFET Drain Problem
(643) When a row is selected in the array, the drain terminal voltage shared between all of the ISFETs in a column moves up or down (as a necessary requirement of the source-and-drain follower). This changes the gate-to-drain capacitances of all of the unselected ISFETs in the column. In turn, this change in capacitance couples from the gate of every unselected ISFET into the fluid, ultimately manifesting itself as noise in the fluid (i.e., an incorrect charge, one not due to the chemical reaction being monitored). That is, any change in the shared drain terminal voltage can be regarded as injecting noise into the fluid by each and every unselected ISFET in the column. Hence, if the shared drain terminal voltage of the unselected ISFETs can be kept constant when selecting a row in the array, this mechanism of coupling noise into the fluid can be reduced or even effectively eliminated.
(644) The ISFET Source Problem
(645) When a row is selected in the array, the source terminal voltage of all of the unselected ISFETs in the column also changes. In turn, that changes the gate-to-source capacitance of all of these ISFETs in the column. This change in capacitance couples from the gate of every unselected ISFET into the fluid, again ultimately manifesting itself as noise in the fluid. That is, any change in the source terminal voltage of an unselected ISFET in the column can be regarded as an injection of noise into the fluid. Hence, if the source terminal voltage of the unselected ISFETs can be kept when selecting a row in the array, this mechanism of coupling noise into the fluid via can be reduced or even effectively eliminated.
(646) A column buffer may be used with some passive pixel designs to alleviate the ISFET drain problem but not the ISFET source problem. Thus, a column buffer most likely is preferable to the above-illustrated source-and-drain follower. With the illustrated three-transistor passive pixels employing a source-and-drain follower arrangement, there are essentially two sense nodes, the ISFET source and drain terminals. By connecting the pixel to a column buffer and grounding the drain terminal of the ISFET, there will be only one sense node: the ISFET source terminal. So the drain problem is eliminated.
(647) Active Pixel Design
(648) All of the above-discussed passive pixels circuits present noise and scalability challenges. That is, increasing the size of the array typically leads to increased bus capacitance and a non-linear increase in power needs. Increasing readout speed comes at the expense of increased readout noise. Replacing passive ISFET pixels with active pixels, each having an active amplifier transistor as an integral element, can reduce noise coupled into the fluid, along with reducing readout noise, low frequency noise and fixed pattern noise. This approach, moreover, appears to provide a low-noise ISFET pixel that successfully eliminates both the ISFET drain problem and the ISFET source problem, the latter because the sense node (i.e., ISFET source terminal) is decoupled from the column bus.
(649) Note that, to make a measurement, the sense node has to be connected to a current source, with current flowing. However, switching the current source off and on introduces a disturbance at its own to the sense node, and thus couples noise into the fluid. To avoid this problem and further improve the signal-to-noise ratio, a single transistor current source can be introduced into each active ISFET pixel. Current then would be flowing through every pixel in every column of the array all of the time. Of course, there are obvious implications for power consumption and it would be advisable to operate this current source transistor in sub-threshold mode to minimize power consumption.
(650) Turning to
(651) A second example of a four-transistor active pixel 75G1 is shown in
(652) By sharing some transistors between pixels, the average number of transistors per pixel, and hence pixel size, can be reduced. For example, see the arrangement of
(653) An example of a six-transistor active pixel 75I0 is shown in
(654) By taking two samples per pixel (i.e., using correlated double sampling (CDS), the six transistor pixel 75I0 can suppress 1/f noise, and fixed pattern noise due to threshold voltage variations.
(655) As shown in
(656) Moving from a passive pixel design to an active amplified pixel design thus improves the scalability of the design and reduces the readout noise. A single-MOSFET current source is required in each ISFET pixel to avoid coupling noise into the fluid. By increasing the number of transistors per pixel, correlated double sampling can be used at the pixel level to reduce flicker noise and fixed pattern noise. Further, the shared pixel concept can be used to reduce the effective number of transistors per pixel to achieve a smaller pixel size.
(657) To reduce power consumption, the FETs (or selected ones of them) can be operated in the so-called weak inversion or sub-threshold mode.
(658) Readout Circuit
(659) The above-described readout circuit, which comprises both sample-and-hold and multiplexer blocks, also has a gain that is less than the ideal value of unity. Furthermore, the sample-and-hold block contributes a significant percentage of the overall chip noise, perhaps more than 25%. From switched-capacitor theory, the sample-and-hold kT/C noise is inversely proportional to capacitance. Hence, by choosing a larger capacitor, the sample-and-hold noise can be reduced. Another approach to reducing noise is to employ Correlated Double Sampling (CDS), where a second sample-and-hold and difference circuit is used to cancel out correlated noise. This approach is discussed at greater length, below.
(660) Correlated Double Sampling
(661) Correlated Double Sampling (CDS) is a known technique for measuring electrical values such as voltages or currents that allows for removal of an undesired offset. The output of the sensor is measured twice: once in a known condition and once in an unknown condition. The value measured from the known condition is then subtracted from the unknown condition to generate a value with a known relation to the physical quantity being measured. The challenge here is how to be efficient in implementing CDS and how to address both correlated noise and the minimization of noise injection into the analyte fluid.
(662) A starting point is the sensor pixel and its readout configuration as expressed in earlier parts of this application. Referring to
(663) As discussed above, the voltage changes on the ISFET source and drain inject noise into the analyte, causing errors in the sensed values. Two constructive modifications can reduce the noise level appreciably, as shown in
(664) The first change is to alter the signals on the ISFET. The feedback loop to the drain of the ISFET is eliminated and the drain is connected to a stable voltage, such as ground. A column buffer 77B is connected to the emitter of transistor.
(665) The second change is to include a circuit to perform CDS on the output of the column buffer. As mentioned above, CDS requires a first, reference value. This is obtained by connecting the input of column buffer 77B1 to a reference voltage via switch 77B2, during a first, or reference phase of a clock, indicated as the SH phase. A combined CDS and sample-and-hold circuit then double samples the output of the column buffer, obtaining a reference sample and a sensed value, performs a subtraction, and supplies a resulting noise-reduced output value, since the same correlated noise appears in the reference sample and in the sensor output.
(666) The operation of the CDS and sample-and-hold circuit is straightforward. The circuit operates on a two-phase clock, the first phase being the SH phase and the second phase being the SHb phase. Typically, the phases will be symmetrical and thus inverted values of each other. The reference sample is obtained in the SH phase and places a charge (and thus a voltage) on capacitor Cin, which is subtracted from the output of the column buffer when the clock phase changes.
(667) An alternative embodiment, still with a passive sensor pixel, is shown in
(668) Digital Pixels and Readouts
(669) As signal-to-noise ratios often can be improved by moving from the analog domain into the digital domain, we have also begun to explore the possibilities for creating digital ISFRT pixels and digital pixel readouts.
(670) Consider first the architecture shown in
(671) To achieve higher throughput (i.e., frame rate), one ADC 75L11-75L1n may be used for each column of the array 75K3, as illustrated in
(672) In either of these two cases, parallelism and frame rate can be increased by dividing the array into two groups (75M1, 75M2 in
(673) To go more directly into the digital domain, one has to move from converting an analog array output into generating a digital output directly at each pixel. In general, this requires providing at each pixel some form of analog-to-digital conversion, and memory (at least 1-bit, for each). Converting the analog sensor signal to digital on an in-pixel basis creates an opportunity to achieve the largest possible signal-to-noise ratio (SNR). It also is inherently scalable, allowing high speed, massively parallel readout of digital sensor data, with the frame rate limitations being dominated by array input/output (I/O) transfer speed, owing to the fact that all pixels are converting sensed values to digital form in parallel.
(674) A basic digital pixel architecture 75O1 is as shown in
(675) The concept of sharing circuitry between multiple pixels, to reduce the average chip area per pixel and to reduce the average and total power consumption, can be extended to digital pixel architectures, as well. For example,
(676) With such digital pixels formed into an array, from a readout perspective the array resembles a memory array. Thus, as shown in
(677) The approaches of
(678) Thus, a row and column addressing scheme allows selection of a variably sized sub-region within the array. This facilitates a trade-off of the size of the array being interrogated with the readout speed (i.e. frames per second). A faster sample rate can have a number of potential advantages:
(679) 1.) A faster sample rate when combined with a digital filter can produce better signal-to-noise measurements for the pixels within the sub-region. For example, selecting a sub-region of one-fourth the array size would allow sample rates approximately four times higher for the pixels within the sub-region. A simple filter that averages four consecutive samples together would reduce the final sample rate back down to the nominal whole-array frame rate, but each measurement would only have approximately half the noise content.
(680) 2.) The faster sample rate can be used to examine higher-frequency signals than would otherwise be possible at the nominal whole-array frame rate of the device. For example, selecting a sub-region of one-fourth the array size would allow sample rates approximately four times higher and the bandwidth limit for measured signals would be increased by a factor of four.
(681) 3.) Both cases can be combined to provide both high-frequency response and higher SNR. For example, selecting a sub-region one-sixteenth the whole-array size would allow for both a two-fold increase in SNR and a four-fold increase in bandwidth.
(682) In some applications, sensitivity and/or signal bandwidth may be more important than the number of active pixels. The availability of variable frame size (which might also be called flexible bandwidth allocation, or perhaps dynamic bandwidth allocation) is valuable in these situations.
(683) With an appropriately segmented array, it would also be possible to perform a rolling sequencing reaction across a large array. One would *slowly* flow dNTP across a large chip. As the wave of dNTP flows across the chip slowly, the sequencing reaction would only be occurring in a small region along the front of the dNTP flow. In theory, it would be possible to synchronize sub-region oversampling with the dNTP front to get very accurate measurements of the entire array.
(684) Protection Diodes
(685) To reduce possible gate oxide degradation during plasma processing (e.g., plasma etch, sputtering, PECVD, etc.), a well diode and/or a substrate diode may be employed, as illustrated in
(686) In
(687) In
(688) As shown in
(689) Reference Electrode Alternatives: The Fluid-Fluid Interface
(690) Performance of the instrument, over all, also can be enhanced with further attention to the reference electrode. With the above-described simple electrodes involving a metal tube, wire, etc. inserted into the flow cell, it has been observed that the reference potential introduced into the flow cell by such electrodes is not stable. It is sensitive to variations in fluid composition and pH. Accordingly, attention also has been given to devising an improved electrode arrangement, which can introduce a more stable reference potential.
(691) With the simple electrode designs discussed above, the fluid-electrode interface influences the way the reference potential is transmitted into the fluid. That is, the interface potential between the fluid and the electrode fluctuates with the composition of the fluid (which may be somewhat turbulent and inhomogeneous), introducing a voltage offset to the potential of the bulk fluid which varies with time and possibly location, as well.
(692) Considerably greater reference potential stability may be achieved by moving the location of the reference electrode so that it is substantially isolated from changes in fluid composition. This may be accomplished by introducing a conductive solution of a consistent composition over at least part of the surface of the electrode (hereafter the electrode solution), arranging the electrode to avoid it coming into direct contact with the fluid in the flow cell and, instead, arranging the electrode solution (not the electrode) to come into electrical contact with the fluid in the flow cell. The result is a transfer of the reference potential to the flow cell solution (be it a reagent or wash or other solution) that is considerably more stable than is obtained by direct insertion of an electrode into the flow cell solution. We refer to this arrangement as a liquid-liquid or fluid-fluid reference electrode interface.
(693) The fluid-fluid interface may be created downstream from the flow cell, upstream from the flow cell, or in the flow cell. Examples of such alternative embodiments are shown in
(694) Turning first to
(695) Two modes of operation are possible. According to a first mode, the electrode solution may be flowed at a rate that is high enough to avoid backflow or diffusion from the fluid flowing out of the flow cell. According to a second mode, once the electrode solution has filled the electrode and come into contact with the outlet flow from the flow cell, a valve (not shown) may be closed to block further flow of the electrode solution into the electrode and, as the electrode solution is an incompressible liquid, there will be substantially no flow into or out of the electrode, yet the fluid-fluid interface will remain intact. This presumes, of course an absence of bubbles and other compressible components. For a fluid-fluid interface to take the place of a metal-fluid interface, the tip 76A12 of the electrode 76A8 is positioned to stop within the Tee connector short of the fluid flow out of the flow cell, so that it is the electrode solution, not the electrode itself, that meets the outlet flow from the flow cell, indicated at 76A13, and carries the reference potential from the electrode to the reagent solution exiting the flow cell. The two fluid streams interact in the Tee connector at 76A13 and if the electrode solution is flowing, it flows out the third port 76A14 of the Tee connector with the reagent flow, as a waste fluid flow, for disposal.
(696) This approach eliminates interfacial potential changes at the electrode surface.
(697) Using a fluid-fluid interface to convey a stable reference potential from a reference electrode to a flow cell, various alternative embodiments are possible.
(698) In one alternative, illustrated in
(699) Alternatively, the manifold can be formed in the substrate of the chip itself by fabricating in the substrate a hollow region which can serve as a conduit allowing fluid passage from an inlet end to an outlet end. An electrode may be inserted therein via a separate inlet port 76B2 or part of the (interior or exterior, as appropriate) surface of the conduit may be metalized during fabrication, to serve as the electrode. The flow path for reagent fluid to exit the flow chamber may include a conduit portion and the electrode conduit/manifold may deliver electrode solution to the reagent fluid outlet conduit, wherein the two fluids come into contact to provide the fluid-fluid interface that applies the reference electrode voltage to the flow cell.
(700) In each instance, the electrode may be hollow and have the electrode solution delivered through its interior, or the electrode solution may be delivered over the exterior of the electrode. For example, as shown in
(701) The electrode assembly thus may be built into the sensor chip itself or into the flow cell or its housing, coupled with a fluid inlet through which electrode solution may be introduced. The flow path for reagent fluid to exit the flow chamber may include a conduit portion 76A4 into which the electrode solution is presented, and wherein the two fluid flows come into contact to provide the fluid-fluid interface. The electrode solution may flow or be static.
(702) As a further alternative embodiment, depicted in
(703) In the foregoing examples, the reference potential is introduced either in or downstream of the flow cell. However, the same approach is possible with the electrode provided upstream of the flow cell, as shown diagrammatically in
(704) Further Developments in Fluidics
(705) The delivery of multiple reagent solutions (and wash solutions) in sequence to a common volume (i.e., flow cell or flow chamber) requires selective switching (i.e., multiplexing) the fluid flows. The multiplexing of fluid flows typically introduces characteristics that are undesirable in that they produce less than ideal results, including potential contamination of reagents, for example, and intervals during which sensor response is unusable or unreliable, reducing potential throughput. The volume of interest, specifically where various reagents must commonly flow to reach the flow cell, is relatively large. This competes with the requirement of cleanliness, as a previously flowing reagent must be completely washed out of the common volume before the next reagent can flow through it to the flow cell. This takes time and consumes wash solution. The characteristic of high volume usually stems from the bulk of valve mechanics that is used to operate the multiplexing action. The presence of valve mechanisms in or near the common volume also competes with the requirement of cleanliness directly, as the valves often present high surface area and/or crevice-type volumes that can retain unwanted reagents. Hence, it would be desirable to provide an improved switching mechanism for reagent flow, to reduce the time required for switching fluids and to minimize cross-contamination.
(706) As exemplified in the embodiment illustrated in
(707) The multiplexer circuit comprises a (optional) housing 778A1 supporting a fluid multiplexer member 78A2 and having reagent input ports 78A3-78A6, a wash input port 78A7, a waste output port 78A8, a chip (flow cell) output port 78A9, a wash solution inlet port 78A10, a multi-use central port 78A11 and a multi-purpose outlet supply port 778A12. (Each reagent is treated in like fashion and the structure of the multiplexer member is the same for each reagent and for the wash solution, so the pertinent structure will be discussed in detail only for one reagent, it being understood that such discussion applies as well to the structures for the other solutions.) Each reagent input feeds into the underside of a corresponding curved (e.g., semi-circular) laminar channel such as channel 78B3, in
(708) On each of the solution feeds, a two-way valve is employed (not shown), upstream of the multiplexer member.
(709) There are two modes of operation for the multiplexer circuit. In a first mode, a reagent is introduced via the multiplexer to the chip. In a second mode, a wash solution clears the multiplexer and the chip.
(710) In the first mode, the upstream valve on the wash solution input is turned off and no wash solution flows into conduit leg 78B18. Selection of a particular reagent is performed by opening its associated upstream valve. Downstream valves (also not shown, to avoid obfuscation) for both the chip and waste outlets are also opened. Two basic processes commence: a) referring to
(711) Between reagent flows, wash solution is fed into the circuit, in the second mode.
(712) The parameters indicated are merely suggested values, and may be adjusted through a large range.
(713) When the reference electrode is placed upstream of the flow cell, the port 78A9 provides a convenient location for its introduction.
(714) The need for only two-way valves is advantageous from a simplicity point of view. Also, the valves can be located very remotely upstream of the multiplexer, and therefore can be placed in almost any location within the supporting instrument. The small physical size of the multiplexer, having no integrated valves or other bulky structures, suggests that it can be located directly at the chip location, greatly reducing the total common volume and providing high spatial and temporal gradients between wash solution and reagent.
(715) Also feasible is a two-dimensional (i.e., thin, disc-like) version of this circuit that would allow tighter packing of reagent inputs, as indicated in
(716) A variation of this two-dimensional structure can be made which does not rely on laminar flow to separate out the diffuse effluent. The reagent inputs are essentially packed in a single circle centered on the chip output. Instead of a free and open two-dimensional channel throughout the circuit, narrow channels connect the reagent inputs to both the chip output and waste ring. When a particular reagent input flows, it enters the central node and both exits to the chip and sweeps past the other reagent inputs on its way to the waste ring. Diffuse effluent from those ports enters the stream to waste, but cannot diffuse upstream toward the chip output. The relative fluidic resistances of the various channels can be adjusted for various performances characteristics (effluent isolation, waste rate minimization, etc.).
EQUIVALENTS
(717) While several inventive embodiments have been described and illustrated herein, those of ordinary skill in the art will readily envision a variety of other means and/or structures for performing the function and/or obtaining the results and/or one or more of the advantages described herein, and each of such variations and/or modifications is deemed to be within the scope of the inventive embodiments described herein. More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the inventive teachings is/are used. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific inventive embodiments described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described and claimed. Inventive embodiments of the present disclosure are directed to each individual feature, system, article, material, kit, and/or method described herein. In addition, any combination of two or more such features, systems, articles, materials, kits, and/or methods, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the inventive scope of the present disclosure.
(718) All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
(719) All references, patents and patent applications disclosed herein are incorporated by reference with respect to the subject matter for which each is cited, which in some cases may encompass the entirety of the document.
(720) The indefinite articles a and an, as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean at least one.
(721) The phrase and/or, as used herein in the specification and in the claims, should be understood to mean either or both of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases. Multiple elements listed with and/or should be construed in the same fashion, i.e., one or more of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the and/or clause, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, a reference to A and/or B, when used in conjunction with open-ended language such as comprising can refer, in one embodiment, to A only (optionally including elements other than B); in another embodiment, to B only (optionally including elements other than A); in yet another embodiment, to both A and B (optionally including other elements); etc.
(722) As used herein in the specification and in the claims, or should be understood to have the same meaning as and/or as defined above. For example, when separating items in a list, or or and/or shall be interpreted as being inclusive, i.e., the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as only one of or exactly one of, or, when used in the claims, consisting of, will refer to the inclusion of exactly one element of a number or list of elements. In general, the term or as used herein shall only be interpreted as indicating exclusive alternatives (i.e. one or the other but not both) when preceded by terms of exclusivity, such as either, one of, only one of, or exactly one of. Consisting essentially of, when used in the claims, shall have its ordinary meaning as used in the field of patent law.
(723) As used herein in the specification and in the claims, the phrase at least one, in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase at least one refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, at least one of A and B (or, equivalently, at least one of A or B, or, equivalently at least one of A and/or B) can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
(724) It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited.
(725) In the claims, as well as in the specification above, all transitional phrases such as comprising, including, carrying, having, containing, involving, holding, composed of, and the like are to be understood to be open-ended, i.e., to mean including but not limited to. Only the transitional phrases consisting of and consisting essentially of shall be closed or semi-closed transitional phrases, respectively, as set forth in the United States Patent Office Manual of Patent Examining Procedures, Section 2111.03.
(726) Frame Averaging and Compression
(727) In regard to
(728) In an embodiment, the ADC(s) 254 may be controlled by the timing generator 256 via the sampling clock signal CS to sample the output signals Vout1 and Vout2 at a high data rate to provide two or more digitized samples for each pixel measurement, which may then be averaged. In an embodiment, two or more pixel measurements in successive frames can be averaged for each pixel of every frame considered. Here, the output is the average measurement for each pixel of all frames considered. As a result of this frame averaging technique, reduction in noise for each pixel can be achieved.
(729) In regard to
(730) Variable frame rate averaging can also be performed on the data sampled from the sensor array (e.g., array 100 of
(731) With faster sampling data rates and higher densities of sensor arrays, the pixel measurement data can consume a large amount of memory on the computer 260 and/or an external memory storage device. It is thus desirable to reduce memory consumption while maintaining the quality of the pixel measurement data A goal of method 8100 (discussed in detail below), among others, is to accurately capture data associated with a biological/chemical event, while reducing noise associated with the data. This goal can be achieved by implementing the frame averaging, variable frame rate averaging, and keyframe delta compression techniques described above. As a result, the amount of data stored (e.g., in the computer 260 of
(732)
(733) In step 8110, a waveform associated with a chemical event occurring on a sensor array is measured. The chemical event that can occur on the sensor array can be, for example and without limitation, one or more stepwise changes in the concentration of one or more ionic species in an analyte solution in fluid contact with an ISFET array (e.g., array 100 of
(734)
(735) In regard to the waveform 8200 of
(736) For ease of reference, the ion-step response depicted in the waveform 8200 of
(737) In regard to step 8110 of
(738) In regard to
(739) Also, based on the sampled data associated with the one or more analytes flowing over the sensor array, the time period of the ion-step response portion 8220 of the waveform can be determined, according to an embodiment of the present invention. In an embodiment, the sampled data can indicate a time period in which the ion-step response occurs for all ISFETs in the sensor array. The ion-step response can be the portion 8220 of the waveform that provides the most important information associated with the chemical reaction. The ion-step response portion 8220 of the waveform oftentimes has unpredictable measured values due to the nature of the biological/chemical experiment conducted on the sensor array such as, for example and without limitation, DNA sequencing. With DNA sequencing, the ion-step response 8220 also has unpredictable measured values due to the nature of the DNA itself and the polymerase interactions occurring on the sensor array.
(740) Further, based on the sampled data associated with the of one or more analytes flowing over the sensor array, the time period of the decay of the ISFET array response pulse 8230 of the waveform can be determined, according to an embodiment of the present invention. In an embodiment, the sampled data can indicate a time period in which the decay occurs for all ISFETs in the sensor array. The decay 8230 can be considered another region of the waveform 8200 that has expected measured values because this region is after the ion-step response portion 8220 of the waveform.
(741) In steps 8120-8140 of
(742) In step 8120 of
(743) In step 8130, a second frame averaging is applied to the at least one region of the waveform 8200 associated with unpredictable measured values. In an embodiment, the second frame averaging can include a second number of frames that are averaged during the at least one region associated with unpredictable measured values. Here, the second frame averaging is a frame averaging technique, as described above, where the second number of frames is less than the first number of frames of step 8120. Also, in an embodiment, the at least one region associated with the unpredictable measured values can be the portion 8220 of the waveform 8200 (e.g., the ion-step response of the waveform 8200). A benefit, among others, of having the second number of frames less than the first number of frames is that accurate data on the ion-step curve 8220 can be stored. As discussed above, the ion-step curve 8220 can be the portion of the waveform 8200 that provides the most important information associated with the chemical event, according to an embodiment of the present invention.
(744) In another embodiment, the second number of frames in step 8130 can be zero, where no frames are averaged. As would be understood by a person of ordinary skill in the art based on the description herein, a tradeoff exists between the number of frames averaged and the accuracy of the data to be stored. Therefore, based on a particular chemical event and/or the specifications of the sensor system (e.g., sampling clock frequency), the number of frames to be averaged can vary.
(745) In step 8140, a third frame averaging is applied to a second region of the waveform 8200 associated with expected measured values. In an embodiment, the third frame averaging can include a third number of frames that are averaged during the second region associated with the expected measured values. Here, the third frame averaging is a frame averaging technique, as described above, where the third number of frames is higher than the second number of frames. Similar to the first number of frames, the third number of frames can be, for example and without limitation, 4, 6, 8, or 10 frames. Also, in an embodiment, the second region associated with the expected measured values can be the portion 8230 of the waveform 8200. A benefit of having a higher number of frames for step 8140 than the number of frames for step 8130, among others, is a reduction in the amount of data stored over all of the frames considered. This can be advantageous because the decay 8230 can be considered another portion of waveform 8200 with expected measured values. Based on the description herein, a person of ordinary skill in the art will recognize that the third number of frames (of step 8140) can be more or less than 4, 6, 8, or 10 frames.
(746) In step 8150, a keyframe delta compression is applied to results from the first, second, and third frame averaging from steps 8120, 8130, and 8140, respectively. In an embodiment, the keyframe delta compression compares a current frame of frame-averaged data associated with the waveform 8200 to a previous frame of frame-averaged data associated with the waveform 8200.
(747) In particular, the keyframe delta compression refers to a lossless compression algorithm where an initial measurement per pixel is stored in a keyframe using a first predetermined number of bits (e.g., 16 bits). For each subsequent frame measured during the at least one region associated with the chemical reaction, each pixel of the frame is compared to its corresponding keyframe value. The difference or delta between the pixel and its corresponding keyframe value is stored using a second predetermined number of bits (e.g., 8 bits). In an embodiment, the process of storing the delta between a pixel and its corresponding keyframe value continues until the delta cannot be stored using the second predetermined number of bits (e.g., 8 bits). At this point, a new keyframe value using the first predetermined number of bits (e.g., 16 bits) is stored, according to an embodiment of the present invention.
(748) Based on the description herein, a person of ordinary skill in the art will recognize that other types of data compression algorithms can be applied to the embodiments described herein. For instance, data compression algorithms such as, for example and without limitations, lossy compression algorithms (e.g., MP2 and MP4) and lossless compression algorithms can be used in step 8150 of
(749) In summary, method 8100 reduces the amount of data associated with the waveform 8200 by increasing the number of frames that are averaged during portions of the waveform with expected measured values (e.g., portions 8210 and 8230 of
(750)
(751)
(752)
(753)
(754)
(755)
(756)
(757)
(758) According to an exemplary embodiment, there is provided a method for compressing nucleic acid sequencing data comprising: (1) obtaining raw data from a semiconductor-based sequencing sensor array comprising a plurality of sensors during a data acquisition time period, the raw data comprising at least a non-informative portion corresponding to a subinterval of the data acquisition time period having a location within the data acquisition time period that varies for different sensors according to a position of the sensor in the sensor array; and (2) transforming the raw data into compressed data using both a lossless compression process including a keyframe delta compression process and lossy compression processes including a variable frame averaging process and a data truncation process, the data truncation process being related for each sensor to the position of the sensor in the sensor array and configured to discard the non-informative portion of the raw data There is also provided a non-transitory machine-readable storage medium comprising instructions which, when executed by a processor, cause the processor to perform such a method and variants thereof. There is also provided a system, including: a machine-readable memory; and a processor configured to execute machine-readable instructions, which, when executed by the processor, cause the system to perform such a method and variants thereof.
(759) A lossless compression process generally refers to a compression process allowing reconstruction of the compressed the data in a way that is essentially identical to the data pre-compression. A lossy compression process generally refers to a compression process allowing reconstruction of the compressed the data in a way that is not identical to the data pre-compression but close enough (under some suitable measure that may be application/objective specific) to achieve a desired purpose. In various embodiments, in addition with other compression approaches discussed herein, compression may generally be based on taking advantage of (1) situations where data points of interest are different at different locations on a sensor array; (2) situations where measurements may be similar within a region; and (3) situations where an enzyme operates at a particular speed so the data points are within a particular time period. In various embodiments, more important portions of the data may be compressed using lossless processes and other less important portions of the data may be compressed using lossy processes.
(760) According to an exemplary embodiment, there is provided a compression method comprising: (1) measuring a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises at least one portion associated with expected measured values and at least one portion associated with unpredictable measured values; (2) applying a first compression process to the waveform, the first compression process including an averaging of one or more frames in one or more portions of the waveform; and (3) applying a second compression process to the waveform, the second compression process including a truncating of data corresponding to a portion of the waveform that is not related to a nucleotide incorporation component of the waveform. There is also provided a non-transitory machine-readable storage medium comprising instructions which, when executed by a processor, cause the processor to perform such a method and variants thereof. There is also provided a system, including: a machine-readable memory; and a processor configured to execute machine-readable instructions, which, when executed by the processor, cause the system to perform such a method and variants thereof.
(761) In the foregoing exemplary embodiments, the truncating of data may comprise determining, for each of a plurality of sensors in the sensor array, a cut-off time point for the waveform for that sensor defining a data range to be truncated. Any suitable waveform analysis or approach at identifying cut-off or transition points in a waveform may be used to identify cut-off points. Each cut-off time point may be determined by mining a plurality of past analysis runs for a given sensor array geometry. In an embodiment, mining may include determining when a nucleotide flow reaches a well as determined by a change in waveform. Other algorithms may include measuring a rate of enzyme incorporation and determining a portion of frames where nucleotides are still incorporating. Each cut-off time point may be determined prior to every run or during a calibration procedure. Each cut-off time point may be factory pre-determined for a given sensor array geometry. The sensors may be arranged in a plurality of regions in the sensor array, and each cut-off time point for sensors in a given region may be determined to have a common cut-off time point determined for sensors for that region. Each common cut-off time point may be determined by finding a best fit to a linear hinge model on a median trace for a region. In an embodiment, a linear hinge model may look for the point where two separate lines best explain a region of a trace, which may include looking for a change point in the data where the nucleotide flow is causing a change in readings of the sensor array compared to a flowed solution. Each common cut-off time point may be determined empirically and may depend on a position of that region relative to other regions along a fluidic flow of nucleotides onto the sensor array. More specifically, fluid may hit different parts of the sensor array at different time points. For example an area near an inlet would experience a new solution being flowed first, as it initially enters or displaces a wash inside the flow cell, and an area near an outlet would experience that new solution later (as seen in
(762) In the foregoing exemplary embodiments, the averaging may comprise applying a first frame averaging to the at least one portion associated with the expected measured values, wherein the first frame averaging may comprise averaging a first number of frames during the at least one portion associated with the expected measured values; and applying a second frame averaging to the at least one portion associated with the unpredictable measured values, wherein the second frame averaging may comprise averaging a second number of frames during the at least one portion associated with the unpredictable measured values, the second number of frames being lower than the first number of frames. The method may further comprise: applying a third frame averaging to another portion associated with expected measured values, and wherein the third frame averaging may comprise averaging a third number of frames that is higher than the second number of frames; and applying a keyframe delta compression to results from the first, second, and third frame averaging, wherein the keyframe delta compression may comprise comparing a current frame of data from frame-averaged data to a previous frame of frame-averaged data associated with the waveform. Measuring the waveform may comprise measuring the waveform of a dynamic response of an ion-sensitive field effect transistor (ISFET) array to a change in ionic strength of an analyte solution in fluid contact with the ISFET array. Measuring the waveform of the dynamic response of the ISFET array may comprise associating the at least one portion with the unpredictable measured values to a stepwise increase in ion concentration in the analyte solution and associating the at least one portion with the expected measured values to at least one portion of the dynamic response outside of the stepwise increase in ion concentration. Applying the first frame averaging may comprise averaging the first number of frames during the at least one portion of the dynamic response outside of the stepwise increase in ion concentration. Applying the second frame averaging may comprise averaging the second number of frames during the stepwise increase in ion concentration in the analyte solution.
(763) According to an exemplary embodiment, there is provided a computer program product comprising a computer-usable medium having computer program logic recorded thereon that, when executed by one or more processors, samples and compresses data from a sensor array, the computer program logic comprising: (1) first computer readable program code that enables a processor to measure a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises at least one portion associated with expected measured values and at least one portion associated with unpredictable measured values; (2) second computer readable program code that enables a processor to apply a first compression process to the waveform, the first compression process including an averaging of one or more frames in one or more portions of the waveform; and (3) third computer readable program code that enables a processor to apply a second compression process to the waveform, the second compression process including a truncating of data corresponding to a portion of the waveform that is not related to a nucleotide incorporation component of the waveform.
(764) According to an exemplary embodiment, there is provided a system, comprising: (1) a sensor array; (2) an array controller coupled to the sensor array, wherein the array controller is configured to measure a waveform associated with a chemical event occurring on the sensor array, wherein the waveform comprises at least one portion associated with expected measured values and at least one portion associated with unpredictable measured values; and (3) a computer coupled to the array controller, wherein the computer is configured to: (i) apply a first compression process to the waveform, the first compression process including an averaging of one or more frames in one or more portions of the waveform; and (ii) apply a second compression process to the waveform, the second compression process including a truncating of data corresponding to a portion of the waveform that is not related to a nucleotide incorporation component of the waveform.
(765) In various exemplary embodiments, in addition or as an alternative to one or more of other compression approaches described in this application, only data corresponding to the time just around when the nucleotide washes over a well (see, e.g.,
(766) In various exemplary embodiments, a change in pH for one or more wells may be measured as a nucleotide solution is flowed across the chip from an inlet to an outlet. A change in proton concentration may be measured both from the different pH of the nucleotide solution compared to the wash and due to the incorporation of nucleotides in a particular well. As a nucleotide solution is flowing across a chip including wells, for example, the nucleotide solution may displace an existing wash solution in a period of microseconds or milliseconds. Different parts or regions of such a chip may be measuring different solutions as the nucleotide flows across the chip. For example, a region near the fluidic inlet may initially be exposed to the nucleotide solution while another region near the fluidic outlet is still measuring wash solution. Because a polymerase can operate relatively quickly, an incorporation signal may also happen relatively quickly and the nucleotide flow may be viewed as a wave of concentration washing across the chip. In an embodiment, the time during which such a wave is washing over a particular region may be deemed the most important for measuring the incorporation signal. Such a time may be determined empirically. For example, the shape of such a wave may be measured empirically over a chip including wells for 6464 sized regions, for example, over a number of sites by looking for the change in pH when the nucleotide replaces the wash solution above the well. Regions of other sizes (e.g., NM where N and M are positive integers) are also possible and may include regions comprising any of the possible subsets of wells that may be in a sensor array. In an embodiment, such a model may look for a change point by finding the best fit to a linear hinge model on the median trace for a region, which may yield results that are very reproducible across runs and sites for standard operating procedure conditions.
(767) In various exemplary embodiments, a compression scheme may exploit the property of nucleotide wash reaching different parts of a chip at different points in time. The frames acquired for any region of the chip before start of the nucleotide flow may be identified as not containing any information regarding the incorporation and may thus only be used to obtain a good estimate of the pH before the nucleotide wash. Therefore, these frames may advantageously not be stored for any region before the nucleotide wash has reached such region to minimize and/or reduce downstream computational burden. Instead, an average of the frames not recorded in this process may be used to better estimate the pH of the region for that time period. The time at which nucleotide starts flowing over a well or region (which may be referred to as t0 (time zero) herein), may be determined for each region of the chip by mining several past analysis runs for a particular chip type. In some cases, a single DAT file produced through this compression scheme may be 10%-15% smaller in size for exemplary ION TORRENT chips of the 31X series, and such a compression performance may be even more significant for larger chips. The benefits of compression become increasingly important as data sizes increase. Indeed, for an Ion Proton I chip, which has about 165 million wells, at 30 frames per second for 300 flows and 6 seconds per acquisition there could be about 165,000,000 wells*30 frames/second*14 bits*6 seconds/acquisition*300 flows/(8*1e6)=15,592,500, which is about 15 terabytes of data. Moreover, the number of wells and flows are expected to increase, leading to even higher amounts of data and a greater need for improved compression. In an embodiment, this compression scheme may advantageously be used in conjunction with variable frame rate if different parts of the time series can be sampled at different rates. For example, one variable frame rate scheme would be the maximum frame rate at time zero during incorporation but sample multiple data points later when the protons are diffusing out of the well at a relatively constant rate. In other embodiments, this compression scheme may advantageously be used in conjunction with both variable frame rate and keyframe delta compression.
(768) In various exemplary embodiments, a fluid containing a nucleotide solution may hit different portions of a sensor array at different times, and such times may be determined and used in data analysis as representative of a point at which an enzyme can begin to incorporate nucleotides in the nucleotide solution. In some cases, such times may be determined empirically by looking for a change point in the pH as a solution of a different pH is flowed over the sensor array, including the nucleotide solution itself which has a different pH than the wash buffer, as sensors will begin to measure a different voltage when exposed to different pH solution. In various embodiments, one or more change point detection algorithms, see chapter 2 of Jie Chen & A. K. Gupta, Parametric Statistical Change Point Analysis (Oberwolfach Seminars), Birkhauser (2000), which is incorporated herein by reference in its entirety, may be used to detect the change point. One possible algorithm is to compare a fit of a piecewise linear hinge model to a single linear model in a sliding window and look for the point at which fitting two linear models improves the fit the most compared to that of a single line within that window. In the case of a sensor array having multiple wells, if the data are locally linear the point where the best fitting piecewise linear models meet would be the change point, or to, where the nucleotide starts washing over the well. This model could be fit on a summary of any subset of wells or on a single well at a time.
(769) In various exemplary embodiments, a waveform obtained using any of the foregoing exemplary embodiments (whether pre- or post-compression) may be compressed (or further compressed) using a combination of a truncation process (such as described above) and a substitution process as described next. In an embodiment, the truncation process may be a t0 compression process as described above. In an exemplary substitution process, a waveform (or portion thereof) may be replaced with a substitute representation, which may be another waveform or a set of parameters or coefficients that can be stored very compactly while allowing downstream reconstruction of the waveform or otherwise remaining representative of the initial waveform in some way. Such an approach may reduce data storage needs to a minimal amount of data yielding a maximal amount of sequencing accuracy.
(770)
(771) In various exemplary embodiments, the waveform substitution process may include determining one or more parameters characterizing the waveform. The one or more parameters may be selected from a larger set of parameters used for downstream analysis. In an example, the one or more parameters may include a tmid_nuc parameter (which may be expressed as time or data frame) characterizing a point in time at which concentration of the nucleotide has risen halfway to the final nucleotide concentration in a nucleotide concentration rise, and a sigma parameter (which may be expressed as a rate or speed) characterizing a rate or speed of the nucleotide concentration rise. These parameters may be determined or estimated based on certain characteristics of a representative nucleotide incorporation curve and existing relationships between the tmid_nuc and sigma parameters and those characteristics. In an embodiment, at least one of these parameters may be related to a t0 (time zero) parameter of the truncation process, such as described above, which may be used in combination with the substitution process. In the case of tmid_nuc, a relationship exists as a start of nucleotides hitting a well is related to a point in time at which concentration of the nucleotide has risen halfway to the final nucleotide concentration. In the case of the sigma parameter, a relationship exists as pH is related to the nucleotide concentration above the well and is related to the speed of the rise of concentration of nucleotide in the well. These parameters may be determined based on the data using any suitable waveform analysis or curve fitting methods or models known in the art. For example, in an embodiment the parameters could be determined using linear regression after a suitable transformation of data as seen in
(772)
(773)
(774)
(775) In various exemplary embodiments, the waveform may be stored in a smaller number of bytes than originally generated but containing exactly the same information using compression schemes, for example by using zip and Huffman codes or other such lossless compression techniques. In some cases, delta compression as described above, which appears to be particularly useful given the generally smooth nature (e.g., with relatively small changes in counts between consecutive time points) of nucleotide incorporation waveforms, may be used. In an example, in addition to delta compression, residuals may be stored using a Huffman code in which multiple wells are packed into bits as small as possible with a prefix specifying that number for writing.
(776) In various exemplary embodiments, the waveform may be stored in a smaller number of bytes than originally generated, while allowing for some difference in the information being stored, for example by using compression schemes such as JPEG or other lossy compression techniques. In an embodiment, the waveform may be represented by a linear combination of the first N principal component vectors (eigenvectors of the covariance matrix, also sometimes referred to as a normalized scatter matrix), as most of the information is contained in the first 5 or 6 principal components, for example, and waveforms from individual wells may advantageously be well represented by only the coefficients for these eigenvectors for compact storage. In various embodiments, these coefficients may in turn be stored compactly by dynamically truncating them to lower precision and storing them compactly using Huffman codes or any other suitable encoding scheme. Although most of the information is contained in the first 5 or 6 principal components here, that is only an example. In general, depending on characteristics of an underlying sequencing system or technology, as well as trade-off between a desire to leave data as close as possible to raw data and a need to increase speed without unduly affecting accuracy, fewer or more principal components could be used. For example, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10, or more principal components could be used. An overview of principal component analysis may be found in Pearson, K., On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine, Vol. 2, No. 11, pp. 559-572 (1901), and Abdi et al., Principal component analysis, Wiley Interdisciplinary Reviews: Computational Statistics, Vol. 2, No. 4, pp. 433-459 (2010).
(777)
(778)
(779)
(780)
(781) According to an embodiment, a time zero point and post time zero slope may be obtained or determined for each well in a set of wells. This may be done using any of the approaches described above regarding the t0 compression process and determination of cut-off points or t0 values, or using any other suitable curve fitting and/or analysis method or model. A tmid_nuc and sigma parameters may be determined during a first nucleotide flow based on the time zero point and post time zero slope. For example, in an embodiment the parameters could be determined using linear regression after a suitable transformation of data as seen in
(782) According to an exemplary embodiment, there is provided a compression method, comprising: measuring a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises a plurality of measured values and the chemical event is indicative of a number of nucleotide incorporations in a genetic sequencing reaction; applying a first compression process to the waveform, the first compression process including a truncating of data corresponding to a portion of the waveform that is not related to nucleotide incorporations in the genetic sequencing reaction; and applying a second compression process to the waveform, the second compression process including a data substitution process that replaces at least a portion of the waveform with a plurality of coefficients representative of the portion of the waveform.
(783) In such a method, the truncating of data may comprise determining, for each of a plurality of sensors in the sensor array, a cut-off time point for the waveform for that sensor defining a data range to be truncated. Each cut-off time point may be determined by mining a plurality of past analysis runs for a given sensor array geometry. Each cut-off time point may be determined prior to every run or during a calibration procedure. Each cut-off time point may be factory pre-determined for a given sensor array geometry. The sensors may be arranged in a plurality of regions in the sensor array, and each cut-off time point for sensors in a given region may be determined to have a common cut-off time point determined for sensors for that region. Each common cut-off time point may be determined by finding a best fit to a linear hinge model on a median trace for a region. Each common cut-off time point may be determined empirically and may depend on a position of that region relative to other regions along a fluidic flow of nucleotides onto the sensor array. The cut-off time point of sensors in a region substantially near a fluidic inlet may be different from the cut-off time point of sensors in a region substantially near a fluidic outlet.
(784) In such a method, the second compression process may include replacing the waveform with a plurality of coefficients of a linear combination of one or more principal component vectors representative of the portion of the waveform. The second compression process may include storing the plurality of coefficients compactly by dynamically truncating the coefficients to a lower precision and encoding the truncated coefficients using a Huffman code. The second compression process may include replacing the portion of the waveform with a plurality of coefficients of a linear combination of between about 5 and about 10 principal component vectors representative of the portion of the waveform. The second compression process may include replacing the portion of the waveform with a plurality of coefficients of a linear combination of 5 or 6 principal component vectors representative of the portion of the waveform.
(785) Measuring the waveform may comprise measuring the waveform of a dynamic response of an ion-sensitive field effect transistor (ISFET) array to a change in ionic strength of an analyte solution in fluid contact with the ISFET array. Measuring the waveform of the dynamic response of the ISFET array may comprise associating a portion of the waveform to a stepwise increase in ion concentration in the analyte solution and associating another portion of the waveform to at least one portion of the dynamic response outside of the stepwise increase in ion concentration.
(786) According to an exemplary embodiment, there is provided a computer program product comprising a non-transitory computer-usable medium having computer program logic recorded thereon that, when executed by one or more processors, samples and compresses data from a sensor array, the computer program logic comprising: first computer readable program code that enables a processor to measure a waveform associated with a chemical event occurring on a sensor array, wherein the waveform comprises a plurality of measured values and the chemical event is indicative of a number of nucleotide incorporations in a genetic sequencing reaction; second computer readable program code that enables a processor to apply a first compression process to the waveform, the first compression process including a truncating of data corresponding to a portion of the waveform that is not related to nucleotide incorporations in the genetic sequencing reaction; and computer readable program code that enables a processor to apply a second compression process to the waveform, the second compression process including replacing at least a portion of the waveform with a plurality of coefficients representative of the portion of the waveform.
(787) According to an exemplary embodiment, there is provided a method for compressing nucleic acid sequencing data, comprising: obtaining raw data from a semiconductor-based sequencing sensor array comprising a plurality of sensors during a data acquisition time period, the raw data comprising at least a non-informative portion corresponding to a subinterval of the data acquisition time period having a location within the data acquisition time period that varies for different sensors according to a position of the sensor in the sensor array; and transforming the raw data into compressed data using both (1) a lossless compression process and (2) lossy compression processes including a data truncation process and a data substitution process, the data truncation process being related for each sensor to the position of the sensor in the sensor array and configured to discard the non-informative portion of the raw data, and the data substitution process being adapted to replace the raw data for each sensor with a plurality of coefficients of a linear combination of one or more principal component vectors representative of the raw data for each sensor.
(788) In such a method, the data substitution process may comprise storing the plurality of coefficients compactly by dynamically truncating the coefficients to a lower precision and encoding the truncated coefficients using a Huffman code. The data substitution process may comprise replacing the raw data for each sensor with a plurality of coefficients of a linear combination of between about 5 and about 10 principal component vectors representative of the raw data for each sensor. The data substitution process may comprise replacing the raw data for each sensor with a plurality of coefficients of a linear combination of 5 or 6 principal component vectors representative of the raw data for each sensor.
(789) Example Computer System
(790) Various aspects of the embodiments described herein may be implemented in software, firmware, hardware, or a combination thereof.
(791) After reading the description herein, it will become apparent to a person skilled in the relevant art how to implement embodiments described herein using other computer systems and/or computer architectures. For instance, in an embodiment, computer system 8300 (or a portion thereof) may be a stand-alone computing system (e.g., array controller 250 of
(792) Computer system 8300 can be any commercially available and well known computer capable of performing the functions described herein, such as computers available from International Business Machines, Apple, Sun, HP, Dell, Compaq, Cray, etc.
(793) Computer system 8300 includes one or more processors, such as processor 8304. Processor 8304 may be a special purpose or a general-purpose processor. Processor 8304 is connected to a communication infrastructure 8306 (e.g., a bus or network).
(794) Computer system 8300 also includes a main memory 8308, preferably random access memory (RAM), and may also include a secondary memory 8310. Main memory 8308 has stored therein a control logic 8309 (computer software) and data. Secondary memory 8310 can include, for example, a hard disk drive 8312, a removable storage drive 8314, and/or a memory stick. Removable storage drive 8314 can comprise a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like. The removable storage drive 8314 reads from and/or writes to a removable storage unit 8317 in a well-known manner Removable storage unit 8318 can include a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 8318. As will be appreciated by persons skilled in the relevant art, removable storage unit 8317 includes a computer-usable storage medium 8318 having stored therein a control logic 8319 (e.g., computer software) and/or data.
(795) In alternative implementations, secondary memory 8310 can include other similar devices for allowing computer programs or other instructions to be loaded into computer system 8300. Such devices can include, for example, a removable storage unit 8322 and an interface 8320. Examples of such devices can include a program cartridge and cartridge interface (such as those found in video game devices), a removable memory chip (e.g., EPROM or PROM) and associated socket, and other removable storage units 8322 and interfaces 8320 which allow software and data to be transferred from the removable storage unit 8322 to computer system 8300.
(796) Computer system 8300 also includes a display 8330 that communicates with computer system 8300 via a display interface 8302. Although not shown in computer system 8300 of
(797) Computer system 8300 can also include a communications interface 8324. Communications interface 8324 allows software and data to be transferred between computer system 8300 and external devices. Communications interface 8324 can include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like. Software and data transferred via communications interface 8324 are in the form of signals, which may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 8324. These signals are provided to communications interface 8324 via a communications path 8326. Communications path 8326 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, a RF link or other communications channels.
(798) In this document, the terms computer program medium and computer-usable medium are used to generally refer to media such as removable storage unit 8317, removable storage unit 8318, and a hard disk installed in hard disk drive 8312. Computer program medium and computer-usable medium can also refer to memories, such as main memory 8308 and secondary memory 8310, which can be memory semiconductors (e.g., DRAMs, etc.). These computer program products provide software to computer system 8300.
(799) Computer programs (also called computer control logic) are stored on main memory 8308 and/or secondary memory 8310. Computer programs may also be received via communications interface 8324. Such computer programs, when executed, enable computer system 8300 to implement embodiments described herein. In particular, the computer programs, when executed, enable processor 8304 to implement processes described herein, such as the steps in the methods illustrated in the flowchart of
(800) Based on the description herein, a person skilled in the relevant art will recognize that the computer programs, when executed, can enable one or more processors to implement processes described above, such as the steps in the methods illustrated in the flowchart of
(801) Based on the description herein, a person of skilled in the relevant art will recognize that the computer programs, when executed, can enable multiple processors to implement processes described above, such as the steps in the methods illustrated in the flowchart of
(802) Embodiments are also directed to computer program products including software stored on any computer-usable medium (e.g., computer useable medium 8318 and 8331). Such software, when executed in one or more data processing device, causes a data processing device(s) to operate as described herein. Embodiments employ any computer-usable or -readable medium, known now or in the future. Examples of computer-usable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage devices, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).
(803) Embodiments have been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
(804) According to various embodiments, one or more features of any one or more of the above-discussed teachings and/or embodiments may be performed or implemented using appropriately configured and/or programmed hardware and/or software elements. Determining whether an embodiment is implemented using hardware and/or software elements may be based on any number of factors, such as desired computational rate, power levels, heat tolerances, processing cycle budget, input data rates, output data rates, memory resources, data bus speeds, etc., and other design or performance constraints.
(805) Examples of hardware elements may include processors, microprocessors, input(s) and/or output(s) (I/O) device(s) (or peripherals) that are communicatively coupled via a local interface circuit, circuit elements (e.g., transistors, resistors, capacitors, inductors, and so forth), integrated circuits, application specific integrated circuits (ASIC), programmable logic devices (PLD), digital signal processors (DSP), field programmable gate array (FPGA), logic gates, registers, semiconductor device, chips, microchips, chip sets, and so forth. The local interface may include, for example, one or more buses or other wired or wireless connections, controllers, buffers (caches), drivers, repeaters and receivers, etc., to allow appropriate communications between hardware components. A processor is a hardware device for executing software, particularly software stored in memory. The processor can be any custom made or commercially available processor, a central processing unit (CPU), an auxiliary processor among several processors associated with the computer, a semiconductor based microprocessor (e.g., in the form of a microchip or chip set), a macroprocessor, or generally any device for executing software instructions. A processor can also represent a distributed processing architecture. The I/O devices can include input devices, for example, a keyboard, a mouse, a scanner, a microphone, a touch screen, an interface for various medical devices and/or laboratory instruments, a bar code reader, a stylus, a laser reader, a radio-frequency device reader, etc. Furthermore, the I/O devices also can include output devices, for example, a printer, a bar code printer, a display, etc. Finally, the I/O devices further can include devices that communicate as both inputs and outputs, for example, a modulator/demodulator (modem; for accessing another device, system, or network), a radio frequency (RF) or other transceiver, a telephonic interface, a bridge, a router, etc.
(806) Examples of software may include software components, programs, applications, computer programs, application programs, system programs, machine programs, operating system software, middleware, firmware, software modules, routines, subroutines, functions, methods, procedures, software interfaces, application program interfaces (API), instruction sets, computing code, computer code, code segments, computer code segments, words, values, symbols, or any combination thereof. A software in memory may include one or more separate programs, which may include ordered listings of executable instructions for implementing logical functions. The software in memory may include a system for identifying data streams in accordance with the present teachings and any suitable custom made or commercially available operating system (O/S), which may control the execution of other computer programs such as the system, and provides scheduling, input-output control, file and data management, memory management, communication control, etc.
(807) According to various embodiments, one or more features of any one or more of the above-discussed teachings and/or embodiments may be performed or implemented using appropriately configured and/or programmed non-transitory machine-readable medium or article that may store an instruction or a set of instructions that, if executed by a machine, may cause the machine to perform a method and/or operations in accordance with the embodiments. Such a machine may include, for example, any suitable processing platform, computing platform, computing device, processing device, computing system, processing system, computer, processor, scientific or laboratory instrument, etc., and may be implemented using any suitable combination of hardware and/or software. The machine-readable medium or article may include, for example, any suitable type of memory unit, memory device, memory article, memory medium, storage device, storage article, storage medium and/or storage unit, for example, memory, removable or non-removable media, erasable or non-erasable media, writeable or re-writeable media, digital or analog media, hard disk, floppy disk, read-only memory compact disc (CD-ROM), recordable compact disc (CD-R), rewriteable compact disc (CD-RW), optical disk, magnetic media, magneto-optical media, removable memory cards or disks, various types of Digital Versatile Disc (DVD), a tape, a cassette, etc., including any medium suitable for use in a computer. Memory can include any one or a combination of volatile memory elements (e.g., random access memory (RAM, such as DRAM, SRAM, SDRAM, etc.)) and nonvolatile memory elements (e.g., ROM, EPROM, EEROM, Flash memory, hard drive, tape, CDROM, etc.). Moreover, memory can incorporate electronic, magnetic, optical, and/or other types of storage media. Memory can have a distributed architecture where various components are situated remote from one another, but are still accessed by the processor. The instructions may include any suitable type of code, such as source code, compiled code, interpreted code, executable code, static code, dynamic code, encrypted code, etc., implemented using any suitable high-level, low-level, object-oriented, visual, compiled and/or interpreted programming language.
(808) According to various embodiments, one or more features of any one or more of the above-discussed teachings and/or embodiments may be performed or implemented at least partly using a distributed, clustered, remote, or cloud computing resource.
(809) According to various embodiments, one or more features of any one or more of the above-discussed teachings and/or embodiments may be performed or implemented using a source program, executable program (object code), script, or any other entity comprising a set of instructions to be performed. When a source program, the program can be translated via a compiler, assembler, interpreter, etc., which may or may not be included within the memory, so as to operate properly in connection with the O/S. The instructions may be written using (a) an object oriented programming language, which has classes of data and methods, or (b) a procedural programming language, which has routines, subroutines, and/or functions, which may include, for example, C, C++, Pascal, Basic, Fortran, Cobol, Perl, Java, and Ada.
(810) While various embodiments have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be understood by those skilled in the relevant art that various changes in form and details can be made therein without departing from the spirit and scope of the embodiments described herein. It should be understood that this description is not limited to these examples. This description is applicable to any elements operating as described herein. Accordingly, the breadth and scope of this description should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
(811) Each of the following is incorporated by reference herein in its entirety: U.S. patent application Ser. No. 13/334,747, filed Dec. 22, 2011, titled Methods and Apparatus for Measuring Analytes; U.S. patent application Ser. No. 12/475,311, filed May 29, 2009, titled Methods and Apparatus for Measuring Analytes; U.S. patent application Ser. No. 13/340,490, filed Dec. 29, 2011, titled Methods, Systems, and Computer Readable Media for Nucleic Acid Sequencing; and U.S. Prov. Pat. Appl. No. 61/428,733, filed Dec. 30, 2010, titled Apparatus, Methods, and Software for Performing Electrochemical Reactions, and the contents of each of which are incorporated by reference herein in their entireties.