Sensor

Abstract

A method of diagnosing, staging or monitoring cancer, the method comprising the steps of: (a) providing a sensor array comprising at least two sensors, wherein each sensor comprises a protein barrel that comprises five or more alpha helices arranged as an alpha-helical barrel, and a reporter dye, wherein the protein barrel defines a lumen, the reporter dye is bound to the lumen reversibly; and wherein the protein barrel is different in structure in the at least two sensors; (b) contacting the sensor array with a sample obtained from a patient; and then (c) comparing the sensor array to a predetermined standard.

Claims

1. A method of diagnosing, staging or monitoring cancer, the method comprising the steps of: (a) providing a sensor array comprising at least two sensors, wherein each sensor comprises a protein barrel that comprises five or more alpha helices arranged as an alpha-helical barrel, and a reporter dye, wherein the protein barrel defines a lumen, the reporter dye is bound to the lumen reversibly; and wherein the protein barrel is different in structure in the at least two sensors; (b) contacting the sensor array with a sample obtained from a patient; and then (c) comparing the sensor array to a predetermined standard.

2. The method according to claim 1, wherein the sample is liquid in which tumour or tissue cells from the patient have been cultured.

3. The method according to claim 1, wherein the sample is or is obtained from whole blood, a cell scraping, a biopsy tissue, bone marrow, plasma, serum, cerebrospinal fluid, saliva, semen, sputum, urine or stool.

4. The method according to claim 1, wherein the cancer is breast cancer.

5. The method according to claim 1, wherein the cancer is metastatic breast cancer in the lung.

6. The method according to claim 1, wherein each alpha helix independently comprises a sequence having a repeat unit with sequence abcdefg, wherein 50% or more of the a and d positions are hydrophobic amino acids and wherein 50% or more of the b, c, e, f and g positions are polar amino acids.

7. The method according to claim 6, wherein the repeat unit with sequence abcdefg is selected from the list consisting of: LQKIEfI, LKAIAfE, LKEIAfS, IKEIAfS, LKEIAfA, FKEIAfA, IKEIAfA, IKEVAfA, VKEVAfA, VKEIAfA, MKEIAfA, LKQIEfI, LKEVAfA, VKELAfA, IKELSfA, IKELAfS, LKELAfS, FKEIAfA, LKQIEfI and LKELAfA; wherein f may vary between repeat units.

8. The method according to claim 1, wherein each alpha helix comprises at least three repeat units.

9. The method according to claim 1, wherein the protein barrel comprises a non-natural amino acid.

10. The method according to claim 9, wherein the non-natural amino acid is an amino acid that has been modified by chemically linking a protein substrate.

11. The method according to claim 10, wherein the protein substrate comprises an enzyme substrate, receptor substrate and/or antibody substrate.

12. The method according to claim 1, wherein the protein barrel comprises a single and continuous amino acid backbone.

13. A sensor array according to claim 1, wherein the protein barrel is immobilised on a substrate, preferably wherein the substrate is a solid substrate or is a hydrogel.

14. The method according to claim 1, wherein the protein barrel and reporter dye are in a dry state.

15. The method according to claim 1, wherein the reporter dye provides an optical signal when bound to the lumen.

16. The method according to claim 1, wherein the reporter dye is a compound according to Formula I: ##STR00004## wherein n is 3 or more, preferably n is 3, 4 or 5, more preferably n is 3; and R1 and R2 are independently selected from aryl or heteroaryl, preferably aryl, more preferably phenyl.

17. The method according to claim 1, comprising at least 10 sensors, preferably at least 50 sensors, more preferably at least 100 sensors, yet more preferably at least 300 sensors, wherein the protein barrel is different in each of the at least 10, 50, 100 or 300 sensors respectively.

18. The method according to claim 1, comprising at least one further sensor, wherein the reporter dye is different in the at least one further sensor.

19. The method according to claim 1, wherein the sensor array is incorporated into a microarray chip.

20. A method according to claim 1, wherein step (d) comprises computational pattern recognition.

21. Use of a sensor array comprising at least two sensors, wherein each sensor comprises a protein barrel that comprises five or more alpha helices arranged as an alpha-helical barrel, and a reporter dye, wherein the protein barrel defines a lumen, the reporter dye is bound to the lumen reversibly; and wherein the protein barrel is different in structure in the at least two sensors, to diagnose, stage or monitor cancer.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0075] FIG. 1 is a schematic showing the abcdefg heptad repeat units of two alpha helices in a coiled-coil arrangement;

[0076] FIG. 2 is a schematic showing the abcdefg heptad repeat units of five or more alpha helices in a coiled-coil alpha-helical barrel;

[0077] FIG. 3 is a schematic showing the abcdefg heptad repeat units of a six-helix barrel;

[0078] FIG. 4 shows top-down and side views of x-ray crystal structures of coiled-coil folds comprising 3, 4, 5, 6 and 7 alpha helices, corresponding to PDP IDs 4DZM, 4DZL, 3R4A, 4PN8, 4PN9 and 4PNA, respectively;

[0079] FIG. 5A shows a partial cutaway view of an x-ray crystal structure of an alpha-helical barrel comprising CC-Hex2 with farnesol bound in the alpha-helical barrel lumen;

[0080] FIG. 5B shows a top down view of the x-ray crystal structure of FIG. 5A;

[0081] FIG. 6 shows a sensor array of different alpha-helical barrels;

[0082] FIG. 7 is a schematic view showing how a sensor array is run, wherein protein barrels are added to the array, DPH reporter dye is bound and different analytes produce different displacement patterns or fingerprints;

[0083] FIG. 8 shows displacement patterns for seven different analytes against the alpha-helical barrel array described in FIG. 6;

[0084] FIG. 9 shows replicate displacement patterns for cholesterol;

[0085] FIG. 10 shows a process for analysing a displacement pattern using computational methods;

[0086] FIG. 11 is a chart showing how computational pattern recognition improves with training;

[0087] FIG. 12 shows the DPH displacement fingerprints produced by selected tea samples demonstrating that complex mixtures can be successfully analysed;

[0088] FIG. 13 shows across the top the DPH displacement fingerprints for glucose, galactose and mannose and across the bottom the structure of the epimers, and demonstrates that the invention can be used to distinguish epimers;

[0089] FIG. 14 is a comparison of DPH displacement fingerprints for cholesterol (right), at 1 μM final concentration, using proteinogenic (left) and non-proteinogenic (centre) peptide arrays;

[0090] FIG. 15 shows DPH displacement fingerprints for N-Acetyl-L-aspartic acid (Panel A) and NG,NG-Dimethylarginine (Panel B) using a peptide barrel array including one all D-amino acid peptide (d-(avkeva)) which is represented by the block depicted on the left, second from bottom in each fingerprint;

[0091] FIG. 16 shows the fingerprints generated for all 10 conditioned media samples. This includes media collected from cells of “non-cancerous”, “primary tumour”, and “metastasised tumour” origins as indicated. Each fingerprint conditioned media is labelled as follows. A: NMuMg; B: HC11; C:EpH4; D: Yej; E: 113; F: 724; G: Yej-M1; H: Yej-M2; I: 113-M1; J: 113-M2;

[0092] FIG. 17 shows the fingerprints generated for combined “non-cancerous” (panel A), “primary tumour” (B) and metastasised tumour (C) derived conditioned media;

[0093] FIG. 18 is the confusion matrix for 2-way prediction of healthy, non-cancerous cells, and cells originating tumours;

[0094] FIG. 19 is the confusion matrix for 3-way prediction of healthy, non-cancerous cells, and cells originating from primary tumours and metastatic tumours; and

[0095] FIG. 20 is the confusion matrix for 2-way prediction of cells originating from primary tumours and metastatic tumours.

DESCRIPTION

[0096] The first aspect of the invention involves providing a sensor array comprising at least two sensors. The sensor array can be provided, for example, in a multiwell plate. In this case, the different sensors would be in different wells.

[0097] The sensor array comprises at least two sensors. Two sensors is the minimum number of sensors needed to define an array. A larger number of sensors can be included in the array. For example, the array can comprise at least 10 sensors, preferably at least 50 sensors, more preferably at least 100 sensors, yet more preferably at least 300 sensors. The protein barrel is different in each of the at least two, 10, 50, 100 or 300 sensors respectively.

[0098] The requirement for the protein barrel to be different in structure in the claimed sensors does not preclude that the sensor array can contain yet further sensors that are merely replicate sensors, controls, or make use of the same protein barrel but with a different reporter dye. Indeed, the use of replicate sensors is a common strategy to improve data quality. In other words, the sensor array comprises a number of different sensors with different protein barrels, but there will usually be further sensors in the sensor array with the same protein barrels. These further sensors are usually replicates for data quality, controls, or sensors that use a different reporter dye. However, within the sensor array, there must at least be the claimed number of sensors wherein the protein barrel is different in structure.

[0099] Each sensor comprises a protein barrel. A protein barrel is a protein that defines a lumen. The protein barrel therefore has a lumen surface and an external surface. A lumen is a tubular cavity within the protein. The tubular cavity is typically elongated, i.e. long and narrow. Usually, the lumen would be open at both ends to allow for displacement of molecules within the lumen. However, in certain embodiments, the lumen may be blocked at one or at both ends to trap specific molecules within the lumen.

[0100] The protein barrel is different in structure in the different sensors. By this, we mean that there is at least one difference by which the protein barrels can be distinguished. This difference could include a point mutation in an amino acid, or an amino acid that has been derivatised or functionalised. This difference could also include a change in length or width of the protein barrel. This difference could also include a change in type of protein barrel.

[0101] Due to the possibility of making very different chemical environments by using a limited number of differences in the protein backbone, in certain embodiments of the invention, the different protein barrels may have similar protein backbones. For example, the different protein barrels may all be of the same type. In one embodiment, the different protein barrels may all be alpha-helical barrels. In another embodiment, it may be that the different protein barrels are within 50% sequence identity, 70% sequence identity or 90% sequence identity.

[0102] Alpha-helical barrels are protein barrels that comprise five or more alpha helices. The alpha helices arrange in a pattern where they are substantially aligned with each other, side-by-side, to form a tube-like shape. This is known as a coiled-coil fold (also known as coiled-coil structures or assemblies) and has been well characterised previously. Representative examples include Malashkevich et al., 1996; Koronakis et al., 2000; Zaccai et al., 2011; Fletcher et al., 2012; Meusch et al., 2014; Sun et al., 2014; Thomson et al., 2014; Collie et al., 2015; and Lombardo et al., 2016. Examples of coiled-coil folds comprising different alpha helix numbers can be seen in FIGS. 1-4.

[0103] As can be seen in FIG. 4, coiled-coil folds can occur with 3 and 4 alpha helices. However, it is not until the number of alpha helices reaches 5 that a lumen forms. Coiled-coils with 5 or more alpha helices form a lumen, and therefore constitute alpha-helical barrels.

[0104] Thomson 2014 reports that five alpha-helix barrels have a lumen diameter of about 5.7 Å, six alpha-helical barrels have a lumen diameter of about 6.0 Å or about 7.4 Å, and seven alpha-helical barrels have a lumen diameter of about 7.6 Å, as measured by x-ray crystallography. In certain embodiments, the protein barrels have a lumen diameter of greater than about 5 Å, more preferably more than about 5.5 Å. In certain embodiments, the protein barrels have a lumen diameter of less than about 10 Å, more preferably less than about 8 Å.

[0105] A common structural feature in coiled-coil folds, such as in alpha-helical barrels, is that each alpha helix can independently comprise a sequence having a repeat unit with sequence abcdefg, wherein 50% or more of the a and d positions are hydrophobic amino acids and wherein 50% or more of the b, c, e, f and g positions are polar amino acids. In particular, having hydrophobic amino acids at the e and g positions can encourage alpha helix barrel formation, as can be seen in FIGS. 2 and 3. In one example, all the b, c and f positions can be polar amino acids, while all e and/or all g positions are hydrophobic amino acids.

[0106] In further embodiments, 60% or more, 75% or more, or 90% or more of the a and d positions are hydrophobic amino acids. In yet further embodiments, 60% or more, 75% or more, or 80% or more of the b, c, e, f and g positions are polar amino acids.

[0107] In particular examples, the repeat unit with sequence abcdefg can be selected from the list consisting of: LQKIEfI (SEQ ID NO: 1), LKAIAfE (SEQ ID NO: 2), LKEIAfS (SEQ ID NO: 3), IKEIAfS (SEQ ID NO: 4), LKEIAfA (SEQ ID NO: 5), FKEIAfA (SEQ ID NO: 6), IKEIAfA (SEQ ID NO: 7), IKEVAfA (SEQ ID NO: 8), VKEVAfA (SEQ ID NO: 9), VKEIAfA (SEQ ID NO: 10), MKEIAfA (SEQ ID NO: 11), LKQIEfI (SEQ ID NO: 12), LKEVAfA (SEQ ID NO: 13), VKELAfA (SEQ ID NO: 14), IKELSfA (SEQ ID NO: 15), IKELAfS (SEQ ID NO: 16), LKELAfS (SEQ ID NO: 17), FKEIAfA (SEQ ID NO: 18), LKQIEfI and LKELAfA (SEQ ID NO: 19); wherein f may vary between repeat units. While these repeat units represent the basic building block of an alpha helix, there may of course be point mutations such that not every unit is an identical repeat. In any given alpha helix, or in the alpha-helical barrel, up to 40%, preferably 25%, more preferably 10%, of the amino acid residues may deviate from the repeat unit. It can be seen from FIGS. 2 and 3 that position f is directed towards the bulk solvent and plays little role in assembly of the alpha helices with each other. The amino-acid residue at position f is therefore less important, and can vary between repeat units. Position f is therefore usually a polar amino acid to assist with water solubility of the alpha-helical barrel. However, position f is also a good candidate for further functionalisation.

[0108] Each alpha helix can comprise at least three repeat units. Examples of full-length sequences based on the above repeat units include the following.

TABLE-US-00001 Sequence Peptide Name cdefgabcdefgabcdefgabcdefgab CC-Pent Ac-GKIEQILQKIEKILQKI (SEQ ID EWILQKIEQILQG-NH2 NO: 20) CC-Hex Ac-GELKAIAQELKAIAKEL (SEQ ID KAIAWELKAIAQG-NH2 NO: 21) CC-Hex2 Ac-GEIAKSLKEIAKSLKEI (SEQ ID AWSLKEIAKSLKG-NH2 NO: 22) CC-Hept Ac-GEIAQALREIAKALREI (SEQ ID AWALREIAQALRG-NH2 NO: 23) CC-Hex2-I10K Ac-GEIAKSLKEKAKSLKEI (SEQ ID AWSLKEIAKSLKG-NH2 NO: 24) CC-Hept-I17K Ac-GEIAQALREIAKALREK (SEQ ID AWALREIAQALRG-NH2 NO: 25) CC-Hept-I24D Ac-GEIAKALREIAKALREI (SEQ ID AWALREDAKALRG-NH2 NO: 26) CC-Hept-I24K Ac-GEIAQALREIAKALREI (SEQ ID AWALREKAQALRG-NH2 NO: 27) CC-Hept-I24E Ac-GEIAKALREIAKALREI (SEQ ID AWALREEAKALRG-NH2 NO: 28) AIKEVA Ac-GEVAQAIKEVAKAIKEV (SEQ ID AWAIKEVAQAIKG-NH2 NO: 29) AIKEIA Ac-GEIAQAIKEIAKAIKEI (SEQ ID AWAIKEIAQAIKG-NH2 NO: 30) AVKEIA Ac-GEIAQAVKEIAKAVKEI (SEQ ID AWAVKEIAQAVKG-NH2 NO: 31) AVKEVA Ac-GEVAQAVKEVAKAVKEV (SEQ ID AWAVKEVAQAVKG-NH2 NO: 32) ALKEVA Ac-GEVAQALKEVAKALKEV (SEQ ID AWALKEVAQALKG-NH2 NO: 33) AVKELA Ac-GELAQAVKELAKAVKEL (SEQ ID AWAVKELAQAVKG-NH2 NO: 34) SIKELA Ac-GELAQSIKELAKSIKEL (SEQ ID AWSIKELAQSIKG-NH2 NO: 35) AIKELS Ac-GELSQAIKELSKAIKEL (SEQ ID SWAIKELSQAIKG-NH2 NO: 36) SIKELA Ac-GELAQSIKELAKSIKEE (SEQ ID AWSIKELAQSIKG-NH2 NO: 37) ALKELA Ac-GELAQALKELAKALKEL (SEQ ID AWALKELAQALKG-NH2 NO: 38) SLKELA Ac-GELAQSLKELAKSLKEL (SEQ ID AWSLKELAQSLKG-NH2 NO: 39) ALKELA Ac-GELAQALKELAKALKEQ (SEQ ID AWALKELAQALKG-NH2 NO: 40) ALKELA Ac-GELAQALKELAKALKEE (SEQ ID AWALKELAQALKG-NH2 NO: 41) AFKEIA Ac-GEIAQAFKEIAKAFKEI (SEQ ID AWAFKEIAQAFKG-NH2 NO: 42) AMKEIA Ac-GEIAQAMKEIAKAMKEI (SEQ ID AWAMKEIAQAMKG-NH2 NO: 43) CCHept- Ac-GEIAQALKEIAKALKEC (SEQ ID I17C AWALKEIAQALKG-NH2 NO: 44) CCPent_var Ac-GQIEQILKQIEKILKQI EWILKQIEQILKG-NH.sub.2

[0109] CC-Pent, CC-Hex2, CC-Hept and AIKEIA point mutants where the b (or c in CC-Pent) position is either K or R, and the f positions are either QKWQ or KKWK, and the mutation is at the 3, 7, 10, 14, 17, 21, 24, 28 position:

[0110] CC-Pent-Mutants: Ac-GcIEfILQcIEfILQcIEfILQcIEfILQG-NH.sub.2

[0111] CC-Hex2-Mutants: Ac-GEIAfSLbEIAfSLbEIAfSLbEIAfSLbG-NH.sub.2

[0112] CCHept-Mutants: Ac-GEIAfALbEIAfALbEIAfALbEIAfALbG-NH.sub.2

[0113] AIKEIA-Mutants: Ac-GEIAfAIbEIAfAIbEIAfAIbEIAfAIbG-NH.sub.2

[0114] Each alpha helix listed above is not covalently linked to any other alpha helices within the fully formed alpha-helical barrel. Instead, the alpha helices self-assemble. The alpha-helical barrels formed from the peptides listed above comprise identical alpha helices. However, in different embodiments, the alpha helices within an alpha-helical barrel can be non-identical. With non-identical alpha helices that are not covalently linked, attention should be paid to the different permutations of alpha-helical barrels that can self-assemble. Alternatively, the alpha-helical barrel can comprise a single and continuous amino acid backbone. This affords a much greater level of control over the alpha helices that assemble to form the alpha-helical barrel.

[0115] The protein barrel can comprise a non-natural amino acid. This may be an enantiomer of a natural amino acid, a natural amino acid that has been further functionalised, or any other amino acid. The rigid structure of protein barrels generally allows for substitution of a number of amino acids without compromising the fold of the protein barrel.

[0116] For example, the table below shows how 3 non-proteinogenic peptides are incorporated into the array of 15 barrels and a DPH control by replacing 3 proteinogenic peptides.

TABLE-US-00002 Proteinogenic array DPH control CC-Pent_var (ILKQIE) CC-Hept-I17C CC-Hex (ELKAIA) AFKEIA CC-Hex2 (SLKEIA) AIKEIA CC-Hept (ALKEIA) AIKEVA CC-Hept-I24D AVKEVA CC-Hept-I24E AVKEIA CC-Hept-I24K AMKEIA CC-Hept-I17K

TABLE-US-00003 Non-proteinogenic array DPH control CC-Pent_var (ILKQIE) CC-Hept-I17C CC-Hex (ELKAIA) AFKEIA CC-Hex2 (SLKEIA) AIKEIA CC- Hept (ALKEIA) AIKEVA CC-Hept-124D AVKEVA CC-Hex-L24E AVKEIA CC-Hept-L28NIE AMKEIA CC-Hept-dL (AdLKEIA)

[0117] As can be seen, peptides in the standard proteinogenic array are shown on the left and the non-proteinogenic array on the right incorporates 3 peptide sequences with unnatural amino acids. Nle=Norleucine, dL=Dehydroleucine.

TABLE-US-00004 CCHept-L28Nle: Ac-GEIAQALKEIAKALKEIAWALKEIAQANleKG-NH2 CCHept-dL: Ac-GEIAQAdLKEIAKAdLKEIAWAdLKEIAQAdLKG-NH2 CCHex-L24Nle: (SEQ ID NO: 46) Ac-GELKAIAQELKAIAKELKAIAWENleKAIAQG-NH2

[0118] In one embodiment, the non-natural amino acid is an amino acid that has been modified by chemically linking a protein substrate. Such methods of chemical linkage are well known. The protein substrate would typically be linked to a residue on the external surface of the protein barrel. Where an alpha-helical barrel is used, position f of the heptad repeat on an alpha helix would be a suitable candidate for the anchor for the linker. The protein substrate can comprise an enzyme substrate, receptor substrate and/or antibody substrate. By providing a protein substrate, the target protein can bind to the protein barrel and/or chemically modify the protein substrate. Either the binding of the protein or the chemical modification of the protein substrate can change the configuration of the protein barrel lumen and, in turn, disrupt binding of the reporter dye.

[0119] Each sensor of the sensor array comprises a reporter dye. A dye is a molecule that can provide an optical signal. The optical signal is typically in the ultraviolet and/or visible spectrum. By this, we mean a molecule that can provide a signal in the ultraviolet-visible region of the electromagnetic spectrum. The optical signal may be an absorption or luminescence signal. Preferably, the optical signal is fluorescence.

[0120] In the sensor array, the reporter dye is bound to the lumen reversibly. By this, we mean that the reporter dye is bound entirely, or substantially, within the protein barrel lumen. The binding is reversible, meaning that the reporter dye is free to unbind from the lumen, or to undergo changes in binding within the lumen. This reversible binding is typically mediated by non-covalent interactions. A particularly preferable form of reversible binding is mediated by a hydrophobic reporter dye binding within a hydrophobic lumen. Labile covalent binding may also be used, for example, by means of an imine that can be readily cleaved by nucleophilic substitution.

[0121] To qualify as a reporter dye, the molecule should provide a different signal between being bound to the lumen and when this binding is disrupted. Disruption includes the reporter dye being ejected from the lumen or the reporter dye changing in configuration within the lumen. Ejection may occur when an analyte enters the lumen and displaces the reporter dye, in other words, by competitive binding. Ejection may also occur when an analyte binds to the exterior of a protein barrel such that the lumen changes in configuration to the extent that the reporter dye can no longer bind to the lumen. Alternatively, in this scenario, the change in configuration of the lumen results in a change in configuration of the reporter dye.

[0122] The reporter dye can be free to leave the lumen, for example, when the lumen is open at both ends. In an alternative embodiment, the reporter dye is encapsulated within the lumen. In this embodiment, the sensor relies on an analyte changing the lumen configuration such that the reporter molecule changes in configuration and exhibits a different signal.

[0123] In a preferred embodiment, the reporter dye provides an optical signal when bound to the lumen. For reporter dyes that can provide signals constituting a positive signal or no signal, depending on environment, (for example, a reporter dye that can fluoresce in one environment but cannot fluoresce in a different environment), the positive signal exists when the reporter dye is bound to the lumen. This is in contrast to a reporter dye where the optical signal exists in free solution, but does not exist when bound to the protein lumen.

[0124] The reporter dye can be a compound according to Formula I

##STR00002##

wherein n is 3 or more, preferably n is 3, 4 or 5, more preferably n is 3; and R1 and R2 are independently selected from aryl or heteroaryl, preferably aryl, more preferably phenyl. Reporter dyes in accordance with Formula I are therefore generally hydrophobic and able to adopt an elongate configuration. In a preferred embodiment, the dye is 1,6-diphenyl-1,3,5-hexatriene.

[0125] Alternative dyes may be used, including any naphthalene such as 6-propionyl-2-dimethylaminonaphthalene (prodan).

[0126] The sensor array may comprise at least one further sensor, wherein the reporter dye is different in the at least one further sensor. This allows for a sensor, or series of sensors, where a dye with very different properties is used. This can allow for more diversity to be brought to the sensor array.

[0127] The protein barrel may be immobilised on a substrate. The substrate may be, for example, a surface comprising a glass or plastics material. The protein barrel of any given sensor may be immobilised within the well of a multiwell plate. This would allow for washing and reuse of the protein barrel. The protein barrel of any given sensor may be immobilised on a flat surface, alongside neighbouring immobilised protein barrels from different sensors in the sensor array. This would allow for a single analyte to be readily applied across different sensors, without the protein barrels diffusing and interfering with each other. This would also allow for miniaturisation of the sensor array, allowing for a considerable number of sensors (i.e. perhaps at least 500 or at least 1000 sensors) to be present in a surface area of a small surface area (i.e. perhaps less than 5 or even less than 2 square centimetres). Such an array would provide a significant ability to distinguish between different analytes in a convenient and low-cost array. Such arrays are sometimes referred to as microchip arrays.

[0128] Techniques for immobilising protein barrels on a substrate are well-known (one example (Pai et al., 2012), discloses immobilisation of peptides in a microarray). Where the protein barrel comprises a number of self-assembled subunits, just one, multiple or all subunits may be individually immobilised. Typically, N- or C-terminal residues are used for immobilisation as this can lower the chance of disrupting the protein fold/3D structure. However, non-terminal residues may instead be used for linking the protein barrels to a substrate. For example, where an alpha protein barrel is used, an f position amino-acid residue could provide a suitable anchor point for immobilisation. Often, a flexible linker can be used between the protein barrel and the substrate to allow a certain degree of movement of the immobilised protein barrel.

[0129] The reporter dye can also be immobilised. The reporter dye can be immobilised to the substrate, by means of a linker that allows the reporter dye enough freedom of movement to enter and leave the protein barrel lumen. Alternatively, the reporter dye can be immobilised by linking to the protein barrel. Again, a linker should be used that allows the reporter dye enough freedom of movement to enter and leave the protein barrel. A different possibility is that the reporter dye is encapsulated within the lumen. In this possibility, the ends of the lumen would be blocked after the reporter dye has bound to the lumen. Immobilisation of the dye and barrel further allows for a sensor array that is reusable or can be used in-line, without needing to consider that either the protein barrel or dye may wash away.

[0130] The protein barrel and reporter dye can be in a dry state. By this, we mean that the complex of protein barrel and reporter dye have been dried. Drying can be carried out by techniques including air drying and lyophilisation. In the dry state, the sensor array can be stored and transported easily. Prior to use, the sensor array should be rehydrated. Rehydration can be achieved by adding an aqueous solution in advance of applying a test sample, or by adding an aqueous test sample.

[0131] These repeat sequences reflect repeat units of de novo alpha-helical barrels that form five-, six-, seven and eight-membered alpha-helical barrels.

[0132] While these repeat units represent the basic building block of an alpha helix, there may of course be point mutations such that not every unit is an identical repeat.

[0133] The analyte or complex mixture of analytes to be detected is in the sample obtained from a patient such as a human or animal, and is usually a liquid or in solution. It would also be advantageous to be able to analyse gaseous analytes such as breath. As an alternative to immobilisation on a solid substrate, the protein barrel can be immobilised in or on a hydrogel or 3-dimensional porous scaffold substrate. This has the advantage that the sensor array could be used to detect gaseous analytes, as these can be dissolved in the hydrogel and hence accessible to the barrel. In particular, the barrels can be loaded into hydrogels, or 3-dimensional porous scaffolds, either covalently or non-covalently. Polymers (such as poly(ethylene glycol), polydimethyl siloxane and polyacrylamide), polysaccharides (such as chitosan, alginate and agarose) and peptide hydrogels are examples of materials that could be used to form the hydrogels.

[0134] The invention also provides for a microarray chip comprising a sensor array according to the first aspect of the invention. Microarray chip technology is well known. The microarray chip can be 3D printed. The microarray chip can comprise the sensor array in a dry state, wherein an aqueous test sample is soaked onto the chip. The microarray chip may be analysable by a smartphone.

[0135] The sensor arrays of the invention provide significant amounts of data. It can be very difficult or even impossible for the human eye to detect the differences that distinguish between analytes, or complex mixtures of analytes as will likely be present in the samples. However, these differences are much more amenable to computational approaches. As such, step (d) may comprise the use of computational pattern recognition. Examples of computational pattern recognition used in the art include principal component analysis (PCA), linear discriminant analysis (LDA), hierarchical cluster analysis (HCA) and artificial neural networks (ANN).

EXPERIMENTAL

Synthesis of Protein Barrels

[0136] Alpha-helical barrels based on alpha helices with the following sequences (corresponding to the alpha-helical barrels referred to in FIG. 6) were synthesised.

TABLE-US-00005 Number of helices Peptide in barrel Sequence CC-Hept-I17C 7 Ac-GEIAQALKEIAKALKE CAWALKEIAQALKG-NH.sub.2 AFKEIA 6 Ac-GEIAQAFKEIAKAFKE IAWAFKEIAQAFKG-NH.sub.2 AIKEIA 8 Ac-GEIAQAIKEIAKAIKE IAWAIKEIAQAIKG-NH.sub.2 AIKEVA 7 Ac-GEVAQAIKEVAKAIKE VAWAIKEVAQAIKG-NH.sub.2 AVKEVA 6 Ac-GEVAQAVKEVAKAVKE VAWAVKEVAQAVKG-NH.sub.2 AVKEIA 6 Ac-GEIAQAVKEIAKAVKE IAWAVKEIAQAVKG-NH.sub.2 AMKEIA 7 Ac-GEIAQAMKEIAKAMKE IAWAMKEIAQAMKG-NH.sub.2 CC- 5 Ac-GQIEQILKQIEKILKQ Pent_var(ILK IEWILKQIEQILKG-NH.sub.2 QIE) CC-Hex 6 Ac-GELKAIAQELKAIAKE (ELKAIA) LKAIAWELKAIAQG-NH.sub.2 CC-Hex2 6 Ac-GEIAKSLKEIAKSLKE (SLKEIA) IAWSLKEIAKSLKG-NH.sub.2 CC-Hept 7 Ac-GEIAQALREIAKALRE (ALKEIA) IAWALREIAQALRG-NH.sub.2 CC-Hept-I24D 7 Ac-GEIAKALREIAKALRE IAWALREDAKALRG-NH.sub.2 CC-Hept-I24E 7 Ac-GEIAKALREIAKALRE IAWALREEAKALRG-NH.sub.2 CC-Hept-I24K 7 Ac-GEIAQALREIAKALRE IAWALREKAQALRG-NH.sub.2 CC-Hept-I17K 7 Ac-GEIAQALREIAKALRE KAWALREIAQALRG-NH.sub.2

[0137] The peptide sequences were synthesised and characterized using techniques previously described (Thomson et al., 2014).

[0138] Fmoc amino acids, DMF and Cl-HOBt were purchased from AGTC Bioproducts (Hessle, UK). Rink amide ChemMatrix solid support was purchased from PCAS BioMatris Inc (Saint-Jean-sur-Richelieu, Canada). TMA-DPH and farnesyl pyrophosphate (FPP) were purchased from Sigma-Aldrich (Gillingham, UK). Farnesol was purchased from Alfa Aesar (Heysham, UK). All other chemicals were purchased from Fisher-Scientific (Loughborough, UK). Unless stated otherwise, biophysical measurements were performed in HEPES buffered saline (HBS; 25 mM HEPES, 100 mM NaCl, pH 7.0). Peptide concentration was determined by UV-Vis on a ThermoScientific (Hemel Hemstead, UK) Nanodrop 2000 spectrometer (ε.sub.280=5690 cm.sup.−1).

[0139] Standard Fmoc solid-phase peptide synthesis was performed on a CEM (Buckingham, UK) Liberty Blue automated peptide synthesis apparatus with inline UV monitoring. Activation was achieved with DIC/Cl-HOBt. Fmoc deprotection was performed with 20% v/v morpholine/DMF. All peptides were produced as the C-terminal amide on Rink amide ChemMatrix solid support and N-terminally acetylated upon addition of acetic anhydride (0.25 mL) and pyridine (0.3 mL) in DMF (5 mL) for 30 minutes at room temperature (rt). Peptides were cleaved from the solid support by addition of trifluoroacetic acid (9.5 mL), triisopropylsilane (0.25 mL) and water (0.25 mL) for 3 hours with shaking at rt. The cleavage solution was reduced to approximately 5 mL under a flow of nitrogen. Crude peptide was precipitated upon addition of diethyl ether (40 mL) and recovered via centrifugation. The resulting precipitant was dissolved in 1:1 acetonitrile and water (≈15 mL) and lyophilised to yield crude peptide as a while solid.

[0140] Peptides were purified by reverse phase HPLC on a Phenomenex (Macclesfield, UK) Luna C18 stationary phase column (150×10 mm, 5 μM particle size, 100 Å pore size). A 20-80% gradient of acetonitrile and water (with 0.1% TFA) was applied over 30 minutes. Fractions containing pure peptide were identified by analytical HPLC and MALDI-TOF MS, and were pooled and lyophilised.

Binding of Dyes to Lumen

[0141] Initial experiments sought to demonstrate that reporter dyes would bind within the lumen of alpha-helical barrels. The dyes 1,6-diphenyl-1,3,5-hexatriene (DPH) and 6-propionyl-2-dimethylaminonaphthalene (prodan) were assayed against a number of alpha-helical barrels to determine their dissociation constants, K.sub.D. DPH or prodan (1 μM) was incubated with varying concentrations of alpha-helical barrel (0.5-500 μM) for up to 2 hours, and the fluorescent signal measured at the corresponding emission wavelength.

TABLE-US-00006 Peptide DPH K.sub.D (μM) Prodan K.sub.D (μM) CC-Pent 22.4 ± 4.3 — CC-Hex 7.1 ± 1.3 — CC-Hex2 9.5 ± 1.1 39.2 ± 6.8 CC-Hept 8.9 ± 2.2 40.5 ± 4.0

[0142] It can be seen from the table above that DPH binds to all four alpha-helical barrels, while prodan did not bind to the alpha-helical barrels comprising CC-Pent or CC-Hex. Prodan did not bind as tightly to these alpha-helical barrels as DPH.

Dye Displacement by Certain Analytes

[0143] After providing proof of concept that reporter dyes can bind within the lumen of alpha-helical barrels, the next step was to demonstrate that bound reporter dyes can be displaced by analytes. The four analytes below were selected based on having hydrophobic properties and being able to adopt an elongate configuration, as these were postulated to have the best chance of displacing a reporter dye.

##STR00003##

[0144] DPH was used as the reporter dye, and displacement of DPH was recorded using a standard competitive inhibition assay. In other words, the ability of an analyte to inhibit DPH binding was recorded by the inhibition constant K.sub.i. Alpha-helical barrels were incubated with DPH, or its cationic variant 1-(4-trimethylammoniumphenyl)-6-phenyl-1,3,5-hexatriene p-toluenesulfonate (TMA-DPH). Analyte was added (0.05-300 μM) and the fluorescence signal measured.

TABLE-US-00007 Palmitic acid Retinol Famesol B-carotene Peptide K.sub.1 (μM) K.sub.1 (μM) K.sub.1 (μM) K.sub.1 (μM) CC-Pent 1.1 ± 0.5 14.8 ± 4.1 — — CC-Hex 1.0 ± 0.3 6.4 ± 3.2 23.9 ± 2.4 — CC-Hex2 1.1 ± 0.3 4.6 ± 1.9 8.6 ± 1.3 — CC-Hept 0.9 ± 0.3 4.0 ± 0.7 0.6 ± 0.2 12.1 ± 5.4

[0145] In all cases where competitive binding was observed, the inhibition constant was in the low micromolar range, similar to the dissociation constant of DPH indicating a similar strength of binding, and demonstrating that reporter dyes can be displaced by analytes.

[0146] Further evidence of analyte binding was provided by an x-ray crystal structure of farnesol bound within the lumen of the CC-Hex2 alpha-helical barrel. This is shown in FIGS. 5A and 5B. To obtain this crystal structure, a lyophilized sample of CC-Hex2 was resuspended in deionized water to a concentration of 5 mg ml.sup.−1. Vapor-diffusion crystallization trials were set up at 19° C. using previously optimized conditions.sup.1 (0.1 M Na HEPES, 4.3 M sodium chloride at pH 7.5) by mixing 1 μl of CC-Hex2 with 1 μl of reservoir solution. Diffraction-quality crystals were obtained in 4 days. A solution of farnesol (2 mM) was prepared in 40% v/v DMSO:H.sub.2O and crystals were soaked for 1, 5, 20, 60 and 120 min. At each time point, the crystals were soaked in the reservoir solution containing 20% glycerol before freezing.

[0147] X-ray diffraction data were collected at the Diamond Light Source (Didcot, UK) on beamline 104-1 at a wavelength of 0.98 Å. Data were processed with MOSFLM (Battye et al., 2011) and AIMLESS (Evans and Murshudov, 2013), as implemented in the CCP4 suite (Winn et al., 2011). Due to high anisotropy in the diffraction data, the resultant mtz file was truncated to 2 Å in the b-axis using the Diffraction Anisotropy Server (Strong et al., 2006).

[0148] The crystal structure was solved by molecular replacement using a poly-alanine model of CC-Hex2 (PDB 4pn8). The structure was obtained after iterative rounds of model building with COOT (Emsley and Cowtan, 2004) and refinement with PHENIX refine (Afonine et al., 2012). Refinement was carried out with torsion-libration-screw (TLS) (Zucker, Champ and Merritt, 2010) and non-crystallographic symmetry (NCS) parameters. An Omit map was calculated from the final model after removal of the ligand and refinement in Phenix. Ligand structures and geometric restraints were calculated using Phenix eLBOW (Moriarty, Grosse-Kunstleve and Adams, 2009).

[0149] The final refined structure showed good stereochemistry, as analysed by MOLPROBITY (Chen et al., 2010) and Ramachandran plots indicated that no residues fell outside preferred regions of backbone conformational space.

Differential Arrays

[0150] In a proof-of-principle experiment, 15 different alpha-helical barrel designs, as set out in FIG. 6, were arrayed in 96-well plates. The different alpha-helical barrels have a variety of sizes, with between 5 and 7 alpha helices. The different alpha-helical barrels have different charges, with some being neutral, some having negatively charged carboxylate groups in the lumen and some having positively charged ammonium groups in the lumen.

[0151] The reporter dye DPH was added to each well and allowed to bind within the lumens of each alpha-helical barrel. Seven different small and large molecules were then subjected to the sensor assay. The molecules and the optical signal of each sensor in each sensor assay is shown in FIG. 7. This Figure shows a unique binding signature for each of the molecules.

[0152] It is important to realise the significance of the molecules screened. Cholesterol and nervonic acid are largely hydrophobic molecules that might be expected to bind readily within the lumen of an alpha-helical barrel. Furthermore, both can act as biomarkers, cholesterol for cardiovascular disease and nervonic acid for psychoses.

[0153] Dimethylarginine and N-acetyl-L-aspartic acid are highly polar amino acids, bearing multiple charges. It might be expected that these molecules would have little effect on an alpha-helical barrel with an uncharged and hydrophobic lumen, however, a displacement pattern is seen even across such alpha-helical barrels.

[0154] Hexamethyltetramine is an explosives precursor and again produces a distinct displacement pattern. Triisopropylphosphate is a sterically bulky nerve agent analogue.

[0155] A significant result was the sensor array pattern produced by insulin. Insulin is a peptide that should not be able to fit within the lumen of the alpha-helical barrels used in the assay. However, a unique reporter dye displacement pattern was still produced. This provides evidence that even when analytes interact with the outer surface of an alpha-helical barrel, reporter dye displacement can occur.

[0156] High reproducibility was observed in repeat assays, as can be seen for the replicate data presented in FIG. 9.

[0157] FIG. 10 shows a workflow for applying computational pattern recognition to the sensor array results. The raw data is normalised, before looking for patterns that uniquely identify the analyte. By applying machine learning to the sensor array patterns for each molecule, the predictive power showed greater than 95% correct predictions.

[0158] FIG. 11 shows how the prediction of analytes from naïve (unseen) data improves as the proportion of the data from known training sets is increased. In this case, by using random selection of just ≈30% of the 150 datasets of array signatures recorded for each of the known compounds, >90% of the predictions from the non-training-sets data are correct.

Analysing Complex Mixtures

[0159] A selection of teas was analysed as a test bed for the analysis of complex mixtures. A total of 9 different boxes of tea bags where purchased from local supermarkets. This comprised three black teas (PG Tips, Yorkshire Tea, and Pukka English Breakfast), three Earl Grey Teas (Twinings The Earl Grey, Pukka Gorgeous Earl Grey, and Clipper Organic Earl Grey), and three Green Teas (Clipper Organic Green Tea, Twinings Pure Green Tea, and Tetley Pure Green Tea).

[0160] Teas were brewed in the laboratory as follows: Firstly, when applicable, strings and labels were removed from tea bags. Next, deionised water was boiled in a newly purchased kettle free of limescale. A single tea bag was placed in a 500 mL Schott bottle with a 50 mm stirrer bar before 250 mL of deionised water was added, and the tea allowed to brew for 5 min with stirring (100 rpm). After this time, 1 mL of the tea solution was removed, and diluted 1:10 with deionised water and the solution snap frozen in liquid nitrogen and then stored at −80° C. Fresh tea samples were prepared for each experimental replicate using an identical protocol.

[0161] Using a suite of 15 barrel-forming peptide, plus a non-peptide containing control, tea was analysed by observing DPH displacement to yield fingerprints as depicted in FIG. 12. FIG. 12 shows the DPH displacement fingerprints produced by selected tea samples as follows: Panel A PGTIPS; Panel B Pukka English Breakfast; Panel C Yorkshire Tea; Panel D Clipper Organic Earl Grey; Panel E Pukka Gorgeous Earl Grey; Panel F Twinings The Earl Grey; Panel G Clipper Organic Green Tea; Panel H Tetley Pure Green Tea; and Panel I Twinings Pure Green Tea.

[0162] Implementing machine leaning techniques, tea could be successful classified by class (i.e. Black, Earl Grey or Green Tea) with 82.3% accuracy and by specific type with 90.0% accuracy.

Analysing Epimers.

[0163] Glucose, galactose and mannose were analysed in an array of 15 peptides and a DPH control. These three sugars are epimers in that they differ by configuration and a single stereo-centre. Solutions of each of the three were prepared at 10 mM concentration ion water before being analysed at 1 mM final concentration in the barrel array in which DPH displacement was measured. Each sugar was examined using 24 replicates of each barrel, in each of two 384-well plates on two separate days (i.e. 4 plates for each sugar). The peptide array was able to distinguish between these 3 very similar molecules as shown by FIG. 13 which depicts the DPH displacement fingerprints for glucose, galactose and mannose across the top, and across the bottom the structure of each of the epimers.

Non-Natural Amino Acids.

[0164] To demonstrate the use of non-natural amino acids, 3 non-proteinogenic peptides were incorporated into the array of 15 barrels and a DPH control by replacing 3 proteinogenic peptides.

TABLE-US-00008 Proteinogenic array DPH control CC-Pent_var (ILKQIE) CC-Hept-I17C CC-Hex (ELKAIA) AFKEIA CC-Hex2 (SLKEIA) AIKEIA CC-Hept (ALKEIA) AIKEVA CC-Hept-I24D AVKEVA CC-Hept-I24E AVKEIA CC-Hept-I24K AMKEIA CC-Hept-I17K

TABLE-US-00009 Non-proteinogenic array DPH control CC-Pent_var (ILKQIE) CC-Hept-I17C CC-Hex (ELKAIA) AFKEIA CC-Hex2 (SLKEIA) AIKEIA CC- Hept (ALKEIA) AIKEVA CC-Hept-124D AVKEVA CC-Hex-L24E AVKEIA CC-Hept-L28NIE AMKEIA CC-Hept-dL (AdLKEIA)

[0165] As can be seen, peptides in the standard proteinogenic array are shown on the left and the non-proteinogenic array on the right incorporates 3 peptide sequences with unnatural amino acids. Nle=Norleucine, dL=Dehydroleucine.

TABLE-US-00010 CCHept-L28Nle: Ac-GEIAQALKEIAKALKEIAWALKEIAQANleKG-NH2 CCHept-dL: Ac-GEIAQAdLKEIAKAdLKEIAWAdLKEIAQAdLKG-NH2 CCHex-L24Nle: (SEQ ID NO: 46) Ac-GELKAIAQELKAIAKELKAIAWENleKAIAQG-NH2

[0166] Cholesterol was analysed at 1 μM and the DPH displacement fingerprints analysed.

[0167] As can be seen in FIG. 14, a clear difference is observed when the proteinogenic (on the left) and non-proteinogenic (on the right) fingerprints are compared.

D Amino Acid Peptides.

[0168] To demonstrate the use of D-amino acids in the barrel array, an analogue of peptide ALKEVA comprising entirely D-Amino acids was prepared (i.e. peptide d-(AVKEVA), below)

TABLE-US-00011 d-(AVKEVA): (SEQ ID NO: 45) Ac-GevaqavkevakavkevawavkevaqakvG-NH.sub.2

[0169] This peptide, which possesses the opposite chirality to peptide ALKEVA at each chiral centre, was substituted into a 15 peptide barrel array (as listed in Example 1) in place of peptide AVKEIA. Using this modified array, two small molecules were analysed for DPH displacement: N-Acetyl-L-aspartic acid and NG,NG-Dimethylarginine. Solutions of each molecule were prepared at 10 μM in water before being examined at 1 μM concentration with 24 replicates in each of three 384-well plates. FIG. 15 shows the DPH displacement signatures for each of these two molecules. In particular FIG. 15 shows DPH displacement fingerprints for N-Acetyl-L-aspartic acid (Panel A) and NG,NG-Dimethylarginine (Panel B) using a peptide barrel array including one all D-amino acid peptide (d-(AVKEVA)) which is represented by the block depicted on the left, second from bottom in each fingerprint. From these data, machine learning techniques were implemented and the two molecules distinguished with 95.5% accuracy.

Example

[0170] This example demonstrates that the sensor array technology can distinguish between the varying secretome produced by non-cancerous cells, cells derived from primary tumours, and those from secondary tumours.

[0171] A total of 10 cell lines were employed, all of mouse origin: 3 Non-cancerous (NMuMg, HC11, and EpH4), 3 of primary mammary tumour origin (Yej, 113, and 734), and 4 of metastasised mammary tumour origin (Yej-M1, Yej-M2, 113-M1, and 113-M2). Table 1 summarises the cell lines used in the current study. It should also be noted that the cell lines Yej, Yej-M1, and Yej-M2 are iso-genetic—that is to say that the lines Yej-M1 and Yej-M2 are each derived from secondary tumours produced from the fat pad transplant and growth of a Yej derived tumour in a recipient mouse. In a similar fashion, the lines 113, 113-M1 and 113-M2 are also isogenetic, although in this instance 113-M1 and 113-M2 are derived from lung metastasis following tail vein injection of the 113 primary cell line.

TABLE-US-00012 TABLE 1 Cell lines used in the present study. Non- Primary Metastasised Cancerous Tumour Tumour NMuMg Yej Yej -M1 HC11 113 Yej -M2 EpH4 724 113-M1 113-M2

Preparing the Samples—Cell Lines and Conditioned Media

[0172] NMuMg, EpH4 and HC11 cells are epithelial cells derived from normal glandular mouse tissues (commercially available). Mammary tumour cell lines were made at the CRUK Beatson Institute, Glasgow, from spontaneous tumours arising in the MMTV-PyMT mouse model of breast cancer. In this model, the PyMT oncogene is expressed under control the control of the mammary gland specific MMTV-LTR promoter, resulting in well characterised disease progression that recapitulates the key events occurring in human metastatic breast cancer. Tumours measuring a maximum size of 9 mm×9 mm were excised from the mouse, processed to a pate texture using a tissue chopper, and then digested in collagenase/hyaluronidase (15000 U Collagenase/5000 U hyaluronidase) for 1-2 hours at 37° C. with gentle shaking. Samples were then centrifuged for 1 minute at 15 g, and the supernatant collected. Supernatant was then centrifuged at 100 g for 3 minutes, and the consequent supernatant then centrifuged at 400 g for 10 minutes. The supernatant was then discarded, the cell pellet resuspended in full growth media, and then centrifuged at 800 r.p.m. for 3 minutes to wash the cells. This wash step was repeated a further two times, and then cells were resuspended in full growth media and incubated and maintained at 37° C./5% CO.sub.2 for passaging.

[0173] Metastatic variants of the mammary tumour cell lines were made using a fat pad transplantation model. In short, 0.5 million tumour cells were injected into the fourth mammary fat pad of recipient mice, and tumours allowed to grow until 9 mm×9 mm measurable size. Tumours were then surgically removed and the recipients allowed to recover, with weight and general health monitored over time. Recipients were culled upon signs of metastatic disease, including cachexia, weight loss and difficulty breathing. Lungs were harvested and processed as described above, with metastatic tumour cell lines consequently being isolated from the lungs of recipients that had succumbed to lung metastasis.

[0174] Normal mouse mammary epithelial cells, primary mammary tumour cell lines, and metastatic variants of the primary tumour cell lines, were maintained in DMEM supplemented with 10% FBS, 2 mM L-Glutamine, 10 ug/mL Insulin, 20 ng/mL EGF and 100 U/L Penicillin-Streptomycin at 37° C./5% CO.sub.2. Cells were plated at a density of 2×10.sup.6 cells per 10 cm dish in 10 mL total volume, and incubated at 37° C./5% CO.sub.2 for 24 hours. Conditioned media was then collected and subjected to the following differential centrifugation protocol: 300 g for 10 minutes, 2000 g for 10 minutes, and 10000 g for 30 minutes, with all centrifugation steps conducted at 4° C. The resulting cell culture supernatant was then snap frozen and stored at −80° C. before use in the sensor array. Cell counts were also performed at the point of conditioned media collection in order to enable normalisation to final cell number. For each cell line, conditioned media was collected across three separate days to give n=3. Thus, with 10 different cell lines (3 non-cancer, 3 primary, and 4 metastatic) used, and conditioned media collected 3 times we examined 30 different batches of media.

Contact with Sensor Array

[0175] Before analysis in the sensor array, frozen conditioned media samples were defrosted and diluted relative to the cell count measured at the time media was collected. These cell counts ranged from 1.67×10.sup.5 cell/mL to 6.84×10.sup.5 cell/mL. Final concentration of media in sample ranged from 2.0% (for the conditioned media with the lowest cell count) to 0.49% (for the media with the highest cell count).

[0176] The analysis of conditioned media samples was performed as outlined in above, using the sensor array described at the beginning of the Experimental section above. Briefly, a set of 15 barrel-forming coiled coil peptides (plus a single no-peptide control) were arrayed (at 10 μM in HEPES buffered saline) with diphenylhexatriene (DPH; 1 μM) on a 384 well plate (i.e. each peptide plus control was deposited in 24 replicates per plate). Next, a given conditioned media analyte was added across columns 1-5, 8-14, and 17-24 of the plate. An equal volume of water was added to columns 6, 7, 15, & 16 to serve as a control. After 1 h, DPH fluorescence was measured (350/450 nm, excitation/emission) and, for each analyte-containing well, normalised to control well value obtained for that given barrel peptide. Each conditioned media sample was assayed on 4 separate 384 well plates, across 4 different days to give n=4.

Results—Generation of Fingerprints.

[0177] For each sample of conditioned media, normalised DPH fluorescence data from each barrel-forming peptide was averaged across each of the four plates. As described above, colour graduation can be used to represent this average fluorescence from each of the 15 barrel (plus −ve control) as a 16 cell fingerprint.

[0178] FIG. 16 shows the fingerprints generated for all 10 conditioned media samples. This includes media collected from cells of “non-cancerous”, “primary tumour”, and “metastasised tumour” origins as indicated. Each fingerprint conditioned media is labelled as follows. A: NMuMg; B: HC11; C:EpH4; D: Yej; E: 113; F: 724; G: Yej-M1; H: Yej-M2; I: 113-M1; J: 113-M2.

[0179] FIG. 17 shows the fingerprints generated for combined “non-cancerous” (panel A), “primary tumour” (B) and metastasised tumour (C) derived conditioned media.

Results—Machine Learning Algorithms

[0180] Using machine learning techniques, we were able to successfully categorise the cells as being from cancerous or non-cancerous origin with 65.5% accuracy. Taking this analysis a step further, attempting a 3-way classification for non-cancer vs primary cancer vs. metastasised cancer returned an accuracy of 47.5% (baseline “guessing” would return only 33%). And finally, focussing exclusively on primary and metastatic tumour-derived samples, returned an accuracy of 67.1% in being able to distinguish between the two. It is expected that with a larger dataset and further use of pattern recognition and artificial intelligence the accuracy will greatly improve going forward. Confusion matrices for each of these analysis are shown in FIGS. 18, 19 and 20.

[0181] FIG. 18 is the confusion matrix for 2-way prediction of healthy, non-cancerous cells, and cells originating tumours.

[0182] FIG. 19 is the confusion matrix for 3-way prediction of healthy, non-cancerous cells, and cells originating from primary tumours and metastatic tumours.

[0183] FIG. 20 is the confusion matrix for 2-way prediction of cells originating from primary tumours and metastatic tumours.

Interrogating the Sensor Fingerprint In Vitro

[0184] Fractionation approaches can be used to interrogate the secretome of the primary tumour cells, and their metastatic variants, in order to inform which components are responsible for distinguishing the fingerprint of a non-cancer versus cancerous sample, and primary versus metastatic samples. A variety of approaches can be used to understand whether these distinguishing features are constituents of either exosomes, the water soluble compartment, or the lipid soluble compartment of the samples.

[0185] With respect the exosome content, centrifugation of the samples at 100,000 g at 4° C. for 70 minutes can be used to isolate the exosomes from the conditioned media of the described cell lines, with consequent use of the Sensor array to fingerprint exosome depleted samples, and enable us to understand whether or not the exosomes are a distinguishing factor in this analysis.

[0186] To the same end, we can also deplete secreted proteins from such samples to understand whether or not the secreted proteome is also a contributing factor. In this case, conditioned media are centrifuged at 300 g for 10 minutes at 4° C., supernatant collected and centrifuged at 2000 g for 10 minutes at 4° C., and supernatant then collected and centrifuged at 10,000 g for 30 minutes at 4° C. Consequent supernatant is then acidified to pH5 with 10% TFA and 10 uL Strataclean (hydroxylated silica) beads added per 1 mL of media. The media/bead slurry is then vortexed for 1 minute and incubated overnight on a rotor wheel at 4° C. The beads are then collected by brief centrifugation, with secreted proteins then being bound to the beads, therefore leaving then conditioned media depleted of proteins and available for fingerprinting for the sensor array according to the invention.

[0187] We also have the ability to isolate metabolites and lipids from such samples, and therefore to implement these approaches in this analysis. With regards to the metabolomics, metabolites are extracted in a polar solvent (50% methanol, 30% acetonitrile, 20% water) and centrifuged to precipitate and remove any proteins present. These extracts can then be applied to the sensor array to obtain a fingerprint for the non-cancer, primary and metastatic samples, whilst in parallel we use HILIC liquid chromatography (LC) coupled with high resolution Orbitrap mass spectrometry (Thermo Scientific) to profile the polar metabolites in these samples in an untargeted fashion. In reference to the lipid component of the secretome, lipids can be extracted in a two-step procedure by the Folch method. The biological samples are treated with a mixture of chloroform and methanol, forming bi-phasic layers, and the chloroform layer are then subsequently evaporated and reconstituted in a compatible organic solvent. We again have the ability to test the lipid extracts on the sensor array, whilst also profiling the contents of those samples in parallel to characterise any differences in the samples. In short, lipids are separated using reversed-phase (RP) liquid chromatography using C18 columns as well as mobile phase modifiers. We use two chromatographic methods to separate lipids: [0188] The general lipidomics method separates lipid species using a gradient of solvents such as water, acetonitrile, and isopropanol, as well as ammonium formate as modifier. This method allows the identification of more than 20 lipid classes, including the triacylglycerol (TG), phosphatidyl ethanolamine (PE), phosphatidyl choline (PC), and ceramide (Cer) families. [0189] The polar lipidomics method uses only water and methanol in the chromatographic gradient, and we use ammonia as modifier. This is useful when the intention is to analyse polar lipids that are not detected in the general method, such as lysophosphatidic acid (LPA).

[0190] We can then use high resolution Orbitrap mass spectrometry in separate polarity modes and data-dependent fragmentation acquisition (ddMS2), with lipid identification being dependent on both accurate mass and fragmentation patterns. Both of these methods will enable us to extract, fingerprint and define the metabolite and lipid composition of the samples.

Interrogating the Sensor Fingerprint In Vivo

[0191] The above approaches can also be applied to samples derived from our mouse models of cancer. We can test the sensor array's ability to distinguish between the serum of mice derived from different genetic backgrounds. We can apply the principles described above to whole and fractionated sera from mouse models of cancer, and to sera from healthy volunteers and cancer patients.

REFERENCES

[0192] Adams, M. M.; Anslyn, E. V. Journal of the American Chemical Society 2009, 131, 17068-17069 [0193] Afonine, P. V.; Grosse-Kunstleve, R. W.; Echols, N.; Headd, J. J.; Moriarty, N. W.; Mustyakimov, M.; Terwilliger, T. C.; Urzhumtsev, A.; Zwart, P. H.; Adams, P. D. Acta Crystallographica Section D-Biological Crystallography 2012, 68, 352. [0194] Battye, T. G. G.; Kontogiannis, L.; Johnson, O.; Powell, H. R.; Leslie, A. G. W. Acta Crystallographica Section D-Biological Crystallography 2011, 67, 271. [0195] Collie, G. W.; Pulka-Ziach, K.; Lombardo, C. M.; Fremaux, J.; Rosu, F.; Decossas, M.; Mauran, L.; Lambert, O.; Gabelica, V.; Mackereth, C. D.; Guichard, G. Nature Chemistry 2015, 7, 871-878. [0196] Chen, V. B.; Arendall, W. B.; Headd, J. J.; Keedy, D. A.; Immormino, R. M.; Kapral, G. J.; Murray, L. W.; Richardson, J. S.; Richardson, D. C. Acta Crystallographica Section D-Biological Crystallography 2010, 66, 12. [0197] Diehl, K. L.; Ivy, M. A.; Rabidoux, S.; Petry, S. M.; Müller, G.; Anslyn, E. V. Proceedings of the National Academy of Sciences of the USA 2015, 112, E3977-E3986. [0198] Donadelli M. The cancer secretome and secreted biomarkers. Semin Cell Dev Biol. 2018:78:1-2. [0199] Emsley, P.; Cowtan, K. Act. Cryst. D 2004, 60, 2126. [0200] Evans, P. R.; Murshudov, G. N. Acta Crystallographica Section D-Biological Crystallography 2013, 69, 1204. [0201] Fletcher, J. M. et al. ACS Synthetic Biology 2012, 1, 240-250. [0202] Ghanem, E.; Afsah, S.; Fallah, P. N.; Lawrence, A.; LeBovidge, E.; Raghunathan, S.; Rago, D.; Ramirez, M. A.; Telles, M.; Winkler, M.; Schumm, B.; Makhnejia, K.; Portillo, D.; Vidal, R. C.; Hall, A.; Yeh, D.; Judkins, H.; Ataide da Silva, A.; Franco, D. W.; Anslyn, E. V. ACS Sensors 2017, 2, 641-647. [0203] Hanahan D, Weinberg R A. The hallmarks of cancer. Cell. 2000; 100(1):57-70. [0204] Hanahan D, Weinberg R A. The hallmarks of cancer: the next generation. Cell. 2011:144(5):646-74. [0205] Ivy, M. A.; Gallagher, L. T.; Ellington, A. D.; Anslyn, E. V. Chemical Science 2012, 3, 1717-2176. [0206] Koronakis, V.; Sharff, A.; Koronakis, E.; Luisi, B.; Hughes, C. Nature 2000, 405, 914-919. [0207] Kubarych, C. J.; Adams, M. M.; Anslyn E. V. Organic Letters 2010, 12, 4780-4783. [0208] Liotta L A, Ferrari M, Petricoin E. Clinical proteomics: written in blood. Nature. 2003; 425:905 Tjalsma H, Bolhuis A, Jongbloed J D, Bron S, van Dijl J M. Signal Peptide-Dependent Protein Transport in Bacillus subtilis: a Genome-Based Survey of the Secretome. Microbiol Mol Biol Rev. 2000; 64:515-547 [0209] Lombardo, C. M.; Collie, G. W.; Pulka-Ziach, K.; Rosu, F.; Gabelica, V.; Mackereth, C. D.; Guichard, G. Journal of the American Chemical Society 2016, 138, 10522-10530. [0210] Malashkevich, V. N.; Kammerer, R. A.; Efimov, V. P.; Schulthess, T.; Engel, J. Science 1996, 274, 761-765. [0211] Meusch, D. et al. Nature 2014, 508, 61-65. [0212] Moriarty, N. W.; Grosse-Kunstleve, R. W.; Adams, P. D. Acta Crystallographica Section D-Biological Crystallography 2009, 65, 1074. [0213] Novo D, Heath N, Mitchell L, Caligiuri G, MacFarlane A, Reijmer D, Charlton L, Knight J, Calka M, McGhee E, Dornier E, Sumpton D, Mason S, Echard A, Klinkert K, Secklehner J, Kruiswijk F, Vousden K, Macpherson I R, Blyth K, Bailey P, Yin H, Carlin L, Morton J, Zanivan S, Norman J. Nat Commun. 2018: 9: 5069. [0214] Pai, J.; Yoon, T.; Kim, N. D.; Lee, I. S.; Yu, J.; Shin, I. Journal of the American Chemical Society 2012, 134, 19287-19296. [0215] Rhys, G.; Wood, C.; Lang, E.; Mulholland, A.; Brady, R.; Thomson, A.; Woolfson, D. Nature Communications 2018, 9; 4132. [0216] Strong, M.; Sawaya, M. R.; Wang, S. S.; Phillips, M.; Cascio, D.; Eisenberg, D. Proceedings of the National Academy of Sciences of the United States of America 2006, 103, 8060. [0217] Sun, L. et al. Nature 2014, 505, 432-435. [0218] Thomas, F.; Dawson, W.; Lang, E.; Burton, A.; Bartlett, G.; Rhys, G.; Mulholland, A.; Woolfson, D. ACS Synth. Biol. 2018, 7, 1808-1816. [0219] Thomson, A. R.; Wood, C. W.; Burton, A. J.; Bartlett, G. J.; Sessions, R. B.; Brady, R. L.; Woolfson, D. N. Science 2014, 346, 485-488. [0220] Umali, A. P.; Anslyn, E. V. Curr. Op. Chem. Biol 2010, 14, 685-692. [0221] Umali, A. P.; Ghanem, E.; Hopfer, H.; Hussain, A.; Kao, Y.; Zabanal, L. G.; Wilkins, B. J.; Hobza, C.; Quach, D. K.; Fredell, M.; Heymann, H.; Anslyn, E. V. Tetrahedron 2015, 71, 3095-3099. [0222] Winn, M. D.; Ballard, C. C.; Cowtan, K. D.; Dodson, E. J.; Emsley, P.; Evans, P. R.; Keegan, R. M.; Krissinel, E. B.; Leslie, A. G. W.; McCoy, A.; McNicholas, S. J.; Murshudov, G. N.; Pannu, N. S.; Potterton, E. A.; Powell, H. R.; Read, R. J.; Vagin, A.; Wilson, K. S. Acta Crystallographica Section D-Biological Crystallography 2011, 67, 235. [0223] You, L.; Zha, D.; Anslyn, E. V. Chemical Reviews 2015, 115, 7840-7892. [0224] Zaccai, N. R.; Chi, B.; Thomson, A. R.; Boyle, A. L.; Bartlett, G. J.; Bruning, M.; Linden, N.; Sessions, R. B.; Booth, P. J.; Brady, R. L.; Woolfson, D. N. Nature Chemical Biology 2011, 7, 935-941. [0225] Zucker, F.; Champ, P. C.; Merritt, E. A. Acta Crystallographica Section D-Biological Crystallography 2010, 66, 889.

Sensor

Inventors

Cpc classification

Classification Explorer

G01N33/57484

PHYSICS

Classification Explorer

G01N33/68

PHYSICS

Classification Explorer

G01N2800/56

PHYSICS

Classification Explorer

G01N21/6428

PHYSICS

Classification Explorer

G01N2021/6439

PHYSICS

Classification Explorer

G01N2470/12

PHYSICS

Classification Explorer

G01N33/574

PHYSICS

Classification Explorer

C07K7/06

CHEMISTRY; METALLURGY

International classification

Classification Explorer

G01N33/574

PHYSICS

Classification Explorer

G01N21/64

PHYSICS

Abstract

Claims

Description