Biomarkers for detecting microbial infection

Abstract

The present invention provides specific peptide biomarkers and sets of peptide biomarkers for use in methods of detecting or identifying bacterial biomarkers in a sample, wherein said bacterial biomarkers can be used to detect Streptococcus pneumoniae, Staphylococcus aureus, Haemophilus influenzae, Escherichia coli, and/or Moraxella catarrhalis in a sample. Kits and diagnostic methods are also provided.

Claims

1. A non-naturally occurring set of peptides comprising 2-50 different peptides, wherein at least two peptides of the 2-50 different peptides are desalted and wherein the at least two peptides of the 2-50 different peptides consist of an amino acid sequence selected from one of the following groups: (i) SEQ ID NOs:52-71 and 168-197; (ii) SEQ ID NOs:36-51 and 104-135; (iii) SEQ ID NOs:1-17, 136-167, and 248; (iv) SEQ ID NOs:18-35 and 72-103; (v) SEQ ID NOs:198-247; (vi) SEQ ID NOs:52-71; (vii) SEQ ID NOs:36-51; (viii) SEQ ID NOs:1-17; (ix) SEQ ID NOs:18-35 (x) SEQ ID NOs:19, 36, 41, 42, 47, 52, 53, 249-268; or (xi) SEQ ID NOs:52, 60, 61, 171, 174, 181, 185, 269-274.

2. The non-naturally occurring set of peptides according to claim 1, wherein the at least two peptides of the 2-50 different peptides consist of an amino acid sequence selected from one of the following groups: (i) SEQ ID NOs:52-55 and 60; (ii) SEQ ID NOs:36, 37, 41, 42, and 48; (iii) SEQ ID NOs:1-5; or (iv) SEQ ID NOs:18-22.

3. A method of detecting or identifying a bacterial biomarker in a sample, the method comprising: detecting in the sample at least one desalted peptide of the non-naturally occurring set of peptides of claim 1.

4. The method according to claim 3, wherein the method is used to detect or identify one of Streptococcus pneumoniae, Staphylococcus aureus, Haemophilus influenzae, Escherichia coli, and Moraxella catarrhalis.

5. The method of claim 3, wherein said method is used to detect or identify two or more bacteria selected from Streptococcus pneumoniae, Staphylococcus aureus, Haemophilus influenzae, Escherichia coli, and Moraxella catarrhalis in the sample.

6. The method of claim 3, wherein the detecting comprises detecting at least two desalted peptides in the non-naturally occurring set of peptides.

7. The method of claim 3, wherein the detecting is carried out using mass spectrometry and/or an affinity reagent specific for the biomarker.

8. The method of claim 3, further comprising, prior to the detecting step, a cell lysis step and/or a proteolysis step.

9. The method of claim 3, further comprising, responsive to detecting in the sample the at least one desalted peptide, diagnosing bacterial infection.

10. A diagnostic kit for detecting specific biomarker peptides in a sample, wherein the kit comprises the non-naturally occurring set of peptides according to claim 1, wherein the kit further comprises a protease.

11. A non-naturally occurring peptide having an amino acid sequence consisting of one of SEQ ID NOs:1-71, wherein the non-naturally occurring peptide is desalted.

12. The non-naturally occurring peptide of claim 11, wherein the non-naturally occurring peptide has an amino acid sequence consisting of one of SEQ ID NOs:1-5, 18-22, 36, 37, 41, 42, 48, 52-55 and 60.

13. A pharmaceutical composition comprising the non-naturally occurring peptide according to claim 11.

Description

FIGURES

(1) The following drawings are provided to illustrate various aspects of the present inventive concept and are not intended to limit the scope of the present invention unless specified herein.

(2) FIG. 1. Exemplary workflow for generating peptide biomarker database identified by MS, followed by ranking by number of hits to generate a preliminary inclusion list, i.e. a first set of peptide biomarkers.

(3) FIG. 2. Exemplary workflow for experimentally verifying and revising an inclusion list (set of peptide biomarkers) involving spiking of negative samples and ranking of peptides according to sensitivity of detection.

(4) FIG. 3. Exemplary workflow for experimentally verifying and revising an inclusion list. The workflow includes analysing clinical samples, exemplified by nasal swabs, followed by bioinformatics processing using a targeted approach (“inclusion list only” mode or “inclusion list plus pick others” mode) and optionally also an open (TCUP) approach, followed by and revision of the inclusion list. The revision of the inclusion list may include verifying any peptide biomarkers of the inclusion list on the basis of their detection in the clinical sample; removing any peptide biomarkers on the basis of their non-detection in the clinical sample; and/or the addition of new biomarkers on the basis of their detection in the clinical sample.

(5) FIG. 4. Comparison of dilution series for detecting certain peptide species using targeted (here: “inclusion list plus pick others” mode) and normal MS analysis approach.

(6) FIG. 5. Identification of species unique peptides in clinical samples using parallel reaction monitoring (PRM) methods and comparison to results from cultures.

(7) FIG. 6. Exemplary workflow of generating an inclusion list (set of peptide biomarkers) using spiked negative samples for a first revision of the preliminary inclusion list and clinical samples to for a further revision of the (revised) inclusion list. In the first step thousands of peptides (+15000 peptides) in cultures from the different microbes/bacteria (Sp, Hi, Mc, Sa) are identified. In the second step negative clinical samples are spiked with the respective microbes/bacteria in order to mimic clinical samples, thus reducing the list of peptides to hundreds instead of thousands of candidate peptide biomarkers. Finally validation of the peptides is performed by analysing real positive clinical samples thus resulting in a final inclusion list of peptide biomarkers.

(8) FIG. 7. Direct analyses of clinical respiratory tract samples using PRM, targeting the most promising peptides. The peptide peaks are labeled with numbers corresponding their sequences from Table 10 (Staphylococcus aureus), Table 11 (Moraxella catarrhalis), Table 12 (Haemophilus influenzae) and Table 13 (Streptococcus pneumonia).

TABLES

(9) TABLE-US-00001 TABLE 1 Bacterial strains selected from the Culture Collection University of Gothenburg (CCUG). The strains were selected by to represent the genetic diversity within each species. Haemophilus Moraxella Streptococcus Staphylococcus influenzae catarrhalis pneumoniae aureus CCUG 23945T CCUG 353T CCUG 28588T CCUG 41582T CCUG 35273 CCUG 34455 CCUG 35272 CCUG 62707 CCUG 4559 CCUG 63408 CCUG 1350 CCUG 68900 CCUG 60440 CCUG 18284 CCUG 11780 CCUG 62271 CCUG 26214 CCUG 56314 CCUG 33774 CCUG 1964 CCUG 23969 CCUG 36757 CCUG 7206 CCUG 39740 CCUG 33775 CCUG 41836 CCUG 35180 CCUG 49245 CCUG 32226 CCUG 18283 CCUG 64138 CCUG 9188 CCUG 27321 CCUG 1979 CCUG 33013 CCUG 69160 CCUG 63533 CCUG 62274 CCUG 1988 CCUG 1914

(10) TABLE-US-00002 TABLE 2 Species discriminatory peptide biomarkers found in dilution series (10{circumflex over ( )}4). H. influenzae M. catarrhalis GVAADAISATGYGK VGDEIEIIGIKPTAK SEQ ID NO: 36 SEQ ID NO: 251 AVVYNNEGTNVELGGR TDEQLQAELDNK SEQ ID NO: 41 SEQ ID NO: 252 FGQGEAPVVAAPEVVSK GLITNSIENTNNITK SEQ ID NO: 249 SEQ ID NO: 253 DGQVTGALATLGEPYK QIVSNAGDEASVIVNEVK SEQ ID NO: 250 SEQ ID NO: 19 LSVIAEQSNSTR NTIEGENSVAIGSNNTVK SEQ ID NO: 47 SEQ ID NO: 254 YDANNIIAGIAYGR QQTEAIDALNK SEQ ID NO: 42 SEQ ID NO: 255 S. pneumoniae S. aureus (10{circumflex over ( )}3) VQYEGGTEDELIR ITYTMIGDPSQTITR SEQ ID NO: 256 SEQ ID NO: 262 VSDVAESTGEFTSEQFEK AILNNENNVLNVSIQLDGQYGGHK SEQ ID NO: 52 SEQ ID NO: 263 GLDVTDEEGDDVTNGIFVGAK GLEVGQIVESGAEADIK SEQ ID NO: 257 SEQ ID NO: 264 SQTEQGEINIER TVEVDGYNAIQVGFEDK SEQ ID NO: 258 SEQ ID NO: 265 PAPAPQPAPAPKPEK SINPADTSQVIANASK SEQ ID NO: 259 SEQ ID NO: 266 AEADKPETEAGKER GGLTDTFTNAFSSGNNVTQGVSVEVGEK SEQ ID NO: 260 SEQ ID NO: 267 GAANGVVSHENTR NFDVLDEATGLAQR SEQ ID NO: 53 SEQ ID NO: 268 AEGVATASETAEAASAAKPEEK SEQ ID NO: 261

(11) TABLE-US-00003 TABLE 3 Results of Example 2 Putative protein from which peptide Peptide Present is potentially derived AEKPADQQAEEDYAR YES hypothetical protein SEQ ID NO: 181 AIEAGQTVDFSDLAIK YES glycerol facilitator-aquaporin SEQ ID NO: 185 AIENPFAVEVADVETEK YES phosphoglucomutase SEQ ID NO: 269 AVVVNPESTGVAIEEK YES alcohol dehydrogenase SEQ ID NO: 270 IDTFGTGTVAESQLEK YES IS1380-Spn1 transposase SEQ ID NO: 171 ILDLNEEEGRVSLSIK YES 30S ribosomal protein S1 SEQ ID NO: 271 LLAGADPDDGTEVIEAK YES pneumococcal surface protein A SEQ ID NO: 174 LYQNAEEVINK YES NADH oxidase SEQ ID NO: 272 LYQNAEEVINKLSDK YES NADH oxidase SEQ ID NO: 273 NLPVGSDGTFTPEDYVGR YES methionyl-tRNA synthetase SEQ ID NO: 61 NVEIIEDDKQGVIR YES 30S ribosomal protein S8 SEQ ID NO: 60 VSDVAESTGEFTSEQFEK YES general stress protein SEQ ID NO: 52 VSDVTTLEEARPATTPSSPNVR YES capsular polysaccharide biosynthesis SEQ ID NO: 274 protein Wzd

(12) TABLE-US-00004 TABLE 4 E. coli strains selected from the Culture Collection University of Gothenburg (CCUG). The strains were selected by to represent the genetic diversity within each species. CCUG 24T Human urine, cystitis Sweden CCUG 65156 A ECOR4 Human faeces, healthy USA (Iowa) woman CCUG 65157 B1 ECOR29 Kangaroo faeces, healthy USA (Nevada) CCUG 65158 C ECOR70 Gorilla faeces, healthy USA (Washington) CCUG 65159 D ECOR37 Marmoset faeces, healthy USA (Washington) CCUG 65160 E ECOR47 Sheep faeces, healthy New guinea CCUG 65161 F ECOR39 Human faeces, healthy Sweden woman CCUG 65162 B2 ECOR53 Human faeces, healthy USA (Iowa) woman CCUG 65163 B2 ECOR56 Human urine, woman Sweden ATCC 35320 A ECOR1 Human, woman USA (Iowa) ATCC 35345 B1 ECOR26 Human, infantis USA (Massachusetts) ATCC 35359 D ECOR40 Human, woman Sweden ATCC 35379 B2 ECOR60 Human, woman Sweden

(13) TABLE-US-00005 TABLE 5 Moraxella catarrhalis biomarker peptides SEQ ID NO SEQ ID NO: 27 1 SQIYQTTASVSGAR SEQ ID NO: 22 2 VDATVDAQNPTK SEQ ID NO: 28 3 LLNETTGQVVPK SEQ ID NO: 20 4 AIAQVGSISANSDATIGELISK SEQ ID NO: 72 5 AQYDITQNAGTER SEQ ID NO: 21 6 ELSNTAAETQPK SEQ ID NO: 19 7 QIVSNAGDEASVIVNEVK SEQ ID NO: 18 8 VVLAGDTVVSDR SEQ ID NO: 26 9 YVVEGANMPLDAQAIDIVR SEQ ID NO: 31 10 GLPVSNSGAPISVPVGQATLGR SEQ ID NO: 29 11 SSENVVVVSVR SEQ ID NO: 25 12 THTSALAEENQQASIPR SEQ ID NO: 24 13 FNATAALGGYGSK SEQ ID NO: 33 14 AVATQQATVSAEYLQK SEQ ID NO: 73 15 TFHVGGAASAASVDNSVSVGNAGSVR SEQ ID NO: 35 16 LGAQEAELVSNSK SEQ ID NO: 23 17 QSDVGQLTGK SEQ ID NO: 32 18 VNYNGDTDTVTLSGVAK SEQ ID NO: 34 19 ADSGLSESEIEEMIR SEQ ID NO: 30 20 AISYGNSADAQPYVGAK SEQ ID NO: 74 21 ANLDTSTEEAR SEQ ID NO: 75 22 ASSENTQNIAK SEQ ID NO: 76 23 DADAVEAGQVIAK SEQ ID NO: 77 24 FAATADAITK SEQ ID NO: 78 25 LNTQGASFDYPVASNATEQGR SEQ ID NO: 79 26 NKADADASFETLTK SEQ ID NO: 80 27 AKLESLTEDMVAR SEQ ID NO: 81 28 IITNNHAIALNLAAEGYGIAK SEQ ID NO: 82 29 ILADIAMHDAAAFTAITEK SEQ ID NO: 83 30 IYRPEIYNANSVAGQIYK SEQ ID NO: 84 31 LDITETTDDSR SEQ ID NO: 85 32 ALESNVEEGLLDLSGR SEQ ID NO: 86 33 EFYAAETLPAESR SEQ ID NO: 87 34 MNIEQTLQSAEDTAR SEQ ID NO: 88 35 QDPANQEVYTK SEQ ID NO: 89 36 SVTATDNTTQATVIK SEQ ID NO: 90 37 VGVMAGPEQAVAEVAGQVAK SEQ ID NO: 91 38 AHIGLAQAQFPEGLASSQVDALAR SEQ ID NO: 92 39 ALATDYSHVVAPATTTGK SEQ ID NO: 93 40 ENTVIVDGAGDKASIEAR SEQ ID NO: 94 41 LPVDKETAPSDDATATTQFSR SEQ ID NO: 95 42 LTYTDGSDPGSYYR SEQ ID NO: 96 43 NLGAAVNEVTANEQSAEAKAPEDQQY SEQ ID NO: 97 44 SDALYVVEDSVK SEQ ID NO: 98 45 LADEGDIDVR SEQ ID NO: 99 46 LTQATAQASAPQGR SEQ ID NO: 100 47 LYPNDPTYQAASEK SEQ ID NO: 101 48 NQADIANNINNIYELAQQQDQHSSDIK SEQ ID NO: 102 49 SEVLDGMNSAYNPVVEDK SEQ ID NO: 103 50 SLENDLGVSLLHR

(14) TABLE-US-00006 TABLE 6 Haemophilus influenza biomarker peptides SEQ ID NO SEQ ID NO: 36 1 GVAADAISATGYGK SEQ ID NO: 41 2 AVVYNNEGTNVELGGR SEQ ID NO: 37 5 ANLKPQAQATLDSIYGEMSQVK SEQ ID NO: 42 6 YDANNIIAGIAYGR SEQ ID NO: 38 7 ADSVANYFVAK SEQ ID NO: 48 8 SADLTNEVAVGDVVEAK SEQ ID NO: 104 9 FGGNAQQTAQLPR SEQ ID NO: 39 10 GSYEVLDGLDVYGK SEQ ID NO: 40 11 LSQERADSVANYFVAK SEQ ID NO: 51 12 AQYIVEQVIGQAR SEQ ID NO: 43 13 ATHNFGDGFYAQGYLETR SEQ ID NO: 105 14 GLSVGDQIQAGINSPIK SEQ ID NO: 45 15 QQVNGALSTLGYR SEQ ID NO: 50 16 TSPTQNLSLDAFVAR SEQ ID NO: 44 17 AVVYNNEGTKVELGGR SEQ ID NO: 47 18 LSVIAEQSNSTR SEQ ID NO: 49 19 SADLTSEVAVGDVVEAK SEQ ID NO: 46 20 YVPTNGNTVGYTFK SEQ ID NO: 106 21 ATGEINLDGENLLTTK SEQ ID NO: 107 22 ATHNLGDGFYAQGYLETR SEQ ID NO: 108 23 GIASGTEVSFGTYGLK SEQ ID NO: 109 24 GVAAIVTLSSTGR SEQ ID NO: 110 25 NNEGTNVELGGR SEQ ID NO: 111 26 TISDGITSAEDKEYGVLK SEQ ID NO: 112 27 AILPPQEIEQGTVK SEQ ID NO: 113 28 ATNLSAEQLNVTDASEK SEQ ID NO: 114 29 FKQTAPSNNEVENELTNEQLTK SEQ ID NO: 115 30 GIDGLVLGANYLLAQER SEQ ID NO: 116 31 IAEQSNSTIKDQK SEQ ID NO: 117 32 TAQFSTGGVYIDSR SEQ ID NO: 118 33 YAYVTLGNNTFGEVK SEQ ID NO: 119 34 ANLKPQAQATLDSVYGEISQVK SEQ ID NO: 120 35 AQQLSTDVKNK SEQ ID NO: 121 36 EITEDPAIYPSADILK SEQ ID NO: 122 37 GLKVENTNNPIQVPVGTK SEQ ID NO: 123 38 GVITVSAVGDQINPTLAR SEQ ID NO: 124 39 INATEGAATLTAESGK SEQ ID NO: 125 40 LSVIAEQSNSTADDQK SEQ ID NO: 126 41 LSVIAEQSNTTVDDQK SEQ ID NO: 127 42 LVSAQSGTESDNFGHIITK SEQ ID NO: 128 43 NEGTNVELGGR SEQ ID NO: 129 44 RAELEATAAANLAAAQAR SEQ ID NO: 130 45 SADLTSEVAVGDVVDAK SEQ ID NO: 131 46 SIIAEQSNSTIKDQK SEQ ID NO: 132 47 SVDLTSEVAVGDVVEAK SEQ ID NO: 133 48 TIADGITSAEDKEYGVLNNSK SEQ ID NO: 134 49 TIIGANLSQLTQNELSAGK SEQ ID NO: 135 50 TQTSTSIGFNAK

(15) TABLE-US-00007 TABLE 7 Staphylococcus aureus biomarker peptides SEQ ID NO SEQ ID NO: 2 1 QAGVGAAVVAELSER SEQ ID NO: 3 2 ELINNIQSGQR SEQ ID NO: 1 3 TVQPIDVDTIVASVEK SEQ ID NO: 4 4 LGISDGDVEETEDAPK SEQ ID NO: 5 5 ALLNNMVQGVSQGYVK SEQ ID NO: 8 6 ILAESPNLAISSSSR SEQ ID NO: 136 7 NALIIEDTGDNNVVK SEQ ID NO: 6 8 SNVNDATDYSSETPEGK SEQ ID NO: 10 9 ATEATNATNNQSTQVSQATSQPINFQVQK SEQ ID NO: 15 10 AEENGLTVVDAFNFEAPK SEQ ID NO: 16 11 EKANELLKDNAELIASFSR SEQ ID NO: 11 12 IHLVGDEIANGQGIGR SEQ ID NO: 7 13 ANNVATDANHSYTSR SEQ ID NO: 248 14 AQENGLTVVDAFNFEAPK SEQ ID NO: 137 15 DLSFGENYGVVMEELR SEQ ID NO: 17 16 LLGINATIVMPETAPQAK SEQ ID NO: 12 17 NISNNVLVTIDAAQGK SEQ ID NO: 9 18 NVVEIPLNDEEQSK SEQ ID NO: 14 19 SQGVSEEELNESIDR SEQ ID NO: 13 20 TAKPVAEVESQTEVTE SEQ ID NO: 138 21 TPTEQTKPVQPK SEQ ID NO: 139 22 VMGVDYVSNITEAR SEQ ID NO: 140 23 YLGDEEISVSELK SEQ ID NO: 141 24 AEAQANQMVGDAVEK SEQ ID NO: 142 25 ANELLKDNAELIASFSR SEQ ID NO: 143 26 ATDAENVEKEEAITK SEQ ID NO: 144 27 AVAGAAGGADAAAEK SEQ ID NO: 145 28 ELINGVFTDINPYIK SEQ ID NO: 146 29 HIGTPGEVLEPGQQVNVK SEQ ID NO: 147 30 KAQSEQDQAFLSK SEQ ID NO: 148 31 MIAVLIPDDGSGK SEQ ID NO: 149 32 NAGIGSGFSNDMYEKEGAK SEQ ID NO: 150 33 QNLPVLDVPEDVVEEGVR SEQ ID NO: 151 34 VVITAQTINEETEPELYDAEGNLINNSK SEQ ID NO: 152 35 ADSGTVIQAISK SEQ ID NO: 153 36 ATIDGLQNLKNAEDVAK SEQ ID NO: 154 37 DSDIATTATKVELATK SEQ ID NO: 155 38 FIAETYLDDVEQFNTVR SEQ ID NO: 156 39 FIEETPELFDIQPSLDR SEQ ID NO: 157 40 GLWNENKENEVIER SEQ ID NO: 158 41 IFSEVEPNPSTNTVYK SEQ ID NO: 159 42 LAEQKATDAENVEKEEA SEQ ID NO: 160 43 LAVNEMLNAIQNK SEQ ID NO: 161 44 LNDVEQTNTPGSLNPK SEQ ID NO: 162 45 MQEVGVTAISGETIIK SEQ ID NO: 163 46 NLSEQGINEATR SEQ ID NO: 164 47 NMLPEVKPSSEVYGK SEQ ID NO: 165 48 NVKDNAIVLEAISGADVNDSTSAPVDDVDFTSDIGKDIK SEQ ID NO: 166 49 QNLPVLDVPEDVVEEGVRK SEQ ID NO: 167 50 SGADVNDSTSAPVDDVDFTSDIGKDIK

(16) TABLE-US-00008 TABLE 8 Streptococcus pneumonia biomarker peptides SEQ ID NO SEQ ID NO: 52 1 VSDVAESTGEFTSEQFEK SEQ ID NO: 53 2 GAANGVVSHENTR SEQ ID NO: 54 3 EEAPVASQSK SEQ ID NO: 60 4 NVEIIEDDKQGVIR SEQ ID NO: 55 5 SADQQAEEDYAR SEQ ID NO: 61 6 NLPVGSDGTFTPEDYVGR SEQ ID NO: 65 7 AVAAADAADAGAAK SEQ ID NO: 63 8 DIGLANDGSIVGINYAK SEQ ID NO: 67 9 TLSPEEYAVTQENQTER SEQ ID NO: 56 10 APLQSELDTK SEQ ID NO: 66 11 GQDWVIAAEVVTKPEVK SEQ ID NO: 62 12 TLELEIAESDVK SEQ ID NO: 64 13 IAELEYEVQR SEQ ID NO: 70 14 IGVISVVEDGDEALAK SEQ ID NO: 68 15 KDEAEAAFATIR SEQ ID NO: 57 16 LKEIDESDSEDYVK SEQ ID NO: 69 17 SQPSSETELSGNKQEQER SEQ ID NO: 71 18 VAYFNEIDTYSEVK SEQ ID NO: 58 19 AKLEEAEKKATEAK SEQ ID NO: 59 20 AVNEPEKPAEESENPAPAPK SEQ ID NO: 168 21 DVPENLITAVVQSNK SEQ ID NO: 169 22 EAEANFNTEQAK SEQ ID NO: 170 23 EIDESDSEDYLKEGLR SEQ ID NO: 171 24 IDTFGTGTVAESQLEK SEQ ID NO: 172 25 LKEIDESDSEDYVKEGFR SEQ ID NO: 173 26 LKEIDESDSEDYVKEGLR SEQ ID NO: 174 27 LLAGADPDDGTEVIEAK SEQ ID NO: 175 28 NGNYETAEGSEETSSEVK SEQ ID NO: 176 29 NTLLELGLDESQIK SEQ ID NO: 177 30 VAAGDLLVTADLNAIR SEQ ID NO: 178 31 VIPKETELATTK SEQ ID NO: 179 32 VVPEAEQLAETK SEQ ID NO: 180 33 AEKDYDAAMKNAEDAK SEQ ID NO: 181 34 AEKPADQQAEEDYAR SEQ ID NO: 182 35 AESTGEFTSEQFEK SEQ ID NO: 183 36 AGITYSEGLVFESK SEQ ID NO: 184 37 AGVVVVDNTSYFR SEQ ID NO: 185 38 AIEAGQTVDFSDLAIK SEQ ID NO: 186 39 ALTPEEVQKR SEQ ID NO: 187 40 AQNTESTVVQLNNGDVK SEQ ID NO: 188 41 DAEHAEEVAPQVK SEQ ID NO: 189 42 DIILAQTEENLTR SEQ ID NO: 190 43 DLENVETVIEKEDVETNASNGQR SEQ ID NO: 191 44 EAGDQATYFDEIR SEQ ID NO: 192 45 EGFVKNVEIIEDDKQGVIR SEQ ID NO: 193 46 ELATQIYQVAR SEQ ID NO: 194 47 GSDGKQFYNNYNDAPLK SEQ ID NO: 195 48 GSIESMHNLPVNLAGAR SEQ ID NO: 196 49 IAEATKEVQQAYLAYQQASNESQR SEQ ID NO: 197 50 IGGGYAGQSGAIR

(17) TABLE-US-00009 TABLE 9 Escherichia coli biomarker peptides SEQ ID NO SEQ ID NO: 198 1 AIDDLVKGFEELDTSK SEQ ID NO: 199 2 ANSSTTTAAEPLK SEQ ID NO: 200 3 QVPILQKDDSR SEQ ID NO: 201 4 VFDVNEPLSQINQAK SEQ ID NO: 202 5 VPVFAGDTEDDITAR SEQ ID NO: 203 6 SVQTVTGQPDVDQVVLDEAIKNR SEQ ID NO: 204 7 LIAAAPTAVAPEESGFYAR SEQ ID NO: 205 8 NAEFLQAYGVAIADGPLK SEQ ID NO: 206 9 EIAFEELGSQAR SEQ ID NO: 207 10 AEVPSGTVLAEKQELVR SEQ ID NO: 208 11 APRPAPAPQAPAQNTTPVTK SEQ ID NO: 209 12 RTEPAAPVASTK SEQ ID NO: 210 13 SDTYGWQEDSTYIR SEQ ID NO: 211 14 SYEEELAKDPR SEQ ID NO: 212 15 RTEPAAPVASTKAPAATSTPAPK SEQ ID NO: 213 16 ADGINPEELLGNSSAAAPR SEQ ID NO: 214 17 IVQSPDVIPADSEAGR SEQ ID NO: 215 18 MAERPEVQDALSAEGLK SEQ ID NO: 216 19 NAEFLQAYGVAIADGPLKGLAAR SEQ ID NO: 217 20 QQAEVTEKAR SEQ ID NO: 218 21 APAATSTPAPK SEQ ID NO: 219 22 AFDSQTEDSSPAIGR SEQ ID NO: 220 23 PNELLNSLAAVK SEQ ID NO: 221 24 APAKESAPAAAAPAAQPALAAR SEQ ID NO: 222 25 MNAFDSQTEDSSPAIGR SEQ ID NO: 223 26 SGDLTAFEPELLKEHNAR SEQ ID NO: 224 27 SLSDTLEEVLSSSGEK SEQ ID NO: 225 28 NIPVELHVLLNDDAETPTR SEQ ID NO: 226 29 QAQINGLEMAFLSAEEKR SEQ ID NO: 227 30 QEAAPAAAPAPAAGVK SEQ ID NO: 228 31 SRLPQNITLTEV SEQ ID NO: 229 32 HLAKAPAKESAPAAAAPAAQPALAA R SEQ ID NO: 230 33 LTSSTATAATSKPVTSVASGPR SEQ ID NO: 231 34 NVEYLVVEAAGATR SEQ ID NO: 232 35 SDDMSMGLPSSAGEHGVLR SEQ ID NO: 233 36 VRYEQSVAEEAVVAPVVEETVAAE PIVQEAPAPR SEQ ID NO: 234 37 AVTNSPVVVALDYHNR SEQ ID NO: 235 38 EAPLAVELDHDKVMNMQVK SEQ ID NO: 236 39 IMSGNSETETQEVGFKER SEQ ID NO: 237 40 KRPEQPALATFAMPDVPPAPTPAE PAAPVVAPAPK SEQ ID NO: 238 41 SQPIFNDKQFQEALSR SEQ ID NO: 239 42 ALDLSAEEKAAVR SEQ ID NO: 240 43 ALEKVVGLQTEAPLKR SEQ ID NO: 241 44 EAAIQVSNVAIFNATTGK SEQ ID NO: 242 45 ETATTAPVQTASPAQTTATPAAGGK SEQ ID NO: 243 46 FSAVLEQGAIAAGSDNK SEQ ID NO: 244 47 LHHANDTDSFSATNVH SEQ ID NO: 245 48 NVEYLVVEAAGTTR SEQ ID NO: 246 49 SLEHEVTLVDDTLVR SEQ ID NO: 247 50 TNGSLNAAEATETLR

(18) TABLE-US-00010 TABLE 10 The most prominent species-unique peptides of S. aureus. The corresponding Gen Bank accession numbers and descriptions of the proteins are shown. SEQ ID NO Peptide sequence Protein SEQ ID NO: 1 TVQPIDVDTIVASVEK AKJ16950.1 2-oxoisovalerate SEQ ID NO: 2 QAGVGAAVVAELSER dehydrogenase SEQ ID NO: 3 ELINNIQSGQR AKJ17520.1 preprotein translocase subunit YajC SEQ ID NO: 4 LGISDGDVEETEDAPK AKJ17148.1 recombinase RecA SEQ ID NO: 5 ALLNNMVQGVSQGYVK AKJ18065.1 50S ribosomal protein L6 SEQ ID NO: 6 SNVNDATDYSSETPEGK AKJ17216.1 transketolase SEQ ID NO: 7 ANNVATDANHSYTSR AKJ17623.1 hypothetical protein SEQ ID NO: 8 ILAESPNLAISSSSR AKJ16422.1 HAD family hydrolase SEQ ID NO: 9 NVVEIPLNDEEQSK AKJ16109.1 lactate dehydrogenase SEQ ID NO: 10 ATEATNATNNQSTQVSQATS AKJ16987.1 heme transporter IsdA QPINFQVQK SEQ ID NO: 11 IHLVGDEIANGQGIGR AKJ17576.1 pyruvate kinase SEQ ID NO: 12 NISNNVLVTIDAAQGK SEQ ID NO: 13 TAKPVAEVESQTEVTE AKJ16406.1 DNA-directed RNA polymerase subunit beta′ SEQ ID NO: 14 SQGVSEEELNESIDR AKJ16022.1 acetaldehyde dehydrogenase SEQ ID NO: 15 AEENGLTVVDAFNFEAPK AKJ18079.1 50S ribosomal protein L4 SEQ ID NO: 16 EKANELLKDNAELIASFSR AKJ18460.1 fructose-16- bisphosphate aldolase SEQ ID NO: 17 LLGINATIVMPETAPQAK AKJ17317.1 threonine dehydratase

(19) TABLE-US-00011 TABLE 11 The most prominent species-unique peptides of M. catarrhalis. The corresponding GenBank accession numbers and descriptions of the proteins are shown. SEQ ID NO Peptide sequence Protein SEQ ID NO: 18 VVLAGDTVVSDR WP_003666427.1 TonB-dependent receptor SEQ ID NO: 19 QIVSNAGDEASVIVNEVK WP_063454121.1 chaperonin GroEL SEQ ID NO: 20 AIAQVGSISANSDATIGELISK SEQ ID NO: 21 ELSNTAAETQPK WP_003659702.1 30S ribosomal protein S1 SEQ ID NO: 22 VDATVDAQNPTK WP_003660336.1 hypothetical protein SEQ ID NO: 23 QSDVGQLTGK SEQ ID NO: 24 FNATAALGGYGSK WP_063454085.1 cell surface protein SEQ ID NO: 25 THTSALAEENQQASIPR WP_063454087.1 cell division protein FtsZ SEQ ID NO: 26 YVVEGANMPLDAQAIDIVR WP_049156084.1 NADP-specific glutamate dehydrogenase SEQ ID NO: 27 SQIYQTTASVSGAR WP_003657351.1 Ohr family peroxiredoxin SEQ ID NO: 28 LLNETTGQVVPK WP_003657987.1 DUF4377 domain- containing protein SEQ ID NO: 29 SSENVVVVSVR WP_063454071.1 electron transfer flavoprotein subunit beta SEQ ID NO: 30 AISYGNSADAQPYVGAK WP_003658939.1 porin family protein SEQ ID NO: 31 GLPVSNSGAPISVPVGQATL WP_003658974.1 F0F1 ATP synthase GR subunit beta SEQ ID NO: 32 VNYNGDTDTVTLSGVAK WP_003656943.1 peptidoglycan-binding protein LysM SEQ ID NO: 33 AVATQQATVSAEYLQK WP_003657125.1 ABC transporter substrate-binding protein SEQ ID NO: 34 ADSGLSESEIEEMIR WP_003669031.1 molecular chaperone DnaK SEQ ID NO: 35 LGAQEAELVSNSK WP_003660298.1 CTP synthase

(20) TABLE-US-00012 TABLE 12 The most prominent species-unique peptides of H. influenzae. The corresponding GenBank accession numbers and descriptions of the proteins are shown. SEQ ID NO Peptide sequence Protein SEQ ID NO: 36 GVAADAISATGYGK WP_038441355.1 porin OmpA SEQ ID NO: 37 ANLKPQAQATLDSIYGEMSQ VK SEQ ID NO: 38 ADSVANYFVAK SEQ ID NO: 39 GSYEVLDGLDVYGK SEQ ID NO: 40 LSQERADSVANYFVAK SEQ ID NO: 41 AVVYNNEGTNVELGGR WP_058222193.1 porin SEQ ID NO: 42 YDANNIIAGIAYGR SEQ ID NO: 43 ATHNFGDGFYAQGYLETR SEQ ID NO: 44 AVVYNNEGTKVELGGR SEQ ID NO: 45 QQVNGALSTLGYR SEQ ID NO: 46 YVPTNGNTVGYTFK SEQ ID NO: 47 LSVIAEQSNSTR SEQ ID NO: 48 SADLTNEVAVGDVVEAK WP_011272719.1 30S ribosomal protein SEQ ID NO: 49 SADLTSEVAVGDVVEAK S1 SEQ ID NO: 50 TSPTQNLSLDAFVAR WP_058222202.1 ShlB/FhaC/HecB WP_050846043.1 family hemolysin secretion/activation protein SEQ ID NO: 51 AQYIVEQVIGQAR WP_011272712.1 pyruvate dehydrogenase (acetyl-transferring), homodimeric type

(21) TABLE-US-00013 TABLE 13 The most prominent species-unique peptides of S. pneumoniae. The corresponding GenBank accession numbers and descriptions of the proteins are shown. SEQ ID NO Peptide sequence Protein SEQ ID NO: 52 VSDVAESTGEFTSEQFEK WP_000064115.1 Asp23/Gls24 family SEQ ID NO: 53 GAANGVVSHENTR envelope stress response protein SEQ ID NO: 54 EEAPVASQSK WP_001035310.1 hypothetical protein SEQ ID NO: 55 SADQQAEEDYAR SEQ ID NO: 56 APLQSELDTK SEQ ID NO: 57 LKEIDESDSEDYVK SEQ ID NO: 58 AKLEEAEKKATEAK SEQ ID NO: 59 AVNEPEKPAEESENPAPAPK SEQ ID NO: 60 NVEIIEDDKQGVIR WP_000245505.1 30S ribosomal protein S8 SEQ ID NO: 61 NLPVGSDGTFTPEDYVGR WP_001291372.1 methionine--tRNA ligase SEQ ID NO: 62 TLELEIAESDVK WP_000458177.1 hypothetical protein SEQ ID NO: 63 DIGLANDGSIVGINYAK WP_000927809.1 sugar ABC transporter substrate-binding protein SEQ ID NO: 64 IAELEYEVQR WP_001008677.1 Asp-tRNA(Asn)/Glu- tRNA(Gln) amidotransferase subunit GatB SEQ ID NO: 65 AVAAADAADAGAAK WP_001196960.1 50S ribosomal protein L7/L12 SEQ ID NO: 66 GQDWVIAAEVVTKPEVK WP_000116461.1 trigger factor SEQ ID NO: 67 TLSPEEYAVTQENQTER WP_000998307.1 peptide-methionine (R)-S-oxide reductase SEQ ID NO: 68 KDEAEAAFATIR WP_001284361.1 thiol-activated toxin pneumolysin SEQ ID NO: 69 SQPSSETELSGNKQEQER WP_078148305.1 sialidase SEQ ID NO: 70 IGVISVVEDGDEALAK WP_000808063.1 elongation factor Ts SEQ ID NO: 71 VAYFNEIDTYSEVK WP_000685088.1 nucleotide sugar dehydrogenase

(22) TABLE-US-00014 TABLE 14 The five most prominent species-unique peptides of S. aureus. The corresponding GenBank accession numbers and descriptions of the proteins are shown. TVQPIDVDTIVASVEK SEQ ID NO: 1 AKJ16950.1 2-oxoisovalerate dehydrogenase QAGVGAAVVAELSER SEQ ID NO: 2 ELINNIQSGQR SEQ ID NO: 3 AKJ17520.1 preprotein translocase subunit YajC LGISDGDVEETEDAPK SEQ ID NO: 4 AKJ17148.1 recombinase RecA ALLNNMVQGVSQGYVK SEQ ID NO: 5 AKJ18065.1 50S ribosomal protein L6

(23) TABLE-US-00015 TABLE 15 The five most prominent species-unique peptides of M. catarrhalis. The corresponding GenBank accession numbers and descriptions of the proteins are shown. VVLAGDTVVSDR SEQ ID NO: 18 WP_003666427.1 TonB-dependent receptor QIVSNAGDEASVIVNEVK SEQ ID NO: 19 WP_063454121.1 chaperonin GroEL AIAQVGSISANSDATIGELISK SEQ ID NO: 20 ELSNTAAETQPK SEQ ID NO: 21 WP_003659702.1 30S ribosomal protein S1 VDATVDAQNPTK SEQ ID NO: 22 WP_003660336.1 hypothetdol protein

(24) TABLE-US-00016 TABLE 16 The five most prominent species-unique peptides of H. influenzae. The corresponding GenBank accession numbers and descriptions of the proteins are shown. GVAADAISATGYGK SEQ ID NO: 36 WP_038441355.1 porin OmpA ANLKPQAQATLDSIYGEMSQVK SEQ ID NO: 37 AWYNNEGTNVELGGR SEQ ID NO: 41 WP_058222193.1 porin YDANNIIAGIAYGR SEQ ID NO: 42 SADLTNEVAVGDVVEAK SEQ ID NO: 48 WP_011272719.1 30S ribosomal protein S1

(25) TABLE-US-00017 TABLE 17 The five most prominent species-unique peptides of S. pneumoniae. The corresponding GenBank accession numbers and descriptions of the proteins are shown. VSDVAESTGEFTSEQFEK SEQ ID NO: 52 WP_000064115.1 Asp23/G1s24 family envelope stress GAANGVVSHENTR SEQ ID NO: 53 response protein EEAPVASQSK SEQ ID NO: 54 WP_001035310.1 hypothetical protein SADQQAEEDYAR SEQ ID NO: 55 NVEIIEDDKQGVIR SEQ ID NO: 60 WP_000245505.1 30S ribosomal protein S8

EXAMPLES

Examples 1—Biomarker Identification

(26) In order to identify candidate peptide biomarkers, several strains from each of the four target species, H. influenzae, M. catarrhalis, S. pneumonia, and S. aureus, including the Type strain of each species, were selected to represent the genetic variability within the species (Table 1). Bacterial cells were grown, washed and prepared by bead beating as described in the examples and methods (FIG. 1).

(27) Digestion of bacterial lysates and generation of peptides was performed using the Lipid-based Protein Immobilization (LPI) methodology. Peptides were analyzed by LC-MS/MS and subsequently, the tandem mass spectra were processed by a bioinformatics pipeline, TCUP, to discover species unique peptides, also described in the examples and methods. For S. pneumoniae 7 strains were analyzed in triplicate (21 MS runs), resulting in 782 species unique peptide candidates found in at least one of the 21 MS runs. For H. influenzae 9 strains were analyzed in triplicate (26 MS runs; 1 run failed), resulting in 2978 species unique peptide candidates found in at least one of the 26 MS runs. For M. catarrhalis 11 strains were analyzed in triplicate (33 MS runs), resulting in 5810 species unique peptide candidates found in at least one of the 33 MS runs. For S. aureus 13 strains were analyzed in triplicate (36 MS runs; 3 runs failed), resulting in 5847 species unique peptide candidates found in at least one of the 36 MS runs. From the sum of these species unique peptides, a targeted database containing 15417 peptides were created.

(28) The peptides within this database were ranked according to frequency of detection to generate a preliminary inclusion list (see FIG. 1).

(29) The preliminary inclusion list was experimentally verified as set out below.

(30) In order to find biomarkers which were being expressed in sufficient amounts as well as being detected most easily by the mass spectrometer, different ranges of numbers of cells per ml were spiked to negative clinical samples. Thus, spiked negative samples were used to evaluate which peptides should be included in an inclusion list (50 to 100 peptides per species). Negative samples were spiked with a range of cells, ranging from 1 million cells per ml down to 100 cells per ml.

(31) Removal of human biomass was performed by use of the MolYsis kit (Molzym, Germany) and in-solution digestion was performed using sodium deoxycholate (SDC), also described in the enclosed examples and methods. The samples were analysed via tandem MS and the tandem mass spectra were processed by a bioinformatics pipeline, TCUP, using the targeted database of mentioned above (15417 peptides) or via an open, non-targeted approach.

(32) The peptides found in the most diluted spiked samples were considered to be promising as peptide biomarker candidates, due to a sufficient expression level and suitable properties for ionization, fragmentation and detection in the mass spectrometer (FIG. 2 and Table 2). On this basis, 100 peptides were selected as good candidate peptides per species, creating a revised inclusion list per species. Each inclusion list contained about 100 peptides and was divided into two lists of 50 for ease of handling.

(33) In the next step of the process, true positive clinical samples were analysed to verify/revise the inclusion lists (FIG. 3). As before, removal of human biomass was performed by use of the MolYsis kit (Molzym, Germany), bacteria were lyzed using bead beating and in-solution digestion was performed using sodium deoxycholate (SDC), again described in the enclosed examples and methods. The clinical samples were analyzed using both an open approach (running all the raw files through TCUP) and a targeted approach (matching the raw data against the inclusion lists and/or targeted database mentioned above, 15417 peptides). The benefit of the open approach is that it is not targeted and thus peptides not present in the targeted database (15417 peptides) can be detected, whereas the drawback is lesser sensitivity. The benefit of the targeted approach is a higher sensitivity, but the drawback lies in the greater risk of false positives.

(34) Approximately 1600 MS analyses were carried out, including on approximately 500 clinical samples containing S. pneumoniae, H. influenzae, M. catarrhalis and S. aureus (as determined by traditional culture-dependent methods, including MALDI-TOF MS-based identification). This analysis was used to validate (or invalidate, as the case may be) the candidate peptide biomarkers in the inclusion lists mentioned above.

(35) The inclusion lists were revised as follows (see FIG. 3): If any peptide biomarkers identified in the true positive clinical samples were already present in the inclusion lists, this validated their relevance. If any peptide biomarkers present in the inclusion lists were not detected in any of the true positive clinical samples, they were removed from the inclusion lists, or were given a lower ranking.

(36) If any peptide biomarkers that were not present in the inclusion list, but were present in the targeted database (15417 peptides), were identified in the true positive clinical samples, their ranking was noted and they were then included in an updated inclusion list.

(37) Some peptides were ranked low in the first run based on bacterial cultures, e.g. found only in 1 out of 21 MS runs (in case of S. pneumoniae), but were nevertheless found in all clinical samples. This was likely due to different expression levels in the blood agar cultures as compared to actual clinical samples. For example, a virulence factor might be highly expressed in a clinical sample, whereas it is only moderately expressed in a blood agar culture.

(38) The same strategy also is also being used to generate a peptide biomarker inclusion list for E. coli.

(39) Some peptides were also selected for PRM studies (FIGS. 5 and 7). Parallel reaction monitoring (PRM) is an ion monitoring technique based on high-resolution and high-precision mass spectrometry. The principle of this technique is comparable to SRM/MRM, but it is more convenient in assay development for absolute quantification of proteins and peptides. It is most suitable for quantification of multiple proteins in complex sample with an attomole-level detection. PRM is based on Q-Orbitrap as the representative quadrupole-high resolution mass spectrum platform. Unlike the SRM, which performs one transition at a time, the PRM performs a full scan of each transition by a precursor ion, that is, parallel monitoring of all fragments from the precursor ion. First, the PRM uses the quadrupole (Q1) to select the precursor ion, and the selection window is usually m/z≤2; then, the precursor ion is fragmented in the collision cell (Q2); finally, Orbitrap replaces Q3, scans all product ions with high resolution and high accuracy. Therefore, PRM technology not only has the SRM/MRM target quantitative analysis capabilities, but also have the qualitative ability. (1) The mass accuracy can reach to ppm level, which can eliminate the background interference and false positive better than SRM/MRM, and improve the detection limit and sensitivity in complex background effectively; (2) Full scan of product ions, without the need to select the ion pair and optimize the fragmentation energy, easier to establish the assay; (3) a wider linear range: increased to 5-6 orders of magnitude. Using this approach, it was possible to verify the presence of a biomarker peptide in a clinical sample by observing the same retention time and the same set of fragment ions when comparing to an analysis of a bacterial culture containing the same peptide.

(40) Methods:

(41) Cultivation of Bacteria and Preparation of Samples

(42) In order to generate lists of candidate peptide biomarkers, approximately ten strains from each of the four target species, H. influenzae, M. catarrhalis, S. pneumoniae and S. aureus, including the Type strain of each species, were selected to represent the genetic variability within the species (Table 1).

(43) Bacterial strains were grown on Blood Agar medium. S. pneumoniae and M. catarrhalis were grown at 36° C. with 5% CO2 overnight, S. aureus at 37° C. overnight, and H. influenzae was grown on chocolate agar medium at the same conditions as S. pneumoniae and M. catarrhalis. Bacterial biomass was collected and resuspended in phosphate-buffered saline (PBS). Bacterial densities were measured at A600 (A600=1 corresponding to 1*10{circumflex over ( )}9 bacteria). For each experiment, the same amounts of bacterial biomass were established, by adjusting the A to 1.0 in 1.0 ml of PBS. The bacterial biomass was washed with PBS three times by centrifuging the sample for 5 min at 12,000 g, discarding the supernatant, and resuspending the pellet in 1.0 ml of PBS. The bacteria were finally resuspended in 150 μl of PBS. The bacterial cell suspensions were transferred to 200-μl vials containing glass beads (Sigma-Aldrich, G1145). The bacterial cells were lysed by bead-beating, using a TissueLyser (Qiagen, 85220), with the following settings: frequency 1/25 s and 5 min. The bacterial lysates were frozen at −20° C. until analysis.

(44) Spiking of Negative Samples for Discovery of Candidate Peptide Biomarkers

(45) Clinical samples (respiratory tract nasopharyngeal and nasal swabs) deemed negative by culture and MALDI-TOF-MS were collected and spiked with cells of the Type strains of the four species H. influenzae, M. catarrhalis, S. pneumoniae and S. aureus. The spikes of added cells ranged from 1 million cells/ml down to 100 cells/ml.

(46) Clinical Sample Clean-Up Using MolYsis Kit

(47) The clinical samples (respiratory tract nasopharyngeal and nasal swabs), collected in Amies media, were supplemented with STGG (Skim milk, tryptone, glucose, glycerol) and frozen until processing.

(48) For removal of human biomass (mucus, cells and proteins), the MolYsis kit (MolYsis Basic5 kit, Molzym, Germany) was used according to protocol provided by the supplier. The biomass was collected by centrifuging the samples 15000 g (5 min) in 1.5 ml Eppendorf tubes. Supernatant was discarded. The pellet was resuspended in 1 ml (500 μl SU buffer+500 μl PBS). CM buffer (250 μl) was added and the sample was vortexed for 15 s and then allowed to stand at room temp for 10 min. The samples were transferred to 2.0 ml tubes. If visible clusters of bacteria/mucus were present the sample was pipetted up and down until they were dissolved. DB1 buffer (250 μl) was added and the sample was vortexed before allowing standing at room temp for 15 min. If visible clusters of bacteria/mucus were present the sample was pipetted up and down until dissolving. The sample was centrifuged—15000 g for 10 min to collect bacteria. Supernatant discarded and pellet saved. Pellet was resuspended in 1 ml RS buffer.

(49) The sample was centrifuged—15000 g for 5 min to collect bacteria. Supernatant discarded and pellet saved. Pellet was resuspended in 1 ml PBS buffer. The sample was centrifuged—15000 g for 5 min to collect bacteria. Supernatant discarded and pellet saved. Supernatant was discarded and the bacteria were resuspended in 120 μl ammonium bicarbonate (20 mM pH 8). The sample was subjected to bead beating in order to break the cells and release as many proteins as possible, making them accessible for digestion. Glass beads (Sigma-Aldrich G1145) had already been placed in the vials. The bead beater used was a TissueLyser from Qiagen. Settings: Frequency 1/25 s and continuous shaking for a total time of 5 min. The bead beaten samples were frozen until analysis.

(50) Digestion of Clinical Samples Using In-Solution Digestion with Sodium Deoxycholate (SDC)

(51) Frozen samples where thawed. SDC 1% was added from a 5% stock and bead beating was repeated. Samples were removed from the glass beads and transferred to new tubes (1.5 ml). The remaining glass beads were rinsed by adding 100 μl 1% SDC in ammonium bicarbonate (20 mM) and transferred to the samples. Trypsin (2 μg/ml, 100 μl ammonium bicarbonate, 20 mM pH 8) was added and samples were allowed to be digested for 8 h at 37 degrees Celsius. Formic acid (3 μl, neat) was subsequently added to remove SDC. Samples were centrifuged at 15000 g (10 min) to pellet biomass/debris. Pellet was discarded and supernatant (peptides) was transferred to a new tube (1.5 ml). Samples were kept frozen at −20 degrees Celsius until analysis.

(52) Peptide Analysis Using Tandem Mass Spectrometry

(53) The tryptic peptides were desalted on Pep Clean C18 spin columns (Thermo Fisher Scientific, Inc., Waltham, Mass.), according to the manufacturer's guidelines, dried, and reconstituted with 15 μl of 0.1% formic acid (Sigma-Aldrich) in 3% gradient-grade acetonitrile (Merck KGaA, Darmstadt, Germany). A 2.0 μl sample was injected, with an Easy-nLC autosampler (Thermo Fisher Scientific), and analyzed, using an interfaced Q Exactive hybrid mass spectrometer (Thermo Fisher Scientific). The peptides were trapped on a pre-column (45μ 0.075-mm inner diameter) and separated on a reversed-phase column, 200 0.075 mm, packed in-house with 3-m Reprosil-Pur C18-AQ particles (Dr. Maisch, Ammerbuch, Germany). The nanoLC (liquid chromatography) gradient was running at 200 nl/min, starting at 7% acetonitrile (ACN) in 0.2% formic acid, increased to 27% CAN for 25 min, then increased to 40% ACN for 5 min, and finally to 80% ACN for 5 min and held at 80% ACN for 10 min. Electrospray ionization was applied under a voltage of 1.8 kV and a capillary temperature of 320° C. in data-dependent positive ion mode. Full scan (MS1) spectra were acquired in the Orbitrap over the m/z range 400-1600, with a charge range of 2-6, at a resolution of 70,000, until reaching an AGC target value of 1e6 at a maximum of 250 ms. MS/MS spectra were acquired, using higher energy collision dissociation, at 30% from m/z 110 for the 10 most abundant parent ions, at a resolution of 35,000, using a precursor isolation window of 2 Da until reaching an AGC target value of 1e5 during an injection time of 110 ms. Dynamic exclusion for 30 s after selection for MS/MS was enabled to allow for detection of as many precursors as possible.

(54) TCUP—Typing and Characterization Using Proteomics

(55) The input to TCUP is a set of peptides identified from spectra generated by bottom-up tandem MS specified as a file in FASTA format. TCUP is general and can be used with peptide data from any spectral matching software, including de novo methods (e.g. SEQUEST (18), X!Tandem (19, 20), TIDE (21), Mascot (22), PEAKS (23), PepNovo (24), and Lutefisk (25)). The output from TCUP is in Excel format and includes the following: 1) the relative abundances of all organisms identified in a sample at and below a user-specified taxonomic level; 2) specific genes in the reference genomes that are matched by peptides in the analysis; and 3) the relative abundances of identified antimicrobial resistance genes. TCUP is implemented in Python 3.5, and the code and usage documentation are freely available under the ISC license from the project's repository. After alignment to the translated reference genome sequences, each peptide is matched to zero, one, or multiple reference genomes. To remove matches that are too dissimilar and unlikely to contain any relevant information about the taxonomic affiliation, two filtering steps were applied. The first step requires matches to have an identity of at least 90% and a coverage of 100% (only complete peptide matches are considered). Also, peptides shorter than six amino acids are removed. In the second filtering step, all matches with sequence identity of at least 5% below the best match for that peptide are discarded.

(56) After filtering, the remaining peptides are assigned to nodes in a taxonomic tree, using the lowest common ancestor algorithm (30). The taxonomic affiliation of a sample is then assigned based on the set of discriminative peptides, i.e. the peptides with a lowest common ancestor at a node that is at or below the user-specified taxonomic level. The taxonomic tree used in TCUP is based on the full NCBI Taxonomy (31) (taxdump downloaded Nov. 17, 2015), in which each reference genome is associated with a unique node. Our implementation extends the SQLite3 database used in the ETE3 package (32) with a table of taxonomic affiliations for all reference genome sequences included in the reference database.

Example 2—Detection of S. Pneumoniae Through Detection of Peptide Biomarkers in a Clinical Sample

(57) Step 1. Ten respiratory tract samples (nasopharyngeal swabs) deemed positive for S. Pneumoniae by traditional methods, including culturing and isolation of bacterial isolates, followed by MALDI-TOF-MS identification, were selected. The samples were in the form of swabs in commercial Amies media (Copan Diagnostics Inc).

(58) Step 2. 50% of the liquid Amies media (0.5 ml) was transferred to a cryotube and supplemented with STGG buffer (Skim milk, tryptone, glucose, glycerol) for storage at −20 degrees Celsius until analysis.

(59) Step 3. Human biomass was removed from the sample using the MolYsis kit (Molzym Gbh, Germany), according to manufacturer's protocol.

(60) Step 4. Sample was homogenized using bead beating and subsequently, the bacterial proteins were digested into peptides using trypsin in a buffer supplemented with sodium deoxycholate (SDC).

(61) Step 5. The peptides were desalted and purified using C18 spin column clean-up. After drying in speedvac, the peptides were resuspended in dilute formic acid.

(62) Step 6. The peptides were analyzed using LC-MS/MS, using the inclusion lists in a mode called inclusion list plus pick others. In this fashion were the MS instrument first looks only at the masses of the selected peptides in the inclusion list. If there are no masses matching to the 50 peptides in the inclusion list during the MS instrument cycle time (milliseconds), the instrument looks for everything else and picks the top ten most intense ion peaks (pick others).

(63) Step 7. The raw files were run through TCUP to match, identify and report the peptides identified in the samples. The results from this analysis are shown in Table 3. Thirteen peptides were identified belonging to the inclusion list among the ten clinical samples, thus resulting in a positive match for S. Pneumoniae.

Example 3—Example Showing a Panel of the Peptide Biomarkers for Performing Clinical Diagnostics

(64) Table 18 below shows an exemplary panel of the peptide biomarkers proposed for use in performing clinical diagnostics. Five different samples containing one or more of the four pathogens are analyzed. In each sample, a particular combination of the peptide biomarkers (shown by SEQ ID NOs) is shown. The detection of the peptide biomarkers are detected by any of the suggested methodologies, i.e. targeted MS approaches, or based on antibody detection or other suitable methodologies.

(65) TABLE-US-00018 S. aureus M. catarrhalis H. influenzae S. pneumoniae Sample 1 SEQ ID NO: 1 S. aureus SEQ ID NO: 2 Sample 2 SEQ ID NO: 20 SEQ ID NO: 36 M. catarrhalis SEQ ID NO: 21 SEQ ID NO: 37 H. influence SEQ ID NO: 22 SEQ ID NO: 41 SEQ ID NO: 42 Sample 3 SEQ ID NO: 18 SEQ ID NO: 52 M catarrhalis SEQ ID NO: 19 SEQ ID NO: 53 S. pneumoniae SEQ ID NO: 20 SEQ ID NO: 54 SEQ ID NO: 55 Sample 4 SEQ ID NO: 41 SEQ ID NO: 54 H. influence SEQ ID NO: 42 SEQ ID NO: 55 S. pneumoniae SEQ ID NO: 48 SEQ ID NO: 60 Sample 5 SEQ ID NO: 52 S. pneumoniae SEQ ID NO: 55

Biomarkers for detecting microbial infection

Inventors

Cpc classification

Classification Explorer

G01N2469/10

PHYSICS

Classification Explorer

G01N33/6848

PHYSICS

Classification Explorer

C07K14/212

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/285

CHEMISTRY; METALLURGY

Classification Explorer

G01N33/56944

PHYSICS

Classification Explorer

Y02A50/30

GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Classification Explorer

C07K14/3156

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/315

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/245

CHEMISTRY; METALLURGY

Classification Explorer

G01N33/56911

PHYSICS

International classification

Classification Explorer

G01N33/569

PHYSICS

Classification Explorer

C07K14/21

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/245

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/285

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/315

CHEMISTRY; METALLURGY

Classification Explorer

G01N33/68

PHYSICS

Abstract

Claims

Description