HIV SEROSIGNATURES FOR CROSS-SECTIONAL INCIDENCE ESTIMATION

20220065857 · 2022-03-03

    Inventors

    Cpc classification

    International classification

    Abstract

    Described are methods for estimating the cross-sectional incidence or duration of infection of a virus. Method steps include obtaining a biological sample with antibodies from a subject having a viral infection. The biological sample is mixed with two or more epitopes or peptides from the proteins of a vims responsible for the viral infection. The amount of antibody binding to the epitopes or peptides is quantified and the cross-sectional incidence or duration of infection of a virus is estimated.

    Claims

    1. A method of identifying the cross-sectional incidence or duration of infection for a virus comprising the steps of: obtaining a biological sample comprising antibodies from a subject who has one or more viral infections; mixing the biological sample with a plurality of epitopes or peptides of the proteins from one or more viruses responsible for the one or more viral infections; quantifying the amount of antibody binding to the plurality of epitopes or peptides of the proteins from the one or more viruses; and estimating the cross-sectional incidence or duration of infection for the one or more viruses.

    2. The method of claim 1 wherein the epitopes or peptides of the one or more virus responsible for the one or more viral infections are derived from, expressed in, or identified using a phage immunoprecipitation sequencing system (PhIP-Seq).

    3. The method of claim 1 wherein the epitopes or peptides of the one or more virus responsible for the one or more viral infections are derived from, expressed in, or identified using a VirScan assay.

    4. The method of claim 1 wherein the plurality of epitopes or peptides are modified by site-directed mutagenesis using alanine substitution or another method to alter the amino acid sequence of the peptides.

    5. The method of claim 1 wherein the one or more viruses is HIV.

    6. The method of claim 5 wherein the proteins are HIV proteins selected from the group comprising gp41, gp120, gag, and pol.

    7. The method of claim 5 wherein the plurality of epitopes or peptides are selected from the group consisting of SEQ ID:1 to SEQ ID:309.

    8. The method of claim 5 wherein the plurality of epitopes or peptides are selected from the group consisting of SEQ ID:1 to SEQ ID:309 in the range of two to twenty epitopes or peptides.

    9. The method of claim 5 wherein the one or more epitopes or peptides are selected from the group consisting of SEQ ID: 1 to SEQ ID: 309 in the range of between ten to one hundred epitopes or peptides.

    10. The method of claim 5 wherein the epitopes or peptides comprise SEQ ID:3, SEQ ID:22, SEQ ID:159 and SEQ ID:180.

    11. The method of claim 2 wherein the one or more viral infections is HIV subtype C.

    12. The method of claim 2 wherein the one or more viral infections is HIV subtype D.

    13. The method of claim 12 wherein the virus is selected from the group consisting of HIV, EBV, other viruses, or a combination thereof.

    14. The method of claim 1 wherein the epitopes or peptides are synthesized chemically.

    15. The method of claim 14 wherein the eptiopes or peptides are used in a assay system that detects and/or quantify the binding of antibodies to one or more epitopes or peptides, either individually or in a multiplex (multi-assay) format.

    16. The method of claim 15 wherein the assay system is selected from the group comprising an enzyme immunoassay, chemiluminescent assay, microparticle bead assay, electrochemiluminescent assay and a combination thereof.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0044] FIG. 1A-1E. Antibody reactivity to peptides spanning the HIV proteome.

    [0045] FIG. 1 illustrates the size and position of open reading frames (ORFs) in the HIV genome. Panels B-D are plotted relative to genomic coordinates for HIV (HXB2, NCBI #NC 001802), shown at the bottom of the figure. Panel B: The plot shows the number of peptide tiles encoded by the VirScan library at each position across the HIV genome. Panel C: The plot shows the average level of antibody binding (average z-score) for each peptide for the 403 samples in the discovery sample set; each dot represents antibody binding for a single peptide in the VirScan library. Panel D: The plot shows the percentage of study participants who had a high level of antibody binding for each peptide (z-score>10). Panel E: The figure shows a heat map of the level of antibody binding for peptides in the VirScan library as a function of duration of HIV infection. The position of peptides is shown on the x-axis; the duration of infection is shown on the y-axis. Z-scores are noted according to the color bar on the right; lighter colors (higher z-scores) indicate a higher level of antibody binding. For each sample, data are plotted in order of increasing z-scores, since many points were overlapping.

    [0046] Abbreviations: ORF: open reading frame; mo: months; yr: years; kb: kilobases.

    [0047] FIG. 2A-2C. Breadth of antibody reactivity

    [0048] FIG. 2 illustrates information related to antibody breadth. Panel A: The relationships between peptides that were highly enriched (z-score>10) are displayed as a network graph; data are from a single representative sample. Peptides (nodes) are indicated by circles. Darker red color indicates peptides with higher z-scores. Overlapping peptides that share amino acid sequences form clusters in the graph; the position of the peptides in HIV proteins is noted for each cluster (the HIV protein is listed first, followed by numbers that represent the range of amino acid positions of the N-termini of peptides in the cluster). Peptides are linked (connected by lines) if they share an identical sequence of at least seven consecutive amino acids. In this case, network graph analysis of 573 reactive peptides identified 45 unique peptide specificities (circles outlined in black), corresponding to an antibody breadth value of 45. Panels B and C. Antibody breadth is plotted as a function of duration of HIV infection. The top two graphs (Panel B) show breadth data for HIV peptides; the bottom two graphs (Panel C) show breadth data for EBV peptides. Each line represents results from a single study participant. The two graphs on the left show data for participants who did not start antiretroviral treatment in the GS Study (no ART, N=33); the two graphs on the right show data for participants who reported starting antiretroviral treatment (ART, N=24). Data from samples collected after treatment initiation are shown in red (on ART). Dark blue lines indicate the locally-weighted regression (lowest) curves for all participants in each graph.

    [0049] Abbreviations: Env: envelope; Pol: polymerase; Gag: group-specific antigen; Rev: HIV regulatory protein; Vpu: viral protein U; ART: antiretroviral treatment; mo: months; yr: years;

    [0050] EBV: Epstein Barr virus.

    [0051] FIG. 3. Relationship between changes in antibody breadth and time to ART.

    [0052] FIG. 3 illustrates time-to-event (survival) analysis for the outcome of time from HIV infection to antiretroviral treatment initiation (time to ART), comparing participants with declining vs. stable or increasing antibody breadth (shown in red and blue, respectively). The change in antibody breadth was calculated for the time period between 9 months and 2 years after HIV infection, using samples collected closest to these dates. The median sample collection times were 0.8 years for the visit 9 months after infection (range 0.55-0.98 years) and 1.5 years for the visit 2 years after infection (range 1.26-3.12 years); the median time to ART initiation was 3.34 years (range 1.16-6.35 years). Data from two participants were removed for this analysis (one did not have viral load data and one started ART<2 years after HIV infection). The survival curves are based on estimated hazard ratios (lines) with 95% confidence intervals (shaded areas). The number of participants at risk (Number at risk; not yet on ART) at each time point is shown below the graph for each participant group.

    [0053] Abbreviations: Ab: antibody; ART: antiretroviral therapy; Decr: decreasing antibody breadth; Non-Decr: stable or increasing antibody breadth.

    [0054] FIG. 4A-4B. Association of antibody binding and the duration of HIV infection.

    [0055] FIG. 4A illustrates data evaluating the association of antibody binding (normalized read counts) and the duration of HIV infection for 3,327 peptides in the VirScan library that had well-defined positions in the HIV genome. P-values were calculated using generalized estimation equations to account for the dependency between measurements over time from the same individual and were adjusted using the Bonferroni correction based on all 3,384 identified HIV peptides. The x-axis shows the position of each peptide and the y-axis shows the corresponding Bonferroni adjusted p-value. Black dots represent peptides where antibody binding was positively associated with the duration of infection (266 peptides with adjusted p-values<0.05); red dots represent peptides where antibody binding was negatively associated with the duration of infection (43 peptides with adjusted p-values<0.05). FIG. 4B shows the position of open reading frames (ORFs) in the HIV genome (reproduced from FIG. 1, Panel A).

    [0056] FIG. 5A-5D: Use of a 4-peptide model to predict duration of HIV infection.

    [0057] FIG. 5 illustrates data from the 4-peptide model. Four peptides were selected from the VirScan library that had the strongest independent association between antibody binding and the duration of HIV infection. This included two peptides that had increasing antibody binding over time, and two peptides that had decreasing antibody binding over time (Supplemental FIG. 2). Panels A-C: Data from these four peptides (normalized read counts) were summed to generate a composite antibody binding score for each of the 403 samples in the discovery sample set that was used to identify the four peptides (Table 1). The plots show the observed duration of HIV infection (y-axes) and the duration of HIV infection that was predicted using a simple linear regression model based on the composite antibody binding score for the four peptides (x-axes). In the graphs, each dot represents data from a single sample. The same data are plotted in Panels A-C. Red dots represent data obtained for samples collected after antiretroviral treatment (ART) initiation (Panel A), for samples with viral load<1,000 copies/mL (Panel B), and for samples with CD4 cell counts <350 cells/mm.sup.3 (Panel C). Panel D: The 4-peptide model described above was used to predict the duration of HIV infection in an independent sample set that included 72 samples from 32 participants in the GS Study (validation sample set, Table 1). Data were analyzed and plotted using the same methods used for Panels A-C. Red dots represent data obtained for samples with subtype D HIV. Correlation values are r=0.79 and r=0.64 for Panels A-C and D, respectively, under the assumption that data points are independent.

    [0058] FIG. 6A-6C. Peptide engineering.

    [0059] FIG. 6 illustrates antibody data for two representative parent peptides and their respective variant peptides generated by alanine scanning mutagenesis. High levels of antibody binding (z-scores>10) were observed in samples for all but one of the 57 participants for parent peptide A (98.2%) and for all 57 participants for parent peptide B. Panels A and B: These panels show heat maps of antibody binding for each set of peptides (the parent peptide and 54 variant peptides with triple alanine substitutions at different positions within the peptide); the position of the alanine substitution in each variant peptide is shown on y-axes. Antibody binding data are shown as a function of duration of HIV infection (x-axes). Panel C: The blue line shows antibody binding data (normalized read counts) for the parent peptide included in the analysis in panel B (parent peptide B) and selected variant peptides. Black lines show data for variant peptides with triple alanine substitutions at amino acids 12-17 and 19-21; the red line shows data for the variant peptide with the triple alanine substitution at amino acid 18.

    [0060] Abbreviations: nrc: Normalized read count; mo: months; yr: years.

    [0061] FIG. 7A-7B. Breadth of antibody reactivity for samples with low viral load and low CD4 cell count.

    [0062] FIG. 7 illustrates the relationship between antibody breadth, HIV viral load, and CD4 cell count. The plots shown in this figure are the same as those shown in FIG. 2B, except that different data points are colored red. In Panel A, red dots indicate samples with viral loads <1,000 copies/mL (V.sub.L<1,000). In Panel B, red dots indicate samples with CD4 cell counts <350 cells/mm.sup.3 (CD4<350).

    [0063] Abbreviations: ART: antiretroviral treatment; V.sub.L: viral load; mo: months; yr: years.

    [0064] FIG. 8A-8B. Association of changes in antibody breadth and other factors.

    [0065] FIG. 8 illustrates the relationship between the changes in antibody breadth between 9 months and 2 years after infection, time to initiation of antiretroviral therapy (ART), and other factors. Panel A: This plot shows univariate (pairwise) associations, reported as estimated Pearson correlation coefficients and respective p-values, between pairs of factors. Solid lines indicate correlations that were statistically significant after correction for multiple comparisons (p<0.05/15=0.0033). Panel B: The array shows histograms of data for factors evaluated for their association with time to ART initiation (diagonal). The array also shows scatter plots of the data (upper right) and summary statistics (lower left) for all pairwise comparisons. Summary statistics include the estimated Pearson correlations with 95% confidence intervals and the respective p-values. Units for variables are as follows: Age (years); viral load set point (log.sub.10 copies/mL); baseline CD4 cell count (baseline CD4; cells/mm.sup.3); change in the antibody breadth between 9 months and 2 years after HIV infection; change in CD4 cell count between 9 months and 2 years after HIV infection (cells/mm.sup.3); time to ART (years).

    [0066] FIG. 9. Association of peptide binding and duration of infection for the peptides selected based on the dynamics of antibody binding over the course of HIV infection.

    [0067] FIG. 9 illustrates data from the four peptides in the 4-peptide model that was used to estimate the duration of HIV infection (peptides A-D); lines indicate longitudinal data for samples from each of the 57 study participants. Antibody binding (normalized read counts) is plotted as a function of duration of HIV infection. In each plot, the blue line is the locally-weighted regression curve (lowers smoother) for all participants, and the red line is the least squares regression line for all participants. P-values were calculated using generalized estimation equations to account for the dependency between measurements from multiple samples from each participant.

    [0068] FIG. 10. Subtypes and strains of HIV represented by peptides in the VirScan library.

    [0069] FIG. 10 illustrates the number of proteins and peptides in the VirScan peptide library corresponding to different HIV subtypes and strains.

    [0070] FIG. 11. Peptides used to estimate the duration of HIV infection.

    [0071] FIG. 11 illustrates information identifiers, amino acid sequences, protein location, and the position in the HIV genome for the peptides in the 4-peptide model that was used to estimate the duration of HIV infection. FIG. 11 discloses SEQ ID NOS 3, 22, 159, and 180, respectively, in order of appearance.

    [0072] FIG. 12. Peptides that had antibody reactivity that was significantly associated with the duration of HIV infection.

    [0073] FIG. 12 illustrates a list of 309 peptides for use in estimating HIV incidence and/or the duration of HIV infection. [Excel Spreadsheet]. The statistical association between antibody reactivity (measured as enrichment z-scores) and duration of infection was assessed for all HIV peptides in the VirScan library, using generalized estimation equations. The sign of the beta coefficient (positive or negative) indicates the observed direction of the association (positive or negative, respectively). All peptides exhibiting a p-value after adjustment for multiple comparisons using the Bonferroni method (“p.adj.Bonf”) of 0.05 or below are provided (309 peptides). Peptides included in the 4-peptide model are highlighted. FIG. 12 discloses SEQ ID NOS 1-309, respectively, in order of appearance.

    [0074] FIG. 13. Samples used for analysis.

    [0075] FIG. 13 illustrates characteristics of the participants who provided samples used in the analysis. The discovery sample set included 403 samples from 57 participants. The validation sample set included 72 samples from 32 participants who were not included in the discovery sample set.

    [0076] Abbreviations: ART: antiretroviral therapy

    DETAILED DESCRIPTION OF THE INVENTION

    [0077] The inventors used a massively-multiplexed antibody profiling system to analyze the fine specificity of the antibody response to HIV infection. This system is based on phage immunoprecipitation sequencing (PhIP-Seq) (7). Testing was performed by incubating samples with a bacteriophage library that expresses peptides encoded by oligonucleotides generated by high-throughput DNA synthesis. The abundance and specificity of antibodies in test samples were assessed by immunoprecipitating phage-antibody complexes and sequencing the DNA in the captured phage particles. The “VirScan” phage library includes >95,000 peptides that span the genomes of >200 viruses that infect humans (the human “virome”) (8). The inventors performed PhIP-Seq using the VirScan library to analyze HIV antibodies from individuals with known duration of HIV infection, ranging from <1 month to 8.7 years. This allowed them to examine dynamic changes in antibody diversity and the fine specificity of HIV antibodies from individuals with early to late stage infection, including individuals on antiretroviral therapy (ART) and individuals with advanced HIV disease.

    [0078] HIV incidence was often determined by following cohorts of HIV-uninfected individuals and quantifying the rate of new HIV infections. HIV incidence can also be estimated using a cross-sectional study design, using laboratory assays to identify individuals who are likely to have recent HIV infection. Most serologic assays used for cross-sectional HIV incidence estimation measure general characteristics of the antibody response to HIV infection (e.g., antibody titer, antibody avidity) (9-11) which may be impacted by viral suppression, loss of CD4 T cells, and other factors (12-15). Unlike conventional methods the inventors used a VirScan assay to identify novel peptide biomarkers associated with the duration of HIV infection, and surprisingly demonstrated that peptide engineering can be used to enhance the properties of peptides for discriminating between early and late-stage infection. This information could be used to develop improved methods for estimating HIV incidence from cross-sectional surveys, for surveillance of the HIV/AIDS epidemic, and evaluating the impact of interventions for HIV prevention in clinical trials.

    [0079] Antibody reactivity to HIV peptides.

    [0080] We used the VirScan assay to characterize anti-HIV antibodies in 403 plasma samples from 57 women with subtype C HIV infection (FIG. 13). The time from seroconversion to sample collection ranged from 14 days to 8.7 years. The density of peptides in the library varied across the open reading frames for HIV proteins (the HIV proteome, FIGS. 1A and 1B). The level and frequency of antibody binding were highly variable (FIGS. 1C and 1D); the strongest and most frequent antibody binding was observed for peptides in the gag and env regions. Some peptides were consistently targeted over the course of the infection; in contrast, the level and frequency of antibody binding to other peptides increased or decreased over the course of HIV infection (FIG. 1E).

    [0081] Breadth of antibody reactivity.

    [0082] The inventors next analyzed the diversity of each individual's antibody response to HIV over time. Network graphs were used to determine antibody breadth at each time point; antibody breadth was defined as the number of non-overlapping peptides with high levels of antibody binding. FIG. 2A shows the network graph for peptides that reacted with antibodies from a representative study sample (one immunoprecipitation reaction). This analysis identified 45 non-overlapping peptides; these peptides were located in the gag, pol, env, vpu and rev regions. The inventors next analyzed the change in antibody breadth over the course of HIV infection. Since ART was known to influence HIV antibody production, the inventors compared data from participants who did vs. did not start ART during the GS Study (FIG. 2B). ART also serves as a surrogate for disease progression; in the GS Study, ART was recommended when the CD4 cell count fell below 250 cells/mm.sup.3. Overall, 32 participants started ART during the GS Study.

    [0083] In both groups (with and without ART initiation), antibody breadth increased during the first 6 months of infection. In the group that did not start ART, a relatively stable value for antibody breadth (termed “antibody breadth set point”) was established in most individuals approximately nine months to one year after infection; the antibody breadth set point varied considerably among study participants. In contrast, in the group that ultimately started ART, a decline in antibody breadth was observed approximately one year after infection. After participants started ART, antibody breadth appeared to stabilize at levels similar to those seen in early HIV infection. The decline in antibody breadth prior to ART initiation did not appear to be related to HIV viral load or CD4 cell count (FIG. 7).

    [0084] The inventors next evaluated the relationship between HIV infection and the antibody response to a different, chronic infection that was expected to have a high prevalence in the study setting (EBV) (FIG. 2C). Data used to calculate the breadth of the antibody response to EBV infection were obtained from the same VirScan data sets used for HIV analysis (FIG. 2C). In most participants, EBV antibody breadth was relatively stable in the first 6 months of HIV infection, and then declined. EBV antibody breadth then appeared to stabilize in participants who did not start ART for HIV infection. In contrast, in most participants who started ART, EBV antibody breadth increased after ART initiation, often reaching levels that surpassed those observed early in HIV infection.

    [0085] Factors associated with changes in antibody breadth over time.

    [0086] To explore the relationship between the decline in HIV antibody breadth and subsequent ART initiation, the inventors calculated the rate of change of antibody breadth over the period ˜9 months to ˜2 years after HIV infection (based on sample availability); none of the participants included in the analysis were on ART during this time window. For this time-to-event analysis (the outcome being time to ART initiation), participants were divided into two groups: those with declining breadth and those with stable or increasing breadth. The inventors found that participants who had stable or increasing antibody breadth ˜9 months to ˜2 years after infection were less likely to start ART earlier in infection (log-rank test p=0.009, hazards ratio: 0.29, 95% CI: 0.11, 0.78, p=0.014, FIG. 3). The average time between the study visits used to evaluate the change in antibody breadth (˜9 months and ˜2 years after infection) was similar in the two groups (p=0.28), so this was not likely to have biased the analysis.

    [0087] The inventors next evaluated the relationship between the rate of decline in antibody breadth and other factors, including age at infection, baseline CD4 cell count, rate of decline in CD4 cell count, and viral load set point (FIG. 8). A faster decline in antibody breadth was strongly associated with lower baseline CD4 cell count (R=0.42, 95% CI: 0.17, 0.62; p=0.002) and higher viral load set point (R=−0.43, 95% CI: −0.62, −0.18; p=0.001), and was also associated with earlier ART initiation (R=0.28, 95% CI: 0.01, 0.51; p=0.043).

    [0088] Dynamic Changes in Antibody Binding

    [0089] The inventors next explored the relationship between HIV antibody specificity and the duration of HIV infection. First, the inventors used a linear model to quantify the association between antibody binding and the duration of infection for the 3,384 HIV peptides in the VirScan library. This analysis was performed using all 403 samples in the discovery sample set. The model identified 309 peptides that had a significant association between these two factors (p-value<0.05 after adjusting for multiple comparisons using the Bonferroni method, FIG. 4A and FIG. 12); 266 peptides had increasing antibody binding over time (positive association) and 43 peptides had decreasing antibody binding over time (negative association). The position of peaks representing increased vs. decreased antibody binding were observed at different positions in the HIV genome. Peptides that had a strong positive association with duration of infection tended to cluster in the N-terminal gag region, the C-terminal pol region, and defined domains within the env region. In contrast, peptides that had a strong negative association with duration of infection clustered in the C-terminal gag region, and the middle of the pol region, with others scattered across the env region or located in non-structural (accessory) proteins, such as nef.

    [0090] The inventors then selected the four peptides that had the strongest independent association between antibody binding and the duration of HIV infection (FIG. 9 and FIG. 11). This included two peptides that had increased antibody binding over time (one in gp41; one in gp120), and two peptides that had decreased antibody binding over time (one in gag; one in pol). Antibody binding measures from each of the four peptides were combined in a simple linear model to generate a single, unweighted, 4-peptide composite measure. The duration of infection predicted by this model was highly correlated with the observed (true) duration of infection (GEE p<1×10.sup.−100; FIG. 5A). Importantly, the predictive value of the 4-peptide composite measure did not appear to be impacted by ART initiation, low viral load, or low CD4 cell count (FIG. 5A-C).

    [0091] The inventors next evaluated the performance of the 4-peptide model using an independent validation sample set (FIG. 13). This set consisted of samples from individuals in the GS Study who were not included in the discovery set that was used to identify the model peptides. This sample set also included “challenge samples” that have characteristics known to complicate cross-sectional HIV incidence estimation using other serologic assays: 28 (38.9%) of the samples were HIV subtype D; 37 (51.4%) had CD4 cell counts <350 cells/mm.sup.3; 16 (22.2%) had viral loads <1,000 copies/mL, and 12 (16.7%) were from individuals on ART. The duration of infection predicted by the 4-peptide model was also correlated with the observed (true) duration of infection using this independent sample set (GEE p<3×10.sup.−36; FIG. 5D). The predictive value of the 4-peptide composite measure did not appear to be not impacted by HIV subtype (subtype C vs. D; FIG. 5D).

    [0092] Epitope Engineering

    [0093] Next, the inventors explored whether peptide epitopes could be modified to improve the association between antibody binding and the duration of HIV infection. The inventors first selected 11 non-overlapping peptides that were targeted by the majority of HIV-infected individuals (“public epitope peptides”). The inventors then generated variant peptides by substituting each set of three consecutive amino acids with alanine residues. FIG. 6 shows the impact of alanine substitutions on antibody binding for two of the 11 parent peptides; these peptides were targeted by >98% of the study participants. In the first case (parent peptide A), antibody binding to the parent peptide and most of the variant peptides decreased with increasing duration of infection (FIG. 6A). Alanine substitutions at amino acid positions 26-34 appeared to disrupt antibody binding at all time points. In the second case (parent peptide B), antibody binding to the parent peptide and most of the variant peptides increased with increasing duration of infection (FIG. 6B). In this case, alanine substitutions at amino acid positions 13-21 preferentially disrupted antibody binding early in infection. FIG. 6C shows the level of antibody binding as a function of duration of infection for parent peptide B and variant peptides that had alanine substitutions in the region most impacted by mutagenesis (9 peptides, with substitutions at positions 13-21). Over the course of HIV infection, antibody binding to the parent peptide increased by 57%; in contrast, antibody binding to one of the variant peptides increased by approximately 479% over the same time period. These data provide proof-of-principle that epitope engineering can be used to improve the capacity of peptides to serve as quantitative biomarkers of disease processes, such as the duration of HIV infection.

    [0094] The present invention provides the most comprehensive analysis of HIV antibody specificities to date, including their characterization from early to late stage infection. The inventors found that changes in antibody diversity early in infection were associated with differences in clinical outcome (measured as time to ART initiation). This study also provides proof-of-principle that an “HIV serosignature”, reactivity to a panel of HIV peptides, is useful for cross-sectional HIV incidence estimation.

    [0095] The inventors used a novel definition of “antibody breadth” to quantify HIV antibody diversity, and found that this measure reaches a plateau (“antibody breadth set point”) early in infection. In the GS study cohort, a decline in antibody breadth between 9 months and 2 years after infection was associated with a shorter time to ART initiation, which was prompted in the GS Study cohort by a decline in CD4 cell count to <250 cells/mm.sup.3. The decline in antibody breadth among those who subsequently started ART likely reflected declining B cell support due to loss of T helper cells. HIV antibody breadth appeared to stabilize at a low level after ART initiation. In contrast, the breadth of the EBV antibody response increased sharply after ART initiation, which may have reflected immune reconstitution.

    [0096] Previous studies have identified several factors associated with HIV disease progression, including virologic factors [e.g., HIV viral load, replication capacity, and subtype], immunologic factors [e.g., inversion of the CD4/CD8 ratio, polyclonality of the anti-HIV T cell response, degree of early immune activation] and host factors [e.g., human leukocyte antigen (HLA) type B57, CCR5 delta 32 mutations]. It is not clear if the decline in antibody breadth that we observed caused disease progression leading to ART initiation, or if it was a surrogate for other changes, such as a decline in T cell number or function. If the decline in antibody breadth has a causative role in disease progression, then use of therapeutic vaccines to boost antibody diversity may in theory provide clinical benefit.

    [0097] Generalized antibody responses to HIV infection, such as antibody titer and avidity, tend to plateau approximately one year after HIV infection. These characteristics of the antibody response are impacted by a variety of factors, including natural and drug-induced viral suppression, disease progression, and HIV subtype. Previous studies evaluating the banding pattern in Western blots demonstrate that HIV antibody specificity evolves early in infection. Recent studies have explored whether assays that include a small number of protein or peptide targets could be used to identify recent HIV infections. Using the VirScan assay to analyze 403 plasma samples, the inventors were able to quantify antibody binding to >3,300 HIV peptides from early to late-stage HIV infection. These data were used to generate a simple, unweighted, 4-peptide model that predicted duration of HIV infection. The peptides included in this prototype model were from four different HIV proteins (gp41, gp120, gag and pol). Two of these peptides had increasing antibody reactivity over time, and two had decreasing antibody reactivity over time. It is noteworthy that the gp41 peptide, which showed the strongest association with duration of infection, included a sequence shared by the HIV subtype B target peptide in the Limiting Antigen Avidity (LAg) assay that is in wide use for cross-sectional HIV incidence estimation. Our analysis also demonstrated that epitope engineering can be used to enhance the capacity of individual peptides to discriminate between early and late HIV infection.

    [0098] Data obtained with the 4-peptide model described above demonstrates that the VirScan assay can be used to identify peptides for applications such as cross-sectional HIV incidence estimation. The inventors are currently investigating more sophisticated statistical and machine-learning models to identify peptide combinations with greater accuracy for predicting the duration of HIV infection, and are generating larger data sets for model building and assessment. We are also exploring whether alternate serosignatures provide more accurate prediction of the duration of infection among people with longer term infections. On-going studies will also provide more information about the possible impact of ART, viral load, and CD4 cell count on antibody binding profiles. Considerable work will be needed to translate findings from this study into a laboratory test that can be used for improved cross-sectional HIV incidence testing. For example, peptides of interest could be incorporated into high-resolution, quantitative, multi-peptide enzyme immunoassays (EIAs) for high-throughput testing. Antibody binding data obtained from the EIA testing platform could then be used to compare the performance of serosignatures for HIV incidence estimation that include different sets of peptides, weighting for individual peptides, and different cut-offs for antibody binding to each peptide in the model. In previous work, we have used this approach to identify multi-assay algorithms that provide accurate cross-sectional HIV incidence estimates.

    [0099] The VirScan assay has several unique advantages over alternative multiplex serological assays for peptide discovery. These include: quantitative assessment of antibody binding to peptides that span all open reading frames in the HIV genome, including both structural and regulatory proteins; representation of a wide range of HIV subtypes and strains, including groups M, N, and O and HIV-2; and fine resolution for epitope identification, which can be further refined with alanine scanning mutagenesis. The assay also provides information about antibody binding to >200 other human viruses. In this report, data from other viral peptides were used to normalize peptide binding measures, and allowed us to compare the impact of ART on the antibody response to a prevalent non-HIV viral infection (EBV). Data from the same assay runs could be used to examine the evolution and fine specificity of antibodies to other viruses, and the impact of viral co-infections on the anti-HIV antibody response. Future studies could also explore use of the VirScan assay to identify serosignatures for estimating incidence of other viral infections, such as hepatitis C virus. Finally, future phage libraries composed of additional protein products, such as those from the gut microbiome, may be used to explore the impact of immune system pre-conditioning on the response to HIV infection.

    [0100] This present invention reveals novel features of the humoral response to HIV infection, and demonstrates the utility of the VirScan assay for identifying peptide biomarkers for applications such as cross-sectional HIV incidence estimation. This technology could also be used to evaluate serologic responses to other infectious diseases, as well as the impact of viral co-infections on immune responses. This may improve understanding of the complex relationships between viral infections and the immune responses that they elicit.

    EXAMPLES

    [0101] The following Examples have been included to provide guidance to one of ordinary skill in the art for practicing representative embodiments of the presently disclosed subject matter. In light of the present disclosure and the general level of skill in the art, those of skill can appreciate that the following Examples are intended to be exemplary only and that numerous changes, modifications, and alterations can be employed without departing from the scope of the presently disclosed subject matter. The following Examples are offered by way of illustration and not by way of limitation.

    [0102] Samples Used for Analysis

    [0103] Plasma samples were obtained from the GS Study (Uganda and Zimbabwe; 2001-2009), which evaluated the relationship between hormonal contraceptive use, genital shedding of HIV, and HIV disease progression among women with known dates of HIV seroconversion (18). ART was recommended for study participants with CD4 cell counts below 250 cells/mm.sup.3, consistent with local treatment guidelines at the time the GS Study was performed. Data for CD4 cell count and viral load were collected in the GS Study (18); data on the timing of ART initiation was obtained by review of clinic records.

    [0104] The inventors analyzed samples from participants who acquired HIV infection, where the maximum time between collection of the last HIV-negative sample and the first HIV-positive sample was four months. For each individual, the estimated date of infection was defined as either the midpoint between visits with the last negative HIV antibody test and the first positive HIV antibody test, or fifteen days before documentation of acute infection (HIV RNA positive/HIV antibody negative status). Two sets of samples were analyzed in this report: a discovery sample set and a validation sample set (FIG. 13). The discovery sample set included participants who had at least one year of follow-up after seroconversion, with samples collected at three or more study visits during that period. The independent validation sample set included samples from participants from the GS Study who were not included in the discovery sample set. HIV subtype assignments were based on phylogenetic analysis of the HIV env C2V3 region (19). All of the samples in the discovery sample set were HIV subtype C; the validation sample set also included “challenge” samples with HIV subtype D, which are often misclassified using currently available serologic HIV incidence assays (20, 21).

    [0105] Phage Library Used for Analysis

    [0106] The VirScan library includes 3,384 HIV peptides spanning all HIV proteins (8). The protein sequences used to design peptide tiles were selected from the UniProtKB database, balancing sequence diversity and library size (8). The peptides are 56-amino acids long with 28-amino acid overlaps and represent diverse HIV subtypes and strains (FIG. 10). In this study, the VirScan library was augmented with a public epitope library that included peptides previously found to be targeted by a high proportion of HIV-infected individuals (8). Eleven “parent” peptides in the public epitope library were modified by introducing triple alanine substitutions centered at each amino acid position; the resulting public epitope library included 594 genetically-engineered variant HIV peptides. Silent nucleotide substitutions were encoded in the first 50 nucleotides of each DNA tile, so that variant peptides could be uniquely identified using 50-nucleotide single-end Illumina sequencing.

    [0107] The VirScan library also includes 2,263 Epstein Barr virus (EBV) peptides, 718 Ebola virus peptides, and 518 rabies virus peptides; the public epitope library includes an additional 227 Ebola virus peptides. In this report, EBV data were used to evaluate the impact of antiretroviral therapy for HIV infection on the breadth of the anti-EBV antibody response. Ebola and rabies virus data were used to normalize antibody binding data to account for differences in sequencing depth between samples.

    [0108] Phage Immunoprecipitation and DNA Sequencing

    [0109] Detailed procedures for the VirScan assay were described previously (8, 22). In this study, the concentration of IgG in plasma samples was determined using an in-house enzyme-linked immunosorbent assay (capture and detection antibodies 2040-01 and 2042-05, respectively Southern Biotech, Birmingham, Ala.). Approximately 2 μg of IgG from each sample were added to the combined T7 bacteriophage VirScan and public epitope libraries (1×10.sup.5 plaque forming units for each phage clone in each library), diluted in phosphate-buffered saline to a final reaction volume of 1 mL in a deep 96-well plate, and incubated overnight at 4° C. Eight mock immunoprecipitation reactions (no plasma) were included on each plate; these reactions served as negative controls for data normalization. After rotating the plates overnight at 4° C., 20 μL of protein A-coated magnetic beads and 20 μL of protein G-coated beads (catalog numbers 10002D and 10004D, Invitrogen, Carlsbad, Calif.) were added to each reaction; the plates were rotated for another 4 hours at 4° C. Immunoprecipitation reactions were processed using the Agilent Bravo liquid handling system (Agilent Technologies, Santa Clara, Calif.). Beads were washed twice with Tris-buffered saline (50 mM Tris-HCl with 150 mM NaCl, pH 7.5) containing 0.1% NP-40 and then resuspended in 20 μL of a polymerase chain reaction (PCR) mix containing Herculase II Polymerase (catalog number 600679, Agilent Technologies). After 20 cycles of PCR, 2 μL of the PCR products was added to a second 20-cycle PCR reaction, which added sample-specific barcodes and P5/P7 Illumina sequencing adapters to the amplified DNA. DNA sequencing of the pooled PCR products was performed using an Illumina HiSeq 2500 instrument (Illumina, San Diego, Calif.) in rapid mode (50 cycles, single end reads).

    [0110] Analysis of DNA Sequencing Data

    [0111] Fastq files from DNA sequencing were demultiplexed using exact matching of 8-nucleotide sample-specific i5 and i7 DNA barcodes (Illumina). For each sample, a read count (the number of times each sequence was detected) was obtained for each peptide using Bowtie alignment (23), without allowing any mismatches. The level of antibody-dependent enrichment of each peptide in each sample was determined by comparing the read count for the sample to the read counts obtained for 40 mock immunoprecipitation reactions (8 mock reactions per plate). Two different measures were used to quantify the degree of antibody binding: “z-scores” were used to reduce false positivity in cases of low sequencing depth (this approach was used to generate data for FIG. 1 and for calculation of antibody breath); “relative fold-change” was used to normalize data for highly-enriched peptides (this approach was used to generate data for FIGS. 4-6 and FIG. 9). Z-scores were calculated by subtracting the expected normalized read count (determined by regression against the mock immunoprecipitation reactions) from the observed normalized read count; the resulting value was then divided by an estimate of the standard deviation of the normalized read counts, based on the mock immunoprecipitation reactions (24). Relative fold change values were determined as follows. Read counts were log.sub.10 transformed prior to analysis. First, read count data for Ebola virus and rabies virus was trimmed by removing outlier values (the lowest 5% and highest 5%). The log.sub.10 transformed read count for each HIV peptide (after adding one read count) was then normalized to the average read count for all Ebola virus and rabies virus peptides of the respective sample. To generate a fold change value for each HIV peptide, the normalized value of the peptide was divided by the average of the normalized values for the same peptide observed across the mock immunoprecipitation reactions that were run on the same plate.

    [0112] Determination of Antibody Breadth

    [0113] The term, “antibody breadth”, was used to indicate the number of unique non-overlapping epitopes that had high levels of antibody binding (z-scores>10). Antibody breadth was determined for HIV and EBV peptides using network graphs as follows. The amino acid sequences of all peptides in the VirScan library (HIV or EBV) were first analyzed to identify sequence overlaps (linkages, defined as two peptides sharing an identical sequence at least 7 amino acids long). The linkages were used to construct an undirected network graph, where each node represented a peptide with high-level antibody binding, and each linkage between two nodes represented a sequence overlap between the two peptides. The number of linkages for each peptide defined its degree of connectivity. Peptides were then removed from the graph one at a time using the following approach. At each iteration, the peptide(s) with maximum connectivity was removed, and the degree of connectivity was recalculated for each of the remaining peptides. If multiple linked peptides had equivalent connectivity, the peptide with the lowest z-score was removed first. This process was repeated until the only remaining structures in the network were simple paths and cycles. For cycles (simple paths without end peptides), the peptide with the lowest z-score was removed first; this resulted in a simple path. Peptides were iteratively removed from simple paths in order to retain the greatest number of unlinked peptides. The number of remaining unlinked peptides was defined as the antibody breadth (25).

    [0114] Rate of Change in Antibody Breadth

    [0115] For each participant, we estimated the rate of change in antibody breadth over the time period from 9 months to 2 years after HIV infection. This was calculated by determining the difference in antibody breadth for samples collected closest to time points 9 months and 2 years after HIV infection, and dividing this value by the length of time between the two visits. The rate of change in CD4 cell count was derived in the same way, using samples that had associated CD4 cell count data. The relationship between the rate of change in antibody breadth (and other factors) with time to ART initiation was determined using Cox proportional hazards models. The following factors were included in the analysis: age at seroconversion, CD4 cell count at the first visit after seroconversion, viral load set point, the rate of change in CD4 cell count, and time between HIV seroconversion and ART initiation. Viral load set point was defined as the median log.sub.10 viral load, excluding viral load results from the first HIV-positive visit, the visit prior to ART initiation, and any visits after ART initiation. Pearson correlation coefficients and their respective p-values and 95% confidence intervals were used to describe the relationships between the factors analyzed. We also compared the time to ART initiation among individuals who experienced a decline in antibody breath between 9 months and 2 years, and those who had stable or increasing antibody breadth in this period. Statistical significance between the breadth measures and time to ART initiation was assessed using a non-parametric log-rank test and the semi-parametric Cox proportional-hazards model with a dichotomized variable for change in breadth rate (decreasing vs. stable/increasing). Individuals who did not initiate ART were treated as right-censored. Survival curves were plotted based on the resulting hazard functions for the two groups.

    [0116] Identification of peptides for estimating duration of HIV infection.

    [0117] The observed duration of infection (log.sub.10 transformed) was regressed on each of the normalized read count for each peptide, and the peptide with the strongest association was selected. To select additional peptides with independent information about duration of infection, we correlated the “residuals” (i.e., the differences between the observed and fitted values) from the above linear model against each of the remaining peptides, selected the peptide with the strongest association, and repeated this step twice more to generate a list of four peptides. Two of the four peptides had increased antibody binding over time since infection (positively associated with duration of infection), and two had decreasing antibody binding over time (negatively associated with duration of infection). A simple predictor for duration of infection was calculated as the sum of the normalized read counts for the positively-associated peptides, minus the sum of the normalized read counts for the negatively-associated peptides; read counts were log transformed for this analysis. For the analysis of predicted duration of infection, generalized estimating equations (GEE) were used to account for auto-regressive correlation structure of samples from the same individual.

    REFERENCES

    [0118] All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein. [0119] 1. S. K. Wendel et al., Effect of natural and ARV-induced viral suppression and viral breakthrough on anti-HIV antibody proportion and avidity in patients with HIV-1 subtype B infection. PLoS One 8, e55525 (2013). [0120] 2. E. W. Fiebig et al., Dynamics of HIV viremia and antibody seroconversion in plasma donors: implications for diagnosis and staging of primary HIV infection. AIDS 17, 1871-1879 (2003). [0121] 3. Y. Geiss, U. Dietrich, Catch Me If You Can—The Race Between HIV and Neutralizing Antibodies. AIDS Rev 17, 107-113 (2015). [0122] 4. E. Y. Dotsey et al., A High Throughput Protein Microarray Approach to Classify HIV Monoclonal Antibodies and Variant Antigens. PLoS One 10, e0125581 (2015). [0123] 5. K. A. Curtis et al., Development and characterization of a bead-based, multiplex assay for estimation of recent HIV type 1 infection. AIDS Res Hum Retroviruses 28, 188-197 (2012). [0124] 6. S. Delhalle, J. C. Schmit, A. Chevigne, Phages and HIV-1: from display to interplay. Int J Mol Sci 13, 4727-4794 (2012). [0125] 7. H. B. Larman et al., Autoantigen discovery with a synthetic human peptidome. Nat Biotechnol 29, 535-541 (2011). [0126] 8. G. J. Xu et al., Viral immunology. Comprehensive serological profiling of human populations using a synthetic human virome. Science 348, aaa0698 (2015). [0127] 9. G. Murphy, J. V. Parry, Assays for the detection of recent infections with human immunodeficiency virus type 1. Euro Surveill 13, (2008). [0128] 10. R. Guy et al., Accuracy of serological assays for detection of recent infection with HIV and estimation of population incidence: a systematic review. Lancet Infect Dis 9, 747-759 (2009). [0129] 11. M. P. Busch et al., Beyond detuning: 10 years of progress and new challenges in the development and application of assays for HIV incidence estimation. AIDS 24, 2763-2771 (2010). [0130] 12. O. Laeyendecker et al., Factors associated with incorrect identification of recent HIV infection using the BED capture immunoassay. AIDS Res Hum Retroviruses 28, 816-822 (2012). [0131] 13. O. Laeyendecker et al., Specificity of four laboratory approaches for cross-sectional HIV incidence determination: analysis of samples from adults with known nonrecent HIV infection from five African countries. AIDS Res Hum Retroviruses 28, 1177-1183 (2012). [0132] 14. R. Kassanjee et al., Independent assessment of candidate HIV incidence assays on specimens in the CEPHIA repository. AIDS 28, 2439-2449 (2014). [0133] 15. R. Brookmeyer, O. Laeyendecker, D. Donnell, S. H. Eshleman, Cross-sectional HIV incidence estimation in HIV prevention research. J Acquir Immune Defic Syndr 63 Suppl 2, S233-239 (2013). [0134] 16. J. E. Justman, O. Mugurungi, W. M. El-Sadr, HIV Population Surveys—Bringing Precision to the Global Response. N Engl J Med 378, 1859-1861 (2018). [0135] 17. T. J. Coates et al., Effect of community-based voluntary counselling and testing on HIV incidence and social and behavioural outcomes (NIMH Project Accept; HPTN 043): a cluster-randomised trial. Lancet Glob Health 2, e267-277 (2014). [0136] 18. C. S. Morrison et al., Hormonal contraceptive use and HIV disease progression among women in Uganda and Zimbabwe. J Acquir Immune Defic Syndr 57, 157-164 (2011). [0137] 19. C. S. Morrison et al., Plasma and cervical viral loads among Ugandan and Zimbabwean women during acute and early HIV-1 infection. AIDS 24, 573-582 (2010). [0138] 20. A. F. Longosz et al., Comparison of antibody responses to HIV infection in Ugandan women infected with HIV subtypes A and D. AIDS Res Hum Retroviruses 31, 421-427 (2015). [0139] 21. A. F. Longosz et al., Immune Responses in Ugandan Women Infected With Subtypes A and D HIV Using the BED Capture Immunoassay and an Antibody Avidity Assay. Jaids-J Acq Imm Def 65, 390-396 (2014). [0140] 22. D. Mohan et al., PhIP-Seq characterization of serum antibodies using oligonucleotide encoded peptidomes. Nature Protocols In Press, (2018). [0141] 23. B. Langmead, C. Trapnell, M. Pop, S. L. Salzberg, Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10, R25 (2009). [0142] 24. T. Yuan et al., Improved analysis of phage immunoprecipitation sequencing (PhIP-Seq) data using a z-score algorithm. bioRxiv, (2018). [0143] 25. D. Monaco et al., Deconvoluting virome-wide antiviral antibody profiling data. bioRxiv, (2018). [0144] 26. S. K. Sharma, M. Soneja, HIV & immune reconstitution inflammatory syndrome (IRIS).

    [0145] Indian J Med Res 134, 866-877 (2011). [0146] 27. G. Touloumi et al., Impact of HIV-1 subtype on CD4 count at HIV seroconversion, rate of decline, and viral load set point in European seroconverter cohorts. Clin Infect Dis 56, 888-897 (2013). [0147] 28. O. T. Ng et al., HIV type 1 polymerase gene polymorphisms are associated with phenotypic differences in replication capacity and disease progression. J Infect Dis 209, 66-73 (2014). [0148] 29. J. M. Baeten et al., HIV-1 subtype D infection is associated with faster disease progression than subtype A in spite of similar plasma HIV-1 loads. J Infect Dis 195, 1177-1180 (2007). [0149] 30. J. B. Margolick et al., Impact of inversion of the CD4/CD8 ratio on the natural history of HIV-1 infection. J Acquir Immune Defic Syndr 42, 620-626 (2006). [0150] 31. G. Pantaleo et al., The qualitative nature of the primary immune response to HIV infection is a prognosticator of disease progression independent of the initial level of plasma viremia. Proc Natl Acad Sci USA 94, 254-258 (1997). [0151] 32. J. L. Fahey et al., The prognostic value of cellular and serologic markers in infection with human immunodeficiency virus type 1. N Engl J Med 322, 166-172 (1990). [0152] 33. C. Costello et al., HLA-B*5703 independently associated with slower HIV-1 disease progression in Rwandan women. AIDS 13, 1990-1991 (1999). [0153] 34. Y. Huang et al., The role of a mutant CCR5 allele in HIV-1 transmission and disease progression. Nat Med 2, 1240-1243 (1996). [0154] 35. S. K. Wendel et al., Short communication: The impact of viral suppression and viral breakthrough on Limited-Antigen Avidity assay results in individuals with clade B HIV infection. AIDS Res Hum Retroviruses 33, 325-327 (2017). [0155] 36. X. Wei et al., Development of two avidity-based assays to detect recent HIV type 1 seroconversion using a multisubtype gp41 recombinant protein. AIDS Res Hum Retroviruses 26, 61-71 (2010). [0156] 37. J. Konikoff et al., Performance of a limiting-antigen avidity enzyme immunoassay for cross-sectional estimation of HIV incidence in the United States. PLoS One 8, e82772 (2013). [0157] 38. O. Laeyendecker et al., Identification and validation of a multi-assay algorithm for cross-sectional HIV incidence estimation in populations with subtype C infection. J Int AIDS Soc 21, (2018). [0158] 39. C. Kadelka et al., Distinct, IgG1-driven antibody response landscapes demarcate individuals with broadly HIV-1 neutralizing activity. J Exp Med 215, 1589-1608 (2018).