METHOD FOR SEQUENCING A PROTEIN/POLYPEPTIDE USING AEROLYSIN NANOPORES
20230220002 · 2023-07-13
Assignee
Inventors
Cpc classification
C07K1/1133
CHEMISTRY; METALLURGY
G01N33/48721
PHYSICS
International classification
Abstract
The present invention provides a method for sequencing a protein/polypeptide based on Aerolysin nanopores to achieve specific discrimination of natural amino acids and post-translational modifications thereof and accurate acquisition of a sequence of a single-molecule protein, the method including the following steps: (1) unfolding of the protein; (2) terminus labeling of protein sequencing; (3) protein charge screening; (4) unfolding of a tertiary structure of the polypeptides; (5) orthogonal identification of amino acids; (6) confined perturbation-assisted identification of amino acids; and (7) single-molecule protein sequencing. The present invention aims at sensitive detection of sequence information about 20 amino acids and establishes an innovative method for accurately determining sequences of the amino acids and post-translational modifications of a single protein molecule.
Claims
1. A method for sequencing a protein/polypeptide using Aerolysin nanopores, comprising the following steps: (1) unfolding of the protein; (2) terminus labeling of protein sequencing; (3) protein charge screening; (4) unfolding of a tertiary structure of the polypeptides; (5) orthogonal identification of amino acids; (6) confined perturbation-assisted identification of amino acids; and (7) single-molecule protein sequencing.
2. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the unfolding of the protein of step (1), a single protein molecule must have its high order structure unraveled to enter a single nanopore in a linear chain form before the sequencing based on nanopores, and the single protein will be unfolded by a temperature and pH regulation method.
3. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the terminus labeling of protein sequencing of step (2), the N-terminus or C-terminus of an unfolded polypeptide chain is labeled with a peptide nucleic acid, an oligonucleotide, a polypeptide chain or an organic functional group having a specific sequence as a sequencing origin to obtain an label signal of ion flow starting point.
4. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the protein charge screening of step (3), protein charge screening nanopores are designed.
5. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the unfolding of a tertiary structure of the polypeptide of step (4), a tertiary structure unfolding nanopore is further designed for assisting the protein charge screening, that is, an unfolding region is constructed at an inlet of a biological nanopore to further open a molecular structure of the polypeptide, i.e., mutant T284/F/Y/I/L/W or G214/F/Y/I/L/W.
6. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the orthogonal identification of amino acids of step (5), aiming at sequencing of linear polypeptide molecules for each chargeability, nanopores at least comprising the following six types of orthogonal nanopores are designed: (a) identification of a first class of amino acids, comprising but not limited to H, R, K, E, D, Q, N and W, intended to be achieved based on an electrostatic interaction, i.e., mutant T218K/R/H/D/E, S278K/R/H/D/E, S276K/R/H/D/E, T274K/R/H/D/E or A224Q/N/D/E/R/K/H; (b) identification of a second class of amino acids, comprising but not limited to Q, N, Y, T, S, C, G and H, intended to be achieved based on hydrogen bond and hydrophilic interactions, i.e., mutant T218N/Q, Q212R/K/H, D209S/T, S276Q/N, D222G/A/S or A224E/D; (c) identification of a third class of amino acids, comprising but not limited to I, L, M, V, P, A, C and G, intended to be achieved based on a van der Waals interaction, i.e., mutant R220S/T/A, D222G/A, S236I/L/V, G270I/L, T232I/L/V, T274G/A/I/L or K238F/Y/W; (d) identification of a fourth class of amino acids, comprising but not limited to W, P, F, Y, H, I, L and V, intended to be achieved based on a large p bond in side chains of part of the amino acids, i.e., mutant D222W/H/F/Y, S276F/Y, A224K/R/W, S272W/H or T274W/H/F/Y; (e) identification of a fifth class of small-volume amino acids, comprising but not limited to A, C, G, S, T and V, intended to be achieved based on a small steric hindrance effect, i.e., mutant S276F/Y/I/L, S278F/Y/I/L/P, T274W/P, S236W or K238G/W/I/L/F/Y/P; and (f) identification of a sixth class of large-volume amino acids, comprising but not limited to W, H, I, K, R and Y, intended to be achieved based on a large steric hindrance effect, i.e., mutant T218G/A, S276G/A, S278G/A, T274G/A, N226D/E or Q268S/T/G/A.
7. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein in the confined perturbation-assisted identification of amino acids of step (6), in view of identification errors possibly introduced between part of amino acids with small structural differences and isomer amino acids, alternating electric field and optical perturbation measurement systems are introduced and a perturbation amplification nanopore for the perturbation systems is designed to further improve sequencing accuracy in combination with a specific nanopore, wherein the specific nanopore is shown as follows: (a) mutant S236D/E/K/H/R, A260D/E/K/H/R, K238H/R/D/E, T240D/E or S256H/R/W in combination with the alternating electric field perturbation system; and (b) mutant S236W/H, K238I/L, S256Y/F/W, P249W or V250I/L/F/Y/W in combination with the optical perturbation system.
8. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein the single-molecule protein sequencing of step (7) is performed by: (a) bringing the protein or polypeptide into a contact with the pore, so that the protein or polypeptide moves relative to the pore; and (b) measuring an ion current passing through the pore as the protein or polypeptide moves relative to the pore, wherein the current is indicative of one or more characteristics of the protein or polypeptide, and comprises shape, amplitude and duration of a current signal, resolving characteristics of the current signal according to a mathematical transformation, and creating a database of polypeptides for mutual correction of data, thereby characterizing the protein or polypeptide.
9. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, comprising the following specific steps: sample pretreatment: breaking internal hydrogen bonds of the protein by raising the temperature to 60-100° C. and decreasing the pH of a solution to 0-5, and breaking S—S bonds of the protein using a reducing agent tri(2-carboxyethyl)phosphine (TCEP) or dithiothreitol (DTT) at the same time, so that polypeptide chains in the single protein are released and linearized; specifically modifying the N-terminus of the polypeptide chain with a peptide nucleic acid PNA, an oligonucleotide, a polypeptide chain or an organic functional group, so that a specific ion flow blocked signal or fluorescent signal is generated at the beginning or the end of the polypeptide entering the nanopore, thereby determining the starting point of sequencing of the single polypeptide molecule in the nanopore, and providing a starting time label for mutual correction of parallel sequencing signals of a plurality of orthogonal identification nanopores; using a denaturant and designing and constructing a “tertiary structure unfolding nanopore” to achieve the unfolding of the tertiary structure of the polypeptide, wherein the “tertiary structure unfolding nanopore” is designed as follows: bionically constructing a central amino acid environment of a proteasome 19S domain at the inlet of the Aerolysin nanopore to enhance a specific non-covalent interaction between the polypeptide and the nanopore, and gradually destroying weak interactions inside the polypeptide molecule by virtue of electric driving forces to drive the polypeptide molecule to enter a confined pore and achieve linear unfolding, so that the great challenge of the tertiary structure of the polypeptide on sequencing of the polypeptide in the nanopore is overcome; designing functionalized Aerolysin nanopores capable of driving polypeptides with different chargeabilities, and preliminarily screening the chargeabilities of the polypeptides to match the selection of orthogonal sequencing nanopores in the next step; constructing 6 types of orthogonal identification nanopores for specifically identifying amino acid sequences of polypeptides for each chargeability based on an electrostatic interaction, hydrogen bond and hydrophilic interactions, a van der Waals interaction, a large p-bond interaction of amino acids, a large steric hindrance effect and a small steric hindrance effect; introducing amino acids which are easy to form a hydrogen bond into the inlet region of each orthogonal identification nanopore, and adjusting the confined pore structure of the region, thereby designing and constructing a secondary structure label region of polypeptides, in which amino acid residues inside the pore will have a hydrogen bond interaction with the polypeptides with different secondary structures, thereby inducing changes in specific ion flow blocking and specific ion mobility, and forming ion flow characteristics of the secondary structure label for calibrating and denoising an ion flow electrical signal of single protein sequencing during data processing; in view of amino acid identification errors possibly existing in the orthogonal amino acid identification, further identifying ion migration frequency characteristics by adopting an ion flow confined perturbation technique in combination with influences of specifically designed amplification temperature perturbation, alternating electric field perturbation and optical perturbation of the nanopore on the ion mobility in the pore, thereby improving the amino acid identification capability at a nanopore measurement interface, and accurately obtaining sequence information of the single protein molecule; and making one or more measurements as the protein or polypeptide moves relative to the Aerolysin nanopore, specifically, measuring and analyzing a current passing through the pore, comprising characteristics such as amplitude, frequency, shape and duration of the current, thereby determining the presence or absence of one or more of the characteristics in the analyte; and resolving characteristics of the current signal according to a mathematical transformation, and creating a database of polypeptides for mutual correction of data, thereby characterizing the protein or polypeptide.
10. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein the organic functional group in step (2) is FAM, VIC, CY5, HEX or ROX.
11. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein the denaturant used in step (3) is guanidine hydrochloride or GdHCl.
12. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein the electric driving forces in step (3) are an electrophoresis force, an electroosmotic flow, and a dielectrophoresis force.
13. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 1, wherein the designing functionalized Aerolysin nanopores capable of driving polypeptides with different chargeabilities in step (3) comprises: adopting 4 “protein charge screening nanopores” for specifically capturing 4 types of polypeptides with chargeabilities, namely negatively charged polypeptides, positively charged polypeptides, electrically neutral polypeptides with positive and negative charges shielded from each other, and electrically neutral polypeptides with positive and negative charges separated, respectively.
14. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 5, wherein the 4 “protein charge screening nanopores” are a “protein charge screening nanopore” for specifically identifying the negatively charged polypeptides, a “protein charge screening nanopore” for specifically identifying the positively charged polypeptides, a “protein charge screening nanopore” for specifically identifying the electrically neutral polypeptides with positive and negative charges shielded from each other, and a “protein charge screening nanopore” for specifically identifying the electrically neutral polypeptides with positive and negative charges separated, respectively; and the 4 nanopores are designed as follows: (i) the “protein charge screening nanopore” for specifically identifying the negatively charged polypeptides: by adjusting the diameter of the key region inside the Aerolysin nanopore or transferring charges in the pore to a region of a larger diameter, that is, constructing mutant T274N/Q/I/L, T232D/E, K238H/D/R/F/A/C/G/Q/E/K/L/M/N/S/Y/T/I/W/P/V or S280T/N/Q/H/I/L, the electroosmotic flow inside the nanopore that is determined by the charge at the narrowest part of the pore is controlled to be zero, and the dielectrophoresis force is reduced, so that a single negatively charged polypeptide is driven into the nanopore by the electrophoresis force; (ii) the “protein charge screening nanopore” for specifically identifying the positively charged polypeptides: by introducing or increasing the distribution of negative charges in the Aerolysin nanopore, that is, constructing mutant T274D/E, T218D/E, S276D/E, S278D/E, K238A/N/D/E/Q, R282D/E/S/T/N/Q/A or R220D/E/S/T/N/Q/A, the electroosmotic flow determined by cations inside the pore is adjusted; in the experiment, a reverse voltage is applied to achieve efficient capture of a single positively charged polypeptide; (iii) the “protein charge screening nanopore” for specifically identifying the electrically neutral polypeptides with positive and negative charges shielded from each other: for the electrically neutral polypeptide with positive and negative charges shielded from each other, by introducing positively charged amino acids into a region of a smaller diameter of the Aerolysin nanopore, that is, constructing mutant T218K/R/H/N/Q, S276K/R/H, S278K/R/H/N/Q, S274K/R/H, N226K/R/H, S272K/R/H, G270K/R/H, S228K/R/H, Q268K/R/H, T230K/R/H, A266K/R/H, T232K/R/H, S264K/R/H, G234K/R/H, N262K/R/H, S236K/R/H, A260K/R/H or S280N/Q, the electroosmotic flow determined by anions inside the pore is enhanced, so that the capture efficiency of the nanopore to the electrically neutral polypeptide with positive and negative charges shielded from each other is enhanced, and a specific ion flow response is obtained; and (iv) the “protein charge screening nanopore” for specifically identifying the electrically neutral polypeptides with positive and negative charges separated: for the electrically neutral polypeptides with positive and negative charges separated, by enhancing the potential gradient at the inlet of the Aerolysin nanopore, that is, constructing mutant S280Q/N/A, T284Q/N/A or G214Q/N/A, the non-linear electric field strength is adjusted, so that a single electrically neutral polypeptide with positive and negative charges separated is driven into the pore by the dielectrophoretic force.
15. The method for sequencing a protein/polypeptide using Aerolysin nanopores according to claim 5, wherein the 6 types of orthogonal identification nanopores of step (5) are constructed as follows: (I) based on the electrostatic interaction, charged amino acids are introduced into an existing current sensing region in the pore, that is, constructing mutant T218K/R/H/D/E, S278K/R/H/D/E, S276K/R/H/D/E, T274K/R/H/D/E or A224Q/N/D/E/R/K/H; the introduction of the charged amino acids can enhance hydrogen bond, salt bond and cation-p interactions between the pore and the amino acids to be sequenced, so that identification of a first class of amino acids, comprising but not limited to H, R, K, E, D, Q, N and W, is achieved; (II) based on the hydrogen bond and hydrophilic interactions, the potential gradient of a current sensing region in the pore is regulated, that is, constructing mutant T218N/Q, Q212R/K/H, D209S/T, S276Q/N, D222G/A/S or A224E/D, to increase the speed of charged amino acids passing through the region and prolong the retention time of polar uncharged amino acids in the region, so that identification of a second class of amino acids, comprising but not limited to Q, N, Y, T, S, C, G, and H, is achieved; wherein the histidine His has an R group with pKa of about 7, and can be made uncharged through fine adjustment of pH, thus enabling characteristic differentiation in a specific nanopore based on its hydrogen bond interaction with a key sensing region of the nanopore; (III) based on the van der Waals interaction, the overall potential distribution and the stereostructure distribution of the pore are regulated, and hydrophobic amino acids are introduced, that is, constructing mutant R220S/T/A, D222G/A, S236I/L/V, G270I/L, T232I/L/V, T274G/A/I/L or K238F/Y/W, to transfer the current sensing region from a electrostatic sensitive region of a wild-type pore to a hydrophobic region of a mutant pore, and prolong the retention time of a specific amino acid in the region through the interaction to obtain a characteristic ion flow signal, so that identification of a third class of amino acids, comprising but not limited to I, L, M, V, P, A, C and G, is achieved; (IV) based on the large p-bond interaction of part of amino acids, the composition of a current sensing region in the Aerolysin nanopore is reconstituted on the basis of regulating the stereostructure and potential distribution of the pore, and a sensitive region with positively charged amino acids and hydrophobic amino acids predominated is constructed, that is, constructing mutant D222W/H/F/Y, S276F/Y, A224K/R/W, S272W/H or T274W/H/F/Y, to enhance the p-p interaction, cation-p bond interaction, and p-p interaction of the sensitive region with a specific amino acid, so that identification of a fourth class of amino acids, comprising but not limited to W, P, F, Y, H, I, L and V, is achieved; (V) based on the large steric hindrance effect, the confined space of the current sensing region in the pore is further reduced, and the steric hindrance of the region is increased, that is, constructing mutant S276F/Y/I/L, S278F/Y/I/L/P, T274W/P, S236W or K238G/W/I/L/F/Y/P, to prolong the time of all amino acids passing through the region, enhance the current amplitude of ionic flow signals of small-volume amino acids, and enable the large-volume amino acids to generate a nearly fully blocked ion flow step, so that the volume of amino acids is specifically distinguished, and identification of a fifth class of small-volume amino acids, comprising but not limited to A, C, G, S, T and V, is achieved; and (VI) based on the small steric hindrance effect, the stereostructure in the pore is regulated, the size of a key current region is increased, that is, constructing mutant T218G/A, S276G/A, S278G/A, T274G/A, N226D/E or Q268S/T/G/A, and the electroosmotic flow in the nanopore is reduced based on the overall chargeability of the polypeptide, so that the current response of small-volume amino acids is further reduced, the current difference of large-volume amino acids is increased, and identification of a sixth class of large-volume amino acids, comprising but not limited to W, H, I, K, R and Y, is achieved.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
DESCRIPTION OF THE EMBODIMENTS
Example 1
[0062] Provided was a method for sequencing polypeptide molecules by a cysteine-specific Aerolysin nanopore, in which Glu was taken as a guide chain for a polypeptide, and amino acid sequences of two polypeptide molecules were Glu-Gly-Cys and Glu-Cys-Gly, respectively. The specific steps are as follows:
[0063] (1) N226Q and T232K protein charge screening nanopore were designed, and corresponding mutant Proerolysin proteins were expressed and purified for pore construction by site-directed mutagenesis, with the specific steps referred to the patent CN202010131704.8.
[0064] (2) 1 mg/mL Proerolysin protein was mixed with trypsin in a 10:1 ratio, and incubated at room temperature for 6 h to obtain Aerolysin monomeric protein with pore-forming activity.
[0065] (3) The experiment temperature was controlled at 22±1° C. 1 mL of buffer solution (1.0 M KCl, 10 mM Tris, 1.0 mM EDTA, pH=8) was added to each of the two detection cells, and a phospholipid bilayer was prepared by the Czochralski method, with the specific steps referred to the patent CN201510047662.9.
[0066] (4) After a stable phospholipid bilayer was formed, 200 mV voltage was applied and 1 μL of Aerolysin monomeric protein was added into the cis detection cell. Aerolysin monomers were self-assembled to form a heptamer and inserted into the phospholipid membrane to form a stable nanopore, and simultaneously, the ion flow took a jump, so that a stable open-pore current was obtained under the voltage of 100 mV.
[0067] (5) 44 of 50 mM tripeptide chain was added into the cis detection cell, and an external voltage of 120 mV was applied. Original current tracks acquired are shown in
[0068] (6) A T232K/K238Q double-mutant polypeptide sequencing pore was designed, and mutant Proerolysin protein was expressed and purified for pore construction by site-directed mutagenesis.
[0069] (7) The nanopore construction steps were repeated, and two polypeptide molecules were added into the cis detection cell, respectively. As shown in
Example 2
[0070] Provided was a method for detecting phosphorylated polypeptides using mutant Aerolysin nanopores, in which S-K-I-G was used as a guide chain, the sequence of a template polypeptide was S-K-I-G-S-T-E-N-L, and the sequences obtained by phosphorylation modification of serine at the fifth position and threonine at the sixth position were S-K-I-G-.sup.PS-T-E-N-L and S-K-I-G-S-.sup.PT-E-N-L, respectively. The specific steps are as follows:
[0071] (1) A wild-type preliminary chargeability screening nanopore was designed, and a wild-type Proerolysin protein was expressed and purified for pore construction, with the specific steps referred to the patent CN202010131704.8.
[0072] (2) 1 mg/mL Proerolysin protein was mixed with trypsin in a 10:1 ratio, and incubated at room temperature for 6 h to obtain Aerolysin monomeric protein with pore-forming activity.
[0073] (3) The experiment temperature was controlled at 22±1° C. 1 mL of buffer solution (1.0 M KCl, 10 mM Tris, 1.0 mM EDTA, pH=8) was added to each of the two detection cells, and a phospholipid bilayer was prepared by the Czochralski method, with the specific steps referred to the patent CN201510047662.9.
[0074] (4) After a stable phospholipid bilayer was formed, 200 mV voltage was applied and 1 μL of Aerolysin monomeric protein was added into the cis detection cell. Aerolysin monomers were self-assembled to form a heptamer and inserted into the phospholipid membrane to form a stable nanopore, and simultaneously, the ion flow took a jump, so that a stable open-pore current was obtained.
[0075] (5) 5 μL of 1 mM polypeptide solution was added into the cis detection cell, and an external voltage of 100 mV was applied. The acquired original current tracks are shown in
[0076] (6) A T232K/K238Q double-mutant phosphorylation detection pore was designed, and mutant Proerolysin protein was expressed and purified for pore construction by site-directed mutagenesis.
[0077] (7) The nanopore construction steps were repeated, and three polypeptide molecules were added into the cis detection cell, respectively. As shown in
[0078] The general principles, principal features, and advantages of the present invention are revealed and described in the above examples. It should be understood by those skilled in the art that the present invention is not limited to the above examples, which are merely illustrative of the principles of the present invention. Various changes and modifications may be made without departing from the spirit and scope of the present invention, and those changes and modifications fall into the claimed scope of the present invention. The claimed scope of the present invention is defined by the appended claims and the equivalents thereof.