Coiled-coil connector

11090381 · 2021-08-17

Assignee

Inventors

Cpc classification

International classification

Abstract

A connector of a helical coiled coil-type structure connecting a first and a second molecule, wherein the first molecule comprises a peptidic first alpha-helix and the second molecule comprises a peptidic second alpha-helix, which second alpha-helix is coiled to the first alpha-helix. The first alpha-helix comprises a C-terminal region consisting of a repeat of an amino acid motif a-x.sub.1 and the C-terminal motif a-x.sub.1-x.sub.2, or an N-terminal region consisting of a repeat of an amino acid motif x.sub.1-a and the N-terminal motif x.sub.2-x.sub.1-a, where a is a motif sequence of 4-8 amino acids, x.sub.1 is Lysine, and x.sub.2 is an extension of the motif. The extension consists of 1-10 amino acids and does not comprise more than 4 consecutive amino acids incorporated in said motif a-x.sub.1 or x.sub.1-a. In a multimeric protein, two polypeptide chains can be connected to each other by such connector.

Claims

1. A multimeric protein comprising a first polypeptide and a second polypeptide joined by a connector, the connector consisting of a coiled coil-type structure consisting of a first peptidic alpha-helix and a second peptidic alpha-helix, wherein the first and the second peptidic alpha-helices form the coiled coil-type structure, wherein: i) the first peptidic alpha-helix consists of repeats of the amino acid motif a-x.sub.1 and a C-terminal motif a-x.sub.1-x.sub.2, and the second peptidic alpha helix consists of repeats of an amino acid motif x.sub.1-a selected from the group consisting of SEQ ID NOs: 2, 4, 6, and 8, wherein the first polypeptide of the multimeric protein is fused to the N-terminus of the first peptidic alpha-helix, and the second polypeptide of the multimeric protein is fused to the N-terminus or to the C-terminus of the second peptidic alpha-helix; or ii) the first peptidic alpha-helix consists of an N-terminal motif x.sub.2-x.sub.1-a and repeats of the amino acid motif x.sub.1-a, and the second alpha-helix consists of repeats of an amino acid motif a-x.sub.1 selected from the group consisting of SEQ ID NOs: 1, 3, 5, and 7, wherein the first polypeptide of the multimeric protein is fused to the C-terminus of the first peptidic alpha-helix, and the second polypeptide of the multimeric protein is fused to the N-terminus or to the C-terminus of the second peptidic alpha-helix; wherein in the first peptidic alpha-helix: a1) the repeating motif a-x.sub.1 is SEQ ID NO:1 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:1 with a C-terminal extension x.sub.2 consisting of 1-2 amino acids; a2) the repeating motif a-x.sub.1 is SEQ ID NO:1 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:41, SEQ ID NO:69, or SEQ ID NO:71; b1) the repeating motif a-x.sub.1 is SEQ ID NO:3 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:3 with a C-terminal extension x.sub.2 consisting of 1-2 amino acids; b2) the repeating motif a-x.sub.1 is SEQ ID NO:3 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:43, SEQ ID NO:85, or SEQ ID NO:87; c1) the repeating motif a-x.sub.1 is SEQ ID NO:5 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:5 with a C-terminal extension x.sub.2 consisting of 1-2 amino acids; c2) the repeating motif a-x.sub.1 is SEQ ID NO:5 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:45, SEQ ID NO:101, or SEQ ID NO:103; d1) the repeating motif a-x.sub.1 is SEQ ID NO:7 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:7 with a C-terminal extension x.sub.2 consisting of 1-2 amino acids; d2) the repeating motif a-x.sub.1 is SEQ ID NO:7 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO:47, SEQ ID NO:117, or SEQ ID NO:119; e1) the repeating motif x.sub.1-a is SEQ ID NO:2 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:2 with an N-terminal extension x.sub.2 consisting of 1-2 amino acids; e2) the repeating motif x.sub.1-a is SEQ ID NO:2 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:42, SEQ ID NO:77, or SEQ ID NO:79; f1) the repeating motif x.sub.1-a is SEQ ID NO:4 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:4 with an N-terminal extension x.sub.2 consisting of 1-2 amino acids; f2) the repeating motif x.sub.1-a is SEQ ID NO:4 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:44, SEQ ID NO:93, or SEQ ID NO:95; g1) the repeating motif x.sub.1-a is SEQ ID NO:6 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:6 with an N-terminal extension x.sub.2 consisting of 1-2 amino acids; g2) the repeating motif x.sub.1-a is SEQ ID NO:6 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:46, SEQ ID NO:109, or SEQ ID NO:111; h1) the repeating motif x.sub.1-a is SEQ ID NO:8 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:8 with an N-terminal extension x.sub.2 consisting of 1-2 amino acids; or h2) the repeating motif x.sub.1-a is SEQ ID NO:8 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO:48, SEQ ID NO:125, or SEQ ID NO:127.

2. The multimeric protein of claim 1, wherein at least one of the 1-2 amino acids of x.sub.2 is selected from the group consisting of Alanine, Glycine, Serine, Isoleucine, Leucine, Phenylalanine, Valine, and Proline.

3. The multimeric protein of claim 2, wherein x.sub.2 consists of two amino acids, and the first amino acid is selected from the group consisting of Alanine, Glycine, Serine, Isoleucine, Leucine, Phenylalanine, Valine, and Proline.

4. The multimeric protein of claim 1, wherein the number of repeating amino acid motifs in each of said first and second alpha-helices is 2-9.

5. The multimeric protein of claim 1, wherein the first and second alpha-helices are parallel or anti-parallel alpha-helices.

6. The multimeric protein of claim 1, comprising a heterodimeric coiled-coil, wherein the repeating amino acid motif in the first alpha-helix is different from the repeating amino acid motif in the second alpha-helix.

7. The multimeric protein of claim 1, wherein x.sub.2 consists of 1 amino acid.

8. The multimeric protein of claim 7, wherein the amino acid is any of Alanine, Glycine, or Serine.

9. The multimeric protein of claim 1, wherein the first and second polypeptide are independently selected from the group consisting of: a) a vaccine antigen or adjuvant; b) an antibody or antibody domain; c) an antibody derivative comprising an antibody connected to a label, tag, enzyme, receptor, ligand, toxin, lipid, carbohydrate, or nucleic acid; and d) a receptor or receptor ligand connected to a solid phase carrier.

10. A multimeric protein comprising a first polypeptide and a second polypeptide joined by a connector, the connector consisting of a coiled coil-type structure consisting of a first peptidic alpha-helix and a second peptidic alpha-helix, wherein the first and second peptidic alpha-helices form the coiled coil-type structure, wherein: i) the first peptidic alpha helix consists of repeats of the amino acid motif a-x1 and a C-terminal motif a-x.sub.1-x.sub.2, and the second peptidic alpha helix consists of repeats of an amino acid motif x.sub.1-a selected from the group consisting of SEQ ID NOs: 2, 4, 6, and 8, wherein the first polypeptide of the multimeric protein is fused to the N-terminus of the first peptidic alpha-helix, and the second polypeptide of the multimeric protein is fused to the N-terminus or to the C-terminus of the second peptidic alpha-helix; or ii) the first peptide alpha helix consists of an N-terminal motif x.sub.2-x.sub.1-a and repeats of the amino acid motif x.sub.1-a, and the second peptidic alpha helix consists of repeats of an amino acid motif a-x.sub.1 selected from the group consisting of SEQ ID NOs: 1, 3, 5, and 7, wherein the first polypeptide of the multimeric protein is fused to the C-terminus of the first peptidic alpha-helix, and the second polypeptide of the multimeric protein is fused to the N-terminus or to the C-terminus of the second peptidic alpha-helix; wherein in the first peptidic alpha-helix: a1) the repeating motif a-x.sub.1 is SEQ ID NO: 1 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 1 with a C-terminal extension x.sub.2 consisting of 1-2 amino acids; a2) the repeating motif a-x.sub.1 is SEQ ID NO: 1 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 41; b1) the repeating motif a-x.sub.1 is SEQ ID NO: 3 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 3 with a C-terminal extension x2 consisting of 1-2 amino acids; b2) the repeating motif a-x.sub.1 is SEQ ID NO: 3 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 43; c1) the repeating motif a-x1 is SEQ ID NO: 5 and the C-terminal motif a-x.sub.1-x2 is SEQ ID NO: 5 with a C-terminal extension x2 consisting of 1-2 amino acids; c2) the repeating motif a-x.sub.1 is SEQ ID NO: 5 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 45; d1) the repeating motif a-x.sub.1 is SEQ ID NO: 7 and the C-terminal motif a-x.sub.1-x2 is SEQ ID NO: 7 with a C-terminal extension x2 consisting of 1-2 amino acids; d2) the repeating motif a-x.sub.1 is SEQ ID NO: 7 and the C-terminal motif a-x.sub.1-x.sub.2 is SEQ ID NO: 47; e1) the repeating motif x.sub.1-a is SEQ ID NO: 2 and the N-terminal motif x2-x.sub.1-a is SEQ ID NO: 2 with a N-terminal extension x2 consisting of 1-2 amino acids; e2) the repeating motif x.sub.1-a is SEQ ID NO: 2 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 42; f1) the repeating motif x.sub.1-a is SEQ ID NO: 4 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 4 with a N-terminal extension x2 consisting of 1-2 amino acids; f2) the repeating motif x.sub.1-a is SEQ ID NO: 4 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 44; g1) the repeating motif x.sub.1-a is SEQ ID NO: 6 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 6 with a N-terminal extension x2 consisting of 1-2 amino acids; g2) the repeating motif x.sub.1-a is SEQ ID NO: 6 and the N-terminal motif x2-x1-a is SEQ ID NO: 46; h1) the repeating motif x.sub.1-a is SEQ ID NO: 8 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 8 with a N-terminal extension x2 consisting of 1-2 amino acids; or h2) the repeating motif x.sub.1-a is SEQ ID NO: 8 and the N-terminal motif x.sub.2-x.sub.1-a is SEQ ID NO: 48.

11. The multimeric protein of claim 10, wherein at least one of the 1-2 amino acids of x.sub.2 is selected from the group consisting of Alanine, Glycine, Serine, Isoleucine, Leucine, Phenylalanine, Valine, and Proline.

12. The multimeric protein of claim 11, wherein x.sub.2 consists of two amino acids, and the first amino acid is selected from the group consisting of Alanine, Glycine, Serine, Isoleucine, Leucine, Phenylalanine, Valine, and Proline.

13. The multimeric protein of claim 10, wherein the number of repeating amino acid motifs in each of said first and second alpha-helices is 2-9.

14. The multimeric protein of claim 10, wherein the first and second alpha-helices are parallel or anti-parallel alpha-helices.

15. The multimeric protein of claim 10, comprising a heterodimeric coiled-coil, wherein the repeating amino acid motif in the first alpha-helix is different from the repeating amino acid motif in the second alpha-helix.

16. The multimeric protein of claim 10, wherein x.sub.2 consists of 1 amino acid.

17. The multimeric protein of claim 16, wherein the amino acid is any of Alanine, Glycine, or Serine.

Description

FIGURES

(1) FIG. 1: Sequence of an exemplary scFv-coil1 (SEQ ID 49)

(2) Underlined: VH and VL domains fused by a flexible linking sequence; the VH and VL domain sequences are obtained from a monoclonal antibody IV.3 that specifically recognizes CD32a (Stuart et al. (1987) J. Exp. Med. 166: 1668).

(3) Normal type set: flexible linker (maybe any linker);

(4) In italics: pepE coil: 5 repeats of a heptad motif, C-terminal Lysine (italics and bold).

(5) FIG. 2: Sequence of an exemplary scFv-coil1ala (SEQ ID 50)

(6) Underlined: VH and VL domains fused by a flexible linking sequence

(7) Normal type set: flexible linker (maybe any linker);

(8) In italics: pepE coil: 5 repeats of a heptad motif, C-terminal Alanine (italics and bold) as a C-terminal extension.

(9) FIG. 3: pepK (SEQ ID 51)

(10) In italics: pepK coil: 5 repeats of a heptad motif, N-terminal Lysine (italics and bold).

(11) FIG. 4:

(12) A: Mass spectrometry scFV-coil1 from HEK293 cells

(13) The expected molecular weight (MW) based on the sequence of FIG. 1 is 30764 Da.

(14) The MW found) are 30755.5 Da (˜60% of total material) and 30630 Da (˜40% of total material) which corresponds to expected MW minus Lys(289)

(15) B: Mass spectrometry scFV-coil1 from S2 Insect cells

(16) The expected molecular weight (MW) based on the sequence of FIG. 1 is 30764 Da.

(17) The MW found is 30755.5 Da indicating full length protein.

(18) C: Mass spectrometry scFV-coil1ala from HEK293 cells

(19) The expected molecular weight (MW) based on the sequence of FIG. 2 is 30835.4 Da.

(20) The MW found is 30831.0 Da indicating full length protein

(21) FIG. 5: Affinity analyses of pepK and scFV-coil1 or ScFV-coil1ala.

DETAILED DESCRIPTION OF THE INVENTION

(22) Specific terms as used throughout the specification have the following meaning.

(23) The term “coil” as used herein shall mean a motif consisting of an amino acid sequence and a three-dimensional structure which is at least part of a peptidic alpha-helix. The coil repeat can make up a peptidic alpha-helix. Such coil repeat may include the repeat of identical motifs, or of different (e.g. variant) motifs, which are functional by recognizing a matching motif to assemble at least two peptidic alpha-helices with high affinity.

(24) Functionally active variants may be obtained by sequence alterations in the peptide sequence e.g., by one or more point mutations, wherein the sequence alterations substantially retains a function of the unaltered peptide sequence. Such sequence alterations or point mutations can include, but are not limited to, (conservative) substitutions, additions, deletions, mutations and insertions, e.g. the alteration of 1, 2, 3, or 4 amino acids, or by addition or insertion of one to several amino acids, e.g. 1, 2, 3, or 4 amino acids, or by a chemical derivatization of one to several amino acids, e.g. 1, 2, 3, or 4, or combination thereof, preferably by point mutations that are not contiguous. The substitutions in amino acid residues may be conservative substitutions, for example, substituting one hydrophobic amino acid for an alternative hydrophobic amino acid.

(25) Conservative substitutions are those that take place within a family of amino acids that are related in their side chains and chemical properties. Examples of such families are amino acids with basic side chains, with acidic side chains, with non-polar aliphatic side chains, with non-polar aromatic side chains, with uncharged polar side chains, with small side chains, with large side chains etc.

(26) Preferred point mutations refer to the exchange of amino acids of the same polarity and/or charge. In this regard, amino acids refer to twenty naturally occurring amino acids encoded by sixty-four triplet codons. These 20 amino acids can be split into those that have neutral charges, positive charges, and negative charges:

(27) The “neutral” amino acids are shown below along with their respective three-letter and single-letter code and polarity:

(28) Alanine: (Ala, A) nonpolar, neutral;

(29) Asparagine: (Asn, N) polar, neutral;

(30) Cysteine: (Cys, C) nonpolar, neutral;

(31) Glutamine: (Gln, Q) polar, neutral;

(32) Glycine: (Gly, G) nonpolar, neutral;

(33) Isoleucine: (Ile, I) nonpolar, neutral;

(34) Leucine: (Leu, L) nonpolar, neutral;

(35) Methionine: (Met, M) nonpolar, neutral;

(36) Phenylalanine: (Phe, F) nonpolar, neutral;

(37) Proline: (Pro, P) nonpolar, neutral;

(38) Serine: (Ser, S) polar, neutral;

(39) Threonine: (Thr, T) polar, neutral;

(40) Tryptophan: (Trp, W) nonpolar, neutral;

(41) Tyrosine: (Tyr, Y) polar, neutral;

(42) Valine: (Val, V) nonpolar, neutral; and

(43) Histidine: (His, H) polar, positive (10%) neutral (90%).

(44) The “positively” charged amino acids are:

(45) Arginine: (Arg, R) polar, positive; and

(46) Lysine: (Lys, K) polar, positive.

(47) The “negatively” charged amino acids are:

(48) Aspartic acid: (Asp, D) polar, negative; and

(49) Glutamic acid: (Glu, E) polar, negative.

(50) Specifically, functional variants of a motif may include point mutations i.e., the insertion, deletion or substitution of 1 or 2 amino acids, in particular those which are conservative substitutions. Thereby, the coiled coil hydrophobic residue periodicity can be maintained.

(51) The term “peptidic alpha-helix” as used herein shall mean the alpha-helix structure based on a peptide sequence comprising a number of motif repeats, herein referred to as coil repeats. Such alpha-helix is capable of binding to one or more peptidic counterpart alpha-helices, also called matching alpha-helices to form a dimer, trimer or further oligomer, also called coiled coil (by longitudinal assembly or self-assembly of the matching alpha-helices).

(52) A coiled coil is a structural motif in polypeptides or peptides, in which two to seven alpha-helices are coiled together like the strands of a rope. In some embodiments, the coiled coil is one with two alpha-helices coiled together. Such alpha helical regions are likely to form coiled-coil structures and may be involved in oligomerization of the coil repeats as measured in a suitable coiled coil interaction binding assay.

(53) Specifically, a dimer of alpha-helices can be formed by contacting two monomers, such that the dimer is formed through an interaction with the two alpha helix coiled coil domains.

(54) According to a specific example, the coils comprise a heptad peptide e.g., with the amino acid sequence as set forth in SEQ ID NO: 7 or 8 (coil and anti-coil), which include n repeats, wherein n=3-5.

(55) The assembly of two matching peptidic alpha-helices is understood as a “coiled coil”, and the connector composed a coiled coil, or an assembly of at least two polypeptide chains connected by a coiled coil is understood as comprising a helical coiled coil-type structure.

(56) The term “terminal region” with respect to an amino acid sequence shall mean the C-terminal region, the N-terminal region, or both. The terminal region of a polypeptide (which term is herein understood as referring to both, a polypeptide or protein) sequence is specifically understood as the region of consecutive amino acids including the terminal amino acid, such region including e.g., the peptidic alpha-helix part of the polypeptide. Typically, the terminal region including or consisting of the peptidic alpha-helix includes or consists of at least 15, 20, 25, 30, 35, 40, 45, or 50 consecutive amino acids including the terminal amino acid (i.e., the terminus), e.g., preferably a number of amino acids ranging between 20 and 45 amino acids. For example, with reference to the terminal region including 5 repeats of a heptad motif, the terminal region includes 35 consecutive amino acids including the terminal amino acid.

(57) When the peptidic alpha-helix is positioned in a non-terminal region, the polypeptide sequence comprises the peptidic alpha-helix as consecutive amino acid sequence incorporated into the polypeptide sequence, however, not including a terminal amino acid, in particular not including a terminal region of at least 5, 10, 15, 20, 25, or 35 amino acids.

(58) The term “multimeric protein” is herein understood as a composite protein, which is composed of at least two polypeptide chains, and optionally one or more further components which may be of peptidic structure or otherwise (e.g., non-peptidic or non-polypeptide). Specifically, the multimeric protein is a proteinaceous construct, which comprises two peptides or polypeptides (dimer e.g., homodimer or heterodimer), or three peptides or polypeptides (trimer e.g., homotrimer or heterotrimer), or four peptides or polypeptides (tetramer e.g., homotetramer or heterotetramer).

(59) Herein the term “polypeptide” is understood to refer to both, polypeptides and proteins, and the term “protein” shall always include polypeptides. The term “peptide shall include an amino acid sequence with a length of 2-50 amino acids, in particular 5-40 amino acids.

(60) The multimeric protein can be a vaccine component, which is composed of at least one vaccine antigen or immunogen (in particular a peptide immunogen) and an adjuvant.

(61) The term “immunogen” as used herein shall mean one or more antigens triggering an immune response in a subject. The term “antigen” as used herein shall in particular refer to any antigenic determinant, which can be possibly recognized by a binding site of an antibody or is able to bind to the peptide groove of HLA class I or class II molecules and as such may serve as stimulant for specific T cells. The target antigen is either recognized as a whole target molecule or as a fragment of such molecule, especially substructures, e.g. a polypeptide or carbohydrate structure of targets, generally referred to as “epitopes”, e.g. B-cell epitopes, T-cell epitope), which are immunologically relevant, i.e. are also recognizable by natural or monoclonal antibodies. Herein the use of T cell epitopes is preferred, e.g. to provide for allergy vaccines.

(62) The term “peptide immunogen” as used herein shall mean an antigen or immunogen of peptidic structure, in particular an immunogen that comprises or consists of a peptide of a specific amino acid sequence, which is either provided as a linear peptide or branched peptide, comprising naturally-occurring amino acid residues or modified ones, e.g. a derivative obtained by modification or chemical derivatization, such as by phosphorylation, methylation, acetylation, amidation, formation of pyrrolidone carboxylic acid, isomerization, hydroxylation, sulfation, flavin-binding, cysteine oxidation and nitrosylation.

(63) The peptide immunogen is specifically designed to trigger an immune response in a subject, and particularly includes one or more antigenic determinants, which can be possibly recognized by a binding site of an antibody or is able to bind to the peptide groove of HLA class I or class II molecules or other antigen presenting molecules such as CD1 and as such may serve as stimulant for specific T cells. The target antigen is either recognized as a whole target molecule or as a fragment of such molecule, especially substructures, e.g. a polypeptide or carbohydrate structure of targets, generally referred to as “epitopes”, e.g. B-cell epitopes, T-cell epitope, which are immunologically relevant, i.e. are also recognizable by natural or monoclonal antibodies. Herein the use of B cell epitopes is preferred to provide for e.g. oncology vaccines.

(64) The term “epitope” as used herein shall in particular refer to a molecular structure which may completely make up a specific binding partner or be part of a specific binding partner to a binding site of modular antibody of the present invention. The term epitope may also refer to haptens. Chemically, an epitope may either be composed of a carbohydrate, a peptide, a fatty acid, an organic, biochemical or inorganic substance or derivatives thereof and any combinations thereof. If an epitope is a polypeptide, it will usually include at least 3 amino acids, preferably at least 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 amino acids. There is no critical upper limit to the length of the peptide, which could comprise nearly the full length of a polypeptide sequence of a protein. Epitopes can be either linear or conformational epitopes. A linear epitope is comprised of a single segment of a primary sequence of a polypeptide or carbohydrate chain. Linear epitopes can be contiguous or overlapping. Conformational epitopes are comprised of amino acids or carbohydrates brought together by folding of the polypeptide to form a tertiary structure and the amino acids are not necessarily adjacent to one another in the linear sequence. Specifically, epitopes are at least part of diagnostically relevant molecules, i.e. the absence or presence of an epitope in a sample is qualitatively or quantitatively correlated to either a disease or to the health status of a patient or to a process status in manufacturing or to environmental and food status. Epitopes may also be at least part of therapeutically relevant molecules, i.e. molecules which can be targeted by the specific binding domain which changes the course of the disease.

(65) In cancer disease an immune response to a self-antigen is desirable. The term “self-antigen” as used herein means any antigen, specifically polypeptide or peptide produced by a normal, healthy subject that does not elicit an immune response as such. These self-antigens may be produced at aberrant or high levels in certain disease states, including cancer disease, so called tumour associated antigens (TAAs). Self-antigens which are associated with auto-immune disease are herein called auto-antigens.

(66) It is understood that the self-antigens can be naturally occurring, recombinantly or synthetically produced. It is also understood that the self-antigens need not be identical to the naturally produced antigen, but rather can include variations thereto having certain sequence identities, similarities or homology.

(67) The choice of the self-antigen for use in cancer therapy depends on the type and stage of the cancer disease, and in particular on the expression pattern of a cancer cell such as derived from a tumour or metastases. Specific examples of selected tumour associated antigens possibly used in a vaccine as described herein are Epithelial cell adhesion molecule (EpCAM), Lewis Y, alphafetoprotein (AFP) and carcinoembryonic antigen (CEA), HER2/neu, MUC-1, etc.

(68) The choice of an auto-antigen for use in the therapy of auto-immune diseases depends on the type of the auto-immune disease. Specific examples of selected auto-immune disease associated antigens possibly used in a vaccine as described herein are C1q, ADAMTS13, Desmogelin 3, keratin, gangliosides (e.g. GM1, GD1a, GQ1b), collagen type IV, IgM, cardiolipin, annexin A5, etc.

(69) In some embodiments, the immunogen comprises one or more specific allergens. An “allergen” is an antigen which can initiate a state of hypersensitivity, or which can provoke an immediate hypersensitivity reaction in a subject already sensitized with the allergen. Allergens are commonly proteins or chemicals bound to proteins which have the property of being allergenic. However, allergens can also include organic or inorganic materials derived from a variety of synthetic or natural sources such as plant materials, metals, ingredients in cosmetics or detergents, latexes, or the like.

(70) The choice of an allergen for use in the anti-allergy therapy depends on the type and severity of allergy. Specific examples of selected allergy associated antigens possibly used in a vaccine as described herein are any allergen conventionally used as immunogen, specifically house dust mite allergens (e.g. Der p1, Der p2, Der p3/-Der p23, Der f1, Der f2, Derf3/-Der f23), cat dander, grass or tree pollen, cockroach allergens, etc.

(71) The choice of an antigen specifically inducing immune response against a pathogen for use in the prophylaxis or therapy of infectious diseases depends on the type of the pathogen, e.g. a microbial or viral infectious agent. Specific examples of selected pathogen derived antigens possibly used in a vaccine as described herein are hepatitis B, hepatitis C, Cholera, HIV, Pertussis, Influenza, Typhoid, etc.

(72) A vaccine adjuvant is herein understood as a vaccine component which is capable of enhancing the effectiveness of the vaccine to trigger an immune response against the vaccine antigen.

(73) An enhanced Th1 immune response may include an increase in one or more of the cytokines associated with a Th1 immune response (such as IFNγ), and an increase in activated macrophages.

(74) An enhanced Th1 immune response may include one or more of an increase in antigen specific IgG antibodies, especially IgG1 antibodies.

(75) For example, a vaccine may be provided which includes an immunogenic composition, i.e. an adjuvant component, such as an anti-CD32 moiety (such as a peptide, or an antibody, antibody fragment or construct comprising the antigen-binding site of an antibody, each specifically recognizing CD32) linked to a TLR9 ligand and the first peptidic alpha-helix, and an immunogen component, e.g. comprising a peptide immunogen linked to the second peptidic alpha-helix that matches the first one. Specific examples of such immunogenic compositions are disclosed in WO2014009209A2.

(76) The multimeric protein may be antigen binding molecule such as an antibody, or a fragment thereof.

(77) The term “antibody” as used herein shall refer to polypeptides or proteins that consist of or comprise antibody domains, which are understood as constant and/or variable domains of the heavy and/or light chains of immunoglobulins, with or without a linker sequence. Polypeptides are understood as antibody domains, if comprising a beta-barrel structure consisting of at least two beta-strands of an antibody domain structure connected by a loop sequence. Antibody domains may be of native structure or modified by mutagenesis or derivatization, e.g. to modify the antigen binding properties or any other property, such as stability or functional properties, such as binding to the Fc receptors FcRn and/or Fcgamma receptor.

(78) The antibody as used herein has a specific binding site to bind one or more antigens or one or more epitopes of such antigens, specifically comprising a CDR binding site of a single variable antibody domain, such as VH, VL or VHH, or a binding site of pairs of variable antibody domains, such as a VL/VH pair, an antibody comprising a VL/VH domain pair and constant antibody domains, such as Fab, F(ab′), (Fab).sub.2, scFv, Fv, or a full length antibody.

(79) The term “antibody” as used herein shall particularly refer to antibody formats comprising or consisting of single variable antibody domain, such as VH, VL or VHH, or combinations of variable and/or constant antibody domains with or without a linking sequence or hinge region, including pairs of variable antibody domains, such as a VL/VH pair, an antibody comprising or consisting of a VL/VH domain pair and constant antibody domains, such as heavy-chain antibodies, Fab, F(ab′), (Fab).sub.2, scFv, Fd, Fv, or a full-length antibody, e.g. of an IgG type (e.g., an IgG1, IgG2, IgG3, or IgG4 sub-type), IgA1, IgA2, IgD, IgE, or IgM antibody. The term “full length antibody” can be used to refer to any antibody molecule comprising at least most of the Fc domain and other domains commonly found in a naturally occurring antibody monomer. This phrase is used herein to emphasize that a particular antibody molecule is not an antibody fragment.

(80) The term “antibody” shall particularly apply to antibodies of animal origin, including human species, such as mammalian, including e.g., human, murine, or rabbit, which term shall include recombinant antibodies which are based on a sequence of animal origin, e.g. human sequences.

(81) The term “antibody” further applies to chimeric antibodies with sequences of origin of different species, such as sequences of murine and human origin.

(82) The term “chimeric” as used with respect to an antibody refers to those antibodies wherein one portion of each of the amino acid sequences of heavy and light chains is homologous to corresponding sequences in antibodies derived from a particular species or belonging to a particular class, while the remaining segment of the chain is homologous to corresponding sequences in another species or class. Typically the variable region of both light and heavy chains mimics the variable regions of antibodies derived from one species of mammals, while the constant portions are homologous to sequences of antibodies derived from another. For example, the variable region can be derived from presently known sources using readily available B-cells or hybridomas from non-human host organisms in combination with constant regions derived from, for example, human cell preparations.

(83) The term “antibody” may further apply to humanized antibodies.

(84) The term “humanized” as used with respect to an antibody refers to a molecule having an antigen binding site that is substantially derived from an immunoglobulin from a non-human species, wherein the remaining immunoglobulin structure of the molecule is based upon the structure and/or sequence of a human immunoglobulin. The antigen binding site may either comprise complete variable domains fused onto constant domains or only the complementarity determining regions (CDR) grafted onto appropriate framework regions in the variable domains. Antigen-binding sites may be wild-type or modified, e.g. by one or more amino acid substitutions, preferably modified to resemble human immunoglobulins more closely. Some forms of humanized antibodies preserve all CDR sequences (for example a humanized mouse antibody which contains all six CDRs from the mouse antibody). Other forms have one or more CDRs which are altered with respect to the original antibody.

(85) The term “antibody” further applies to human antibodies.

(86) The term “human” as used with respect to an antibody, is understood to include antibodies having variable and constant regions derived from human germline immunoglobulin sequences. The human antibody of the invention may include amino acid residues not encoded by human germline immunoglobulin sequences (e.g., mutations introduced by random or site-specific mutagenesis in vitro or by somatic mutation in vivo), for example in the CDRs. Human antibodies include antibodies isolated from human immunoglobulin libraries or from animals transgenic for one or more human immunoglobulin.

(87) The term “antibody” specifically applies to antibodies of any class or subclass. Depending on the amino acid sequence of the constant domain of their heavy chains, antibodies can be assigned to the major classes of antibodies IgA, IgD, IgE, IgG, and IgM, and several of these may be further divided into subclasses (isotypes), e.g., IgG1, IgG2, IgG3, IgG4, IgA1, and IgA2.

(88) The term further applies to monoclonal or polyclonal antibodies, specifically a recombinant antibody, which term includes all antibodies and antibody structures that are prepared, expressed, created or isolated by recombinant means, such as antibodies originating from animals, e.g. mammalians including human, that comprises genes or sequences from different origin, e.g. murine, chimeric, humanized antibodies, or hybridoma derived antibodies. Further examples refer to antibodies isolated from a host cell transformed to express the antibody, or antibodies isolated from a recombinant, combinatorial library of antibodies or antibody domains, or antibodies prepared, expressed, created or isolated by any other means that involve splicing of antibody gene sequences to other DNA sequences.

(89) It is understood that the term “antibody” also refers to derivatives of an antibody, in particular functionally active derivatives. An antibody derivative is understood as any combination of one or more antibody domains or antibodies and/or a fusion protein, in which any domain of the antibody may be fused or otherwise connected at any position of one or more other proteins, such as other antibodies, e.g. a binding structure comprising CDR loops, a receptor polypeptide, but also ligands, scaffold proteins, enzymes, toxins and the like. A derivative of the antibody may be obtained by association (e.g. through a connector as described herein) or binding to other substances by various chemical techniques such as covalent coupling, electrostatic interaction, di-sulphide bonding etc. The other substances bound to the antibody may be lipids, carbohydrates, nucleic acids, organic and inorganic molecules or any combination thereof (e.g. PEG, prodrugs or drugs). In a specific embodiment, the antibody is a derivative comprising an additional tag allowing specific interaction with a biologically acceptable compound. There is not a specific limitation with respect to the tag usable in the present invention, as far as it has no or tolerable negative impact on the binding of the antibody to its target. Examples of suitable tags include His-tag, Myc-tag, FLAG-tag, Strep-tag, Calmodulin-tag, GST-tag, MBP-tag, and S-tag. In another specific embodiment, the antibody is a derivative comprising a label. The term “label” as used herein refers to a detectable compound or composition which is conjugated directly or indirectly to the antibody so as to generate a “labeled” antibody. The label may be detectable by itself, e.g. radioisotope labels or fluorescent labels, or, in the case of an enzymatic label, may catalyze chemical alteration of a substrate compound or composition which is detectable.

(90) According to a specific example, the multimeric protein is an antibody construct which is composed of at least two polypeptide chains that are connected to each other by the connector as described herein. Specific examples include constructs wherein at least one antibody domain (either a single domain or a chain of 2, 3, or 4 antibody domains) is connected to another antibody domain (either a single domain or a chain of 2, 3, or 4 antibody domains), such as single chain antibody constructs wherein the linker is replaced by the connector as described herein, or constructs comprising an antibody heavy and light chain connected by a connector as described herein.

(91) The components of the connector or the multimeric protein as described herein, or parts thereof with or without the coil repeats may be obtained by various methods known in the art, e.g. by purification or isolation from cell culture, recombinant technology or by chemical synthesis.

(92) As used herein, the term “recombinant” refers to a molecule or construct that does not naturally occur in a host cell. In some embodiments, recombinant nucleic acid molecules contain two or more naturally-occurring sequences that are linked together in a way that does not occur naturally. A recombinant protein refers to a protein that is encoded and/or expressed by a recombinant nucleic acid. In some embodiments, “recombinant cells” express genes that are not found in identical form within the native (i.e., non-recombinant) form of the cell and/or express native genes that are otherwise abnormally over-expressed, under-expressed, and/or not expressed at all due to deliberate human intervention. Recombinant cells contain at least one recombinant polynucleotide or polypeptide. “Recombination”, “recombining”, and generating a “recombined” nucleic acid generally encompass the assembly of at least two nucleic acid fragments. In certain embodiments, recombinant proteins and recombinant nucleic acids remain functional, i.e., retain their activity or exhibit an enhanced activity in the host cell.

(93) Thus, the invention further refers to the production of the connector or the mkultimeric protein comprising the connector, or components thereof (e.g. any of the polypeptide chains, in particular the first alpha-helix or the first polypeptide chain comprising the first alpha-helix), and the recombinant means for such production, including a nucleic acid encoding the respective amino acid sequence, an expression cassette, a vector or plasmid comprising the nucleic acid encoding the amino acid sequence to be expressed, and a host cell comprising any such means. Suitable standard recombinant DNA techniques are known in the art and described inter alia in Sambrook et al., “Molecular Cloning: A Laboratory Manual” (1989), 2nd Edition (Cold Spring Harbor Laboratory press).

(94) The multimeric protein as described herein is specifically composed of at least one heterologous recombinant polypeptide chain, produced in a eukaryotic cell, preferably any of the mammalian or yeast cells as described herein, preferably as secreted proteins.

(95) The term “heterologous” as used herein with respect to a nucleotide or amino acid sequence or protein, refers to a compound which is either foreign, i.e. “exogenous”, such as not found in nature, to a given host cell; or that is naturally found in a given host cell, e.g., is “endogenous”, however, in the context of a heterologous construct, e.g. employing a heterologous nucleic acid. The heterologous nucleotide sequence as found endogenously may also be produced in an unnatural, e.g. greater than expected or greater than naturally found, amount in the cell. The heterologous nucleotide sequence, or a nucleic acid comprising the heterologous nucleotide sequence, possibly differs in sequence from the endogenous nucleotide sequence but encodes the same protein as found endogenously. Specifically, heterologous nucleotide sequences are those not found in the same relationship to a host cell in nature. Any recombinant or artificial nucleotide sequence is understood to be heterologous. An example of a heterologous polynucleotide is a nucleotide sequence not natively associated with a promoter used in an expression construct, e.g. to obtain a hybrid promoter, or operably linked to a coding sequence, as described herein. As a result, a hybrid or chimeric polynucleotide may be obtained. A further example of a heterologous compound is a protein encoding polynucleotide operably linked to a transcriptional control element, e.g., an exogenous promoter to which an endogenous, naturally-occurring protein coding sequence is not normally operably linked.

(96) The term “host cell” as used herein shall refer to primary subject cells transformed to produce a particular recombinant protein, such as a multimeric protein as described herein or one or more polypeptide chains of any such multimeric protein, and any progeny thereof. It should be understood that not all progeny are exactly identical to the parental cell (due to deliberate or inadvertent mutations or differences in environment), however, such altered progeny are included in these terms, so long as the progeny retain the same functionality as that of the originally transformed cell. The term “host cell line” refers to a cell line of host cells as used for expressing a recombinant gene to produce recombinant polypeptides such as recombinant antibodies. The term “cell line” as used herein refers to an established clone of a particular cell type that has acquired the ability to proliferate over a prolonged period of time. Such host cell or host cell line may be maintained in cell culture and/or cultivated to produce a recombinant polypeptide.

(97) According to a specific aspect, a recombinant construct is obtained by ligating the relevant genes into a vector. These genes can be stably integrated into a host cell genome by transforming the host cell using such vectors. The polypeptides encoded by the genes can be produced using the recombinant host cell line by culturing a transformant, thus obtained in an appropriate medium, isolating the expressed protein from the culture, and purifying it by a method appropriate for the expressed product, in particular to separate the protein from contaminating proteins.

(98) For recombinant technologies, isolated nucleic acids may be provided.

(99) With reference to nucleic acids as described herein, the term “isolated nucleic acid” is sometimes used. This term, when applied to DNA, refers to a DNA molecule that is separated from sequences with which it is immediately contiguous in the naturally occurring genome of the organism in which it originated. For example, an “isolated nucleic acid” may comprise a DNA molecule inserted into a vector, such as a plasmid or virus vector, or integrated into the genomic DNA of a prokaryotic or eukaryotic cell or host organism. When applied to RNA, the term “isolated nucleic acid” refers primarily to an RNA molecule encoded by an isolated DNA molecule as defined above. Alternatively, the term may refer to an RNA molecule that has been sufficiently separated from other nucleic acids with which it would be associated in its natural state (i.e., in cells or tissues). An “isolated nucleic acid” (either DNA or RNA) may further represent a molecule produced directly by biological or synthetic means and separated from other components present during its production.

(100) With reference to polypeptides or proteins, the term “isolated” shall specifically refer to compounds that are free or substantially free of material with which they are naturally associated such as other compounds with which they are found in their natural environment, or the environment in which they are prepared (e g. cell culture) when such preparation is by recombinant DNA technology practiced in vitro or in vivo. Isolated compounds can be formulated with diluents or adjuvants and still for practical purposes be isolated—for example, the polypeptides or polynucleotides can be mixed with pharmaceutically acceptable carriers or excipients when used in diagnosis or therapy.

(101) As isolation and purification methods for obtaining a recombinant polypeptide or protein product, methods, such as methods utilizing difference in solubility, such as salting out and solvent precipitation, methods utilizing difference in molecular weight, such as ultrafiltration and gel electrophoresis, methods utilizing difference in electric charge, such as ion-exchange chromatography, methods utilizing specific affinity, such as affinity chromatography, methods utilizing difference in hydrophobicity, such as reverse phase high performance liquid chromatography, and methods utilizing difference in isoelectric point, such as isoelectric focusing may be used.

(102) The highly purified product is essentially free from contaminating proteins, and preferably has a purity of at least 90%, more preferred at least 95%, or even at least 98%, up to 100%. The purified products may be obtained by purification of the cell culture supernatant or else from cellular debris.

(103) As isolation and purification methods the following standard methods are preferred: Cell disruption (if a protein is obtained intracellularly), cell (debris) separation and wash by Microfiltration or Tangential Flow Filter (TFF) or centrifugation, protein purification by precipitation or heat treatment, protein activation by enzymatic digest, protein purification by chromatography, such as ion exchange (IEX), hydrophobic ointeraction chromatography (HIC), Affinity chromatography, size exclusion (SEC) or HPLC Chromatography, protein precipitation of concentration and washing by ultrafiltration steps.

(104) The isolated and purified proteins can be identified by conventional methods such as Western blot, HPLC, activity assay, or ELISA.

(105) Therefore, the invention particularly provides for improved connector domains and improved multimeric proteins wherein polypeptide chains are connected to each other using such improved connectors.

(106) According to a specific example, the loss of C-terminal Lys in proteins produced in mammalian cells could be avoided and a correct C-terminus (Lys-extension e.g., an additional Ala, Gly or Ser after a C-terminal Lys) was obtained. Such correct terminus surprisingly not only stabilized the protein construct through a stabilized coiled coil structure but also enhanced the affinity of connecting the coil to the counter coil.

(107) The foregoing description will be more fully understood with reference to the following examples. Such examples are, however, merely representative of methods of practicing one or more embodiments of the present invention and should not be read as limiting the scope of invention.

EXAMPLES

Example 1: Improved ScFV-coil1

(108) An immunogenic composition is prepared according to WO2014009209A2. Therefore, ScFV-coil1 is the basis for the preparation of warhead. In order to transform a ScFV-coil1 molecule into a warhead molecule, CpG oligonucleotides (TLR9 binder) are coupled to “Lys” amino acids in the ScFV-coil1 protein backbone.

(109) According to prior art, ScFV-coil1 was produced in mammalian cells such as CAP-T cells, HEK293 and CHO cells but also in insect cells such as S2 cells and in yeast cells (Pichia pastoris).

(110) As an exemplary ScFV-coil1, a polypeptide characterized by the amino acid sequence, identified as SEQ ID 49 (FIG. 1), has been used.

(111) However, an analysis in mass spectrometry showed that in ˜40% of the ScFV-coil1 molecules produced in mammalian cells, the Lys(289) was cleaved off (FIG. 4a), a phenomenon which has previously been described by Harris (1995. Processing of C-terminal lysine and arginine residues of proteins isolated from mammalian cell culture. J. Chromatogr. A 705:129-134). However, ScFV-coil1 produced in insect cells did not show this phenomenon as 100% of the molecules contained the Lys(289) (FIG. 4b). The Lys molecules in the ScFV-coil1 are targets for coupling of the CpG molecules and therefore important for the full functionality of ScFV-coil1 in the immunogenic construct.

(112) As a solution to this problem, the following was constructed:

(113) By adding an Ala to the C′-end of the sequence (ScFV-coil1ala, FIG. 2) Lys(289) was effectively protected. Mass spectrometry of ScFV-coil1ala produced in HEK293 cells or in CHO cells showed 100% full length ScFV-coil1ala (FIG. 4c). The same was true when Gly or Ser were added at the C′-end (data not shown).

(114) According to this example, ScFV-coil1 (the basis for the warhead) contains a 5× heptad repeat structure (pepE, italic in FIGS. 1 and 2) which forms a coiled coil interaction with a similar 5× heptad repeat structure called pepK (FIG. 3), which is part of the immunogen component. When immunogen and warhead are mixed, a coiled coil is formed, thus, coupling warhead and immunogen with high affinity to form a stable complex to be used to produce a vaccine construct for use in allergy, oncology, infectious diseases and/or autoimmune diseases. The immunogen component would need to further comprise one or more relevant epitopes to trigger an effective immune response as necessary to treat any such disease.

(115) When scFV-coil1 and ScFV-coil1ala were compared in Octet analyses for the affinity between the pepE and pepK, the affinity between pepK and scFV-coil1 was lower than the affinity between pepK and scFV-coil1ala, whereas the K.sub.D value obtained with the partly truncated version of ScFV-coil1 was higher (reflecting a lower affinity) than that of the full length (FIG. 5). Affinity was similarly increased with Gly or Ser at the C′-end (data not shown).

Example 2: Mass Spectrometry of ScFV-Coil1 Variants

(116) Purified ScFV-coil1 variants were analyzed for mass distribution on a Waters 3100 Mass Detector according to the manufacturer's instructions. Mass spectra were acquired by electrospray ionization (ESI-MS) operating in positive ion mode. The mass spectra were de-convoluted by the program MaxEnt v. 4.1 (Waters).

(117) The expected molecular weight (MW) based on the sequence of FIG. 1 is 30764 Da.

(118) As shown in FIG. 4A, for scFV-coil1 produced in recombinant HEK293 cells, the MW found are 30755.5 Da (˜60% of total material) and 30630 Da (˜40% of total material) which corresponds to expected MW minus Lys(289)

(119) As shown in FIG. 4B, for scFV-coil1 produced in recombinant S2 insect cells cells, the MW found is 30755.5 Da indicating full length protein.

(120) The expected molecular weight (MW) based on the sequence of FIG. 2 is 30835.4 Da.

(121) As shown in FIG. 4C, for scFV-coil1ala (obtained according to Example 1) produced in recombinant HEK293 cells, the MW found is 30831.0 Da indicating full length protein.

Example 3: Affinity Analyses of pepK and scFV-Coil1 or ScFV-Coil1ala

(122) The affinity of the E-coil on the C-terminus of the different versions of ScFV-coil1 for the pepK was determined by Bio-Layer Interferometry on an Octet QK instrument. For this purpose Streptavidin-biosensors were loaded with biotinylated pepK, and after blocking the surface for unspecific binding, association and dissociation rates were measured for the ScFV-coil1 proteins. From the binding curves obtained by different protein concentrations the affinities were calculated. For clarity, only one concentration is shown in the graph. K.sub.D values of 8.7 pM for ScFV-coil1ala (obtained according to Example 1) obtained from recombinant HEK cells, 15.8 pM for ScFV-coil1 obtained from recombinant S2 insect cells, and 32.5 pM for ScFV-coil1 obtained from recombinant HEK cells, were calculated. The lowest affinity is found for the ScFV-coil1 derived from the HEK cells which show deletion of the Lys 289 in ˜40% of molecules (FIG. 4A). Material from insect cells which contains the complete original sequence (FIG. 4B) shows an intermediate affinity, whereas the ScFv-coil1ala (FIG. 4C) unexpectedly shows the highest affinity, see FIG. 5.