Synthetic Two-piece and Three-piece Split Inteins for Protein trans -Splicing

Wenchang Sun, Jing Yang, Xiang-Qin Liu
2004 Journal of Biological Chemistry  
Inteins are protein-intervening sequences that can self-excise and concomitantly splice together the flanking polypeptides. Two-piece split inteins capable of protein trans-splicing have been found in nature and engineered in laboratories, but they all have a similar split site corresponding to the endonuclease domain of the intein. Can inteins be split at other sites and do transsplicing? After testing 13 split sites engineered into a Ssp DnaB mini-intein, we report the finding of three new
more » ... it sites that each produced a two-piece split intein capable of protein trans-splicing. These three functional split sites are located in different loop regions between ␤-strands of the intein structure, and one of them is just 11 amino acids from the beginning of the intein. Because different inteins have similar structures and similar ␤-strands, these new split sites may be generalized to other inteins. We have also demonstrated for the first time that a three-piece split intein could function in protein trans-splicing. These findings have implications for intein structure-function, evolution, and uses in biotechnology. An intein is a protein-intervening sequence that catalyzes a protein-splicing reaction in which the intein sequence is precisely excised and its flanking sequences (N-and C-exteins) join with a peptide bond to produce the mature host protein (spliced protein) (1). The mechanism of protein splicing typically has four steps: two acyl rearrangements at the two splicing junctions, a trans-esterification between the two junctions, and a cyclization of the Asn residue at the C-terminal junction (2-4). Crystal structures of inteins revealed a splicing domain consisting of 11-12 ␤-strands and forming a compact horseshoe shape with the splicing junctions located in the central cleft (5-11). A majority of inteins also have a homing endonuclease domain inserted in the splicing domain sequence (12). These bifunctional inteins are ϳ350 -550 amino acids (aa) 1 long, although some extra large inteins are up to 1650 aa long and also contain tandem repeats (13, 14) . Nearly 200 intein and inteinlike sequences have been found in a wide variety of host proteins and in microorganisms belonging to bacteria, Archaea, and eukaryotes (12, 15). Their sporadic phylogenetic distributions suggest lateral gene transfer through intein homing (16, 17) . Inteins generally share only low levels of sequence similarity, but they share striking similarities in structure, reaction mechanism, and evolution (4, 18, 19, 21) . It is thought that inteins first originated with just the splicing domain and then acquired the endonuclease domain, with the latter conferring genetic mobility to the intein. During intein evolution, however, some inteins lost their endonuclease domain to become mini-inteins consisting of just the ϳ130-aa protein-splicing domain plus a linker sequence of various lengths in place of the endonuclease domain (12, 22 ). An interesting event of intein evolution is the loss of sequence continuity in some inteins, which apparently produced the DnaE split intein that exists in two fragments and is capable of protein trans-splicing (23). A pair of split DnaE genes produces two precursor polypeptides, with one consisting of the N-terminal part of DnaE (N-extein) followed by the N-terminal part of intein (N-intein) and another consisting of the C-terminal part of DnaE (C-extein) preceded by the Cterminal part of the intein (C-intein). The N-and C-inteins, through their structural complementation, can reassemble and catalyze a protein trans-splicing reaction to produce a mature DnaE. This two-piece split intein has since been found in many cyanobacterial species (14, 24). These naturally occurring split inteins most likely originated from a contiguous intein sequence as a result of genomic rearrangement(s) that broke the intein coding sequence. Interestingly, all these DnaE split inteins share the same split site, which may be explained by a single origin for all these split inteins. However, it may also suggest that splitting at any other site is incompatible with protein trans-splicing and therefore not tolerated, which can be examined by splitting inteins at other sites followed by testing for possible trans-splicing. Synthetic two-piece split inteins have been engineered before in laboratories by splitting the coding sequences of contiguous inteins, but their split sites corresponded with those of the naturally occurring split inteins (25) (26) (27) . Both naturally occurring and synthetic split inteins have found many practical uses, which include producing trans-spliced recombinant proteins and circularized proteins or peptides for various purposes (28 -30). Finding new split sites for functional split inteins can be useful for the many emerging applications involving protein trans-splicing techniques. Inteins have been viewed as protein equivalents of introns because of some superficial similarities between inteins and self-splicing introns. Both are intervening sequences, can excise themselves through self-splicing, and are genetically mobile through a similar homing mechanism. Furthermore, the natural occurrence of two-piece split inteins parallels the natural occurrence of two-piece split group II introns. A group II intron can also exist in three pieces, and further fragmentation is believed to have led to the origin of nuclear spliceosomal introns (31, 32). Can inteins also function when split at different sites and into more than two pieces? This is not as predict-
doi:10.1074/jbc.m405491200 pmid:15194682 fatcat:rvdicjlasng5tnlws64am6rg3m