IS Families/IS256 family
Contents
The IS256 cluster
IS256 was first identified in 1987 as part of the gentamycin resistance transposon Tn4001[1][2] from Staphylococcus aureus[2]. It was observed that tandem duplication of IS256 contiguous with Tn4001 resulted in an increase in the level of resistance to gentamycin, tobramycin and kanamycin (Gm, Tm, and Km) implying the presence strong of IS256-associated promoters. Other examples of IS256-mediated increased gene expression have also been observed[3][4] IS256 is widely distributed in staphylococci and enterococcus[5][6] where it is part of a variety composite transposons[7][8][9][10].
Recently, a study of ICE elements identified examples from type B Streptococcus [TnGBS[11]] and Mycoplasma[12] which include a DDE type Tpase rather than the more common phage integrase-like gene. Using a cascade PSI-Blast approach not only revealed two new IS families (ISLre2 and ISKra4) but established a distant relationship with the IS256 and ISH6 families[13] (Fig. IS256.1 and Fig. IS256.2).
Analysis of the N-terminal Tpase region[13] also identified two shared domains (N1 and N2). N2 corresponds to a potential HTH domain in the region of the IS256 Tpase which recognizes the terminal IRs[14] (Fig. IS256.3).
The cluster can be divided into five clades containing nine groups based on branching of the Tpases phylogenetic tree: two types of closely related TnGBS, TnGBS1 and TnGBS2, and ISLre2 (MULT3); the Mycoplasma ICE; IS256 (MULT1); ISH6 (MULT2); ISAzba1, ISMich2, ISKra4 (MULT4)[13] (Fig. IS256.1 and Fig. IS256.2 and Fig. IS256.3).
There is a distant relationship with the Tpase of the eukaryotic Mutator TE and, like MuDR from Zea mays, many generate 8-9-bp target repeat on insertion. They have therefore been called MULE (for Mutator-Like Elements). Like MuDr/Foldback, members of these groups carry a largely α-helical insertion domain between the second D and E catalytic residues. This includes a conserved C/D(2)H signature present in the eukaryotic and prokaryotic IS[13][15].
IS256
The IS256 family can be subdivided into 3 groups: IS256, IS1249, and ISC1250.
IS256 group
The classical IS256 group has large number of members in both bacteria and archaea. They are between 1200 and 1500 bp long with IR of 20-30 bp (Fig. IS256.4) and generate DR of between 8 and 9 bp. A single long orf carrying a potential DDE motif with a spacing of 112 residues between the second D and E residues (Fig. IS256.2), together with a correctly placed K/R residue. This spacing is due to an insertion domain[16][17]. The catalytic residues have been validated by mutagenesis[18]. It was shown several years ago that the Tpase of IS256 family elements share some similarities with the eukaryotic Mutator element[19], a relationship which has been explored recently in more detail[20].
Members of this family transpose using an excised circular dsDNA transposon intermediate [e.g. [18][21]]. They are also found as part of composite transposons such as Tn4001 flanked on either side by IS256[22][2][23][24]. For IS256 itself, the sequences of circle junctions showed that the left IS end preferentially attacked the right end[21], a result that was independently demonstrated by the effect of small deletions in the left and right ends on circle formation[14].
IS1249 group
There are more than 30 members confined at present to the Actinobacteria and the Firmicutes. They are about 1300 pb in length with IR of about 26bp (Fig. IS256.4) and generally generate DR of 8bp (with variations of between 0 and 10).
ISC1250 group
At present, there are only 3 members of this group in ISfinder. All are found in the archaeon Sulfolobus solfataricus.
ISH6
This group (MULT2) were originally observed uniquely in archaea[25]. There are 11 members of about 1450 bp with highly conserved IR of 24-27bp (Fig. IS256.5), DR of 8 bp and a single Tpase orf encoding a protein of 450 bp.
ISLre2
There are 48 entries for ISLre2 family members in ISfinder. They are restricted at present to the bacteria. They are between 1500 and 2000 bp long, with IR from 15 to 29 bp (Fig. IS256.5) and generate 9 bp DR. Together with the related TnGBS ICE, show strong target specificity and insert 13-17 bp upstream of σA promoters [11][13] in oriented fashion with RE proximal. PCR analysis has detected a transposon circle junction, as with the related ICE, suggesting that transposition may occur via a Donor Primed Transposon Replication process.
ISKra4
This newly emerging family includes 83 members and is divided into three related groups: ISAzba1, ISMich2 and ISKra4.
ISAzba1
There are presently 28 members of this group. They encode a Tpase of between 450 and 480 aa, are 1400 to 2900 bp long with IR of about 20 bp (Fig. IS256.6) and no DR. Six (ISAfe13, ISCot1, ISEc51, ISKpn19, ISSysp7) carry an orf in addition to the Tpase and this specifies a protein related to serine-recombinases or resolvases. Four of these also include a third orf annotated as hypothetical protein. The fifth, ISAfe13, carries the Tpase, a resolvase, and an alternative orf annotated as ORF-3-like from plasmid pRiA4b. Other proteins found in this family are annotated as being hypothetical or putative TnpR resolvases although no direct evidence for resolvase function is available. Eight other members simply encode the Tpase and the ORF-3 like protein. While ISCep1 includes the ORF-3-like protein and a third annotated as phage integrase or xerC/D.
ISMich2
This includes 24 members which are presently limited to the cyanobacteria. Twenty two have a Tpase orf distributed between two reading phases while in the remaining 2 the Tpase forms a unique continuous orf. However all show a potential but atypical frameshift motif, TTTTTT which could be involved in either PRF (Programmed -1 Ribosomal Frameshifting) or PTR (Programmed Transcriptional Frameshifting) recoding. The further experimental analysis would be necessary to confirm or refute this. Members are between 1250 and 1400 bp long with a Tpase of 360aa, IR of between 18 and 39 bp (Fig. IS256.6) with 8 bp DR. Three members (ISCysp26; ISMic1; ISMich2) carry a passenger gene annotated as hypothetical protein.
ISKra4
This small group of elements range in size from 1400 to 3700 pb due to the presence in some of various passenger genes. They have IR of 18 to 31 bp (Fig. IS256.6) and generate DR of 9 bp. Three carry passenger genes: ISLdr1, a hypothetical protein and a reverse transcriptase; ISSri1, a transcriptional regulator; and ISTn1, a hypothetical protein. Six members may express their Tpases by frameshifting (5 include a 7A motif and 1 with a motif, 5TC).
Mechanism
Recently IS256 transposition mechanism has been addressed [25] using the family member ISCth4 from the thermophilic Clostridium thermocellum (Hungateiclostridium thermocellum ATCC 27405; NC_009012). ISCth4 is present in 15 copies in the host genome, 12 of which are flanked by 8-bp target site duplications (ISfinder: https://www-is.biotoul.fr/scripts/ficheIS.php?name=ISCth4) strongly suggesting that the IS is active. The choice of this IS was dictated by the solubility of its transposase for structural studies. Like other members of the family, the ISCth4 transposase includes a predicted alpha-helical insertion domain (Groups with DDE Transposases;[15][26][27]) within the catalytic domain with a conserved CxxH motif (Fig. IS256.2) important for catalysis in transposases of this type [25][28][29]. In addition, like other IS256 family members [4], there are sequence elements present in both ends [25] which would generate a strong promoter to drive transposase expression in a typical circular IS transposition intermediate (IS and Gene Expression; IS3 family: Mechanism, Circular intermediate).
The ISCth4 transposase is organized (Fig. IS256.7 top)[25] into a number of domains including dimerization (DD), N-terminal DNA binding (NDB), a helix-turn-helix (HTH) and a catalytic domain CD split by an “insertion” domain (ID) in which the upstream CD segment carries the DD and CXXH residues and the downstream CD carries the final E of the DDE triad. The position of these domains their relevant secondary and tertiary structures (Fig. IS256.7 bottom) bear a close correspondence to those of the IS256 transposase itself.
Transposition reaction steps
Biochemical studies explored ISCth4 transposition steps and confirmed that this IS undergoes copy-out- paste-in transposition (Major DDE transposition pathways) [25].
The first step in the transposition pathway, figure-eight formation, in which a single-strand bridge is formed between two IS ends, was observed in vitro. This was observed using if linear substrates with a pair of 35 bp IRL and IRR cloned in the appropriate orientation. No bridged intermediate could be detected when the substrate was in a supercoiled form. The sequences of the intervening linker bases which derive from the flank of the target end (e.g. [30]) were between 6 and 8bp in length and showed that either end could be used as a target by its partner end. However, the sample (n=8) was not large enough to determine whether there was a preference. The second step, replication and excision, is technically challenging to observe (see [31]) and was not addressed.
It was also possible to reconstitute the final circle integration step in vitro. This involves cleavage at both IS ends at the IS circle junction to generate a 3’OH and their transfer into the target. In the classical concerted integration reaction (first seen as a product of a mini phage Mu insertion into a circular plasmid [32] and subsequently developed as a diagnostic assay concerted insertion of other transposons such as HIV [33] and mariner [34]), two ends, both with 3’OH insert into opposite strands at the target site in a supercoiled target resulting in linearization (single end integration results simple in relaxation of the target). Double strand IR of more than 25 bp were found to efficiently integrate. Moreover, when the two IS ends were abutted as in an IS circle, the junction was able to integrate to generate a linear plasmid substrate in a reaction which was most efficient if the spacer was 6bp in length and resulted in an 8bp flanking direct target repeat. Junction integration occurred preferentially into a supercoiled target.
An Asymmetric Transpososome
A structural analysis has provided some important insights into the molecular machinery and mechanism of ISCth4 transposition and by extension, to transposition of the entire IS256 family and probably of the other IS families which transpose using a copy-out-paste-in mechanism. TnpA was found to bind to DNA as at least a dimer (stoichiometry 1:2, end:TnpA). The crystal structure of TnpA bound to three different substrates, a pre-reaction complex with an IRR 26 bp and six flanking bp, a pre-cleaved complex an IRR 26 bp without flanking DNA and a substrate resembling an IS circle structure with abutted left and right ends but could also structurally mimic the figure-eight intermediate in which the two IRs are bridged on one strand by a 6 nucleotide spacer while the other carries a gapped 5 nucleotide complementary spacer, were determined at good resolution.
One remarkable observation is that all three transpososome complexes are asymmetric (Fig. IS256.8). Although not unexpected in view of the asymmetric cleavage and transfer reactions involved in copy-out-paste-in transposition, it is unusual and the only example of an asymmetric complex of the transpososome structures available [17][35][36][37]. The studies showed that: in all three structures, a single DNA molecule is bound by a TnpA dimer; the IR is recognized through an extensive interface where most protein-DNA interactions with one of the two TnpA protomers [A] with the transposon tip directed toward the catalytic domain of the partner protomer [B] consistent with the commonly observed “trans” cleavage which occurs in many transposition reactions [35][37] (Cleavage in trans: A Committed Complex); the alpha helical insertion domain (a characteristic of the IS256 family) which interrupts the catalytic domain (Fig. IS256.7) includes the important C/DxxH motif which, in the eukaryotic TE, Hermes is located at the same position [29][38]; the donor IR is bound by two regions of one monomer [A], NDB binds a subterminal sequence while sequence closer to the tip are bound by both the CD and HTH domains (Fig. IS256.8) and some of the residues involved also important in IS256 transposase binding [14]. It is these interactions towards the IR tip that direct it towards the catalytic domain of the partner monomer.
The structural studies provide an indication of how the figure-eight intermediate may be formed and the IS circle junction processed in insertion: the removal of the DNA flank in the pre-reaction complex (Fig. IS256.8i) to generate the pre-cleavage complex (Fig. IS256.8ii) results in folding of the ID domain of the non-bound monomer [B] towards the IR tip of the DNA on the bound monomer [A]. In the Strand Transfer complex (Fig. IS256.8iii), both IDs ([A] and [B]) close, forming a protein-protein interface and enclose the linker sequence interacting with it in a non-sequence specific way. This movement creates an additional DNA binding site which places the tip of one IR in the catalytic site. The DNA substrate used could assume configurations resembling either the figure eight junction (but lacking the adjoining donor plasmid flanks found for IS911 [30]) or an IS circle junction with a single nucleotide gap at one IR tip (Fig. IS256.9). It is proposed [25] that target DNA is sequestered following reopening of the ID [A] and [B] interface.
The insertion domain appears to undergo dynamic changes, opening and closing over the course of the consecutive transposition steps, a flexibility found also for similar domains in eukaryotic transposons Transbib and Rag1 (Ru et al , 2018; Liu et al , 2019; Chen et al , 2020). This movement directs different DNA segments along the copy-out-paste-in transposition pathway. However, not all IS which undergo this type of transposition carry transposases with Insertion Domains (Transposases examined by secondary structure prediction programs). It will be interesting to determine how these IS, such as IS3 or IS30 family members accomplish figure-eight formation and integration of the circle junction without the ID.
While these data provide some revealing snapshots of copy-out-paste-in transposition, they leave some intriguing questions such as how the figure-eight intermediate is replicated to generate a closed circular IS form with a junction composed of abutted IRs and how the target DNA engages in the transpososome.
Bibliography
- ↑ <pubmed>6323927</pubmed>
- ↑ 2.0 2.1 2.2 </nowiki>
- ↑ <pubmed>31474962</pubmed>
- ↑ 4.0 4.1 </nowiki>
- ↑ <pubmed>7899522</pubmed>
- ↑ <pubmed>1334269</pubmed>
- ↑ <pubmed>8654967</pubmed>
- ↑ <pubmed>7625803</pubmed>
- ↑ <pubmed>8031032</pubmed>
- ↑ <pubmed>8723445</pubmed>
- ↑ 11.0 11.1 </nowiki>
- ↑ <pubmed>23888872</pubmed>
- ↑ 13.0 13.1 13.2 13.3 13.4 </nowiki>
- ↑ 14.0 14.1 14.2 </nowiki>
- ↑ 15.0 15.1 Yuan YW, Wessler SR . The catalytic domain of all eukaryotic cut-and-paste transposase superfamilies. - Proc Natl Acad Sci U S A: 2011 May 10, 108(19);7884-9 [PubMed:21518873] [DOI] </nowiki>
- ↑ <pubmed>20067338</pubmed>
- ↑ 17.0 17.1 Dyda F, Chandler M, Hickman AB . The emerging diversity of transpososome architectures. - Q Rev Biophys: 2012 Nov, 45(4);493-521 [PubMed:23217365] [DOI] </nowiki>
- ↑ 18.0 18.1 </nowiki>
- ↑ <pubmed>8041625</pubmed>
- ↑ <pubmed>19018586</pubmed>
- ↑ 21.0 21.1 </nowiki>
- ↑ <pubmed>6323927</pubmed>
- ↑ <pubmed>2553542</pubmed>
- ↑ <pubmed>2544565</pubmed>
- ↑ 25.0 25.1 25.2 25.3 25.4 25.5 25.6 Filée J, Siguier P, Chandler M . Insertion sequence diversity in archaea. - Microbiol Mol Biol Rev: 2007 Mar, 71(1);121-57 [PubMed:17347521] [DOI] </nowiki>
- ↑ <pubmed>10207011</pubmed>
- ↑ <pubmed>16041385</pubmed>
- ↑ <pubmed>PMC5225508</pubmed>
- ↑ 29.0 29.1 <pubmed>PMC6212770</pubmed>
- ↑ 30.0 30.1 <pubmed>7590258</pubmed>
- ↑ <pubmed>PMC522794</pubmed>
- ↑ <pubmed>2822259</pubmed>
- ↑ <pubmed>15958388</pubmed>
- ↑ <pubmed>15333635</pubmed>
- ↑ 35.0 35.1 <pubmed>PMC3536463</pubmed>
- ↑ <pubmed>PMC2999894</pubmed>
- ↑ 37.0 37.1 <pubmed>10884228</pubmed>
- ↑ <pubmed>PMC4105704</pubmed>