IS Families/IS256 family

From TnPedia
Revision as of 20:36, 8 December 2020 by TnCentral (talk | contribs)
Jump to navigation Jump to search

The IS256 cluster

IS256 was first identified in 1987 as part of the gentamycin resistance transposon Tn4001[1][2] from Staphylococcus aureus[2]. It was observed that tandem duplication of IS256 contiguous with Tn4001 resulted in an increase in the level of resistance to gentamycin, tobramycin and kanamycin (Gm, Tm, and Km) implying the presence strong of IS256-associated promoters. Other examples of IS256-mediated increased gene expression have also been observed[3][4] IS256 is widely distributed in staphylococci and enterococcus[5][6] where it is part of a variety composite transposons[7][8][9][10].

Recently, a study of ICE elements identified examples from type B Streptococcus [TnGBS[11]] and Mycoplasma[12] which include a DDE type Tpase rather than the more common phage integrase-like gene. Using a cascade PSI-Blast approach not only revealed two new IS families (ISLre2 and ISKra4) but established a distant relationship with the IS256 and ISH6 families[13] (Fig. IS256.1 and Fig. IS256.2).

Fig. IS256.1. Phylogenetic tree of prokaryotic Mutator-like transposases. Each p-MULT clade is colored according to figure 1. p-MULT 1 and p-MULT 2 transposases are encoded by ISs of the IS256 and ISH6 families, respectively. p-MULT 3 transposases are encoded by both the TnGBS family and the ISLre2 family. p-MULT 4 encoded by both transposons and by ISs form three different lineages: ISAzba1, ISMich2, and ISKra4. Transposons of the ISAzba1 group encoding a pRiA4_Orf3-like protein are indicated by blue dots. IS of the ISMich2 group with a predicted −1 frameshift in the transposase gene are indicated by pink dots. TE names are indicated at the extremity of the tree branches. TEs with a predicted σA promoter at a distance of 13–17 bp from the IR-genome junctions in more than 20% of their insertion sites are indicated by small black dots.
Fig. IS256.2. Alignment of the protein domains encompassing the catalytic DDE residues in p-MULT. Transposase sequences were aligned by the MAFFT alignment software and visualized using Jalview. The alignment was filtered for redundancy to subsequently retain a subset of transposases for each p-MULT family representative of their diversity. Only regions surrounding the predicted DDE residues and the C/D(2)H motif were kept in the alignment. Numbers given in parentheses correspond to the distance in aa residues between the different motifs. Transposases accession numbers are indicated on the left.

Analysis of the N-terminal Tpase region[13] also identified two shared domains (N1 and N2). N2 corresponds to a potential HTH domain in the region of the IS256 Tpase which recognizes the terminal IRs[14] (Fig. IS256.3).

The cluster can be divided into five clades containing nine groups based on branching of the Tpases phylogenetic tree: two types of closely related TnGBS, TnGBS1 and TnGBS2, and ISLre2 (MULT3); the Mycoplasma ICE; IS256 (MULT1); ISH6 (MULT2); ISAzba1, ISMich2, ISKra4 (MULT4)[13] (Fig. IS256.1 and Fig. IS256.2 and Fig. IS256.3).

Fig. IS256.3. Alignment of the predicted N-terminal DNA binding domain implicated in p-MULT IR recognition. Transposases identified in this study were aligned by the MAFFT alignment software and visualized using Jalview. Only the predicted N2 domain encompassing the minimum IR-binding domain identified in the IS256 transposase (Hennig and Ziebuhr 2010) was retained in the alignment. The alignment was filtered for redundancy in order to keep a subset of transposases representative of the diversity of p-MULT 1, 2, 3, and 4 families. Transposases accession numbers are indicated on the left.

There is a distant relationship with the Tpase of the eukaryotic Mutator TE and, like MuDR from Zea mays, many generate 8-9-bp target repeat on insertion. They have therefore been called MULE (for Mutator-Like Elements). Like MuDr/Foldback, members of these groups carry a largely α-helical insertion domain between the second D and E catalytic residues. This includes a conserved C/D(2)H signature present in the eukaryotic and prokaryotic IS[13][15].

IS256

The IS256 family can be subdivided into 3 groups: IS256, IS1249, and ISC1250.

IS256 group

The classical IS256 group has large number of members in both bacteria and archaea. They are between 1200 and 1500 bp long with IR of 20-30 bp (Fig. IS256.4) and generate DR of between 8 and 9 bp. A single long orf carrying a potential DDE motif with a spacing of 112 residues between the second D and E residues (Fig. IS256.2), together with a correctly placed K/R residue. This spacing is due to an insertion domain[16][17]. The catalytic residues have been validated by mutagenesis[18]. It was shown several years ago that the Tpase of IS256 family elements share some similarities with the eukaryotic Mutator element[19], a relationship which has been explored recently in more detail[20].

Members of this family transpose using an excised circular dsDNA transposon intermediate [e.g. [18][21]]. They are also found as part of composite transposons such as Tn4001 flanked on either side by IS256[22][2][23][24]. For IS256 itself, the sequences of circle junctions showed that the left IS end preferentially attacked the right end[21], a result that was independently demonstrated by the effect of small deletions in the left and right ends on circle formation[14].

IS1249 group

There are more than 30 members confined at present to the Actinobacteria and the Firmicutes. They are about 1300 pb in length with IR of about 26bp (Fig. IS256.4) and generally generate DR of 8bp (with variations of between 0 and 10).

Fig. IS256.4. IS256 and IS1249 Weblogo showing the IS ends. Left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format (Crooks et al., 2004).
ISC1250 group

At present, there are only 3 members of this group in ISfinder. All are found in the archaeon Sulfolobus solfataricus.

ISH6

This group (MULT2) were originally observed uniquely in archaea[25]. There are 11 members of about 1450 bp with highly conserved IR of 24-27bp (Fig. IS256.5), DR of 8 bp and a single Tpase orf encoding a protein of 450 bp.

Fig. IS256.5. ISH6 and ISLre2 Weblogo showing the IRs. Left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format (Crooks et al., 2004).

ISLre2

There are 48 entries for ISLre2 family members in ISfinder. They are restricted at present to the bacteria. They are between 1500 and 2000 bp long, with IR from 15 to 29 bp (Fig. IS256.5) and generate 9 bp DR. Together with the related TnGBS ICE, show strong target specificity and insert 13-17 bp upstream of σA promoters [11][13] in oriented fashion with RE proximal. PCR analysis has detected a transposon circle junction, as with the related ICE, suggesting that transposition may occur via a Donor Primed Transposon Replication process.

ISKra4

This newly emerging family includes 83 members and is divided into three related groups: ISAzba1, ISMich2 and ISKra4.

ISAzba1

There are presently 28 members of this group. They encode a Tpase of between 450 and 480 aa, are 1400 to 2900 bp long with IR of about 20 bp (Fig. IS256.6) and no DR. Six (ISAfe13, ISCot1, ISEc51, ISKpn19, ISSysp7) carry an orf in addition to the Tpase and this specifies a protein related to serine-recombinases or resolvases. Four of these also include a third orf annotated as hypothetical protein. The fifth, ISAfe13, carries the Tpase, a resolvase, and an alternative orf annotated as ORF-3-like from plasmid pRiA4b. Other proteins found in this family are annotated as being hypothetical or putative TnpR resolvases although no direct evidence for resolvase function is available. Eight other members simply encode the Tpase and the ORF-3 like protein. While ISCep1 includes the ORF-3-like protein and a third annotated as phage integrase or xerC/D.

Fig. IS256.6. ISKra1 grp ISAzba1, ISKra1 grp ISKra1, and ISKra1 grp ISMich2 Weblogo showing the IRs. Left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format (Crooks et al., 2004).
ISMich2

This includes 24 members which are presently limited to the cyanobacteria. Twenty two have a Tpase orf distributed between two reading phases while in the remaining 2 the Tpase forms a unique continuous orf. However all show a potential but atypical frameshift motif, TTTTTT which could be involved in either PRF (Programmed -1 Ribosomal Frameshifting) or PTR (Programmed Transcriptional Frameshifting) recoding. The further experimental analysis would be necessary to confirm or refute this. Members are between 1250 and 1400 bp long with a Tpase of 360aa, IR of between 18 and 39 bp (Fig. IS256.6) with 8 bp DR. Three members (ISCysp26; ISMic1; ISMich2) carry a passenger gene annotated as hypothetical protein.

ISKra4

This small group of elements range in size from 1400 to 3700 pb due to the presence in some of various passenger genes. They have IR of 18 to 31 bp (Fig. IS256.6) and generate DR of 9 bp. Three carry passenger genes: ISLdr1, a hypothetical protein and a reverse transcriptase; ISSri1, a transcriptional regulator; and ISTn1, a hypothetical protein. Six members may express their Tpases by frameshifting (5 include a 7A motif and 1 with a motif, 5TC).


Mechanism

Recently IS256 transposition mechanism has been addressed [25] using the family member ISCth4 from the thermophilic Clostridium thermocellum (Hungateiclostridium thermocellum ATCC 27405; NC_009012). ISCth4 is present in 15 copies in the host genome, 12 of which are flanked by 8-bp target site duplications (ISfinder: https://www-is.biotoul.fr/scripts/ficheIS.php?name=ISCth4) strongly suggesting that the IS is active. The choice of this IS was dictated by the solubility of its transposase for structural studies. Like other members of the family, the ISCth4 transposase includes a predicted alpha-helical insertion domain (Groups with DDE Transposases;[15][26][27]) within the catalytic domain with a conserved CxxH motif (Fig. IS256.2) important for catalysis in transposases of this type [25][28][29]. In addition, like other IS256 family members [4], there are sequence elements present in both ends [25] which would generate a strong promoter to drive transposase expression in a typical circular IS transposition intermediate (IS and Gene Expression; IS3 family: Mechanism, Circular intermediate).

The ISCth4 transposase is organized (Fig. IS256.7 top)[25] into a number of domains including dimerization (DD), N-terminal DNA binding (NDB), a helix-turn-helix (HTH) and a catalytic domain CD split by an “insertion” domain (ID) in which the upstream CD segment carries the DD and CXXH residues and the downstream CD carries the final E of the DDE triad. The position of these domains their relevant secondary and tertiary structures (Fig. IS256.7 bottom) bear a close correspondence to those of the IS256 transposase itself.

Transposition reaction steps

Biochemical studies explored ISCth4 transposition steps and confirmed that this IS undergoes copy-out- paste-in transposition (Major DDE transposition pathways) [25].

The first step in the transposition pathway, figure-eight formation, in which a single-strand bridge is formed between two IS ends, was observed in vitro. This was observed using if linear substrates with a pair of 35 bp IRL and IRR cloned in the appropriate orientation. No bridged intermediate could be detected when the substrate was in a supercoiled form. The sequences of the intervening linker bases which derive from the flank of the target end (e.g. [30]) were between 6 and 8bp in length and showed that either end could be used as a target by its partner end. However, the sample (n=8) was not large enough to determine whether there was a preference. The second step, replication and excision, is technically challenging to observe (see [31]) and was not addressed.

It was also possible to reconstitute the final circle integration step in vitro. This involves cleavage at both IS ends at the IS circle junction to generate a 3’OH and their transfer into the target. In the classical concerted integration reaction (first seen as a product of a mini phage Mu insertion into a circular plasmid [32] and subsequently developed as a diagnostic assay concerted insertion of other transposons such as HIV [33] and mariner [34]), two ends, both with 3’OH insert into opposite strands at the target site in a supercoiled target resulting in linearization (single end integration results simple in relaxation of the target). Double strand IR of more than 25 bp were found to efficiently integrate. Moreover, when the two IS ends were abutted as in an IS circle, the junction was able to integrate to generate a linear plasmid substrate in a reaction which was most efficient if the spacer was 6bp in length and resulted in an 8bp flanking direct target repeat. Junction integration occurred preferentially into a supercoiled target.

An Asymmetric Transpososome

A structural analysis has provided some important insights into the molecular machinery and mechanism of ISCth4 transposition and by extension, to transposition of the entire IS256 family and probably of the other IS families which transpose using a copy-out-paste-in mechanism. TnpA was found to bind to DNA as at least a dimer (stoichiometry 1:2, end:TnpA). The crystal structure of TnpA bound to three different substrates, a pre-reaction complex with an IRR 26 bp and six flanking bp, a pre-cleaved complex an IRR 26 bp without flanking DNA and a substrate resembling an IS circle structure with abutted left and right ends but could also structurally mimic the figure-eight intermediate in which the two IRs are bridged on one strand by a 6 nucleotide spacer while the other carries a gapped 5 nucleotide complementary spacer, were determined at good resolution.

One remarkable observation is that all three transpososome complexes are asymmetric (Fig. IS256.8). Although not unexpected in view of the asymmetric cleavage and transfer reactions involved in copy-out-paste-in transposition, it is unusual and the only example of an asymmetric complex of the transpososome structures available [17][35][36][37]. The studies showed that: in all three structures, a single DNA molecule is bound by a TnpA dimer; the IR is recognized through an extensive interface where most protein-DNA interactions with one of the two TnpA protomers [A] with the transposon tip directed toward the catalytic domain of the partner protomer [B] consistent with the commonly observed “trans” cleavage which occurs in many transposition reactions [35][37] (Cleavage in trans: A Committed Complex); the alpha helical insertion domain (a characteristic of the IS256 family) which interrupts the catalytic domain (Fig. IS256.7) includes the important C/DxxH motif which, in the eukaryotic TE, Hermes is located at the same position [29][38]; the donor IR is bound by two regions of one monomer [A], NDB binds a subterminal sequence while sequence closer to the tip are bound by both the CD and HTH domains (Fig. IS256.8) and some of the residues involved also important in IS256 transposase binding [14]. It is these interactions towards the IR tip that direct it towards the catalytic domain of the partner monomer.

The structural studies provide an indication of how the figure-eight intermediate may be formed and the IS circle junction processed in insertion: the removal of the DNA flank in the pre-reaction complex (Fig. IS256.8i) to generate the pre-cleavage complex (Fig. IS256.8ii) results in folding of the ID domain of the non-bound monomer [B] towards the IR tip of the DNA on the bound monomer [A]. In the Strand Transfer complex (Fig. IS256.8iii), both IDs ([A] and [B]) close, forming a protein-protein interface and enclose the linker sequence interacting with it in a non-sequence specific way. This movement creates an additional DNA binding site which places the tip of one IR in the catalytic site. The DNA substrate used could assume configurations resembling either the figure eight junction (but lacking the adjoining donor plasmid flanks found for IS911 [30]) or an IS circle junction with a single nucleotide gap at one IR tip (Fig. IS256.9). It is proposed [25] that target DNA is sequestered following reopening of the ID [A] and [B] interface.

The insertion domain appears to undergo dynamic changes, opening and closing over the course of the consecutive transposition steps, a flexibility found also for similar domains in eukaryotic transposons Transbib and Rag1 (Ru et al , 2018; Liu et al , 2019; Chen et al , 2020). This movement directs different DNA segments along the copy-out-paste-in transposition pathway. However, not all IS which undergo this type of transposition carry transposases with Insertion Domains (Transposases examined by secondary structure prediction programs). It will be interesting to determine how these IS, such as IS3 or IS30 family members accomplish figure-eight formation and integration of the circle junction without the ID.

While these data provide some revealing snapshots of copy-out-paste-in transposition, they leave some intriguing questions such as how the figure-eight intermediate is replicated to generate a closed circular IS form with a junction composed of abutted IRs and how the target DNA engages in the transpososome.

Bibliography

  1. <pubmed>6323927</pubmed>
  2. 2.0 2.1 2.2 Lyon BR, Gillespie MT, Skurray RA . Detection and characterization of IS256, an insertion sequence in Staphylococcus aureus. - J Gen Microbiol: 1987 Nov, 133(11);3031-8 [PubMed:2833560] [DOI] </nowiki>
  3. <pubmed>31474962</pubmed>
  4. 4.0 4.1 </nowiki>
  5. <pubmed>7899522</pubmed>
  6. <pubmed>1334269</pubmed>
  7. <pubmed>8654967</pubmed>
  8. <pubmed>7625803</pubmed>
  9. <pubmed>8031032</pubmed>
  10. <pubmed>8723445</pubmed>
  11. 11.0 11.1 </nowiki>
  12. <pubmed>23888872</pubmed>
  13. 13.0 13.1 13.2 13.3 13.4 Guérillot R, Siguier P, Gourbeyre E, Chandler M, Glaser P . The diversity of prokaryotic DDE transposases of the mutator superfamily, insertion specificity, and association with conjugation machineries. - Genome Biol Evol: 2014 Feb, 6(2);260-72 [PubMed:24418649] [DOI] </nowiki>
  14. 14.0 14.1 14.2 Hennig S, Ziebuhr W . Characterization of the transposase encoded by IS256, the prototype of a major family of bacterial insertion sequence elements. - J Bacteriol: 2010 Aug, 192(16);4153-63 [PubMed:20543074] [DOI] </nowiki>
  15. 15.0 15.1 Yuan YW, Wessler SR . The catalytic domain of all eukaryotic cut-and-paste transposase superfamilies. - Proc Natl Acad Sci U S A: 2011 May 10, 108(19);7884-9 [PubMed:21518873] [DOI] </nowiki>
  16. <pubmed>20067338</pubmed>
  17. 17.0 17.1 </nowiki>
  18. 18.0 18.1 </nowiki>
  19. <pubmed>8041625</pubmed>
  20. <pubmed>19018586</pubmed>
  21. 21.0 21.1 </nowiki>
  22. <pubmed>6323927</pubmed>
  23. <pubmed>2553542</pubmed>
  24. <pubmed>2544565</pubmed>
  25. 25.0 25.1 25.2 25.3 25.4 25.5 25.6 Filée J, Siguier P, Chandler M . Insertion sequence diversity in archaea. - Microbiol Mol Biol Rev: 2007 Mar, 71(1);121-57 [PubMed:17347521] [DOI] </nowiki>
  26. <pubmed>10207011</pubmed>
  27. <pubmed>16041385</pubmed>
  28. <pubmed>PMC5225508</pubmed>
  29. 29.0 29.1 <pubmed>PMC6212770</pubmed>
  30. 30.0 30.1 <pubmed>7590258</pubmed>
  31. <pubmed>PMC522794</pubmed>
  32. <pubmed>2822259</pubmed>
  33. <pubmed>15958388</pubmed>
  34. <pubmed>15333635</pubmed>
  35. 35.0 35.1 <pubmed>PMC3536463</pubmed>
  36. <pubmed>PMC2999894</pubmed>
  37. 37.0 37.1 <pubmed>10884228</pubmed>
  38. <pubmed>PMC4105704</pubmed>