Difference between revisions of "General Information/Major Groups are Defined by the Type of Transposase They Use"

From TnPedia
Jump to navigation Jump to search
Line 187: Line 187:
<br />
==Groups with S-Transposases==
==Groups with S-Transposases==

Revision as of 12:16, 4 May 2020

The principal factor in IS classification is the similarity, at the primary sequence level, of the enzymes which catalyze their movement, their transposases (Tpases). In addition, a variety of characteristics are also taken into account. These include: the length and sequence of the short imperfect terminal inverted repeat sequences (IRs) carried by many ISs at their ends (TIRs or ITRs in eukaryotes); the length and sequence of the short flanking direct target DNA repeats (DRs) (TSD, Target Site Duplication, in eukaryotes) often generated on insertion; the organization of their open reading frames or the target sequences into which they insert[1][2][3]. IS and some transposons can also be divided into two major types based on the chemistry used in breaking and rejoining DNA during TE displacement: the DDE (and DEDD) and HUH enzymes. Additional types of transposase enzymes have been identified (Fig.1.7.1) but are generally associated with other types of transposon rather than IS.

A relatively new type of potential transposase, Cas1, is associated with so-called casposons, elements that may resemble complex IS and are related to CRISPRs (for more details please see The Casposases section).

Groups with DDE Transposases

Fig. 1.7.1. Types of Transposon and catalytic sites. Each column shows a different type of transposase with the principal amino acids defining their catalytic sites shown at the top. Underneath are shown examples in which the atomic structures have been determine ( at the present time May 2020) there is no overall structure of a bacterial Y2 enzyme from the IS91 family. Below the cartoons, the figure indicates some of the bacterial TE and, below, the eukaryotic TE which encode transposases with each of the catalytic centers. In boxes at the bottom are shown the nucleophiles used in each case to break the phosphodiester DNA bond.

DDE enzymes, so-called because of a conserved Asp, Asp, Glu triad of amino acids which coordinate essential metal ions, use OH (e.g. H20) as a nucleophile in a transesterification reaction[4] (Fig.1.7.1) and (Fig.1.8.1). IS with DDE enzymes are the most abundant type in the public databases (Fig.1.4.2). This is partly due to the fact that the definition of an IS became implicitly coupled to the presence of a DDE Tpase, an idea probably reinforced by the similarity between Tpases of IS (and other TE) and the retroviral integrases (Fig.1.8.1)[5][6][7] particularly in the region including the catalytic site. More precisely, for these TE, the triad is DD(35)E in which the second D and E are separated by 35 residues. As more DDE transposases were identified, the distance separating the D and E residues was found to vary slightly (TABLE MGE transposases examined using secondary structure prediction programmes)[8]. However, for certain IS, this distance was significantly larger. In these cases, the Tpases include an “insertion domain” between the second D and E residues[9] with either α-helical or β-strand configurations (Fig.1.8.2). Although in most cases this is a prediction, it has been confirmed by crystallographic studies for the IS50 [β-strand[10] and Hermes [α-helical;[11] Tpases. The function of these “insertion domains” is not entirely clear[12].

Fig. 1.8.1 DDE transposase Glu-Glu-Asp domain.Top: variation in spacing of the amino acid DDE triad and the downstream conserved lysine or arginine residues. The references are one of the first realisations that there is significant similarity between eukaryotic and prokaryotic transposases. Below: the original structure of the HIV integrase catalytic core domainshowing the position of the 4 relevant amino acids (green arrows), a single divalent metal cation (blue circle), and the projected position of bound DNA. (Figure thanks to F. Dyda)
Fig 1.8.2 The Transposase structures.Ribbon diagrams of aligned catalytic cores of four DNA transposases and of HIV-1 integrase. Residues shown in orange are the carboxylate active side residues, in green are the W residues of the Tn5 transposase and Hermes that are important in the reactions, in blue are the YRK residues of the YREK motif and in yellow is W298 of the Tn5 transposase. The insertion domains of the Tn5 transposase and Hermes are shown in red. The proteins are to scale. Adapted from Hickman et al., 2010

Transposases examined by secondary structure prediction programs

Table 2. Adapted from Hickman et al. 2010, Integrating prokaryotes and eukaryotes: DNA transposases in light of structure. 1Information on the number of copies within the host genome was obtained from ISfinder or the reference indicated by the asterisk. 2Where indicated, the secondary structure predicts an insertion domain between β5 and α4 with predominantly either β-strands or α-helices. 3Relevant references include reviews or papers that report the results of secondary structure prediction, report sequence alignments or consensus sequences, identify the DDE/D catalytic residues, or demonstrate that the element is active. The association of certain eukaryotic superfamilies to specific IS families is as per Feschotte and Pritham (2005) and references therein.
Family Element (or protein) analyzed Active or # copies in genome1 From secondary structure, type of DDE/D motif2 Relevant references3
IS1 IS1N >40* DD(24)E *Nyman et al., 1981; Ohta et al., 2002, 2004; Siguier et al., 2009
ISSto9 5 DD(20)E
IS1595 ISPna2 DD(36)N Siguier et al., 2009
ISH4 DD(36)E
IS1016C DD(34)E
IS1595 DD(35)N
ISSod11 13 DD(34)H
ISNWi1 DD(35)E
ISNha5 DD(33)E
Merlin: MERLIN1_SM consensus DD(36)E Feschotte, 2004
IS3 IS911 Active DD(35)E Polard and Chandler, 1995; Rousseau et al., 2002
IS481 IS481 ~100* DD(35)E *Glare et al., 1990; Chandler and Mahillon, 2002
IS4 IS50R Active PDB ID: 1muh Rezsöhazy et al., 1993; Davies et al., 2000
IS701 IS701 Active (15*) DD(β-strand)E *Mazel et al., 1991
ISRso17 7
ISH3 ISC1359 5 DD(β-strand)E
ISC1439A 13
IS1634 IS1634 Active (~30*) DD(β-strand)E *Vilei et al., 1999
ISMac5 7
ISPlu4 7
IS5 IS903 Active DD(65)E Derbyshire et al., 1987; Rezsöhazy et al., 1993; Tavakoli et al., 1997
PIF/Harbinger: PIFa (Z. mays) Active DD(59)E Zhang et al., 2001; Kapitonov and Jurka, 2004; Sinzelle et al., 2008
IS1182 "IS660 3 DD(β-strand)E Takami et al., 2001
ISPsy6 14
IS6 IS6100 Active DD(34)E Martin et al., 1990; Mahillon and Chandler, 1998
IS21 IS21 Active DD(45)E Mahillon and Chandler, 1998; Berger and Haas, 2001
IS30 IS30 Active DD(33)E Caspers et al., 1984; Mahillon and Chandler, 1998
IS66 IS679 Active DD(α-helical?)E Han et al., 2001
ISPsy5 33
ISMac8 3
IS110 IS492 Active DEDD Perkins-Balding et al., 1999; Buchner et al., 2005
IS1111 20
IS256 IS256 Active DD(α-helical)E Mahillon and Chandler, 1998; Prudhomme et al., 2002
MuDr/Foldback</i< (Mutator) Active Eisen et al., 1994; Babu et al., 2006; Hua-Van and Capy, 2008
IS630 ISY100 Active DD(34)E Doak et al., 1994; Feng and Colloms, 2007
Tc1/mariner: Mos1 (D. mauritiana) PDB ID: 2f7t Plasterk et al., 1999; Richardson et al., 2006
Zator: Zator-1_HM 36* DD(43)E *Bao et al., 2009
IS982 ISPfu3 5 DD(47)E Mahillon and Chandler, 1998
IS1380 IS1380A ~100* DD(β-strand)E *Takemura et al., 1991; Chandler and Mahillon, 2002
piggyBac (T. ni) Active DD(β-strand)D Cary et al., 1989; Sarkar et al., 2003; Mitra et al., 2008
ISAs1 ISAzo3 7 DD(β-strand)E/D?
ISL3 IS31831 Active DD(α-helical)E Suzuki et al., 2006
IS651 22
Tn3 Tn3 (E. coli) Active DD(α-helical?)E, DD(α-helical)E insertion Grindley, 2002
hAT Hermes Active PDB ID: 2bw3 Warren et al., 1994; Rubin et al., 2001; Hickman et al., 2005
CACTA CACTA1 (A. thaliana) En/Spm ZM Active DD(α-helical?)E/D? Miura et al., 2001; DeMarco et al., 2006
P Drosophila Active ? Rio, 2002
Transib >Transib1_AG Consensus DD(α-helical)E Kapitonov and Jurka, 2005; Chen and Li, 2008
RAG1 (M. musculus) Active Kim et al., 1999; Landree et al., 1999; Lu et al., 2006
Sola Sola3-3_HM Multiple copies* DD(40)E *Bao et al., 2009

Major DDE transposition pathways

Although DDE-type transposons share basic transposition chemistry, different TE vary in the steps leading to the formation of a unique insertion intermediate (Fig.1.8.3)[13][14]. They catalyze the cleavage of a single DNA strand to generate a 3’OH at the TE ends which is subsequently used as a nucleophile to attack the DNA target phosphate backbone. This is known as the transferred strand. The variations are due to the way in which the second (non-transferred) strand is processed[15][16][17].

There are several ways in which second-strand processing can occur (Fig.1.8.3): for certain IS, the second strand is not cleaved but replication following the transfer of the first strand fuses donor and target molecules to generate cointegrates with a directly repeated copy at each donor/target junction. This is known as replicative transposition (e.g. IS6, Tn3) or more precisely, Target Primed Replicative Transposition (TPRT) (Fig.1.8.3 pathway a).

In the other pathways, the flanking donor DNA can be shed in several different ways: the non-transferred strand may be cleaved initially several bases within the IS prior to cleavage of the transferred strand [e.g. IS630 and Tc1[18][19][20] (Fig.1.8.3 pathway d); the 3’OH generated by the first-strand cleavage may be used to attack the second strand to form a hairpin structure at the IS ends liberating the IS from flanking DNA and subsequently hydrolyzed to regenerate the 3’OH known as conservative or cut-and-paste transposition (e.g. IS4;[21] (Fig.1.8.3 pathway f) and (Tn10 movie - see below) (IS4.4; IS4.5; IS4.6; IS4.7); the 3’OH of the transferred strand from one IS end may attack the other to generate a donor molecule with a single strand bridge which is then replicated to produce a double-strand transposon circle intermediate and regenerating the original donor molecule known as copy-out-paste-in or more precisely Donor Primed Replicative Transposition (DPRT) (e.g. IS3 ) [22] (Fig.1.8.2 pathway e) and (IS911 movie - see below); or the 3’OH at the flank of the non-transferred strand may attack the second strand to form a hairpin on the flanking DNA and a 3’OH on the transferred strand (at present this has only been demonstrated for eukaryotic TE of the hAT family and in V(D)J recombination [23]) (Fig.1.8.3 pathway g).

Clearly, many families produce double-strand circular intermediates but this does not necessarily mean that they all use the copy-paste DPRT mechanism since a circle could formally be generated by excision involving recombination of both strands[24]. These differences are reflected in the different IS families.

Fig 1.8.3 Major DDE transposition pathways: Dealing with the second strand. The color code is as follows: transposon DNA (green); flanking donor DNA (blue); target phosphates destined to be removed from the final liberated transposon (filled blue circles with a white “P”); phosphates destined to remain as 5′ transposon ends (open blue circles); the preferred stereoisomer, Sp or Rp, where known, is indicated within the circles; liberated 3′OH groups involved in strand joining reactions (open red circles); 3′OH destined to be removed from the liberated transposon (filled red circles); H2O is the attacking nucleophile in the hydrolysis reactions. (a) The Mu and Tn3 cleavage reactions. Note that the preferred stereoisomer has been demonstrated only for Mu and not for Tn3. (b) Tn7 cleavage reactions. Cleavage of the transferred strand (top of panel) is shown occurring prior to cleavage of the non-transferred strand (middle) leading to liberation of the transposon from flanking donor DNA (bottom of panel), although this order of cleavage reactions has not been demonstrated experimentally. The two types of cleavage are catalyzed by different enzymes. (c) Retroviral “processing” reaction, equivalent to cleavage of the transferred strand. An initial transcription step from the integrated provirus is indicated. The RNA genome is then encapsidated with a second copy and undergoes reverse transcription following infection to generate the double strand DNA integration intermediate. The intermediate is flanked by only short fragments of donor material and does not require second strand processing for insertion. (d) Transposition by the members of the IS630 family and the Tc1/Mariner superfamily is initiated by cleavage of the non-transferred strand (top of panel) at several bases within the transposon end (middle) leaving these bases attached to the liberated flanks following cleavage of the transferred strand (bottom). (e) For IS911, IS2, IS3 and other members of the IS3 family, single-end hydrolysis occurs (top). The liberated 3′OH then directs a strand transfer reaction to the same strand several bases 5′ to the other end of the element. This results in the formation of a single-strand circle which is then resolved into a transposon circle by replication from the free 3′OH (filled red circle). Single-strand hydrolysis at each 3′ end within the circle generates a linear transposon which can then undergo integration. (f ) The IS4 family and piggyBac have similar mechanisms. Following initial nucleophilic attack on the Rp target phosphate, the liberated 3′OH attacks an Sp phosphate in a trans-strand transfer reaction to generate a hairpin intermediate, liberating the transposon from its flanking donor DNA and inverting the target phosphate to its Rp configuration. These then become the substrates for a second hydrolysis. Note that the stereochemistry has been analyzed only in the case of Tn10. (g) Hermes and V(D)J transposition occur by initial cleavage of the non-transferred strand (top). The liberated 3′OH on the donor flank then attacks the opposite strand (middle) to generate hairpin structure on the donor flank (bottom). The stereochemistry has been analyzed for V(D)J only. Modified and reprinted from Turlan and Chandler (2000),
IS911 and Tn10 transposition mechanisms
IS911.Copy out - Paste in (column e in figure 1.8.3) Tn10. Cut and paste (column f in figure 1.8.3)

Groups with DEDD Transposases

A similar type of Tpase, known as a DEDD Tpase, is related to the Holiday junction resolvase, RuvC (Choi, et al., 2003, Buchner, et al., 2005)[25][26][27] but is at present limited to only a single known IS family (IS110). The organization of family members is quite different from that of the DDE ISs: they do not contain the typical terminal IRs of the DDE IS (although one subgroup, IS1111, carry sub-terminal IR) and do not generate flanking target DRs on insertion. This implies that their transposition occurs using a different mechanism to the DDE IS. It seems probable that an intermediate resembling a four-way Holliday junction is involved. Moreover, in contrast to the DDE transposases in which a DNA binding domain invariably precedes the catalytic domain, DEDD transposases appear to include a DNA binding domain downstream from the catalytic domain.

Groups with HUH Enzymes

Fig 1.10.1 The HUH enzymes.Organization of representative HUH domain-containing proteins is shown; they contain HUH, helicase, oligomerization (OD) and proposed Zn - binding (not necessarily structurally related) domains. The length of each protein is indicated in numbers of amino acids, and those proteins for which HUH domain structures are available are indicated with an asterisk; the HUH motif data are from Koonin & Ilyina Biosystems 30, 241–268 (1993) (motif 2) and Garcillan-Barcia et al., FEMS Microbiol. Rev. 33, 657–687 (2009) (motif III); the Y motif data are from Koonin & Ilyina Biosystems 30 , 241–268 (1993) (motif 3) and Garcillan-Barcia, et al;, FEMS Microbiol. Rev. 33 , 657–687 (2009) (motif I). The assigned domain organizations are taken from phage φ X174 protein A (gpA) (Boer et al. EMBO J. 28, 1666–1678 (2009)), AAV Rep78 (Smith et al., J. Virol. 71, 4461–4471 (1997)), tomato yellow leaf curl virus (TYLCV) Rep (Campos-Olivas et al Proc. Natl Acad. Sci. USA 99, 10310–10315 (2002)) , plasmid pMV158 RepB (Boer et al. EMBO J. 28, 1666–1678 (2009)), plasmid R388 TrwC (Guasch et al. Nature Struct. Biol.10, 1002–1010 (2003)), plasmid RSF1010 MobA (mobilization protein A) (Monzingo et al. J. Mol. Biol. 366, 165–178 (2007)), transposases from the insertion sequences IS608 (Ronning et al. Mol. Cell 20 , 143–154 (2005)), IS91 and insertion sequence with a common region 1 (ISCR1) (S. Messing, A.B.H. and F.D., unpublished observations), and HeliBat1 (a consensus sequence from a bioinformatic prediction).

TE encoding the second major type of Tpase, called HUH (named for the conserved active site amino acid residues H=Histidine and U=large hydrophobic residue )(Fig.1.7.1) and (Fig.1.10.1), has been identified more recently. HUH enzymes are widespread single-strand nucleases. They include Rep proteins involved in bacteriophage and plasmid rolling circle replication and relaxases or Mob proteins involved in conjugative plasmid transfer[28]. They are limited to two prokaryotic (IS91 and IS200/IS605; [29]) and one eukaryotic (helitron[30]) TE family.

As Tpases, they are involved in presumed rolling circle transposition and also in single-strand transposition (see [31][32]). Not only is the transposition chemistry radically different to that of DDE group elements, since it involves DNA cleavage using a tyrosine residue and transient formation of a phospho-tyrosine bond, but the associated transposons have an entirely different organization and include sub-terminal secondary structures instead of IRs (see IS families below [33]). Note that these Tpases are not related to the well-characterized tyrosine site-specific recombinases such as phage integrases.

There are two major HUH Tpase families: Y1 and Y2 enzymes (Fig.1.7.1)(see [34]) depending on whether there is a single or two catalytic Y residues. One family includes IS91-family transposases[35][36], the other includes IS200/IS605 transposases[37][38][39][40][41][42]. Although these enzymes use the same Y-mediated cleavage mechanism, IS200/IS605 family Y1 transposases and IS91 transposases appear to carry out the transposition process in quite different ways. Neither carries terminal IRs nor do they generate DRs on insertion. Members of these families transpose using an entirely different mechanism to IS with DDE transposases[43][44]. The members of the IS91 insertion sequence family[45][46], are related to newly defined group, the ISCR[47] (see “IS91-related ISCRs”) and with eukaryotic helitrons (Fig.1.7.1)[48]. These IS carry sub-terminal sequences which are able to form hairpin secondary structures (Fig.1.3.2). This is particularly marked in the IS200/IS605 family elements and, at least in the case of this family, it is these structures which are recognised by the transposase[49].

Groups with S-Transposases

The third transposase family is represented by IS607 which carries a Tpase closely related to serine recombinases such as the resolvases of Tn3 family elements. Little is known about their transposition mechanism. However, it appears likely, in view of the known activities of resolvases, that IS607 transposition may involve a double-strand DNA intermediate (Grindley cited as pers. comm. in [50]) see also [51] (Fig.1.7.1).

Groups with Y-Transposases

Finally, tyrosine site-specific recombinases of the bacteriophage integrase (Int) type are often associated with conjugative transposons (Integrative Conjugative Elements or ICE)( IS related to ICE) and are considered to be Tpases. However, at present there are no known IS which use this type of enzyme (Fig.1.7.1).


  1. <pubmed>9729608</pubmed>
  2. <pubmed>24499397</pubmed>
  3. <pubmed>26104715</pubmed>
  4. <pubmed>26104718</pubmed>
  5. <pubmed>1963920</pubmed>
  6. <pubmed>1850126</pubmed>
  7. <pubmed>1314954</pubmed>
  8. <pubmed>20067338</pubmed>
  9. <pubmed>20067338</pubmed>
  10. <pubmed>10207011</pubmed>
  11. <pubmed>16041385</pubmed>
  12. <pubmed>20067338</pubmed>
  13. <pubmed>26104718</pubmed>
  14. <pubmed>20067338</pubmed>
  15. <pubmed>26104718</pubmed>
  16. <pubmed>10838584</pubmed>
  17. <pubmed>14682279</pubmed>
  18. <pubmed>8556864</pubmed>
  19. <pubmed>7954797</pubmed>
  20. <pubmed>17680987</pubmed>
  21. <pubmed>26104553</pubmed>
  22. <pubmed>26350305</pubmed>
  23. <pubmed>15616554</pubmed>
  24. <pubmed>26104718</pubmed>
  25. <pubmed>12897009</pubmed>
  26. <pubmed>15866929</pubmed>
  27. <pubmed>11169105</pubmed>
  28. <pubmed>23832240</pubmed>
  29. <pubmed>26350330</pubmed>
  30. <pubmed>26350323</pubmed>
  31. <pubmed>26350330</pubmed>
  32. <pubmed>26104718</pubmed>
  33. <pubmed>26350330</pubmed>
  34. <pubmed>23832240</pubmed>
  35. <pubmed>6282809</pubmed>
  36. <pubmed>19709290</pubmed>
  37. <pubmed>6315530</pubmed>
  38. <pubmed>3009825</pubmed>
  39. <pubmed>6313217</pubmed>
  40. <pubmed>11807059</pubmed>
  41. <pubmed>9631304</pubmed>
  42. <pubmed>9858724</pubmed>
  43. <pubmed>11136468</pubmed>
  44. <pubmed>16163392</pubmed>
  45. <pubmed>1321417</pubmed>
  46. <pubmed>1310503</pubmed>
  47. <pubmed>16760305</pubmed>
  48. <pubmed>17850916</pubmed>
  49. <pubmed>26350330</pubmed>
  50. <pubmed>17347521</pubmed>
  51. <pubmed>24195768</pubmed>