IS Families/IS5 and related IS1182 families

From TnPedia
Revision as of 12:47, 11 August 2021 by TnCentral (talk | contribs)
Jump to navigation Jump to search

IS5 family

Original Identification

IS5 was originally isolated as an insertion into the immunity region of bacteriophage lambda and subsequently found as a cause of mutation in a number of E. coli genes[1][2][3][4]. Together with IS1, it was also identified as an activator (by insertion) of expression of the usually cryptic beta-glucosidase gene of E. coli[5][6][7][8][9][10].

Presence in Compound Transposons

Several members are associated with compound transposons. These include IS903 and IS602, which form part of the kanamycin resistance transposons Tn903[11], and Tn602[12] respectively, and ISVa1 and ISVa2 which form part of a transposon carrying iron transport genes[13].

Distribution

The IS5 family, like the IS4 family, is also a relatively heterogeneous group which now requires reanalysis. It also includes sequences from both eubacteria and the archaea. There are now a large number of identified members of the IS5 family (>550 members) and of a closely related IS1182 family (>150 members) which have allowed a more detailed analysis and a separation into various subgroups and families. The IS5 family is partitioned into 6 subgroups: IS5, IS903, ISL2, ISH1, IS1031 and IS427[14] (Table Characteristics of IS families; Fig.5.1). Some of these may prove to be emerging families. Members of the IS5 subgroup appear to be composed of two groups with different lengths: one of 1060-1300 bp and a second of 1460-1610 (Fig.5.2 A).

Fig. IS5.1. Correspondence between the IS IRs and different IS5 family subgroups.
Fig. IS5.2. IS5 family IS5 subgroups A) distribution of IS length (base pairs); B) distribution of the length of transposase (amino acid residues).
Diversity

The transposases of these are also of different lengths (Fig.5.2 B) and transposase length is correlated with that of the IS. The lengths of the IS1013 subgroup are between ~900 and ~1200 bp with the majority between 103 and 1090 bp (Fig.5.3), those of the IS427 group are between 800 and 1070 bp in length with most having lengths in the range of 810 900 bp (Fig.5.3). Members of the IS903 subgroup are generally about 1030-1090 bp long (Fig.5.4 A ), those of the ISH1 subgroup are about 850 - 1200 bp long (note that this subgroup includes a number of Miniature Inverted repeat Transposable Elements (MITES) (Fig.5.4 B) and ISL2 members are 820 to 1260 bp long with a majority of about 820-970 bp (Fig.5.4 C). There are a large number of additional IS5 family members whose attribution to subgroups has yet to be established.

Fig. IS5.3. IS5 family IS1031 and IS427 subgroups. Top: distribution of IS length (base pairs) IS1031; Bottom: distribution of IS length (base pairs) IS427. The number of examples used in the sample is shown above each column.
Fig. IS5.4. Length (base pairs) distribution of IS5 family IS903, ISH1 and ISL2 subgroups. The number of examples used in the sample is shown above each column.

There is a distant relationship, about 30% similarity, between IS5 and the Pif/Harbinger group of eukaryotic TE[15].

Organization

Although the majority of members have a single Tpase orf, about 20% may express Tpase by frameshifting since it is distributed between two translation phases similar to most of the IS427 subgroup (82/116)[14]. In these cases if frameshifting indeed occurs the frameshifting signals appear more appropriate for a programmed transcriptional realignment frameshift mechanism (PTR) rather than for classical translation frameshifting (PRF) since there are no obvious downstream enhancement signals[16].

Similar split reading frames have also been identified in several of the other subgroups: IS1031 (13/65 members); ISL2 (7/43); and few in the IS5 subgroup (7/149). There is no experimental evidence that these frameshift signals are functional but many of these IS are in multiple copies suggesting that the derivatives are active. In view of their diversity compared to families such as IS3, the subgroups will certainly be partitioned into additional groups as more ISs are identified.

At present, the IS903 and the archaeal ISH1 subgroups whose IR are quite similar (Fig.5.5) do not contain members with potential frameshifting.

Fig. IS5.5. WebLogo showing the most common ISH1 and IS903 ends. The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. From top to bottom: IS1031, IS427, ISL2, ISH1, IS903, IS5 subgroups.

In addition to their Tpases and the presence or absence of potential frameshifting, a further distinction between these elements resides in their target specificities.

Certain IS427 subgroup members and IS1182 family members do not carry a termination codon for their Tpases but generate this on insertion into a specific target sequence, CTAG, which is duplicated on insertion. Other IS such as IS1031, duplicate a sequence TNA while others such ISL2 appear to duplicate ANT.

The lengths of the entire group range from 789 bp (e.g., ISMbu1) to 1643 bp (eg., IS493). The latter carries a second open reading frame upstream of the "Tpase" frame inessential for transposition[17]. IS4811 (Tn4811[18], which is greater than 5kb, clearly contains a number of passenger genes including one with a consensus ATP/GTP-binding motif; an oxidoreductase-like protein; and one related to bacterial transcription regulators of the AraC family. Another, IS881 from Streptomyces, is interrupted by a group II intron.

The major feature which defines this group is the similarities between their putative Tpases[19]. This includes the N2, N3 and C1 domains carried by the IS4 group[20]. However, IS5 family Tpases exhibit a spacing between the N3 and C1 domains of approximately 40 residues, a distance more consistent with the canonical DDE motif[14].

Analysis of the largely increased number of members generally confirms these subgroups. Members within each group also generate distinct DRs of similar lengths (IS5, 4 bp; ISL2, 2-3 bp; IS1031, 3-4 bp; IS903, 8-9 bp; and IS427, 2-3 bp).

The IS903 and ISH1 subgroups have similar terminal IRs (Fig.5.5) but appear distinct by correlation with the length of the target duplication and, to a lesser extent, by the typical length of the entire IS (Fig.5.4).

Several members exhibit GATC sites within their terminal 50 bp. This includes all members of the IS903 subgroup and many members of the IS1031 and IS427 subgroups. IS903 transposition activity has been shown to be modulated by Dam in vivo (cited in [21]).

A preferred target sequence, YTAR (often CTAG), is observed for two subgroups, IS5 and IS427, and for two members of the ISL2 group (eg., IS112 and IS1373) in which either all four base pairs or the central TA are duplicated on insertion.

It is important to underline that, in many cases, the sequence of the original target site before insertion is not available. This can introduce ambiguities not only in estimating the number of duplicated target base pairs but also in defining the IRs. It is particularly important in several cases where the target repeat is symmetrical (e.g. CTAG) and where it is impossible to distinguish whether the element duplicates 2 or 4 bp and therefore to determine the exact ends of the element. Alignment of the ends of these elements in subgroups has permitted a number of ambiguities to be resolved. Members of the ISL2 group which generate 3 bp DRs exhibit a preference for ANT while those from the IS1031 group (which generate exclusively a 3 bp DR) exhibits a preference for insertion sites with the sequence TNA. Neither the small ISH1 group (8 bp DRs) nor the IS903 group (9 bp DRs) exhibit marked target specificity (see IS903 and also Target specificity). Only two of these elements, IS5 and IS903, have received significant attention.

IS5 group

In spite of the historical importance of IS5 in generating mutations, the published work concerning this element is largely directed to an understanding of its coding capacity and expression properties. IS5 carries one large orf, ins5A, spanning the entire element and shown to be essential for transposition IS5 (see [7]), and two small orfs (ins5B and 5C[22][23][24][25], whose relevance to transposition remains to be demonstrated. Nothing is known about the transposition mechanism of this element.

Mechanism IS903

The only IS5 family member which transposition mechanism has been addressed at present is IS903. The ends of IS903 carry IRs of 18 bp which exhibit the typical two-domain organization[26] . Transposase has been shown to bind specifically to the ends using a region located in the amino-terminal portion of the protein[27][28]. In addition, a region possibly involved in the formation of higher order multimers has been identified and residues probably involved in catalysis have been pinpointed among the conserved residues in the catalytic DDE domain[28]. Insertion generates a 9 bp target duplication.

An elegant genetic analysis provided strong evidence that IS903 is not only capable of undergoing direct insertion but can also generate adjacent deletions in a duplicative manner. Moreover, point mutations in the terminal base pair of the IRs decrease overall transposition frequency but increase the frequency of cointegrate formation[29]. Similarly, mutation of the first nucleotide flanking an IR also influences the level of cointegrate formation[30]. The level of cointegrate formation can also be increased by mutation of the Tpase. The molecular nature of these effects requires further investigation.

Factors affecting IS903 target site choice have been addressed in some detail. Initial studies[31] identified that insertion into the conjugative plasmid pOX38 showed no consensus in the 9 bp target duplication produced on insertion but the alignment of the target sequences indicated a preference for sites with symmetry on either side. A cloned copy of one native symmetric site into a second conjugative plasmid, pUB307, confirmed its attractiveness for insertion. More extensive studies provided a consensus symmetric target sequence which, when cloned into a target replicon, proved highly efficient[32]. The preferred target was a 21 bp palindrome cantered on the 9 bp target duplication. It could be dissected into: the 5 bp flanking sequences, the most important for site-specific insertion; the 7 bp palindromic core within the target duplication; the dinucleotide pair at the transposon-target junction; and the local DNA context.

Insertion into pUB307 itself showed a strong preference for a single orientation. By inverting either the vegetative (oriV) or transfer, oriT, origins, it was concluded that orientation was determined by the direction of conjugative transfer. This of course implies that the ends of IS903 are not equivalent. It also implies, as is the case for Tn7[33][34][35][36][37] and members of the IS200/IS608 family[38][39][40], that transposition targets replication forks.

The requirement the most abundant nucleoid proteins in transposition[41]. Most notably, H-NS was required for efficient transposition. Similar results were obtained for IS10 and Tn522 suggesting a more general role for H-NS in bacterial transposition. H-NS exerts its effect on target capture: IS903. Targeting preferences in the E. coli chromosome were dramatically altered in the absence of H-NS.

Several other host mutants were identified exhibiting a unique population pattern[42]: a ring phenotype with predominant papillae located just inside the edge of the colony, implying a spatial triggering of transposition within the. These mutants were found to be in pur genes, whose products are involved in purine biosynthesis. The genetic evidence was consistent with a requirement for GTP in IS903 transposition. These observations suggest that transposition occurs in later stages of colony growth. Transposition may occur within the colony edge in response to either a gradient of exogenous purines across the colony and may also reflect the developmental stage of the cells.

IS903 transposase like those of a variety of other IS, exhibits a strong preference for action in cis: complementation of defective transposons in trans occurs at less than 1% [42]. Transposition is extremely sensitive to the distance between the 3' end of the transposase gene and the nearest transposon IR. Insertion of 1 kb of DNA reduces transposition to 1-2%. There is a strong correlation between the stability of transposase and its ability to act in trans. wild-type transposase has a half-life of about 3 min. Fusion with α-galactosidase stabilizes the protein and results in an increase in its capacity to act in trans. A similar effect was noted in a lon mutant strain where trans activity was increased by a factor of 10-100. Further studies identified a class of transposase mutants specifically enhanced in trans activity and reduced in cis activity without increasing the overall transposition frequency. This was correlated with an increase in transposase half-life compared to the wildtype Derbyshire[43]. A second class of mutants with enhanced cis activity resulted in increased levels of transposase expression (as for IS10[44]).

IS1182

IS1182 family members exhibit a diverse set of target specificities. Some duplicate 4 bp. These are of two types: those specific for CTAG and those that show no apparent target sequence specificity. Yet others target palindromic sequences. These are also of different types: some insert at the 3’ foot of a stem-loop and duplicate the entire structure while others insert 3’ of the loop and simply duplicate the loop (P. Siguier, E. Gourbeyre and M. Chandler, unpublished) (Fig.IS1182.1).

Fig. IS1182.1. The IS1182 family's main characteristics. Top: The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. Bottom: distribution of IS length (base pairs) IS1182 family members. The number of examples used in the sample is shown above each column.

ISDol1 group (ISNCY)

Another small group, ISDol1, with 58 members from a large number of bacterial species has emerged from the ISNCY “orphan” group. Members have a length of between 1600-1900 bp (Fig.ISDol.1) and generate DRs of 6-7bp.

Fig. ISDol1.1. The IS1182 family's main characteristics. Top: The left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format. Bottom: distribution of IS length (base pairs) IS1182 family members. The number of examples used in the sample is shown above each column.

Bibliography

  1. <pubmed>4432374</pubmed>
  2. <pubmed>353507</pubmed>
  3. <pubmed>84614</pubmed>
  4. <pubmed>641012</pubmed>
  5. <pubmed>6270569</pubmed>
  6. <pubmed>3034860</pubmed>
  7. 7.0 7.1 Schnetz K, Rak B . IS5: a mobile enhancer of transcription in Escherichia coli. - Proc Natl Acad Sci U S A: 1992 Feb 15, 89(4);1244-8 [PubMed:1311089] [DOI] </nowiki>
  8. <pubmed>2846278</pubmed>
  9. <pubmed>7781607</pubmed>
  10. <pubmed>8710516</pubmed>
  11. <pubmed>6261245</pubmed>
  12. <pubmed>2819910</pubmed>
  13. <pubmed>7568465</pubmed>
  14. 14.0 14.1 14.2 Mahillon J, Chandler M . Insertion sequences. - Microbiol Mol Biol Rev: 1998 Sep, 62(3);725-74 [PubMed:9729608] [DOI] </nowiki>
  15. <pubmed>15020481</pubmed>
  16. <pubmed>21673094</pubmed>
  17. <pubmed>1319378</pubmed>
  18. <pubmed>1332944</pubmed>
  19. <pubmed>7934941</pubmed>
  20. <pubmed>7934941</pubmed>
  21. <pubmed>3000598</pubmed>
  22. <pubmed>6269958</pubmed>
  23. <pubmed>6269959</pubmed>
  24. <pubmed>6281651</pubmed>
  25. <pubmed>6327289</pubmed>
  26. <pubmed>2825175</pubmed>
  27. <pubmed>1324175</pubmed>
  28. 28.0 28.1 Tavakoli NP, DeVost J, Derbyshire KM . Defining functional regions of the IS903 transposase. - J Mol Biol: 1997 Dec 12, 274(4);491-504 [PubMed:9417930] [DOI] </nowiki>
  29. <pubmed>10096085</pubmed>
  30. <pubmed>11387225</pubmed>
  31. <pubmed>9620951</pubmed>
  32. <pubmed>11178901</pubmed>
  33. <pubmed>8804309</pubmed>
  34. <pubmed>11274058</pubmed>
  35. <pubmed>11030337</pubmed>
  36. <pubmed>26104363</pubmed>
  37. <pubmed>19703395</pubmed>
  38. <pubmed>26350330</pubmed>
  39. <pubmed>27466393</pubmed>
  40. <pubmed>20691900</pubmed>
  41. <pubmed>15130124</pubmed>
  42. 42.0 42.1 Coros AM, Twiss E, Tavakoli NP, Derbyshire KM . Genetic evidence that GTP is required for transposition of IS903 and Tn552 in Escherichia coli. - J Bacteriol: 2005 Jul, 187(13);4598-606 [PubMed:15968071] [DOI] </nowiki>
  43. <pubmed>8898394</pubmed>
  44. <pubmed>8412678</pubmed>