Difference between revisions of "IS Families/IS6 family"

From TnPedia
Jump to navigation Jump to search
Line 15: Line 15:
 
A phylogenetic tree based on the transposase amino acid sequence of the [https://isfinder.biotoul.fr/ ISfinder] collection ([[:File:IS6.3.png|Fig. IS6.3]]) shows that the IS''6'' family members fall into a number of well-defined clades. This slightly more extensive set of IS corresponds well to the results of another wide-ranging phylogenetic analysis <ref name=":5"><nowiki><pubmed>PMC6807381</pubmed></nowiki></ref>. These clades include one which groups all archaeal IS''6'' family members (Fig. IS6.3a) composed mainly of ''[[wikipedia:Euryarchaeota|Euryarchaeota]]'' (''[[wikipedia:Haloarchaea|Halobacteria]]'' ; Fig. IS6.3ai-iii). '''Group''' '''aiv''' includes both ''[[wikipedia:Euryarchaeota|Euryarchaeota]]'' (''[[wikipedia:Thermococcales|Thermococcales]]'' and ''[[wikipedia:Methanococcales|Methanococcales]]'') and ''[[wikipedia:Crenarchaeota|Crenarchaeota]]'' (''[[wikipedia:Sulfolobales|Sulfolobales]]''). Of the 10 clades containing bacterial IS: clade b includes examples from the [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and [[wikipedia:Gammaproteobacteria|Gamma-''proteobacteria'']], ''[[wikipedia:Firmicutes|Firmicutes]]'', ''[[wikipedia:Cyanobacteria|Cyanobacteria]]'', ''[[wikipedia:Acidobacteria|Acidobacteria]]'' and [[wikipedia:Bacteroidetes|Bacteroidetes]] ; '''clade''' '''c''' is more homogenous and is composed of ''[[wikipedia:Alphaproteobacteria|Alphaproteobacteria]]'' (''[[wikipedia:Rhizobiaceae|Rhizobiaceae]]'' and ''[[wikipedia:Methylobacteriaceae|Methylobacteriaceae]]''); '''clade d''' includes some [[wikipedia:Actinobacteria|Actinobacteria]], [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and ''[[wikipedia:Gammaproteobacteria|Gamma-proteobacteria]]'' ; while '''clades e, f, g''' and '''h''' are composed exclusively of [[wikipedia:Firmicutes|Firmicutes]] (almost exclusively ''[[wikipedia:Lactococcus|Lactococci]]'' in the case of '''clades e and f'''). '''Clades I''' and '''j''' are more mixed. Clearly, the  [https://isfinder.biotoul.fr/ ISfinder] collection does not necessarily reflect the true IS''6'' family distribution and these grouping should be interpreted with care. For example, although many do not form part of the [https://isfinder.biotoul.fr/ ISfinder]database, IS''6'' family elements are abundant in archaea and cover almost all of the traditionally recognized archaeal lineages (methanogens, halophiles, thermoacidophiles, and hyperthermophiles <ref><nowiki><pubmed>PMC1847376</pubmed></nowiki></ref> ([[:File:IS6.3.png|Fig. IS6.3]]) .                                 
 
A phylogenetic tree based on the transposase amino acid sequence of the [https://isfinder.biotoul.fr/ ISfinder] collection ([[:File:IS6.3.png|Fig. IS6.3]]) shows that the IS''6'' family members fall into a number of well-defined clades. This slightly more extensive set of IS corresponds well to the results of another wide-ranging phylogenetic analysis <ref name=":5"><nowiki><pubmed>PMC6807381</pubmed></nowiki></ref>. These clades include one which groups all archaeal IS''6'' family members (Fig. IS6.3a) composed mainly of ''[[wikipedia:Euryarchaeota|Euryarchaeota]]'' (''[[wikipedia:Haloarchaea|Halobacteria]]'' ; Fig. IS6.3ai-iii). '''Group''' '''aiv''' includes both ''[[wikipedia:Euryarchaeota|Euryarchaeota]]'' (''[[wikipedia:Thermococcales|Thermococcales]]'' and ''[[wikipedia:Methanococcales|Methanococcales]]'') and ''[[wikipedia:Crenarchaeota|Crenarchaeota]]'' (''[[wikipedia:Sulfolobales|Sulfolobales]]''). Of the 10 clades containing bacterial IS: clade b includes examples from the [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and [[wikipedia:Gammaproteobacteria|Gamma-''proteobacteria'']], ''[[wikipedia:Firmicutes|Firmicutes]]'', ''[[wikipedia:Cyanobacteria|Cyanobacteria]]'', ''[[wikipedia:Acidobacteria|Acidobacteria]]'' and [[wikipedia:Bacteroidetes|Bacteroidetes]] ; '''clade''' '''c''' is more homogenous and is composed of ''[[wikipedia:Alphaproteobacteria|Alphaproteobacteria]]'' (''[[wikipedia:Rhizobiaceae|Rhizobiaceae]]'' and ''[[wikipedia:Methylobacteriaceae|Methylobacteriaceae]]''); '''clade d''' includes some [[wikipedia:Actinobacteria|Actinobacteria]], [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and ''[[wikipedia:Gammaproteobacteria|Gamma-proteobacteria]]'' ; while '''clades e, f, g''' and '''h''' are composed exclusively of [[wikipedia:Firmicutes|Firmicutes]] (almost exclusively ''[[wikipedia:Lactococcus|Lactococci]]'' in the case of '''clades e and f'''). '''Clades I''' and '''j''' are more mixed. Clearly, the  [https://isfinder.biotoul.fr/ ISfinder] collection does not necessarily reflect the true IS''6'' family distribution and these grouping should be interpreted with care. For example, although many do not form part of the [https://isfinder.biotoul.fr/ ISfinder]database, IS''6'' family elements are abundant in archaea and cover almost all of the traditionally recognized archaeal lineages (methanogens, halophiles, thermoacidophiles, and hyperthermophiles <ref><nowiki><pubmed>PMC1847376</pubmed></nowiki></ref> ([[:File:IS6.3.png|Fig. IS6.3]]) .                                 
  
[[File:IS6.3.png|border|center|thumb|860x860px|'''Fig. IS6.3.''' A dendrogram of IS''6'' family members. The figure shows 11 major clades. The surrounding colored circles and the insert indicate the clades identified by Harmer and Hall (2017). The insert shows the correspondence.]]      
+
[[File:IS6.3.png|border|center|thumb|860x860px|'''Fig. IS6.3.''' A dendrogram of IS''6'' family members. The figure shows 11 major clades. The surrounding colored circles and the insert indicate the clades identified by Harmer and Hall (2017). The insert shows the correspondence.]]'''Terminal Inverted Repeats.'''
  
====Terminal Inverted Repeats.====
 
 
The division into clades is also underlined to some extent by the '''IR''' sequences. As shown in Fig. IS6.2 ('''bottom'''), in spite of the wide range of bacterial and archaeal species in which family members are found, there is a surprising sequence conservation. In particular, the presence of a G dinucleotide at the IS tips and '''cTGTt''' and '''caaa''' internal motifs. Sequence motifs are more pronounced when each clade is considered separately ([[:File:IS6.4.png|Fig. IS6.4]]).   
 
The division into clades is also underlined to some extent by the '''IR''' sequences. As shown in Fig. IS6.2 ('''bottom'''), in spite of the wide range of bacterial and archaeal species in which family members are found, there is a surprising sequence conservation. In particular, the presence of a G dinucleotide at the IS tips and '''cTGTt''' and '''caaa''' internal motifs. Sequence motifs are more pronounced when each clade is considered separately ([[:File:IS6.4.png|Fig. IS6.4]]).   
  
=====Clade b=====
+
'''Clade b'''
(n=16; ''[[wikipedia:Actinobacteria|Actinobacteria]]'', ''[[wikipedia:Alphaproteobacteria|Alpha]]''-, ''[[wikipedia:Betaproteobacteria|Beta]]''-, and ''[[wikipedia:Gammaproteobacteria|Gamma-proteobacteria]]'') includes a well conserved GG..cTGTTGCAAA signature with little conservation further into each end.
 
  
=====Clade c=====
+
n=16; ''[[wikipedia:Actinobacteria|Actinobacteria]]'', ''[[wikipedia:Alphaproteobacteria|Alpha]]''-, ''[[wikipedia:Betaproteobacteria|Beta]]''-, and ''[[wikipedia:Gammaproteobacteria|Gamma-proteobacteria]]'') includes a well conserved GG..cTGTTGCAAA signature with little conservation further into each end.
(n= 14; ''[[wikipedia:Alphaproteobacteria|Alphaproteobacteria]]'': ''[[wikipedia:Rhizobiaceae|Rhizobiaceae]]'' and ''[[wikipedia:Methylobacteriaceae|Methylobacteriaceae]]'') shows considerable conservation of an extended motif (GGG... TGTCGCAAA) and some conservation further into both IRL and IRR, although these are different for each end.  
 
  
=====Clade d=====
+
'''Clade c'''
(n=24; with [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and [[wikipedia:Gammaproteobacteria|Gamma-''proteobacteria'']], ''[[wikipedia:Firmicutes|Firmicutes]]'', ''[[wikipedia:Cyanobacteria|Cyanobacteria]]'', ''[[wikipedia:Acidobacteria|Acidobacteria]]'' and ''[[wikipedia:Bacteroidetes|Bacteroidetes]]'') maintains stronger traces of parts of these motifs (GG.. tcTGtt and CAaa).
 
  
=====Clade e =====
+
n= 14; ''[[wikipedia:Alphaproteobacteria|Alphaproteobacteria]]'': ''[[wikipedia:Rhizobiaceae|Rhizobiaceae]]'' and ''[[wikipedia:Methylobacteriaceae|Methylobacteriaceae]]'') shows considerable conservation of an extended motif (GGG... TGTCGCAAA) and some conservation further into both IRL and IRR, although these are different for each end.
(n=23; s composed mainly of IS from ''[[wikipedia:Lactococcus|Lactococcus]]'', a single ''[[wikipedia:Leuconostoc|Leuconostoc]]'' and other bacilli (''[[wikipedia:Listeria|Lysteria]]'', ''[[wikipedia:Enterococcus|Enterococcus]]'');
 
  
=====Clade f=====
+
'''Clade d'''
(n = 11; largely ''[[wikipedia:Staphylococcus|Staphylococci]]'' with 2 ''[[wikipedia:Bacillus_thuringiensis|B. thuringiensis]]'') also exhibit the typical GGTTCTGTTGCAAAGTTt signature and some internal conservation in IRL.
 
  
=====Clade g=====
+
n=24; with [[wikipedia:Alphaproteobacteria|Alpha]]-, [[wikipedia:Betaproteobacteria|Beta]]-, and [[wikipedia:Gammaproteobacteria|Gamma-''proteobacteria'']], ''[[wikipedia:Firmicutes|Firmicutes]]'', ''[[wikipedia:Cyanobacteria|Cyanobacteria]]'', ''[[wikipedia:Acidobacteria|Acidobacteria]]'' and ''[[wikipedia:Bacteroidetes|Bacteroidetes]]'') maintains stronger traces of parts of these motifs (GG.. tcTGtt and CAaa).  
(n = 10; is more heterogenous ([[wikipedia:Alphaproteobacteria|Alpha proteobacteria]]'': [[wikipedia:Methylobacterium|Methylobacterium]], [[wikipedia:Paracoccus|Paracoccus]], [[wikipedia:Roseovarius|Roseovarius]], [[wikipedia:Rhizobium|Rhizobium]], [[wikipedia:Bradyrhizobium|Bradyrhizobium]] ; [[wikipedia:Deinococcus–Thermus|Deinococci]]'' and ''[[wikipedia:Haloarchaea|Halobacteria]]''). It contains a poorly conserved IR sequence but does include a prominent gG dinucleotide tip and a poorly pronounced tgtcaagtt signature).  
 
  
=====Clade h=====
+
'''Clade e'''
(n= 5; composed entirely of ''[[wikipedia:Firmicutes|Firmicutes]]'' (''[[wikipedia:Natranaerobius|Natranaerobius]]'', ''[[wikipedia:Clostridium|Clostridium]]'' and ''[[wikipedia:Thermoanaerobacter|Thermoanaerobacter]]'') exhibits a moderately well-defined internal signature TcTgTtAAgTt).
+
 
 +
n=23; s composed mainly of IS from ''[[wikipedia:Lactococcus|Lactococcus]]'', a single ''[[wikipedia:Leuconostoc|Leuconostoc]]'' and other bacilli (''[[wikipedia:Listeria|Lysteria]]'', ''[[wikipedia:Enterococcus|Enterococcus]]'');
 +
 
 +
'''Clade f'''
 +
 
 +
n = 11; largely ''[[wikipedia:Staphylococcus|Staphylococci]]'' with 2 ''[[wikipedia:Bacillus_thuringiensis|B. thuringiensis]]'') also exhibit the typical GGTTCTGTTGCAAAGTTt signature and some internal conservation in IRL.
 +
 
 +
'''Clade g'''
 +
 
 +
n = 10; is more heterogenous ([[wikipedia:Alphaproteobacteria|Alpha proteobacteria]]'': [[wikipedia:Methylobacterium|Methylobacterium]], [[wikipedia:Paracoccus|Paracoccus]], [[wikipedia:Roseovarius|Roseovarius]], [[wikipedia:Rhizobium|Rhizobium]], [[wikipedia:Bradyrhizobium|Bradyrhizobium]] ; [[wikipedia:Deinococcus–Thermus|Deinococci]]'' and ''[[wikipedia:Haloarchaea|Halobacteria]]''). It contains a poorly conserved IR sequence but does include a prominent gG dinucleotide tip and a poorly pronounced tgtcaagtt signature.
 +
 
 +
'''Clade h'''
 +
 
 +
n= 5; composed entirely of ''[[wikipedia:Firmicutes|Firmicutes]]'' (''[[wikipedia:Natranaerobius|Natranaerobius]]'', ''[[wikipedia:Clostridium|Clostridium]]'' and ''[[wikipedia:Thermoanaerobacter|Thermoanaerobacter]]'') exhibits a moderately well-defined internal signature TcTgTtAAgTt.
 +
 
 +
'''Clade i'''
  
=====Clade i=====
 
 
Finally, clade I (n=3) is composed of Halanaerobia and [[wikipedia:Thermoanaerobacter|Thermoanaerobacter]].
 
Finally, clade I (n=3) is composed of Halanaerobia and [[wikipedia:Thermoanaerobacter|Thermoanaerobacter]].
 +
  
 
'''The archaeal-specific clades also generally exhibit well-defined consensus sequences.'''
 
'''The archaeal-specific clades also generally exhibit well-defined consensus sequences.'''
  
<br />
+
====='''Clade Ai''',=====
 +
Is composed of diverse ''Halobacterial species'' (''Halohasta, Haloferax, Natrinema, Natrialba, Halogeometricum, Natronomonas, Natronococcus,'' and ''Haloarcula''): GgcACtGTCTAGtT.
 +
 
 +
'''Clade Aii'''
  
===== '''Clade Ai''', =====
+
n = 12; is composed uniquely of ''Halobacterial'' ''Euryarchaeota'' with a ggtaGTGTTcagatAaG signature and significant internal conservation which is different for each end.
Is composed of diverse ''Halobacterial species'' (''Halohasta, Haloferax, Natrinema, Natrialba, Halogeometricum, Natronomonas, Natronococcus,'' and ''Haloarcula''): GgcACtGTCTAGtT.
 
<br />
 
  
===== '''Clade Aii''' =====
+
'''Clade Aiii'''
(n = 12) is composed uniquely of ''Halobacterial'' ''Euryarchaeota'' with a ggtaGTGTTcagatAaG signature and significant internal conservation which is different for each end.
 
  
===== '''Clade Aiii''' =====
+
n = 5); is composed entirely of ''Halobacterial'' ''Euryarchaeota'' (''Haloarcula, Halomicrobium, Natronomonas, Natronobacterium, Natrinema'') also has well conserved ends, ggtcgTGTTTaGTT, and significant internal conservation which is different for each end.  
(n = 5), is composed entirely of ''Halobacterial'' ''Euryarchaeota'' (''Haloarcula, Halomicrobium, Natronomonas, Natronobacterium, Natrinema'') also has well conserved ends, ggtcgTGTTTaGTT, and significant internal conservation which is different for each end.  
 
  
===== '''Clade Aiv''' =====
+
'''Clade Aiv'''
(n = 9) which includes both ''Euryarchaeota'' and ''Crenarchaeota'', has poor conservation although on further analysis, an alignment shows significant conservation in the ''Sulfolobus'' and in the ''Pyrococcus'' groups with good interior conservation also in the 3 ''Pyrococcal'' members. It is possible that the IS ends in the ''Sulfolobus'' members have not been accurately identified.
 
  
 +
n = 9; which includes both ''Euryarchaeota'' and ''Crenarchaeota'', has poor conservation although on further analysis, an alignment shows significant conservation in the ''Sulfolobus'' and in the ''Pyrococcus'' groups with good interior conservation also in the 3 ''Pyrococcal'' members. It is possible that the IS ends in the ''Sulfolobus'' members have not been accurately identified.
  
 
MCL analysis ��[33]� for the entire group of transposases using the criteria of ISfinder for classification  (IS identification)��[34]� showed that all members fell within the definition of a single family (Inflation factor 1.2, score >30) and fell into 3 groups: clades b-I; clades Ai-Aiii; and Aiv using the appropriate filter (Inflation factor 2, score >140). The answer to the recent question “An analysis of the IS6/IS26 family of insertion sequences: is it a single family?”��[31]� is therefore “Probably, yes” according to the ISfinder definition.
 
MCL analysis ��[33]� for the entire group of transposases using the criteria of ISfinder for classification  (IS identification)��[34]� showed that all members fell within the definition of a single family (Inflation factor 1.2, score >30) and fell into 3 groups: clades b-I; clades Ai-Aiii; and Aiv using the appropriate filter (Inflation factor 2, score >140). The answer to the recent question “An analysis of the IS6/IS26 family of insertion sequences: is it a single family?”��[31]� is therefore “Probably, yes” according to the ISfinder definition.

Revision as of 13:22, 8 March 2021

General

There are at present nearly 160 family members in ISfinder from nearly 80 bacterial and archaeal species but this represents only a fraction of those present in the public databases. The family was named[1] after the directly repeated insertion sequences in transposon Tn6 [2] to standardize the various names that had been attributed to identical elements (e.g. IS15, IS26, IS46, IS140, IS160, IS176) [3][4][5][6][7][8][9][10][11][12][13][14][15], including one isolate, IS15, corresponding to an insertion of one iso-IS6 (IS15D) into another [4][5] . More recently there has been some attempt to rename the family as the IS26 family (see [16]), presumably because of accumulating experimental data from IS26 itself and the importance of this IS in accumulation and transmission of multiple anti biotic resistance, although this might potentially introduce confusion in the literature. IS6 family members have a simple organization (Fig. IS6.1) and generate 8bp direct target repeats on insertion. This family is very homogenous with an average length of about 800 bp and highly conserved short, generally perfect, IRs (Fig. IS6.1 and Fig. IS6.2). There are two examples of MITES (Miniature Inverted repeat Transposable Elements composed of both IS ends and no intervening orfs; ��[17]of 227 and 336 bp), 7 members between 1230 and 1460 bp and three members between 1710 and 1760 bp. One member, IS15, of 1648 bp represents and insertion of one IS into another ��[3][5]�.any are found as part of compound transposons (called pseudo-compound transposons [1] described below) invariably as flanking direct repeats (Fig. IS6.1) a consequence of their transposition mechanism [7][9][13][14][18][19][20][21][22][23][24][25][26][27][28][29][30].

Fig. IS6.1. IS6 family organization. Top. Structure of IS6 family. Left (IRL) and right (IRR) terminal 14 bp IRs are shown as blue triangles. The 8 bp direct target repeats are shown as pink arrow heads. The transposase open reading frame is shown in purple. Bottom. A Pseudo-compound transposon (see text for explanation). IS6 family characteristics are as above. Here, two directly repeated IS flank a passenger gene in green.


Fig. IS6.2. The general characteristics of the IS6 family. Top: Distribution of IS length (base pairs). The number of examples used in the sample is shown above each column. Bottom: Left (IRL) and right IRR inverted terminal repeats are shown in WebLogo format (Crooks et al., 2004).

Distribution and Phylogenetic Transposase Tree

A phylogenetic tree based on the transposase amino acid sequence of the ISfinder collection (Fig. IS6.3) shows that the IS6 family members fall into a number of well-defined clades. This slightly more extensive set of IS corresponds well to the results of another wide-ranging phylogenetic analysis [31]. These clades include one which groups all archaeal IS6 family members (Fig. IS6.3a) composed mainly of Euryarchaeota (Halobacteria ; Fig. IS6.3ai-iii). Group aiv includes both Euryarchaeota (Thermococcales and Methanococcales) and Crenarchaeota (Sulfolobales). Of the 10 clades containing bacterial IS: clade b includes examples from the Alpha-, Beta-, and Gamma-proteobacteria, Firmicutes, Cyanobacteria, Acidobacteria and Bacteroidetes ; clade c is more homogenous and is composed of Alphaproteobacteria (Rhizobiaceae and Methylobacteriaceae); clade d includes some Actinobacteria, Alpha-, Beta-, and Gamma-proteobacteria ; while clades e, f, g and h are composed exclusively of Firmicutes (almost exclusively Lactococci in the case of clades e and f). Clades I and j are more mixed. Clearly, the ISfinder collection does not necessarily reflect the true IS6 family distribution and these grouping should be interpreted with care. For example, although many do not form part of the ISfinderdatabase, IS6 family elements are abundant in archaea and cover almost all of the traditionally recognized archaeal lineages (methanogens, halophiles, thermoacidophiles, and hyperthermophiles [32] (Fig. IS6.3) .

Fig. IS6.3. A dendrogram of IS6 family members. The figure shows 11 major clades. The surrounding colored circles and the insert indicate the clades identified by Harmer and Hall (2017). The insert shows the correspondence.

Terminal Inverted Repeats.

The division into clades is also underlined to some extent by the IR sequences. As shown in Fig. IS6.2 (bottom), in spite of the wide range of bacterial and archaeal species in which family members are found, there is a surprising sequence conservation. In particular, the presence of a G dinucleotide at the IS tips and cTGTt and caaa internal motifs. Sequence motifs are more pronounced when each clade is considered separately (Fig. IS6.4).

Clade b

n=16; Actinobacteria, Alpha-, Beta-, and Gamma-proteobacteria) includes a well conserved GG..cTGTTGCAAA signature with little conservation further into each end.

Clade c

n= 14; Alphaproteobacteria: Rhizobiaceae and Methylobacteriaceae) shows considerable conservation of an extended motif (GGG... TGTCGCAAA) and some conservation further into both IRL and IRR, although these are different for each end.

Clade d

n=24; with Alpha-, Beta-, and Gamma-proteobacteria, Firmicutes, Cyanobacteria, Acidobacteria and Bacteroidetes) maintains stronger traces of parts of these motifs (GG.. tcTGtt and CAaa).

Clade e

n=23; s composed mainly of IS from Lactococcus, a single Leuconostoc and other bacilli (Lysteria, Enterococcus);

Clade f

n = 11; largely Staphylococci with 2 B. thuringiensis) also exhibit the typical GGTTCTGTTGCAAAGTTt signature and some internal conservation in IRL.

Clade g

n = 10; is more heterogenous (Alpha proteobacteria: Methylobacterium, Paracoccus, Roseovarius, Rhizobium, Bradyrhizobium ; Deinococci and Halobacteria). It contains a poorly conserved IR sequence but does include a prominent gG dinucleotide tip and a poorly pronounced tgtcaagtt signature.

Clade h

n= 5; composed entirely of Firmicutes (Natranaerobius, Clostridium and Thermoanaerobacter) exhibits a moderately well-defined internal signature TcTgTtAAgTt.

Clade i

Finally, clade I (n=3) is composed of Halanaerobia and Thermoanaerobacter.


The archaeal-specific clades also generally exhibit well-defined consensus sequences.

Clade Ai,

Is composed of diverse Halobacterial species (Halohasta, Haloferax, Natrinema, Natrialba, Halogeometricum, Natronomonas, Natronococcus, and Haloarcula): GgcACtGTCTAGtT.

Clade Aii

n = 12; is composed uniquely of Halobacterial Euryarchaeota with a ggtaGTGTTcagatAaG signature and significant internal conservation which is different for each end.

Clade Aiii

n = 5); is composed entirely of Halobacterial Euryarchaeota (Haloarcula, Halomicrobium, Natronomonas, Natronobacterium, Natrinema) also has well conserved ends, ggtcgTGTTTaGTT, and significant internal conservation which is different for each end.

Clade Aiv

n = 9; which includes both Euryarchaeota and Crenarchaeota, has poor conservation although on further analysis, an alignment shows significant conservation in the Sulfolobus and in the Pyrococcus groups with good interior conservation also in the 3 Pyrococcal members. It is possible that the IS ends in the Sulfolobus members have not been accurately identified.

MCL analysis ��[33]� for the entire group of transposases using the criteria of ISfinder for classification (IS identification)��[34]� showed that all members fell within the definition of a single family (Inflation factor 1.2, score >30) and fell into 3 groups: clades b-I; clades Ai-Aiii; and Aiv using the appropriate filter (Inflation factor 2, score >140). The answer to the recent question “An analysis of the IS6/IS26 family of insertion sequences: is it a single family?”��[31]� is therefore “Probably, yes” according to the ISfinder definition.

A recent study ��[35]� identified a number of IS26 variants with specific mutations in their Tpases. In particular one variant, originally called IS15D ��[4,36]� was observed to exhibit enhanced activity and it was suggested that such mutants, even though they satisfy ISfinder criteria attributing a new name for an IS (< 95% nucleotide identity and/or < 98% amino acid identity). It has been suggested that such variant should be suffixed as IS26.v1, .v2 etc. ��[35]�. This makes sense if the mutation is not functionally neutral results in a change IS properties or behavior.




Clade Ai

(n = 12) is composed uniquely of Halobacterial Euryarchaeota with a ggtaGTGTTcagatAaG signature and significant internal conservation which is different for each end.

Clade Aii

(n = 5), again, composed entirely of Halobacterial Euryarchaeota (Haloarcula, Halomicrobium, Natronomonas, Natronobacterium, Natrinema) also has well conserved ends, ggtcgTGTTTaGTT, and significant internal conservation which is different for each end.

Clade Aiii

This is also the case for Clade Aiii, also composed of diverse Halobacterial species (Halohasta, Haloferax, Natrinema, Natrialba, Halogeometricum, Natronomonas, Natronococcus, and Haloarcula): GgcACtGTCTAGtT.

Clade Aiv

However, Clade Aiv (n = 9) which includes both Euryarchaeota and Crenarchaeota, has poor conservation although on further analysis, an alignment shows significant conservation in the Sulfolobus and in the Pyrococcus groups with good interior conservation also in the 3 Pyrococcal members. It is possible that the IS ends in the Sulfolobus members have not been accurately identified.



The answer to the recent question:

An analysis of the IS6/IS26 family of insertion sequences: is it a single family ?[31] is therefore “Probably, yes”.

Genomic Impact

Activity resulting in horizontal dissemination is suggested, for example, by the observation that copies identical to Mycobacterium fortuitum IS6100 [33](Clade d) occur in other bacteria: as part of a plasmid-associated catabolic transposon carrying genes for nylon degradation in Arthrobacter sp. [34], from the Pseudomonas aeruginosa plasmid R1003 [35], and within the Xanthomonas campestris transposon Tn5393b [36]. Similar copies have also been reported in Salmonella enterica (typhimurium) [37], and on plasmid pACM1 from Klebsiella oxytoca (AF107205) [38].

A single member, ISDsp3, present in single copy in Dehalococcoides sp. BAV1 carries a passenger gene annotated as hypothetical protein.

IS257 [39](Clade h) (also known as IS431) has played an important role in sequestering a variety of antibiotic resistance genes in clinical isolates of methicillin resistant Staphylococcus aureus (MRSA) (e.g.[40][41][42][43]. It provides an outward oriented promoter which drives expression of genes located proximal to the left end. Moreover, both left and right ends appear to carry a –35 promoter component which would permit formation of hybrid promoters on insertion next to a resident –10 element [42][44]. Insertion of can result in activation of a neighboring gene using both a hybrid promoter and an indigenous promoter [42]. IS257 is also involved in expression of tetA [45] and dfrA [43] in S. aureus.

IS26 [6][7][8] (clade d) is encountered with increasing frequency in plasmids of clinical importance where it is involved in expression of antibiotic resistance genes and plasmid rearrangements (see [28][46][47][48][49][50]). Its transposition mechanism contributes to its ability to assemble anti-bacterial resistance genes into clusters (e.g. [51]). It can also form hybrid promoters capable of driving different antibiotic resistance genes: aphA7, blaS2A (Klebsiella pneumoniae [22]), blaSHV-2a (Pseudomonas aeruginosa [52]) and aphA7 (Pasteurella piscicida [53]) as well as the wide spectrum beta-lactam resistance gene blaKPC (Table IS and Gene Expression).

The formation of hybrid promoters on insertion (Table IS and Gene Expression) is clearly a general property of members of the IS6 family [22][42][43][54][55].

Another member, IS6100 [33] (Clade d), often used as an aid in classifying mycobacterial isolates[56] [57][58] has been found to drive strA strB expression in X. campestris pv. vesicatoria, [36].

This IS family is able to form transposons which resemble compound transposons with the flanking IS in direct repeat but, because of the particular transposition mechanism of IS6 family members (see below), were called pseudo-compound transposons [1]. These include Tn610 (flanked by IS6100 [33]), Tn4003 and others (flanked by IS257 [40]) and Tn6023 (flanked by IS26 [59]).


Clinical Importance of IS26.

In view of the particular importance of IS26 in sequestering antibiotic resistance genes and generating arrays of these genes in clinically important conjugative plasmids and in the host chromosome (see [28][51]), it is worthwhile devoting a separate section to the contribution of this IS to the clinical landscape. Recognition of its place as an important player has derived from the large number of sequences now available of multiple antibiotic resistance plasmids and chromosomal segments such as Genomic Resistance Islands (GRI). It is now no longer practical to provide a complete analysis of the literature. At present (19th November 2020) a PubMed search using IS26 as the search term yielded nearly 450 citations. The references in the following are not exhaustive but simply provide examples.

Arrays

IS6 family members are often found in arrays (Fig. IS6.5 and Fig. IS6.6) in direct and inverted repeat in multiple drug resistant plasmids (e.g. S. typhimurium ��[27,60,61]�, Klebsiella quasipneumoniae ��[62]�, Acinetobacter baumannii ��[48,63]�, Proteus mirabilis ��[64]� and uncultured sewage bacteria ��[65]� among many others). These are often intercalated in or next to other transposable elements rather than neatly flanking ABR genes and can form units able to undergo tandem amplification.

Amplification

Shropshire et al ��[66]�, studying clinical isolates of non-carbapenemase-producing Carbapenem-Resistant Enterobacteria, non-CP-CRE, isolated from several patients with recurrent bacteraemia, observed an increase in carbapenem resistance partially due to IS26-mediated amplification of up to 10 fold of a cassette blaOXA-1 and blaCTX-M-1 which forms part of a larger chromosomal structure of IS26 arrays which they call TnMB1860 (Fig. IS6.6). It was unclear whether this cassette amplification was due to transposition activity or, as had been observed in similar, IS1-mediated, gene amplifications ��[67–72]�. Another example has been revealed by Hastak et al ��[73]� who analysed a multi resistant derivative of the clinically important, globally dispersed pathogenic, Escherichia coli ST131 subclade H30Rx, isolated from a number of bacteraemic patients and revealed that increased piperacillin/tazobactam resistance was due to IS26-mediated amplification of blaTEM-1B. A similar type of limited (tandem dimer) amplification of an IS26-flanked blaSHV-5-carrying cassette found in plasmids from a number of geographically diverse enteric species was identified in a nosocomial E. cloacae strain ��[74]�. A more extensive amplification (>10 fold) was observed with the same cassette located in a different plasmid in a well-characterised laboratory strain of Escherichia coli and occurred in a recA-independent manner ��[46]� and even higher levels of tandem amplification (~65 fold) of the aphA1 gene in the IS26-based Tn6020 were identified in Acinetobacter baumannii ��[75]�.

Cointegrating plasmids.


Organization

Mechanism: the state of play

Cointegrate formation


Circular transposon molecules: translocatable units (TU)

Targeted transposition.

Acknowledgements

We would like to thank Susu He (Nanjing University) for stimulating discussions concerning the transposition models.

Bibliography

  1. 1.0 1.1 1.2 Galas DJ, Chandler M. Bacterial Insertion Sequences. In: Berg DE, Howe MM, editors. Mob DNA. Washington, D.C.: American Society for Microbiology; 1989. p. 109–162.
  2. Berg DE, Davies J, Allet B, Rochaix JD. Transposition of R factor genes to bacteriophage lambda. ProcNatlAcadSciUSA. 1975;72:3628–3632.
  3. 3.0 3.1 Labigne-Roussel A, Courvalin P. IS15, a new insertion sequence widely spread in R plasmids of gram- negative bacteria. MolGenGenet. 1983;189:102–112.
  4. 4.0 4.1 Trieu-Cuot P, Courvalin P. Nucleotide sequence of the transposable element IS15. Gene. 1984;30:113–120.
  5. 5.0 5.1 5.2 <pubmed>2994132</pubmed>
  6. 6.0 6.1 <pubmed>PMC326375</pubmed>
  7. 7.0 7.1 7.2 <pubmed>PMC326375</pubmed>
  8. 8.0 8.1 <pubmed>3003524</pubmed>
  9. 9.0 9.1 <pubmed>PMC215669</pubmed>
  10. Nucken EJ, Henschke RB, Schmidt FR. Nucleotide-sequence of insertion element IS15 delta IV from plasmid pBP11. DNA Seq. 1990;1:85–88.
  11. <pubmed>6304469</pubmed>
  12. <pubmed>2999303</pubmed>
  13. 13.0 13.1 Colonna B, Bernardini M, Micheli G, Maimone F, Nicoletti M, Casalino M. The Salmonella wien virulence plasmid pZM3 carries Tn1935, a multiresistance transposon containing a composite IS1936- kanamycin resistance element. Plasmid. 1988;20:221–231.
  14. 14.0 14.1 <pubmed>PMC162495</pubmed>
  15. <pubmed>PMC305975</pubmed>
  16. <pubmed>32871211</pubmed>
  17. <pubmed>2842323</pubmed>
  18. <pubmed>PMC1196216</pubmed>
  19. <pubmed>PMC3195058</pubmed>
  20. <pubmed>PMC3587239</pubmed>
  21. <pubmed>PMC219079</pubmed>
  22. 22.0 22.1 22.2 <pubmed>PMC209129</pubmed>
  23. Barberis-Maino L, Berger-Bachi B, Weber H, Beck WD, Kayser FH. IS431, a staphylococcal insertion sequence-like element related to IS26 from Proteus vulgaris. Gene. 1987;59:107–113.
  24. <pubmed>PMC174916</pubmed>
  25. <pubmed>3033719</pubmed>
  26. <pubmed>2543009</pubmed>
  27. Sundstrom L, Jansson C, Bremer K, Heikkila E, Olsson-Liljequist B, Skold O. A new dhfrVIII trimethoprim-resistance gene, flanked by IS26, whose product is remote from other dihydrofolate reductases in parsimony analysis. Gene. 1995;154:7–14.
  28. 28.0 28.1 28.2 <pubmed>19074421</pubmed>
  29. <pubmed>21393132</pubmed>
  30. <pubmed>PMC284528</pubmed>
  31. 31.0 31.1 <pubmed>PMC6807381</pubmed>
  32. <pubmed>PMC1847376</pubmed>
  33. 33.0 33.1 33.2 <pubmed>2163027</pubmed>
  34. <pubmed>PMC205175</pubmed>
  35. <pubmed>PMC196970</pubmed>
  36. 36.0 36.1 <pubmed>PMC167566</pubmed>
  37. <pubmed>10930753</pubmed>
  38. <pubmed>25291385</pubmed>
  39. Rouch D, Skurray R. IS257 from Staphylococcus aureus member of an insertion sequence superfamily Gram-positive and Gram-negative bacteria. Gene. 1989;76:195–205.
  40. 40.0 40.1 Rouch DA, Messerotti LJ, Loo SL, Jackson CA, Skurray RA. Trimethoprim resistance transposon Tn4003 from Staphylococcus aureus encodes genes for a dihydrofolate reductase and thymidylate synthetase flanked by three copies of IS257. Mol Microbiol. 1989;3:161–175.
  41. Stewart PR, Dubin DT, Chikramane SG, Inglis B, Matthews PR, Poston SM. IS257 and small plasmid insertions in the mec region of the chromosome of Staphylococcus aureus. Plasmid. 1994;31:12–20.
  42. 42.0 42.1 42.2 42.3 <pubmed>PMC101884</pubmed>
  43. 43.0 43.1 43.2 <pubmed>PMC284724</pubmed>
  44. <pubmed>PMC107441</pubmed>
  45. <pubmed>7830550</pubmed>
  46. <pubmed>PMC1913244</pubmed>
  47. <pubmed>23330672</pubmed>
  48. <pubmed>16870645</pubmed>
  49. <pubmed>23169892</pubmed>
  50. <pubmed>20093380</pubmed>
  51. 51.0 51.1 <pubmed>PMC4471558</pubmed>
  52. <pubmed>PMC89260</pubmed>
  53. <pubmed>27873653</pubmed>
  54. <pubmed>PMC2443897</pubmed>
  55. Allmansberger R, Brau B, Piepersberg W. Genes for gentamicin-(3)-N-acetyl-transferases III and IV. II. Nucleotide sequences of three AAC(3)-III genes and evolutionary aspects. MolGenGenet. 1985;198:514–520.
  56. <pubmed>PMC268253</pubmed>
  57. <pubmed>PMC330226</pubmed>
  58. <pubmed>PMC172007</pubmed>
  59. <pubmed>21702681</pubmed>