Difference between revisions of "IS Families/IS3 family"

From TnPedia
Jump to navigation Jump to search
Line 1: Line 1:
 
===Original Identification===
 
===Original Identification===
IS''3'' and another member of this family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] were identified genetically as a DNA segments causing insertional inactivation of ''[[wikipedia:Gal_operon|gal]]'' and ''[[wikipedia:Lac_operon|lac]]'' operons and physically by [[wikipedia:Scanning_electron_microscope|electron microscopy]]<ref><pubmed>4567156</pubmed></ref> and in [[wikipedia:Fertility_factor_(bacteria)|plasmid F]] as a segment called alpha-beta<ref><pubmed>1092667</pubmed></ref><ref><pubmed>1092668</pubmed></ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] was subsequently wrongly identified as the insertion sequence flanking the [[wikipedia:Tetracycline_antibiotics|tetracycline resistance]] transposon [http://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn10-AF162223 Tn''10'']<ref><pubmed>1092669</pubmed></ref><ref><pubmed>383689</pubmed></ref>. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.  
+
IS''3'' and another member of this family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] were identified genetically as a DNA segments causing insertional inactivation of ''[[wikipedia:Gal_operon|gal]]'' and ''[[wikipedia:Lac_operon|lac]]'' operons and physically by [[wikipedia:Scanning_electron_microscope|electron microscopy]]<ref><nowiki><pubmed>4567156</pubmed></nowiki></ref> and in [[wikipedia:Fertility_factor_(bacteria)|plasmid F]] as a segment called alpha-beta<ref><nowiki><pubmed>1092667</pubmed></nowiki></ref><ref><nowiki><pubmed>1092668</pubmed></nowiki></ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] was subsequently wrongly identified as the insertion sequence flanking the [[wikipedia:Tetracycline_antibiotics|tetracycline resistance]] transposon [http://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn10-AF162223 Tn''10'']<ref><nowiki><pubmed>1092669</pubmed></nowiki></ref><ref><nowiki><pubmed>383689</pubmed></nowiki></ref>. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.  
  
 
===Presence in Compound Transposons===
 
===Presence in Compound Transposons===
Although IS''3'' family elements do participate in compound transposons (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) flanking the [[wikipedia:Citrate_test|Citrate Utilization]], to our knowledge there has been no systematic survey undertaken and very few IS''3''-associated compounds have been described to date. Several family members are part of compound transposons. These include: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] flanking genes for [[wikipedia:Citrate_test|citrate utilization]] in transposon Tn''3411''<ref><pubmed>6277857</pubmed></ref><ref><pubmed>2832386</pubmed></ref><ref><pubmed>6094480</pubmed></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS4521 IS''4521''] which flanks a heat stable enterotoxin gene in [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxinogenic ''Escherichia coli'']] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1706 IS''1706''], which flanks genes of the [[wikipedia:Clp_protease_family|Clp protease]]/[[wikipedia:Chaperone|chaperone]] family.
+
Although IS''3'' family elements do participate in compound transposons (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) flanking the [[wikipedia:Citrate_test|Citrate Utilization]], to our knowledge there has been no systematic survey undertaken and very few IS''3''-associated compounds have been described to date. Several family members are part of compound transposons. These include: [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] flanking genes for [[wikipedia:Citrate_test|citrate utilization]] in transposon Tn''3411''<ref><nowiki><pubmed>6277857</pubmed></nowiki></ref><ref><nowiki><pubmed>2832386</pubmed></nowiki></ref><ref><nowiki><pubmed>6094480</pubmed></nowiki></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS4521 IS''4521''] which flanks a heat stable enterotoxin gene in [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxinogenic ''Escherichia coli'']] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1706 IS''1706''], which flanks genes of the [[wikipedia:Clp_protease_family|Clp protease]]/[[wikipedia:Chaperone|chaperone]] family.
  
 
===Distribution===
 
===Distribution===
This is one of the most coherent, largest, most abundant and widely distributed IS families <ref>Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. </ref> (see <ref name=":42"><pubmed>26350305</pubmed></ref>). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.
+
This is one of the most coherent, largest, most abundant and widely distributed IS families <ref>Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002. </ref> (see <ref name=":42"><nowiki><pubmed>26350305</pubmed></nowiki></ref>). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.
  
For example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], (isolated from a ''[[wikipedia:Shigella_dysenteriae|Shigella dysenteriae]]'' phage λ lysogen by spontaneous insertion into the phage cI repressor gene<ref name=":0"><pubmed>2163395</pubmed>
+
For example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], (isolated from a ''[[wikipedia:Shigella_dysenteriae|Shigella dysenteriae]]'' phage λ lysogen by spontaneous insertion into the phage cI repressor gene<ref name=":0"><nowiki><pubmed>2163395</pubmed></nowiki>
</ref>) is present in multiple copies in the original host strain and in type strains of other ''[[wikipedia:Shigella|Shigella]]'' species. Two vestigial copies, both interrupted by a copy of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS30 IS''30''], were also detected in the chromosome of [[wikipedia:Escherichia_coli_in_molecular_biology#K-12|''E. coli'' K12]]<ref><pubmed>9278503</pubmed></ref> and could form transposition intermediates when supplied with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposase<ref><pubmed>9302015</pubmed></ref>. Entire or truncated [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies have also been identified in several ''[[wikipedia:Escherichia_coli|E. coli]]'' virulence plasmids (e.g. <ref><pubmed>10496929</pubmed></ref>), in pathogenicity islands of uropathogenic ''[[wikipedia:Escherichia_coli|E. coli]]'' (e.g. <ref><pubmed>8751923</pubmed></ref>), in various other clinical isolates of ''[[wikipedia:Escherichia_coli|E. coli]]'' and in a large number of well-known and less well-known enterobacteria such as ''[[wikipedia:Escherichia_fergusonii|Escherichia fergusonii]]'', ''[[wikipedia:Cronobacter|Chronobacter]]'', [[wikipedia:Dickeya|''Dickeya'']], ''[[wikipedia:Erwinia|Erwinia]]'', ''[[wikipedia:Klebsiella|Klebsiella]]'', ''[[wikipedia:Pantoea|Pantoea]]'', ''[[wikipedia:Shimwellia|Shimwellia]]'', and ''[[wikipedia:Yersinia|Yersinia]]''.
+
</ref>) is present in multiple copies in the original host strain and in type strains of other ''[[wikipedia:Shigella|Shigella]]'' species. Two vestigial copies, both interrupted by a copy of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS30 IS''30''], were also detected in the chromosome of [[wikipedia:Escherichia_coli_in_molecular_biology#K-12|''E. coli'' K12]]<ref><nowiki><pubmed>9278503</pubmed></nowiki></ref> and could form transposition intermediates when supplied with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposase<ref><nowiki><pubmed>9302015</pubmed></nowiki></ref>. Entire or truncated [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies have also been identified in several ''[[wikipedia:Escherichia_coli|E. coli]]'' virulence plasmids (e.g. <ref><nowiki><pubmed>10496929</pubmed></nowiki></ref>), in pathogenicity islands of uropathogenic ''[[wikipedia:Escherichia_coli|E. coli]]'' (e.g. <ref><nowiki><pubmed>8751923</pubmed></nowiki></ref>), in various other clinical isolates of ''[[wikipedia:Escherichia_coli|E. coli]]'' and in a large number of well-known and less well-known enterobacteria such as ''[[wikipedia:Escherichia_fergusonii|Escherichia fergusonii]]'', ''[[wikipedia:Cronobacter|Chronobacter]]'', [[wikipedia:Dickeya|''Dickeya'']], ''[[wikipedia:Erwinia|Erwinia]]'', ''[[wikipedia:Klebsiella|Klebsiella]]'', ''[[wikipedia:Pantoea|Pantoea]]'', ''[[wikipedia:Shimwellia|Shimwellia]]'', and ''[[wikipedia:Yersinia|Yersinia]]''.
  
Most IS''3'' family members have been identified in bacteria although at least one example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''], has also been identified in the archaea [[wikipedia:Methanosaeta_concilii|''Methanosaeta concilii'']]<ref><pubmed>17347521</pubmed></ref>. Since this archaeon is widespread in nature<ref><pubmed>17320399</pubmed></ref>, it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''] is active in its archaeal host.
+
Most IS''3'' family members have been identified in bacteria although at least one example, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''], has also been identified in the archaea [[wikipedia:Methanosaeta_concilii|''Methanosaeta concilii'']]<ref><nowiki><pubmed>17347521</pubmed></nowiki></ref>. Since this archaeon is widespread in nature<ref><nowiki><pubmed>17320399</pubmed></nowiki></ref>, it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISMco1 IS''Mco1''] is active in its archaeal host.
  
 
===Organization===
 
===Organization===
The family is quite homogenous in the organization [[:File:Fig. IS3.1.png|(Fig.IS3.1)]]. in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the [[wikipedia:Mycobacterium|Mycobacterial]] examples to 25% in those isolated from ''[[wikipedia:Mycoplasma|Mycoplasma]]'') and of the presence of members in hosts such as ''[[wikipedia:Mycoplasma|Mycoplasma]]'' with a non-universal genetic code (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138'']) or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISDvu3 IS''Dvu3''] from ''[[wikipedia:Desulfovibrio_vulgaris|Desulfovibrio vulgaris]]''). In the case of both copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], which participates in high frequency rearrangements of the ''[https://microbewiki.kenyon.edu/index.php/Mycoplasma_pulmonis Mycoplasma pulmonis]'' chromosome, the Tpase orf carries 11 '''UGA''' codons which are decoded as tryptophan<ref name=":1"><pubmed>8096321</pubmed>
+
The family is quite homogenous in the organization [[:File:Fig. IS3.1.png|(Fig.IS3.1)]]. in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the [[wikipedia:Mycobacterium|Mycobacterial]] examples to 25% in those isolated from ''[[wikipedia:Mycoplasma|Mycoplasma]]'') and of the presence of members in hosts such as ''[[wikipedia:Mycoplasma|Mycoplasma]]'' with a non-universal genetic code (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138'']) or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISDvu3 IS''Dvu3''] from ''[[wikipedia:Desulfovibrio_vulgaris|Desulfovibrio vulgaris]]''). In the case of both copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], which participates in high frequency rearrangements of the ''[https://microbewiki.kenyon.edu/index.php/Mycoplasma_pulmonis Mycoplasma pulmonis]'' chromosome, the Tpase orf carries 11 '''UGA''' codons which are decoded as tryptophan<ref name=":1"><nowiki><pubmed>8096321</pubmed></nowiki>
 
</ref>.
 
</ref>.
 
[[Image:Fig. IS3.1.png|thumb|center|820x820px|'''Fig. IS3.1'''. '''(A)''' Genetic organization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']. The 1,250-bp [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, ''orfA'' (blue) and ''orfB'' (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between ''orfA'' and ''orfB'', which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at
 
[[Image:Fig. IS3.1.png|thumb|center|820x820px|'''Fig. IS3.1'''. '''(A)''' Genetic organization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']. The 1,250-bp [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, ''orfA'' (blue) and ''orfB'' (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between ''orfA'' and ''orfB'', which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at
Line 26: Line 26:
 
IS''3''-family members generally have two consecutive and partially overlapping reading frames, ''orfA'' and ''orfB'', in relative translational reading phases 0 and -1, respectively [[:File:Fig. IS3.1.png|(Fig.IS3.1 A)]] under control of a weak promoter, pIRL, partially located in IRL ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]] and [[:File:Fig. IS3.3.png|Fig.IS3.3 C]]). The 5' end of ''orfB'' overlaps the 3' end of ''orfA'' and occurs in reading phase -1 relative to ''orfA'' [[:File:Fig. IS3.1.png|(Fig.IS3.1)]].  
 
IS''3''-family members generally have two consecutive and partially overlapping reading frames, ''orfA'' and ''orfB'', in relative translational reading phases 0 and -1, respectively [[:File:Fig. IS3.1.png|(Fig.IS3.1 A)]] under control of a weak promoter, pIRL, partially located in IRL ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]] and [[:File:Fig. IS3.3.png|Fig.IS3.3 C]]). The 5' end of ''orfB'' overlaps the 3' end of ''orfA'' and occurs in reading phase -1 relative to ''orfA'' [[:File:Fig. IS3.1.png|(Fig.IS3.1)]].  
  
It had been demonstrated in the 1990s that several family members ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":2"><pubmed>1653413</pubmed>
+
It had been demonstrated in the 1990s that several family members ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":2"><nowiki><pubmed>1653413</pubmed></nowiki>
</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref name=":3"><pubmed>8107082</pubmed>
+
</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref name=":3"><nowiki><pubmed>8107082</pubmed></nowiki>
</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":4"><pubmed>1660923</pubmed>
+
</ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":4"><nowiki><pubmed>1660923</pubmed></nowiki>
</ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":5"><pubmed>9302014</pubmed>
+
</ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":5"><nowiki><pubmed>9302014</pubmed></nowiki>
</ref>) express two major proteins [[:File:Fig. IS3.1.png|(Fig.IS3.1 B)]]: OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from ''orfA'' and ''orfB'' by '''P'''rogrammed -1 '''R'''ibosomal '''F'''rameshifting (PRF) (see "[[General Information/Transposase expression and activity#Programmed Translational Frameshifting|Programmed translational frameshifting]]")<ref><pubmed>8384687</pubmed></ref>. Many other members of this family are also organized in this way<ref name=":6"><pubmed>21673094</pubmed>
+
</ref>) express two major proteins [[:File:Fig. IS3.1.png|(Fig.IS3.1 B)]]: OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from ''orfA'' and ''orfB'' by '''P'''rogrammed -1 '''R'''ibosomal '''F'''rameshifting (PRF) (see "[[General Information/Transposase expression and activity#Programmed Translational Frameshifting|Programmed translational frameshifting]]")<ref><nowiki><pubmed>8384687</pubmed></nowiki></ref>. Many other members of this family are also organized in this way<ref name=":6"><nowiki><pubmed>21673094</pubmed></nowiki>
</ref><ref name=":7"><pubmed>24875478</pubmed></ref>. The frameshifting frequency varies from element to element. It is approximately 50% in the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'' and only 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''.
+
</ref><ref name=":7"><nowiki><pubmed>24875478</pubmed></nowiki></ref>. The frameshifting frequency varies from element to element. It is approximately 50% in the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'' and only 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''.
 
[[Image:Fig. IS3.3.png|thumb|center|780x780px|'''Fig. IS3.3.''' '''(A)''' Organization of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] inverted repeat ('''IR'''). The nucleotide sequence of IRL and '''IRR''' is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB [1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain.
 
[[Image:Fig. IS3.3.png|thumb|center|780x780px|'''Fig. IS3.3.''' '''(A)''' Organization of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] inverted repeat ('''IR'''). The nucleotide sequence of IRL and '''IRR''' is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB [1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain.
 
The dotted horizontal gray bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of '''pjunc''' (green boxes) are shown. The conserved '''5′''' TG tips are highlighted in red. '''(B)''' Organization of '''pjunc'''. The “junction” promoter assembled on the circularization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as green boxes. The initiating transcript nucleotide (+1 '''pjunc'''), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved '''5′''' TG tips are highlighted in red. '''(C)''' Secondary structure at the left [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end. The sequence of the “top” strand of '''IRL''' is shown, together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from '''pjunc''', or from pIRL respectively. The brackets are shown in ''italic'', simply permit the reader to identify the apical stem of the secondary structure.|alt=]]
 
The dotted horizontal gray bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of '''pjunc''' (green boxes) are shown. The conserved '''5′''' TG tips are highlighted in red. '''(B)''' Organization of '''pjunc'''. The “junction” promoter assembled on the circularization of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown as green boxes. The initiating transcript nucleotide (+1 '''pjunc'''), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved '''5′''' TG tips are highlighted in red. '''(C)''' Secondary structure at the left [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end. The sequence of the “top” strand of '''IRL''' is shown, together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from '''pjunc''', or from pIRL respectively. The brackets are shown in ''italic'', simply permit the reader to identify the apical stem of the secondary structure.|alt=]]
  
Complex internal inverted repeat sequences ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) (for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the [[wikipedia:Ribosome-binding_site|ribosome binding site]] for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] and [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet] pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) indeed results in a significant increase in translation. In the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this<ref name=":8"><pubmed>8107136</pubmed></ref>.  
+
Complex internal inverted repeat sequences ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) (for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the [[wikipedia:Ribosome-binding_site|ribosome binding site]] for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] and [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet] pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]) indeed results in a significant increase in translation. In the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this<ref name=":8"><nowiki><pubmed>8107136</pubmed></nowiki></ref>.  
  
 
===Formation of a strong transposase promoter===
 
===Formation of a strong transposase promoter===
In common with many IS of other families (e.g. [[IS Families/IS21 family|IS''21'']]<ref><pubmed>2540414</pubmed></ref>, [[IS Families/IS30 family|IS''30'']]<ref><pubmed>3039299</pubmed></ref>, [[IS Families/IS110 family|IS''110'']]<ref><pubmed>10438765</pubmed></ref><ref name=":9"><pubmed>11598022</pubmed></ref>) the IS''3'' family '''IRR''' carry an outward-directed -35 promoter hexamer while '''IRL''' carries an inward-directed -10 promoter component ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). These are assembled into a strong promoter, '''pJunc''', which serves to express high levels of transposition proteins ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]); ([[:File:Fig. IS3.4.png|Fig.IS3.4]]) in one of its key transposition intermediates, an excised transposon circle (see "[[General Information/Major Groups are Defined by the Type of Transposase They Use#Major DDE transposition pathways|Transposition Pathway]]"). Transcription initiation from '''pJunc''', like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]).
+
In common with many IS of other families (e.g. [[IS Families/IS21 family|IS''21'']]<ref><nowiki><pubmed>2540414</pubmed></nowiki></ref>, [[IS Families/IS30 family|IS''30'']]<ref><nowiki><pubmed>3039299</pubmed></nowiki></ref>, [[IS Families/IS110 family|IS''110'']]<ref><nowiki><pubmed>10438765</pubmed></nowiki></ref><ref name=":9"><nowiki><pubmed>11598022</pubmed></nowiki></ref>) the IS''3'' family '''IRR''' carry an outward-directed -35 promoter hexamer while '''IRL''' carries an inward-directed -10 promoter component ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). These are assembled into a strong promoter, '''pJunc''', which serves to express high levels of transposition proteins ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]); ([[:File:Fig. IS3.4.png|Fig.IS3.4]]) in one of its key transposition intermediates, an excised transposon circle (see "[[General Information/Major Groups are Defined by the Type of Transposase They Use#Major DDE transposition pathways|Transposition Pathway]]"). Transcription initiation from '''pJunc''', like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure ([[:File:Fig. IS3.3.png|Fig.IS3.3 C]]).
 
[[Image:Fig. IS3.4.png|thumb|center|680x680px|'''Fig. IS3.4.''' '''Left:''' Primer extension analysis of ''lac'' transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction.  '''Middle:''' Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. '''Right:''' Colonies on MacConkey lactose plates.|alt=]]
 
[[Image:Fig. IS3.4.png|thumb|center|680x680px|'''Fig. IS3.4.''' '''Left:''' Primer extension analysis of ''lac'' transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction.  '''Middle:''' Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. '''Right:''' Colonies on MacConkey lactose plates.|alt=]]
  
 
===Regulation by Methylation?===
 
===Regulation by Methylation?===
Several members carry GATC [[wikipedia:DNA_methylation|methylation]] sites within 50bp of their ends, which have been shown in one case, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], to modulate transposition activity<ref name=":10"><pubmed>1645443</pubmed></ref>, however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.  
+
Several members carry GATC [[wikipedia:DNA_methylation|methylation]] sites within 50bp of their ends, which have been shown in one case, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], to modulate transposition activity<ref name=":10"><nowiki><pubmed>1645443</pubmed></nowiki></ref>, however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.  
  
 
===Insertion specificity===
 
===Insertion specificity===
There appears to be little sequence specificity for insertion of members of the family. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] exhibits a preference for a region of [[wikipedia:P1_phage|bacteriophage P1]] but the basis of this preference is at present unknown<ref><pubmed>3035338</pubmed></ref>. Both [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":11"><pubmed>8106332</pubmed></ref> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] <ref>Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993. </ref> have been found next to sequences which resemble their IRs (see “[[IS Families/IS3 family#Targeted Insertion|Targeted Insertion]]”) and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1397 IS''1397''] is invariably located within intergenic repeated sequences in ''[[wikipedia:Escherichia_coli|E. coli]]'' ('''B'''acterial '''I'''nterspersed '''M'''osaic '''E'''lements or BIMEs<ref><pubmed>9055066</pubmed></ref>.
+
There appears to be little sequence specificity for insertion of members of the family. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] exhibits a preference for a region of [[wikipedia:P1_phage|bacteriophage P1]] but the basis of this preference is at present unknown<ref><nowiki><pubmed>3035338</pubmed></nowiki></ref>. Both [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']<ref name=":11"><nowiki><pubmed>8106332</pubmed></nowiki></ref> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] <ref>Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993. </ref> have been found next to sequences which resemble their IRs (see “[[IS Families/IS3 family#Targeted Insertion|Targeted Insertion]]”) and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1397 IS''1397''] is invariably located within intergenic repeated sequences in ''[[wikipedia:Escherichia_coli|E. coli]]'' ('''B'''acterial '''I'''nterspersed '''M'''osaic '''E'''lements or BIMEs<ref><nowiki><pubmed>9055066</pubmed></nowiki></ref>.
  
 
===Group II intron insertions===
 
===Group II intron insertions===
Finally, an element isolated from the [https://pubmed.ncbi.nlm.nih.gov/6363394/?dopt=Abstract ECOR collection] of ''[[wikipedia:Escherichia_coli|E. coli]]'' and closely related to [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] carries a [[wikipedia:Group_II_intron|group II intron]]<ref><pubmed>7994604</pubmed></ref>. The effect of this on regulation of transposition of this element has not been investigated.  
+
Finally, an element isolated from the [https://pubmed.ncbi.nlm.nih.gov/6363394/?dopt=Abstract ECOR collection] of ''[[wikipedia:Escherichia_coli|E. coli]]'' and closely related to [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''] carries a [[wikipedia:Group_II_intron|group II intron]]<ref><nowiki><pubmed>7994604</pubmed></nowiki></ref>. The effect of this on regulation of transposition of this element has not been investigated.  
  
 
===IS3 family subgroups===
 
===IS3 family subgroups===
The IS''3'' family is divided into five subgroups ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]; [[:File:1.4.2.png|Fig.4.2]]). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences<ref name=":12"><pubmed>9729608</pubmed>
+
The IS''3'' family is divided into five subgroups ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]; [[:File:1.4.2.png|Fig.4.2]]). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences<ref name=":12"><nowiki><pubmed>9729608</pubmed></nowiki>
 
</ref> ([[:File:Fig. IS3.5.png|Fig.IS3.5]]). These are: the '''IS''2'' and IS''407'' subgroups''' (which appear closely related), and the '''IS''3'', IS''51'', and IS''150'' subgroups'''.  
 
</ref> ([[:File:Fig. IS3.5.png|Fig.IS3.5]]). These are: the '''IS''2'' and IS''407'' subgroups''' (which appear closely related), and the '''IS''3'', IS''51'', and IS''150'' subgroups'''.  
  
Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein<ref name=":12" />. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the '''IS''2'' subgroup''', insertion invariably leads to a 5 bp DR; for the '''IS''407'' subgroup''' a 4 bp DR is observed; while for the other groups a 3 bp DR is generated ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]). In the latter cases some of the elements, e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal '''IRs''' ([[:File:Fig. IS3.2.png|Fig.IS3.2]]) and is particularly marked in the '''IS''2'', IS''51'' and IS''407'' subgroups'''. It can also be observed in the primary sequence details of the putative leucine zipper<ref name=":13"><pubmed>10677279</pubmed></ref>.
+
Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein<ref name=":12" />. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the '''IS''2'' subgroup''', insertion invariably leads to a 5 bp DR; for the '''IS''407'' subgroup''' a 4 bp DR is observed; while for the other groups a 3 bp DR is generated ([[General Information/What Is an IS?#Characteristics of insertion sequence families|Table Characteristics of IS families]]). In the latter cases some of the elements, e.g. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal '''IRs''' ([[:File:Fig. IS3.2.png|Fig.IS3.2]]) and is particularly marked in the '''IS''2'', IS''51'' and IS''407'' subgroups'''. It can also be observed in the primary sequence details of the putative leucine zipper<ref name=":13"><nowiki><pubmed>10677279</pubmed></nowiki></ref>.
 
[[Image:Fig. IS3.5.png|thumb|center|680x680px|'''Fig. IS3.5.'''  '''Relationship of OrfA and OrfB in various IS''3'' family groups.''' Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups, showing that both A and B frames are largely group-specific.|alt=]]
 
[[Image:Fig. IS3.5.png|thumb|center|680x680px|'''Fig. IS3.5.'''  '''Relationship of OrfA and OrfB in various IS''3'' family groups.''' Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups, showing that both A and B frames are largely group-specific.|alt=]]
  
Line 61: Line 61:
  
 
===Mycoplasma and the non-universal genetic code===
 
===Mycoplasma and the non-universal genetic code===
Family members from ''[[wikipedia:Mycoplasma|Mycoplasma]]'' merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see <ref><pubmed>1579111</pubmed></ref>, but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS''3'' family members have been observed in [[wikipedia:Mycoplasma|''Mycoplasma'']]. Of these, only [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''] (and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138b IS''1138b'']) has been demonstrated directly to undergo autonomous transposition<ref name=":1" />. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS''3'' family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. As indicated above, one of these carries a single long reading frame (representing ''orfA'' + ''orfB'') instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of ''orfA'', ''orfB'', or both. Expression studies in ''[[wikipedia:Escherichia_coli|E. coli]]'' indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single ''orfAB'' frame by [[wikipedia:Ribosomal_frameshift|translational frameshifting]], representing an "inverted" expression pattern to the majority of the family members<ref name=":14"><pubmed>7476162</pubmed>
+
Family members from ''[[wikipedia:Mycoplasma|Mycoplasma]]'' merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see <ref><nowiki><pubmed>1579111</pubmed></nowiki></ref>, but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS''3'' family members have been observed in [[wikipedia:Mycoplasma|''Mycoplasma'']]. Of these, only [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''] (and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138b IS''1138b'']) has been demonstrated directly to undergo autonomous transposition<ref name=":1" />. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS''3'' family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. As indicated above, one of these carries a single long reading frame (representing ''orfA'' + ''orfB'') instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of ''orfA'', ''orfB'', or both. Expression studies in ''[[wikipedia:Escherichia_coli|E. coli]]'' indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1221 IS''1221'']. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single ''orfAB'' frame by [[wikipedia:Ribosomal_frameshift|translational frameshifting]], representing an "inverted" expression pattern to the majority of the family members<ref name=":14"><nowiki><pubmed>7476162</pubmed></nowiki>
 
</ref>. Although this appears not to be a general rule for IS''3'' family members originating from ''Mycoplasma'' hosts, the presence of a similar single-frame arrangement in a second member, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.  
 
</ref>. Although this appears not to be a general rule for IS''3'' family members originating from ''Mycoplasma'' hosts, the presence of a similar single-frame arrangement in a second member, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1138 IS''1138''], indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.  
  
Line 68: Line 68:
  
 
===An additional subgroup===
 
===An additional subgroup===
Recently, an additional subgroup has been proposed which includes [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISPpy1 IS''Ppy1'']<ref><pubmed>23832000</pubmed></ref>. However, all members belong to the IS''150'' subgroup and their Tpases are not separated by our standard multiple alignments and [https://micans.org/mcl/ MCL analysis]. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and members of other IS''3'' subgroups.
+
Recently, an additional subgroup has been proposed which includes [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISPpy1 IS''Ppy1'']<ref><nowiki><pubmed>23832000</pubmed></nowiki></ref>. However, all members belong to the IS''150'' subgroup and their Tpases are not separated by our standard multiple alignments and [https://micans.org/mcl/ MCL analysis]. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and members of other IS''3'' subgroups.
  
 
===Mechanism===
 
===Mechanism===
 
====Transposition Proteins====
 
====Transposition Proteins====
Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements<ref name=":15"><pubmed>8302872</pubmed>
+
Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements<ref name=":15"><nowiki><pubmed>8302872</pubmed></nowiki>
</ref><ref name=":16"><pubmed>1963920</pubmed>
+
</ref><ref name=":16"><nowiki><pubmed>1963920</pubmed></nowiki>
</ref><ref name=":17"><pubmed>1647013</pubmed>
+
</ref><ref name=":17"><nowiki><pubmed>1647013</pubmed></nowiki>
</ref><ref name=":18"><pubmed>1850126</pubmed></ref><ref name=":19"><pubmed>7934941</pubmed></ref> provided insights into structure/function relationships of the proteins ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).  
+
</ref><ref name=":18"><nowiki><pubmed>1850126</pubmed></nowiki></ref><ref name=":19"><nowiki><pubmed>7934941</pubmed></nowiki></ref> provided insights into structure/function relationships of the proteins ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]).  
  
 
====OrfA====
 
====OrfA====
OrfA is small. For [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS''3'' family members exhibit a similarly placed HTH signature (see for example <ref name=":0" /><ref><pubmed>9435062</pubmed></ref>) which initially suggested that they might provide sequence-specific binding to the terminal '''IRs''' of their particular IS<ref name=":40"><pubmed>2841644</pubmed></ref> involved in sequence-specific binding of the transposase to the terminal '''IRs''' OrfAB which was subsequently confirmed experimentally<ref name=":20"><pubmed>14981152</pubmed></ref>. They also carry a C-terminal [[wikipedia:Leucine_zipper|leucine zipper]] (LZ) motif first identified in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] and which appears to be conserved in the majority of known members<ref name=":21"><pubmed>9761671</pubmed></ref> and is involved in protein multimerization<ref name=":0" /><ref name=":4" /><ref name=":13" /><ref name=":21" />.
+
OrfA is small. For [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS''3'' family members exhibit a similarly placed HTH signature (see for example <ref name=":0" /><ref><nowiki><pubmed>9435062</pubmed></nowiki></ref>) which initially suggested that they might provide sequence-specific binding to the terminal '''IRs''' of their particular IS<ref name=":40"><nowiki><pubmed>2841644</pubmed></nowiki></ref> involved in sequence-specific binding of the transposase to the terminal '''IRs''' OrfAB which was subsequently confirmed experimentally<ref name=":20"><nowiki><pubmed>14981152</pubmed></nowiki></ref>. They also carry a C-terminal [[wikipedia:Leucine_zipper|leucine zipper]] (LZ) motif first identified in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] and which appears to be conserved in the majority of known members<ref name=":21"><nowiki><pubmed>9761671</pubmed></nowiki></ref> and is involved in protein multimerization<ref name=":0" /><ref name=":4" /><ref name=":13" /><ref name=":21" />.
  
 
====OrfB====
 
====OrfB====
The OrfB products carry a DD(35)E catalytic motif and share additional identities with [[wikipedia:Integrase|retroviral integrases]] and various other Tpases<ref name=":4" /><ref name=":15" /><ref name=":16" /><ref name=":17" /><ref name=":18" /><ref name=":19" /><ref><pubmed>10547692</pubmed></ref>. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.  
+
The OrfB products carry a DD(35)E catalytic motif and share additional identities with [[wikipedia:Integrase|retroviral integrases]] and various other Tpases<ref name=":4" /><ref name=":15" /><ref name=":16" /><ref name=":17" /><ref name=":18" /><ref name=":19" /><ref><nowiki><pubmed>10547692</pubmed></nowiki></ref>. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.  
  
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels<ref name=":4" /><ref name=":22"><pubmed>10064703</pubmed></ref> and is modulated by the level of initiation factor IF3<ref name=":23"><pubmed>21478364</pubmed></ref>.
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels<ref name=":4" /><ref name=":22"><nowiki><pubmed>10064703</pubmed></nowiki></ref> and is modulated by the level of initiation factor IF3<ref name=":23"><nowiki><pubmed>21478364</pubmed></nowiki></ref>.
  
OrfB has been observed for: IS''3<ref name=":3" />'' ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] & [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet], unpublished), [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''<ref name=":22" /><ref name=":23" /> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']/[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']<ref name=":24"><pubmed>18474594</pubmed>
+
OrfB has been observed for: IS''3<ref name=":3" />'' ([https://scholar.google.com/citations?user=dDU8ukUAAAAJ&hl=en Prère] & [https://scholar.google.com/citations?user=wAxcf14AAAAJ&hl=en Fayet], unpublished), [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']''<ref name=":2" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']''<ref name=":4" />''<ref name=":22" /><ref name=":23" /> and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']/[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']<ref name=":24"><nowiki><pubmed>18474594</pubmed></nowiki>
</ref><ref><pubmed>16731525</pubmed></ref> but not for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":25"><pubmed>8824609</pubmed></ref>. It is generally present at quite low levels although for IS''3'' approximately equal amounts of OrfB and OrfAB appear to be produced<ref name=":3" />. The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.
+
</ref><ref><nowiki><pubmed>16731525</pubmed></nowiki></ref> but not for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']<ref name=":25"><nowiki><pubmed>8824609</pubmed></nowiki></ref>. It is generally present at quite low levels although for IS''3'' approximately equal amounts of OrfB and OrfAB appear to be produced<ref name=":3" />. The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.
  
 
Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS''3'' family members through translational coupling: the stop codon of ''orfA'' overlaps with a potential ''orfB'' start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed<ref name=":6" />.
 
Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS''3'' family members through translational coupling: the stop codon of ''orfA'' overlaps with a potential ''orfB'' start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed<ref name=":6" />.
Line 92: Line 92:
 
It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames ''orfA'' and ''orfB'' (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]).  
 
It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames ''orfA'' and ''orfB'' (see [[General Information/Transposase expression and activity#Programmed Translational Frameshifting|"Programmed translational frameshifting"]]).  
  
The OrfB amino acid sequence shares significant similarities with [[wikipedia:Integrase|retroviral integrases]], an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS''3'' family members and to many of this type of phophoryltransferase enzymes<ref name=":16" /><ref><pubmed>1314954</pubmed></ref>. This constitutes part of the active site (for reviews see: <ref name=":40" /><ref name=":22" />).
+
The OrfB amino acid sequence shares significant similarities with [[wikipedia:Integrase|retroviral integrases]], an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS''3'' family members and to many of this type of phophoryltransferase enzymes<ref name=":16" /><ref><nowiki><pubmed>1314954</pubmed></nowiki></ref>. This constitutes part of the active site (for reviews see: <ref name=":40" /><ref name=":22" />).
  
 
OrfB carries neither the [[wikipedia:Helix-turn-helix|HTH]] nor the [[wikipedia:Leucine_zipper|LZ motif]].
 
OrfB carries neither the [[wikipedia:Helix-turn-helix|HTH]] nor the [[wikipedia:Leucine_zipper|LZ motif]].
Line 105: Line 105:
 
Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]])..  
 
Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]])..  
  
IS''3''-family frameshifting is similar to that used in some retroviruses to generate the [https://www.wikigenes.org/e/gene/e/155348.html pol-gag "polyprotein"]<ref><pubmed>7636469</pubmed></ref> and in the ''[https://www.wikigenes.org/e/gene/e/945105.html dnaX]'' gene of ''[[wikipedia:Escherichia_coli|E. coli]]'' to synthesize γ the sub-unit of [[wikipedia:DNA_polymerase|DNA polymerase]] III<ref name=":26"><pubmed>1547945</pubmed></ref>.
+
IS''3''-family frameshifting is similar to that used in some retroviruses to generate the [https://www.wikigenes.org/e/gene/e/155348.html pol-gag "polyprotein"]<ref><nowiki><pubmed>7636469</pubmed></nowiki></ref> and in the ''[https://www.wikigenes.org/e/gene/e/945105.html dnaX]'' gene of ''[[wikipedia:Escherichia_coli|E. coli]]'' to synthesize γ the sub-unit of [[wikipedia:DNA_polymerase|DNA polymerase]] III<ref name=":26"><nowiki><pubmed>1547945</pubmed></nowiki></ref>.
  
The relevant [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] sequences involved in frameshifting are shown in ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]]). Examples of frameshifting sequences from other members of the family are shown in [[:File:Fig. IS3.6.png|Fig.IS3.6]]. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since ''[[wikipedia:Escherichia_coli|E. coli]]'' does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position<ref><pubmed>3860833</pubmed></ref> probably because modifications of U34 increase the rigidity of the anticodon<ref><pubmed>11027137</pubmed></ref>. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA<ref name=":26" /><ref><pubmed>12970189</pubmed></ref>. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome<ref name=":7" /> and, although the most common motif is A6G, different members of the IS''3'' family carry a variety of these (e.g. A3G for IS''3''; see [https://www.springer.com/gp/book/9780387893815 Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression], Springer 2010).
+
The relevant [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] sequences involved in frameshifting are shown in ([[:File:Fig. IS3.1.png|Fig.IS3.1 C]]). Examples of frameshifting sequences from other members of the family are shown in [[:File:Fig. IS3.6.png|Fig.IS3.6]]. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since ''[[wikipedia:Escherichia_coli|E. coli]]'' does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position<ref><nowiki><pubmed>3860833</pubmed></nowiki></ref> probably because modifications of U34 increase the rigidity of the anticodon<ref><nowiki><pubmed>11027137</pubmed></nowiki></ref>. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA<ref name=":26" /><ref><nowiki><pubmed>12970189</pubmed></nowiki></ref>. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome<ref name=":7" /> and, although the most common motif is A6G, different members of the IS''3'' family carry a variety of these (e.g. A3G for IS''3''; see [https://www.springer.com/gp/book/9780387893815 Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression], Springer 2010).
 
[[Image:Fig. IS3.6.png|thumb|center|780x780px|'''Fig. IS3.6.'''  '''Signals and predicted branched stem-loop structures in the frameshift regions of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']''.''''' This figure, adapted from Sharma et al., 2014 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']), Mazauric et al., 2008 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) and Mejlhede et al., 2004 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases, but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).|alt=]]
 
[[Image:Fig. IS3.6.png|thumb|center|780x780px|'''Fig. IS3.6.'''  '''Signals and predicted branched stem-loop structures in the frameshift regions of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']''.''''' This figure, adapted from Sharma et al., 2014 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']), Mazauric et al., 2008 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']) and Mejlhede et al., 2004 ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases, but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).|alt=]]
  
 
Two similarly located partially overlapping reading frames in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />'' also produce three proteins. The transposases, OrfAB, like that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], are fusion products of the two orfs generated by a –1 translational frameshift.  
 
Two similarly located partially overlapping reading frames in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />'' also produce three proteins. The transposases, OrfAB, like that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], are fusion products of the two orfs generated by a –1 translational frameshift.  
  
For IS''3'', frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding<ref><pubmed>18621088</pubmed></ref>. In [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA<ref name=":24" />. Two similarly arranged orfs occur in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and have been shown to encode OrfA and OrfAB equivalents only<ref name=":8" /><ref name=":25" />. This organization is observed in most members of the IS''3'' family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS51 IS''51''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS222 IS''222''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1133 IS''1133''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']).  
+
For IS''3'', frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding<ref><nowiki><pubmed>18621088</pubmed></nowiki></ref>. In [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411''], -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA<ref name=":24" />. Two similarly arranged orfs occur in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] and have been shown to encode OrfA and OrfAB equivalents only<ref name=":8" /><ref name=":25" />. This organization is observed in most members of the IS''3'' family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS51 IS''51''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS222 IS''222''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1133 IS''1133''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1222 IS''1222'']).  
  
 
The frequency of frameshifting is quite variable from element to element: reported values are 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], 50% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''], 6% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] and 2% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />''. These values may not reflect the ''in vivo'' situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system<ref name=":2" /><ref name=":3" /><ref name=":4" />.
 
The frequency of frameshifting is quite variable from element to element: reported values are 15% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], 50% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150''], 6% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] and 2% for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3411 IS''3411'']''<ref name=":24" />''. These values may not reflect the ''in vivo'' situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system<ref name=":2" /><ref name=":3" /><ref name=":4" />.
  
The level of formation of a circular [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition intermediate [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] carrying abutted left and right ends to generate an '''IRR-IRL junction''' ([[IS Families/IS3 family#The Transposition Pathway|Transposition Pathway]]) measured by PCR indeed depends on frameshifting frequency ''in vivo''<ref name=":27"><pubmed>12586397</pubmed>
+
The level of formation of a circular [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition intermediate [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] carrying abutted left and right ends to generate an '''IRR-IRL junction''' ([[IS Families/IS3 family#The Transposition Pathway|Transposition Pathway]]) measured by PCR indeed depends on frameshifting frequency ''in vivo''<ref name=":27"><nowiki><pubmed>12586397</pubmed></nowiki>
 
</ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] they resulted in comparable reductions in a circle formation.
 
</ref>. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] they resulted in comparable reductions in a circle formation.
  
Line 122: Line 122:
  
 
====Artificial ''orfA''-''orfB'' fusion====
 
====Artificial ''orfA''-''orfB'' fusion====
For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the ''orfA'' and ''orfB'' frames and eliminates OrfA production<ref name=":4" />. It was initially difficult to construct this mutant in the context of an entire [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency<ref name=":28"><pubmed>22195971</pubmed>
+
For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the ''orfA'' and ''orfB'' frames and eliminates OrfA production<ref name=":4" />. It was initially difficult to construct this mutant in the context of an entire [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency<ref name=":28"><nowiki><pubmed>22195971</pubmed></nowiki>
 
</ref>. A similar mutant in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] results in a high frequency of adjacent deletions<ref name=":3" />.
 
</ref>. A similar mutant in [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] results in a high frequency of adjacent deletions<ref name=":3" />.
  
Line 131: Line 131:
 
[[Image:Fig. IS3.7A.png|thumb|center|640x640px|'''Fig. IS3.7A.'''  Sequence alignments of the HTH motif.  '''Top.'''  Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS''3'' family with that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''].  Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. '''Bottom'''. An expanded view of the IS''911'' HTH motif with (below) mutated resides used in defining DNA binding functions.|alt=]]
 
[[Image:Fig. IS3.7A.png|thumb|center|640x640px|'''Fig. IS3.7A.'''  Sequence alignments of the HTH motif.  '''Top.'''  Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS''3'' family with that of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''].  Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. '''Bottom'''. An expanded view of the IS''911'' HTH motif with (below) mutated resides used in defining DNA binding functions.|alt=]]
  
Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see <ref name=":14" /> <ref><pubmed>8520113</pubmed></ref><ref><pubmed>7496528</pubmed></ref>). Studies with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] indicate that this is a multimerization domain of the proteins<ref name=":13" /><ref name=":21" /><ref><pubmed>9335268</pubmed></ref>. The [[wikipedia:Leucine_zipper|LZ motif]] of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is composed of four heptameric units ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure<ref name=":21" />. [[wikipedia:Leucine_zipper|Leucine zipper motif]] are found in most IS''3'' family members ([[:File:Fig. IS3.7C.png|Fig.IS3.7 C]]).
+
Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see <ref name=":14" /> <ref><nowiki><pubmed>8520113</pubmed></nowiki></ref><ref><nowiki><pubmed>7496528</pubmed></nowiki></ref>). Studies with [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] indicate that this is a multimerization domain of the proteins<ref name=":13" /><ref name=":21" /><ref><nowiki><pubmed>9335268</pubmed></nowiki></ref>. The [[wikipedia:Leucine_zipper|LZ motif]] of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is composed of four heptameric units ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure<ref name=":21" />. [[wikipedia:Leucine_zipper|Leucine zipper motif]] are found in most IS''3'' family members ([[:File:Fig. IS3.7C.png|Fig.IS3.7 C]]).
 
[[Image:Fig. IS3.7B.png|thumb|center|780x780px|'''Fig. IS3.7B.''' '''A)''' OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the [[wikipedia:Helix-turn-helix|helix-turn-helix]] (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the [[wikipedia:Leucine_zipper|LZ motif]] with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in [[wikipedia:Leucine_zipper|LZ]] sequence resulting from frameshifting between OrfA and OrfAB. '''B)''' A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.|alt=]]
 
[[Image:Fig. IS3.7B.png|thumb|center|780x780px|'''Fig. IS3.7B.''' '''A)''' OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the [[wikipedia:Helix-turn-helix|helix-turn-helix]] (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the [[wikipedia:Leucine_zipper|LZ motif]] with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in [[wikipedia:Leucine_zipper|LZ]] sequence resulting from frameshifting between OrfA and OrfAB. '''B)''' A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.|alt=]]
 
[[Image:Fig. IS3.7C.png|thumb|center|640x640px|'''Fig. IS3.7C.'''  '''Conservation of the [[wikipedia:Leucine_zipper|leucine zipper motif]] throughout the different IS''3'' family subgroups.''' Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS''3'' families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.|alt=]]
 
[[Image:Fig. IS3.7C.png|thumb|center|640x640px|'''Fig. IS3.7C.'''  '''Conservation of the [[wikipedia:Leucine_zipper|leucine zipper motif]] throughout the different IS''3'' family subgroups.''' Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS''3'' families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.|alt=]]
Line 137: Line 137:
 
OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers<ref name=":13" /><ref name=":21" />.  
 
OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers<ref name=":13" /><ref name=":21" />.  
  
Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates ''in vivo'' and ''in vitro'' <ref><pubmed>9761671</pubmed> </ref> ([[IS Families/IS3 family#The Transposition Pathway|Transposition Cycle]]) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, '''M''', located between residues 109 and 135 ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) and components in the catalytic domain of OrfAB are also involved in its multimerization.  
+
Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates ''in vivo'' and ''in vitro'' <ref><nowiki><pubmed>9761671</pubmed></nowiki> </ref> ([[IS Families/IS3 family#The Transposition Pathway|Transposition Cycle]]) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads ([[:File:Fig. IS3.7B.png|Fig.IS3.7 B]]). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, '''M''', located between residues 109 and 135 ([[:File:Fig. IS3.1.png|Fig.IS3.1 B]]) and components in the catalytic domain of OrfAB are also involved in its multimerization.  
  
 
====Co-translational DNA binding====
 
====Co-translational DNA binding====
Line 148: Line 148:
  
 
====Co-translational multimerisation====
 
====Co-translational multimerisation====
An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "[[IS Families/IS3 family#The IS911 transpososome|The IS''911'' transpososome]]" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using ''[https://www.uniprot.org/uniprot/Q91UU4 luxA]'' and ''[https://www.uniprot.org/uniprot/Q56822 luxB]'' as a model system, it been shown that ''luxA''/''B'' subunit assembly initiates cotranslationally on nascent [https://www.uniprot.org/uniprot/Q56822 LuxB] ''in vivo''. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”<ref><pubmed>26405228</pubmed></ref>.
+
An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "[[IS Families/IS3 family#The IS911 transpososome|The IS''911'' transpososome]]" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using ''[https://www.uniprot.org/uniprot/Q91UU4 luxA]'' and ''[https://www.uniprot.org/uniprot/Q56822 luxB]'' as a model system, it been shown that ''luxA''/''B'' subunit assembly initiates cotranslationally on nascent [https://www.uniprot.org/uniprot/Q56822 LuxB] ''in vivo''. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”<ref><nowiki><pubmed>26405228</pubmed></nowiki></ref>.
  
 
====The IS''911'' transpososome====
 
====The IS''911'' transpososome====
A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility<ref><pubmed>21439812</pubmed></ref><ref><pubmed>23217365</pubmed></ref><ref><pubmed>16181782</pubmed></ref>. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.
+
A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility<ref><nowiki><pubmed>21439812</pubmed></nowiki></ref><ref><nowiki><pubmed>23217365</pubmed></nowiki></ref><ref><nowiki><pubmed>16181782</pubmed></nowiki></ref>. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.
  
 
The overall [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition pathway is a two-step process, involving replicative excision followed by insertion ([[:File:Fig. IS3.9A.png|Fig.IS3.9 A]] and [[:File:Fig. IS3.9B.png|9B]]). This implies consecutive assembly of two types of transpososome: one implicated in IS excision ('''synaptic complex A'''; SCA) and includes both IS ends while the other ('''synaptic complex B'''; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.
 
The overall [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition pathway is a two-step process, involving replicative excision followed by insertion ([[:File:Fig. IS3.9A.png|Fig.IS3.9 A]] and [[:File:Fig. IS3.9B.png|9B]]). This implies consecutive assembly of two types of transpososome: one implicated in IS excision ('''synaptic complex A'''; SCA) and includes both IS ends while the other ('''synaptic complex B'''; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.
Line 160: Line 160:
 
Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)<ref name=":13" /> equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation<ref name=":13" /><ref name=":21" />. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself <ref name=":13" />.
 
Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)<ref name=":13" /> equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation<ref name=":13" /><ref name=":21" />. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself <ref name=":13" />.
  
DNaseI and Copper [[wikipedia:Phenanthroline|phenanthroline]] footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left ('''IRL''') and right ('''IRR''') ends ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]]). DNA binding assays ''in vitro'' and measurement of ''in vivo'' recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site<ref><pubmed>11352577</pubmed></ref>.
+
DNaseI and Copper [[wikipedia:Phenanthroline|phenanthroline]] footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left ('''IRL''') and right ('''IRR''') ends ([[:File:Fig. IS3.1.png|Fig.IS3.1 A]]). DNA binding assays ''in vitro'' and measurement of ''in vivo'' recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site<ref><nowiki><pubmed>11352577</pubmed></nowiki></ref>.
  
SCA is composed of a dimer of transposase bridging to two IR<ref name=":29"><pubmed>20553579</pubmed>
+
SCA is composed of a dimer of transposase bridging to two IR<ref name=":29"><nowiki><pubmed>20553579</pubmed></nowiki>
 
</ref>, as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation ([[:File:Fig. IS3.9A.png|Fig.IS3.4]])<ref name=":29" /> as studied at the single molecule level by [[wikipedia:Atomic_force_microscopy|Atomic Force Microscopy]] (AFM) using asymmetric IRR-carrying DNA fragments.
 
</ref>, as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation ([[:File:Fig. IS3.9A.png|Fig.IS3.4]])<ref name=":29" /> as studied at the single molecule level by [[wikipedia:Atomic_force_microscopy|Atomic Force Microscopy]] (AFM) using asymmetric IRR-carrying DNA fragments.
  
SCA assembly was also studied using a second single-molecule approach: [[wikipedia:Tethered_particle_motion|tethered particle motion]] (TPM) ([[:File:Fig. IS3.10.png|Fig.IS3.10]])<ref><pubmed>15155821</pubmed></ref> in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end ([[:File:Fig. IS3.10.png|Fig.IS3.10]] left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both '''IRs'''. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted '''IR'''. This is compatible with a model in which OrfAB binds separately to and bends each '''IR''' and protein-protein interactions then lead to SCA formation ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]])<ref><pubmed>16923775</pubmed></ref>. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid ([[:File:Fig. IS3.9B.png|Fig.IS3.9 C]]) (see "[[IS Families/IS3 family#The Transposition Pathway|The Transposition Pathway]]").
+
SCA assembly was also studied using a second single-molecule approach: [[wikipedia:Tethered_particle_motion|tethered particle motion]] (TPM) ([[:File:Fig. IS3.10.png|Fig.IS3.10]])<ref><nowiki><pubmed>15155821</pubmed></nowiki></ref> in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end ([[:File:Fig. IS3.10.png|Fig.IS3.10]] left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both '''IRs'''. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced ([[:File:Fig. IS3.10.png|Fig.IS3.10]] middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted '''IR'''. This is compatible with a model in which OrfAB binds separately to and bends each '''IR''' and protein-protein interactions then lead to SCA formation ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]])<ref><nowiki><pubmed>16923775</pubmed></nowiki></ref>. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid ([[:File:Fig. IS3.9B.png|Fig.IS3.9 C]]) (see "[[IS Families/IS3 family#The Transposition Pathway|The Transposition Pathway]]").
 
[[Image:Fig. IS3.10.png|thumb|center|690x690px|'''Fig. IS3.10.''' IR pairing by Tethered Particle Motion. The figure is adapted from Pouget et al., 2006|alt=]]
 
[[Image:Fig. IS3.10.png|thumb|center|690x690px|'''Fig. IS3.10.''' IR pairing by Tethered Particle Motion. The figure is adapted from Pouget et al., 2006|alt=]]
  
 
====Insertion synaptic complex SCB====
 
====Insertion synaptic complex SCB====
SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed ([[:File:Fig. IS3.11.png|Fig.IS3.11 B]]). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively<ref name=":30"><pubmed>17367389</pubmed></ref>. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circle), the product of the replicative [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] excision. This IR targeted insertion explains how the original isolate of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] might have occurred next to a sequence which strongly resembles an IR<ref name=":0" /> and can also explain one ended insertion<ref name=":11" />. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11]]).  
+
SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed ([[:File:Fig. IS3.11.png|Fig.IS3.11 B]]). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively<ref name=":30"><nowiki><pubmed>17367389</pubmed></nowiki></ref>. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circle), the product of the replicative [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] excision. This IR targeted insertion explains how the original isolate of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] might have occurred next to a sequence which strongly resembles an IR<ref name=":0" /> and can also explain one ended insertion<ref name=":11" />. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11]]).  
 
[[Image:Fig. IS3.11.png|thumb|center|780x780px|'''Fig. IS3.11.''' Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition cycle.  
 
[[Image:Fig. IS3.11.png|thumb|center|780x780px|'''Fig. IS3.11.''' Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition cycle.  
 
The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. '''IRs''' are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in '''(A)'''. In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two '''IRs'''. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.|alt=]]
 
The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. '''IRs''' are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in '''(A)'''. In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two '''IRs'''. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.|alt=]]
  
SCBnt is thought to differ from both SCA and SCBt and to include the second [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB<ref name=":13" /><ref name=":21" />, is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt<ref name=":30" /><ref><pubmed>18586933</pubmed></ref>. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an ''in vitro'' reaction<ref name=":31"><pubmed>9463394</pubmed></ref>.
+
SCBnt is thought to differ from both SCA and SCBt and to include the second [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB<ref name=":13" /><ref name=":21" />, is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt<ref name=":30" /><ref><nowiki><pubmed>18586933</pubmed></nowiki></ref>. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an ''in vitro'' reaction<ref name=":31"><nowiki><pubmed>9463394</pubmed></nowiki></ref>.
  
 
====The Transposition Pathway====
 
====The Transposition Pathway====
The IS''3'' family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1 IS''1'']<ref name=":45"><pubmed>7489730</pubmed></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']''<ref name=":5" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref><pubmed>15493331</pubmed></ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":32"><pubmed>12374815</pubmed></ref>. This represents a major transposition pathway which has yet to be widely recognized. As shown in [[:File:Fig. IS3.9A.png|Fig.IS3.9]], and '''the animation below''', IS''3'' family transposition proceeds through a '''copy-out-paste-in process'''.  
+
The IS''3'' family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1 IS''1'']<ref name=":45"><nowiki><pubmed>7489730</pubmed></nowiki></ref>, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']''<ref name=":5" />'', [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3'']<ref><nowiki><pubmed>15493331</pubmed></nowiki></ref>, and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']<ref name=":32"><nowiki><pubmed>12374815</pubmed></nowiki></ref>. This represents a major transposition pathway which has yet to be widely recognized. As shown in [[:File:Fig. IS3.9A.png|Fig.IS3.9]], and '''the animation below''', IS''3'' family transposition proceeds through a '''copy-out-paste-in process'''.  
 
<center>
 
<center>
 
{| class="wikitable"
 
{| class="wikitable"
Line 185: Line 185:
  
 
====The Figure-eight form====
 
====The Figure-eight form====
The initial step is recognition of the IR by OrfAB (presumably during its translation) ('''[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] movie above''') and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions<ref><pubmed>26104718</pubmed></ref>, IS''911'' transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right ([[:File:Fig. IS3.9A.png|Fig.IS3.9]]). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule ([[:File:Fig. IS3.12.png|Fig.IS3.12]]) which can be easily observed ''in vivo''<ref name=":33"><pubmed>7590258</pubmed></ref>. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS''3''<ref name=":34"><pubmed>10556026</pubmed></ref>) and an [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach<ref name=":27" />. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS''3''<ref><pubmed>9413996</pubmed></ref>. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter<ref name=":4" />, initiation of transposition to form the figure eight intermediate may be stochastic.
+
The initial step is recognition of the IR by OrfAB (presumably during its translation) ('''[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] movie above''') and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions<ref><nowiki><pubmed>26104718</pubmed></nowiki></ref>, IS''911'' transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right ([[:File:Fig. IS3.9A.png|Fig.IS3.9]]). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule ([[:File:Fig. IS3.12.png|Fig.IS3.12]]) which can be easily observed ''in vivo''<ref name=":33"><nowiki><pubmed>7590258</pubmed></nowiki></ref>. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS''3''<ref name=":34"><nowiki><pubmed>10556026</pubmed></nowiki></ref>) and an [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911'']-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach<ref name=":27" />. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS''3''<ref><nowiki><pubmed>9413996</pubmed></nowiki></ref>. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter<ref name=":4" />, initiation of transposition to form the figure eight intermediate may be stochastic.
 
[[Image:Fig. IS3.12.png|thumb|center|680x680px|'''Fig. IS3.12.''' [[wikipedia:Agarose_gel_electrophoresis|Agarose gel electrophoresis]] of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. '''The first panel, Left'''. Cartoons of three [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown in green, plasmid backbone in black and the transposon ends as red dots. '''Second panel.''' [[wikipedia:Ethidium_bromide|Ethidium bromide]]-stained [[wikipedia:Agarose_gel_electrophoresis|Agarose gel]] showing various DNA species, including the plasmid which was used to supply transposase. '''Third panel.''' Electron micrographs of [[wikipedia:RecA|RecA]] coated figure 8 and IS circles. |alt=]]
 
[[Image:Fig. IS3.12.png|thumb|center|680x680px|'''Fig. IS3.12.''' [[wikipedia:Agarose_gel_electrophoresis|Agarose gel electrophoresis]] of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. '''The first panel, Left'''. Cartoons of three [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] is shown in green, plasmid backbone in black and the transposon ends as red dots. '''Second panel.''' [[wikipedia:Ethidium_bromide|Ethidium bromide]]-stained [[wikipedia:Agarose_gel_electrophoresis|Agarose gel]] showing various DNA species, including the plasmid which was used to supply transposase. '''Third panel.''' Electron micrographs of [[wikipedia:RecA|RecA]] coated figure 8 and IS circles. |alt=]]
  
 
====The circular intermediate====
 
====The circular intermediate====
Kinetic data<ref name=":28" /><ref name=":33" /> indicate that the figure-eight gives rise to the circular transposon form which can easily be detected ''in vivo'' and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.12.png|Fig.IS3.12]]). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction<ref><pubmed>1334464</pubmed></ref>.
+
Kinetic data<ref name=":28" /><ref name=":33" /> indicate that the figure-eight gives rise to the circular transposon form which can easily be detected ''in vivo'' and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.12.png|Fig.IS3.12]]). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction<ref><nowiki><pubmed>1334464</pubmed></nowiki></ref>.
  
Studies ''in vivo'' using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule<ref name=":35"><pubmed>15359283</pubmed></ref> ([[:File:Fig. IS3.9B.png|Fig.IS3.9]]). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] DNA flank) and does not depend on replication from the donor plasmid origin of replication<ref name=":35" />.
+
Studies ''in vivo'' using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule<ref name=":35"><nowiki><pubmed>15359283</pubmed></nowiki></ref> ([[:File:Fig. IS3.9B.png|Fig.IS3.9]]). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] DNA flank) and does not depend on replication from the donor plasmid origin of replication<ref name=":35" />.
  
Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the [[wikipedia:Replication_terminator_Tus_family|Tus/ter system]]<ref><pubmed>8021197</pubmed></ref><ref><pubmed>2181438</pubmed></ref><ref><pubmed>2510933</pubmed></ref><ref><pubmed>16148308</pubmed></ref> (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of [[wikipedia:Replication_terminator_Tus_family|Tus protein]], no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end<ref name=":35" />.
+
Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the [[wikipedia:Replication_terminator_Tus_family|Tus/ter system]]<ref><nowiki><pubmed>8021197</pubmed></nowiki></ref><ref><nowiki><pubmed>2181438</pubmed></nowiki></ref><ref><nowiki><pubmed>2510933</pubmed></nowiki></ref><ref><nowiki><pubmed>16148308</pubmed></nowiki></ref> (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of [[wikipedia:Replication_terminator_Tus_family|Tus protein]], no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end<ref name=":35" />.
  
At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of [[wikipedia:Bacteriophage_Mu|bacteriophage Mu]] (see <ref><pubmed>26104374</pubmed></ref><ref><pubmed>12770828</pubmed></ref><ref><pubmed>11459960</pubmed></ref>).  
+
At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of [[wikipedia:Bacteriophage_Mu|bacteriophage Mu]] (see <ref><nowiki><pubmed>26104374</pubmed></nowiki></ref><ref><nowiki><pubmed>12770828</pubmed></nowiki></ref><ref><nowiki><pubmed>11459960</pubmed></nowiki></ref>).  
  
[https://www.uniprot.org/uniprot/P24230 RecG helicase] is implicated in targeted insertion. This process involves a target [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate ([[:File:Fig. IS3.13.png|Fig.IS3.13]]). Resolution of this structure implicates branch migration and replication from the donor plasmid<ref name=":36"><pubmed>15306008</pubmed></ref>. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.
+
[https://www.uniprot.org/uniprot/P24230 RecG helicase] is implicated in targeted insertion. This process involves a target [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate ([[:File:Fig. IS3.13.png|Fig.IS3.13]]). Resolution of this structure implicates branch migration and replication from the donor plasmid<ref name=":36"><nowiki><pubmed>15306008</pubmed></nowiki></ref>. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.
 
[[Image:Fig. IS3.13.png|thumb|center|620x620px|'''Fig. IS3.13.''' ''In vitro'' reactions were performed using purified [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless ''lacZ'' gene.
 
[[Image:Fig. IS3.13.png|thumb|center|620x620px|'''Fig. IS3.13.''' ''In vitro'' reactions were performed using purified [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless ''lacZ'' gene.
 
Following a standard In vitro reaction, the reaction mixture was used to transform competent ''[[wikipedia:Escherichia_coli|E. coli]]'' with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.|alt=]]
 
Following a standard In vitro reaction, the reaction mixture was used to transform competent ''[[wikipedia:Escherichia_coli|E. coli]]'' with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.|alt=]]
  
 
====Integration of the circular intermediate====
 
====Integration of the circular intermediate====
The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion ''in vivo''<ref name=":37"><pubmed>9214651</pubmed></ref> and ''in vitro<ref name=":31" />''. Transposon circle insertion presumably requires further transposase synthesis.  
+
The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion ''in vivo''<ref name=":37"><nowiki><pubmed>9214651</pubmed></nowiki></ref> and ''in vitro<ref name=":31" />''. Transposon circle insertion presumably requires further transposase synthesis.  
  
 
A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by '''IRL''' ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). The 3 (or more rarely 4) bp which separate '''IRL''' and '''IRR''' in the circle provide an ideal spacing between the –35 and –10 elements<ref name=":37" />. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL<ref name=":37" /> ([[:File:Fig. IS3.4.png|Fig.IS3.4]]), and more than two fold stronger than ''lacUV5<ref name=":9" />''. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition.  
 
A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by '''IRL''' ([[:File:Fig. IS3.3.png|Fig.IS3.3 B]]). The 3 (or more rarely 4) bp which separate '''IRL''' and '''IRR''' in the circle provide an ideal spacing between the –35 and –10 elements<ref name=":37" />. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL<ref name=":37" /> ([[:File:Fig. IS3.4.png|Fig.IS3.4]]), and more than two fold stronger than ''lacUV5<ref name=":9" />''. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition.  
Line 208: Line 208:
 
Inactivation of pjunc by mutagenesis strongly reduced [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition ''in vivo'' when transposase was expressed in its native configuration<ref name=":9" />. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds '''IRR''' and '''IRL''', reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect<ref name=":9" />. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.
 
Inactivation of pjunc by mutagenesis strongly reduced [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposition ''in vivo'' when transposase was expressed in its native configuration<ref name=":9" />. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds '''IRR''' and '''IRL''', reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect<ref name=":9" />. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.
  
For the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], this junction promoter is required for transposition<ref><pubmed>14729714</pubmed></ref>.
+
For the related [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], this junction promoter is required for transposition<ref><nowiki><pubmed>14729714</pubmed></nowiki></ref>.
  
 
Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.11.png|Fig.IS3.11 B]]).  
 
Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA ([[:File:Fig. IS3.9B.png|Fig.IS3.9]] and [[:File:Fig. IS3.11.png|Fig.IS3.11 B]]).  
Line 215: Line 215:
 
The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA<ref name=":31" />.
 
The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA<ref name=":31" />.
  
It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] are produced from transposon circles ''in vitro'' and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration ''in vitro''<ref><pubmed>10320583</pubmed></ref>.
+
It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] are produced from transposon circles ''in vitro'' and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration ''in vitro''<ref><nowiki><pubmed>10320583</pubmed></nowiki></ref>.
  
 
Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.  
 
Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.  
Line 222: Line 222:
  
 
====Targeted Insertion====
 
====Targeted Insertion====
As stated above, several IS including IS''911'' show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; [[:File:Fig. IS3.11.png|Fig.IS3.11 B]] left) which is similar to SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]]) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], this phenomenon occurs more frequently if OrfA is not present ([[:File:Fig. IS3.13.png|Fig.IS3.13]]) and it was proposed that one role of OrfA is to promote dispersion of the IS<ref name=":30" /><ref name=":38"><pubmed>12145217</pubmed></ref>.
+
As stated above, several IS including IS''911'' show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; [[:File:Fig. IS3.11.png|Fig.IS3.11 B]] left) which is similar to SCA ([[:File:Fig. IS3.11.png|Fig.IS3.11 A]]) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], this phenomenon occurs more frequently if OrfA is not present ([[:File:Fig. IS3.13.png|Fig.IS3.13]]) and it was proposed that one role of OrfA is to promote dispersion of the IS<ref name=":30" /><ref name=":38"><nowiki><pubmed>12145217</pubmed></nowiki></ref>.
  
This type of one-ended intermolecular recombination/integration has been analyzed in some detail<ref name=":36" /><ref name=":38" /><ref><pubmed>14756780</pubmed></ref>.
+
This type of one-ended intermolecular recombination/integration has been analyzed in some detail<ref name=":36" /><ref name=":38" /><ref><nowiki><pubmed>14756780</pubmed></nowiki></ref>.
  
 
IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected ''in vitro''. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of ''in vitro'' and ''in vivo'' IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way [[wikipedia:Holliday_junction|Holliday-like junction]] which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks.
 
IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected ''in vitro''. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of ''in vitro'' and ''in vivo'' IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way [[wikipedia:Holliday_junction|Holliday-like junction]] which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks.
Line 231: Line 231:
  
 
====Mechanism in other family members====
 
====Mechanism in other family members====
Several other members of this family have also been analysed in some detail. These include [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase<ref name=":3" /><ref name=":5" /><ref name=":32" /><ref name=":34" /><ref name=":39"><pubmed>8550559</pubmed></ref>.
+
Several other members of this family have also been analysed in some detail. These include [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''], and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS150 IS''150'']. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase<ref name=":3" /><ref name=":5" /><ref name=":32" /><ref name=":34" /><ref name=":39"><nowiki><pubmed>8550559</pubmed></nowiki></ref>.
  
 
IS''3'' also generates adjacent deletions<ref name=":3" /> but, unlike [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate<ref name=":39" />. Such forms may be equivalent to the linear [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] species derived from transposon circles. In addition, IS''3''-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition<ref name=":10" />.  
 
IS''3'' also generates adjacent deletions<ref name=":3" /> but, unlike [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate<ref name=":39" />. Such forms may be equivalent to the linear [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] species derived from transposon circles. In addition, IS''3''-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition<ref name=":10" />.  
Line 237: Line 237:
 
Insertion of IS''3'' creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the '''IRs''' are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS''3'' elements. This may be the result of the formation of a strong hybrid promoter which, as described for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] are also efficient substrates in transposition.  
 
Insertion of IS''3'' creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the '''IRs''' are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS''3'' elements. This may be the result of the formation of a strong hybrid promoter which, as described for [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS3 IS''3''] are also efficient substrates in transposition.  
  
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] generates direct target duplications of 5 bp on insertion<ref><pubmed>375194</pubmed></ref> although transposon circles generated with this element carry only a single base pair separating '''IRL''' and '''IRR'''<ref name=":5" />.
+
[https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] generates direct target duplications of 5 bp on insertion<ref><nowiki><pubmed>375194</pubmed></nowiki></ref> although transposon circles generated with this element carry only a single base pair separating '''IRL''' and '''IRR'''<ref name=":5" />.
  
 
While [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] carries a conserved terminal '''5'<nowiki/>''' -CA- '''3'<nowiki/>''' at its right end, the left end terminates with '''5'<nowiki/>''' -TG- '''3''''. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.  
 
While [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] carries a conserved terminal '''5'<nowiki/>''' -CA- '''3'<nowiki/>''' at its right end, the left end terminates with '''5'<nowiki/>''' -TG- '''3''''. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.  
Line 245: Line 245:
 
It does not appear to bind '''IRR''' (note that in the original article the authors inverse the standard definition of IRL and IRR<ref name=":8" />.
 
It does not appear to bind '''IRR''' (note that in the original article the authors inverse the standard definition of IRL and IRR<ref name=":8" />.
  
Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition<ref name=":5" /><ref><pubmed>8676870</pubmed></ref> and, like [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], the circle junction of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS''3''-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.  
+
Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2'']-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition<ref name=":5" /><ref><nowiki><pubmed>8676870</pubmed></nowiki></ref> and, like [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''], the circle junction of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS2 IS''2''] also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS''3''-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.  
  
 
===Structural studies===
 
===Structural studies===
Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] from the [[IS Families/IS256 family|IS''256'' family]], which also undergoes '''copy-out-paste-in''' transposition has provided some insights <ref><pubmed>33006208</pubmed></ref>. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip <ref name=":33" />. In accord with this type of mechanism, crystal structures of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with '''IRR''' together with its flanking DNA, a pre-cleaved complex in which the '''IRR''' flank had been removed and a strand transfer complex including an abutted '''IRR''' and '''IRL''' separated by a gapped 6 base pair linker ([[:File:IS256.8.png|Fig. IS256.8]]).  
+
Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] from the [[IS Families/IS256 family|IS''256'' family]], which also undergoes '''copy-out-paste-in''' transposition has provided some insights <ref><nowiki><pubmed>33006208</pubmed></nowiki></ref>. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip <ref name=":33" />. In accord with this type of mechanism, crystal structures of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISCth4 IS''Cth4''] transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with '''IRR''' together with its flanking DNA, a pre-cleaved complex in which the '''IRR''' flank had been removed and a strand transfer complex including an abutted '''IRR''' and '''IRL''' separated by a gapped 6 base pair linker ([[:File:IS256.8.png|Fig. IS256.8]]).  
  
It is important to note that [[IS Families/IS256 family|IS''256'' family]] transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS''3'' family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA <ref><pubmed>9463394</pubmed></ref>.
+
It is important to note that [[IS Families/IS256 family|IS''256'' family]] transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS''3'' family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS911 IS''911''] transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA <ref><nowiki><pubmed>9463394</pubmed></nowiki></ref>.
 
===Excision: A dedicated enzyme===
 
===Excision: A dedicated enzyme===
<blockquote>This section has been published in a modified form as Chandler M, Ross K, Varani AM. '''The insertion sequence excision enhancer: A PrimPol-based primer invasion system for immobilizing transposon-transmitted antibiotic resistance genes'''. ''Mol Microbiol''. 2023 <ref><pubmed>37574851</pubmed></ref>.</blockquote>
+
<blockquote>This section has been published in a modified form as Chandler M, Ross K, Varani AM. '''The insertion sequence excision enhancer: A PrimPol-based primer invasion system for immobilizing transposon-transmitted antibiotic resistance genes'''. ''Mol Microbiol''. 2023 <ref><nowiki><pubmed>37574851</pubmed></nowiki></ref>.</blockquote>
  
  
The IS''3'' family insertion sequence IS''1203''v (similar to [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']), originally identified in a Shiga toxin 2 gene (''stx2'') of ''[[wikipedia:Escherichia_coli|Escherichia coli]]'' O157:H7 which it had insertionally inactivated, was found to undergo precise excision leading to ''stx2'' reactivation <ref><pubmed>10698782</pubmed></ref>.  Curiously excision of the IS''3'' family transposon, IS''1203''v occurred at a much higher frequency in some in some ''[[wikipedia:Escherichia_coli|E.coli]]'' hosts than in others <ref><pubmed>16233651</pubmed></ref>.  
+
The IS''3'' family insertion sequence IS''1203''v (similar to [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']), originally identified in a Shiga toxin 2 gene (''stx2'') of ''[[wikipedia:Escherichia_coli|Escherichia coli]]'' O157:H7 which it had insertionally inactivated, was found to undergo precise excision leading to ''stx2'' reactivation <ref><nowiki><pubmed>10698782</pubmed></nowiki></ref>.  Curiously excision of the IS''3'' family transposon, IS''1203''v occurred at a much higher frequency in some in some ''[[wikipedia:Escherichia_coli|E.coli]]'' hosts than in others <ref><nowiki><pubmed>16233651</pubmed></nowiki></ref>.  
 
<br />
 
<br />
  
 
====IS Excision is Stimulated by High Transposase Levels====
 
====IS Excision is Stimulated by High Transposase Levels====
Using a (single copy) [[wikipedia:Fertility_factor_(bacteria)|F plasmid]] derivative in which an [[wikipedia:Ampicillin|ampicillin]] resistance gene was interrupted by an IS''1203''v insertion (''bla''::IS''1203''v) to monitor precise excision rates (reversion to [[wikipedia:Ampicillin|ampicillin]] resistance) ([[:File:IS3.15.png|Fig. IS3.15 '''A''']]''')''', the authors showed that excision was 10<sup>5</sup> fold higher in [[wikipedia:Escherichia_coli|''Escherichia coli'' O157:H7]], known to carry a significant number of IS629 copies, compared to [[wikipedia:Escherichia_coli|''E. coli'' K12]] ([https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 MG1655]), where it is absent <ref><pubmed>11258796</pubmed></ref><ref><pubmed>9628576</pubmed></ref>.
+
Using a (single copy) [[wikipedia:Fertility_factor_(bacteria)|F plasmid]] derivative in which an [[wikipedia:Ampicillin|ampicillin]] resistance gene was interrupted by an IS''1203''v insertion (''bla''::IS''1203''v) to monitor precise excision rates (reversion to [[wikipedia:Ampicillin|ampicillin]] resistance) ([[:File:IS3.15.png|Fig. IS3.15 '''A''']]''')''', the authors showed that excision was 10<sup>5</sup> fold higher in [[wikipedia:Escherichia_coli|''Escherichia coli'' O157:H7]], known to carry a significant number of IS629 copies, compared to [[wikipedia:Escherichia_coli|''E. coli'' K12]] ([https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 MG1655]), where it is absent <ref><nowiki><pubmed>11258796</pubmed></nowiki></ref><ref><nowiki><pubmed>9628576</pubmed></nowiki></ref>.
  
 
Further studies using a number of ''E. coli'' isolates with and without IS''1203''v/IS''629'' copies supported the idea that excision was higher in those strains already carrying the IS.  
 
Further studies using a number of ''E. coli'' isolates with and without IS''1203''v/IS''629'' copies supported the idea that excision was higher in those strains already carrying the IS.  
  
In a modified experimental system in which various transposition functions were supplied ''in trans'' from a compatible plasmid ([[:File:IS3.15.png|Fig. IS3.15 '''B''']]''')''', deletion was observed to be very low or below the detection level in an [https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 MG1655] host. In the [[wikipedia:Escherichia_coli_O157:H7|O157:H7]] strain, however, supplying the OrfAB transposase induced high deletion levels (~10<sup>-3</sup>) compared to that obtained with the empty vector (7.8x10<sup>-7</sup>) whereas supplying the ''orfA'', ''orfB'' and ''orfAB'' genes in their native configuration only resulted in a moderate frequency of excision (2.6x10<sup>-6</sup>) and supplying ''orfA'' alone depressed excision (3.8x10<sup>-9</sup>). This implies that the levels of available OrfAB transposase are a determining factor in excision. A survey of a number of strains showed that excision frequencies of strains possessing IS''1203v'' ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']) were on average 10<sup>3</sup> times higher than those not carrying the IS. High [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] excision frequencies were observed in a large number of clinical ''[[wikipedia:Escherichia_coli|E. coli]]'' isolates <ref><pubmed>16233651</pubmed></ref>.
+
In a modified experimental system in which various transposition functions were supplied ''in trans'' from a compatible plasmid ([[:File:IS3.15.png|Fig. IS3.15 '''B''']]''')''', deletion was observed to be very low or below the detection level in an [https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=511145 MG1655] host. In the [[wikipedia:Escherichia_coli_O157:H7|O157:H7]] strain, however, supplying the OrfAB transposase induced high deletion levels (~10<sup>-3</sup>) compared to that obtained with the empty vector (7.8x10<sup>-7</sup>) whereas supplying the ''orfA'', ''orfB'' and ''orfAB'' genes in their native configuration only resulted in a moderate frequency of excision (2.6x10<sup>-6</sup>) and supplying ''orfA'' alone depressed excision (3.8x10<sup>-9</sup>). This implies that the levels of available OrfAB transposase are a determining factor in excision. A survey of a number of strains showed that excision frequencies of strains possessing IS''1203v'' ([https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629'']) were on average 10<sup>3</sup> times higher than those not carrying the IS. High [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] excision frequencies were observed in a large number of clinical ''[[wikipedia:Escherichia_coli|E. coli]]'' isolates <ref><nowiki><pubmed>16233651</pubmed></nowiki></ref>.
  
 
<br />
 
<br />
Line 270: Line 270:
  
 
====A Dedicated Enzyme: Identification of a common reading frame ECs1305 in all high excision strains====
 
====A Dedicated Enzyme: Identification of a common reading frame ECs1305 in all high excision strains====
The authors identified a reading frame, [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305], present in all high excision strains that was absent in the low excision strains <ref name=":41"><pubmed>21224843</pubmed></ref>. In [[wikipedia:Escherichia_coli|EHEC O157]] it It is located in large potential integrative elements that are similar to [https://www.ncbi.nlm.nih.gov/gene/8479696 SpLE1] of [[wikipedia:Escherichia_coli_O157:H7|EHEC O157]] <ref name=":46"><pubmed>24334665</pubmed></ref> and has probably been dispersed in this way. It was identified both in [[wikipedia:Escherichia_coli_O157:H7|enterohemorrhagic (EHEC)]] and [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxigenic (ETEC)]] ''[[wikipedia:Escherichia_coli|E. coli]]'' strains but homologues were also identified by [https://blast.ncbi.nlm.nih.gov/Blast.cgi Blast] analysis in a broad range of bacteria including [[wikipedia:Alphaproteobacteria|Alpha]]- [[wikipedia:Betaproteobacteria|Beta]]-, [[wikipedia:Gammaproteobacteria|Gamma]]-, [[wikipedia:Myxococcota|Delta]]- and [[wikipedia:Campylobacterota|Epsilon-proteobacteria]]; [[wikipedia:Bacteroides|Bacteroides]]; [[wikipedia:Green_sulfur_bacteria|Chlorobi]]; [[wikipedia:Cyanobacteria|Cyanobacteria]]; [[wikipedia:Bacillota|Firmicutes]]; [[wikipedia:Actinomycetota|Actinobacteria]]; and [[wikipedia:Verrucomicrobiota|Verrucomicrobia]]<ref name=":41" />.
+
The authors identified a reading frame, [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305], present in all high excision strains that was absent in the low excision strains <ref name=":41"><nowiki><pubmed>21224843</pubmed></nowiki></ref>. In [[wikipedia:Escherichia_coli|EHEC O157]] it It is located in large potential integrative elements that are similar to [https://www.ncbi.nlm.nih.gov/gene/8479696 SpLE1] of [[wikipedia:Escherichia_coli_O157:H7|EHEC O157]] <ref name=":46"><nowiki><pubmed>24334665</pubmed></nowiki></ref> and has probably been dispersed in this way. It was identified both in [[wikipedia:Escherichia_coli_O157:H7|enterohemorrhagic (EHEC)]] and [[wikipedia:Enterotoxigenic_Escherichia_coli|enterotoxigenic (ETEC)]] ''[[wikipedia:Escherichia_coli|E. coli]]'' strains but homologues were also identified by [https://blast.ncbi.nlm.nih.gov/Blast.cgi Blast] analysis in a broad range of bacteria including [[wikipedia:Alphaproteobacteria|Alpha]]- [[wikipedia:Betaproteobacteria|Beta]]-, [[wikipedia:Gammaproteobacteria|Gamma]]-, [[wikipedia:Myxococcota|Delta]]- and [[wikipedia:Campylobacterota|Epsilon-proteobacteria]]; [[wikipedia:Bacteroides|Bacteroides]]; [[wikipedia:Green_sulfur_bacteria|Chlorobi]]; [[wikipedia:Cyanobacteria|Cyanobacteria]]; [[wikipedia:Bacillota|Firmicutes]]; [[wikipedia:Actinomycetota|Actinobacteria]]; and [[wikipedia:Verrucomicrobiota|Verrucomicrobia]]<ref name=":41" />.
  
More recently it has been estimated that a highly conserved IEE gene copy is present in over 30% of available ''[[wikipedia:Escherichia_coli|E. coli]]'' genome assemblies and is very abundant not only within enterohemorrhagic and enterotoxigenic genomes but also within enteropathogenic ''[[wikipedia:Escherichia_coli|E. coli]]'' <ref name=":44"><pubmed>36715333</pubmed></ref>.
+
More recently it has been estimated that a highly conserved IEE gene copy is present in over 30% of available ''[[wikipedia:Escherichia_coli|E. coli]]'' genome assemblies and is very abundant not only within enterohemorrhagic and enterotoxigenic genomes but also within enteropathogenic ''[[wikipedia:Escherichia_coli|E. coli]]'' <ref name=":44"><nowiki><pubmed>36715333</pubmed></nowiki></ref>.
  
 
The [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305] gene was subsequently named '''''iee''''' for '''I'''S-'''e'''xcision '''e'''nhancer '''<ref name=":41" />'''. In EHEC O157, ECs1305/IEE is located in a large potential integrative element that is similar to SpLE1 and has probably been dispersed in this way '''<ref name=":46" />'''.
 
The [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305] gene was subsequently named '''''iee''''' for '''I'''S-'''e'''xcision '''e'''nhancer '''<ref name=":41" />'''. In EHEC O157, ECs1305/IEE is located in a large potential integrative element that is similar to SpLE1 and has probably been dispersed in this way '''<ref name=":46" />'''.
Line 279: Line 279:
 
When this reading frame was deleted, the IS excision frequency was greatly decreased but could be restored by reintroduction of a plasmid-carried ''iee'' copy. Moreover, the use of DDE mutant transposases was used to demonstrate that the [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305]-promoted excision behavior was also dependent on an active transposase.
 
When this reading frame was deleted, the IS excision frequency was greatly decreased but could be restored by reintroduction of a plasmid-carried ''iee'' copy. Moreover, the use of DDE mutant transposases was used to demonstrate that the [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305]-promoted excision behavior was also dependent on an active transposase.
  
The effect on excision frequency of a number of host genes was investigated and although some of these had been shown to affect excision of other ISs (e.g. <ref><pubmed>6287993</pubmed></ref><ref><pubmed>6322169</pubmed></ref><ref><pubmed>2981756</pubmed><br /></ref> ), the effect of mutations were largely marginal ([[:File:FigIS3.15-new.png|Fig.IS3.16]]) and not as pronounced as mutations in [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305] <ref name=":41" />.   
+
The effect on excision frequency of a number of host genes was investigated and although some of these had been shown to affect excision of other ISs (e.g. <ref><nowiki><pubmed>6287993</pubmed></nowiki></ref><ref><nowiki><pubmed>6322169</pubmed></nowiki></ref><ref><nowiki><pubmed>2981756</pubmed></nowiki><br /></ref> ), the effect of mutations were largely marginal ([[:File:FigIS3.15-new.png|Fig.IS3.16]]) and not as pronounced as mutations in [https://www.ncbi.nlm.nih.gov/gene/912859 ECs1305] <ref name=":41" />.   
 
[[File:FigIS3.15-new.png|alt=|center|thumb|680x680px|'''Fig.IS3.16.''' '''A Histogram showing the frequency of excision of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] in various Mutant ''E. coli'' Genetic Backgrounds.''' The assay consisted of a plasmid carrying an [[wikipedia:Ampicillin|ampicillin resistance]] (Ap<sup>r</sup>) gene inactivated by the insertion of a copy of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] whose transposase had been substituted for a [[wikipedia:Tetracycline|tetracycline resistance]] gene (Tc<sup>r</sup>). The plasmid also included the transposase gene placed under the control of an external promoter. Excision of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] derivative results in loss of Tc<sup>r</sup> and appearance of Ap<sup>r</sup>. Data were taken from <ref name=":41" />. The authors used a number of deletion mutants of genes that are thought to influence various aspects of transposition: IHF ('''i'''ntegration '''h'''ost '''f'''actor), HU, H-NS, FIS (factor for inversion stimulation), ClpXP5 protease Lon protease, Dam, RecA, and RecBC. The precise IS''629'' excision frequency was examined in each mutant using the reporter plasmid-based assay. The ''hns'', ''dam'', and ''recB'' deletion mutants could not be generated in the original [[wikipedia:Escherichia_coli|''E. coli'' O157 Sakai strain]] and were generated in another ''[[wikipedia:Escherichia_coli|E. coli]]'' host carrying a chromosomally inserted '''iee''' gene.]]
 
[[File:FigIS3.15-new.png|alt=|center|thumb|680x680px|'''Fig.IS3.16.''' '''A Histogram showing the frequency of excision of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] in various Mutant ''E. coli'' Genetic Backgrounds.''' The assay consisted of a plasmid carrying an [[wikipedia:Ampicillin|ampicillin resistance]] (Ap<sup>r</sup>) gene inactivated by the insertion of a copy of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] whose transposase had been substituted for a [[wikipedia:Tetracycline|tetracycline resistance]] gene (Tc<sup>r</sup>). The plasmid also included the transposase gene placed under the control of an external promoter. Excision of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] derivative results in loss of Tc<sup>r</sup> and appearance of Ap<sup>r</sup>. Data were taken from <ref name=":41" />. The authors used a number of deletion mutants of genes that are thought to influence various aspects of transposition: IHF ('''i'''ntegration '''h'''ost '''f'''actor), HU, H-NS, FIS (factor for inversion stimulation), ClpXP5 protease Lon protease, Dam, RecA, and RecBC. The precise IS''629'' excision frequency was examined in each mutant using the reporter plasmid-based assay. The ''hns'', ''dam'', and ''recB'' deletion mutants could not be generated in the original [[wikipedia:Escherichia_coli|''E. coli'' O157 Sakai strain]] and were generated in another ''[[wikipedia:Escherichia_coli|E. coli]]'' host carrying a chromosomally inserted '''iee''' gene.]]
  
Line 317: Line 317:
 
[[File:FigIS3.20.png|center|thumb|680x680px|'''Fig.IS3.21.''' '''Model Proposed for generating different [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] deletion'''s (from <ref name=":41" />). Transposon sequences are shown in green; transposon ends are shown as red circles and the neighboring host DNA as black lines. The arrows show attack by one IS end ats the opposite end that occurs during copy-out-paste-in transposition. '''i)''' the normal productive pathway leading to forthe mation of a circular IS intermediate and regenerating the original donor replicon <ref name=":42" />. Note that no IS loss occurs. '''ii)''' “sloppy” attack at different positions in the donor replicon. Note that from what is known of the copy-out-paste-in transposition mechanism, this would not be expected to result in IS excision.]]
 
[[File:FigIS3.20.png|center|thumb|680x680px|'''Fig.IS3.21.''' '''Model Proposed for generating different [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] deletion'''s (from <ref name=":41" />). Transposon sequences are shown in green; transposon ends are shown as red circles and the neighboring host DNA as black lines. The arrows show attack by one IS end ats the opposite end that occurs during copy-out-paste-in transposition. '''i)''' the normal productive pathway leading to forthe mation of a circular IS intermediate and regenerating the original donor replicon <ref name=":42" />. Note that no IS loss occurs. '''ii)''' “sloppy” attack at different positions in the donor replicon. Note that from what is known of the copy-out-paste-in transposition mechanism, this would not be expected to result in IS excision.]]
  
Calvo and coleagues <ref name=":44" /> provide a model for '''IEE''' activity based on their biochemical results in which they propose that '''IEE''' pairs two 5’ resected ends DNA ends in a reaction which is facilitated by the C-terminal [[wikipedia:Helicase|helicase]] domain, allowing the '''AEP domain''' to accomplish a “''filling in''” polymerization reaction. The model invokes an intermediate which is thought to occur during a cut-out-paste-in transposition pathway which leaves a blunt ended double strand break in the donor DNA molecule (see <ref><pubmed>1316613</pubmed></ref>). Moreover, although it is not pointed out, most IS generate short, direct target repeat sequences on insertion (see: [[General Information/What Is an IS?|General Information: What is an IS?]]) and would therefore provide 3’ terminal microhomologies upon 5’ resection.                                                                 
+
Calvo and coleagues <ref name=":44" /> provide a model for '''IEE''' activity based on their biochemical results in which they propose that '''IEE''' pairs two 5’ resected ends DNA ends in a reaction which is facilitated by the C-terminal [[wikipedia:Helicase|helicase]] domain, allowing the '''AEP domain''' to accomplish a “''filling in''” polymerization reaction. The model invokes an intermediate which is thought to occur during a cut-out-paste-in transposition pathway which leaves a blunt ended double strand break in the donor DNA molecule (see <ref><nowiki><pubmed>1316613</pubmed></nowiki></ref>). Moreover, although it is not pointed out, most IS generate short, direct target repeat sequences on insertion (see: [[General Information/What Is an IS?|General Information: What is an IS?]]) and would therefore provide 3’ terminal microhomologies upon 5’ resection.                                                                 
  
However, in addition to IS''3'' family members, Kusumoto et al., <ref name=":41" /> show that '''IEE''' also stimulates excision of [[IS Families/IS1 family|IS''1'']] and I[[IS Families/IS30 family|S''30'' family]] members ([[:File:FigIS3.21.png|Fig. IS3.22]]), both of which can transpose using a copy-out-paste-in mechanism <ref name=":45" /><ref><pubmed>18022196</pubmed></ref>. A number of the IS exhibit a measurable level of excision in the absence of the '''''iee'' gene'''. Interestingly, it was proposed based on identification of ''in vivo'' transposase-induced structures, that [[IS Families/IS1 family|IS''1'']] can transpose using a number of alternative pathways <ref name=":45" />: [[IS Families/IS1 family|IS''1'']] shows a low basal level of excision which is further stimulated by '''IEE'''. It is possible that excision in the absence of '''''iee''''' occurs by a different mechanism <ref><pubmed>455447</pubmed></ref><ref><pubmed>6260376</pubmed></ref>  as occurs in the [[IS Families/IS4 and related families|IS''4'' family member]], IS''10''. In this light, it is noteworthy that IS''4'' itself shows a low level of '''''iee''''' independent excision which is not affected by '''''iee''''' ([[:File:FigIS3.21.png|Fig. IS3.22]]). Additionally, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS26 IS''26''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS621 IS''621''], none of which use a copy-out-paste-in transposition mechanism, were not observed to excise even in the presence of '''''iee'''''.
+
However, in addition to IS''3'' family members, Kusumoto et al., <ref name=":41" /> show that '''IEE''' also stimulates excision of [[IS Families/IS1 family|IS''1'']] and I[[IS Families/IS30 family|S''30'' family]] members ([[:File:FigIS3.21.png|Fig. IS3.22]]), both of which can transpose using a copy-out-paste-in mechanism <ref name=":45" /><ref><nowiki><pubmed>18022196</pubmed></nowiki></ref>. A number of the IS exhibit a measurable level of excision in the absence of the '''''iee'' gene'''. Interestingly, it was proposed based on identification of ''in vivo'' transposase-induced structures, that [[IS Families/IS1 family|IS''1'']] can transpose using a number of alternative pathways <ref name=":45" />: [[IS Families/IS1 family|IS''1'']] shows a low basal level of excision which is further stimulated by '''IEE'''. It is possible that excision in the absence of '''''iee''''' occurs by a different mechanism <ref><nowiki><pubmed>455447</pubmed></nowiki></ref><ref><nowiki><pubmed>6260376</pubmed></nowiki></ref>  as occurs in the [[IS Families/IS4 and related families|IS''4'' family member]], IS''10''. In this light, it is noteworthy that IS''4'' itself shows a low level of '''''iee''''' independent excision which is not affected by '''''iee''''' ([[:File:FigIS3.21.png|Fig. IS3.22]]). Additionally, [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS5 IS''5''], [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS26 IS''26''] and [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS621 IS''621''], none of which use a copy-out-paste-in transposition mechanism, were not observed to excise even in the presence of '''''iee'''''.
  
 
Thus the common mechanistic property of those IS whose excision is stimulated by '''IEE''' is that they all use a copy-out pathway.
 
Thus the common mechanistic property of those IS whose excision is stimulated by '''IEE''' is that they all use a copy-out pathway.
 
[[File:FigIS3.21.png|center|thumb|680x680px|'''Fig. IS3.22.''' Relative IS Excision Frequencies. A number of different IS were examined for '''IEE'''-stimulated excision in ''[[wikipedia:Escherichia_coli|E.coli]]'' K-12 with and without a cloned '''iee''' copy. The assay consisted of a plasmid carrying an [[wikipedia:Ampicillin|ampicillin resistance]] (Ap<sup>r</sup>) gene inactivated by insertion of a copy of the IS being tested whose transposase had been substituted for a [[wikipedia:Tetracycline|tetracycline resistance]] gene (Tc<sup>r</sup>). The plasmid also included the transposase gene placed and the control of an external promoter. Excision of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] derivative results in loss of Tc<sup>r</sup> and appearance of Ap<sup>r</sup>. Data is from Kusumoto et al., <ref name=":41" />.]]
 
[[File:FigIS3.21.png|center|thumb|680x680px|'''Fig. IS3.22.''' Relative IS Excision Frequencies. A number of different IS were examined for '''IEE'''-stimulated excision in ''[[wikipedia:Escherichia_coli|E.coli]]'' K-12 with and without a cloned '''iee''' copy. The assay consisted of a plasmid carrying an [[wikipedia:Ampicillin|ampicillin resistance]] (Ap<sup>r</sup>) gene inactivated by insertion of a copy of the IS being tested whose transposase had been substituted for a [[wikipedia:Tetracycline|tetracycline resistance]] gene (Tc<sup>r</sup>). The plasmid also included the transposase gene placed and the control of an external promoter. Excision of the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS629 IS''629''] derivative results in loss of Tc<sup>r</sup> and appearance of Ap<sup>r</sup>. Data is from Kusumoto et al., <ref name=":41" />.]]
An alternative explanation to that of Kusumoto et al., <ref name=":41" />, is that they are generated not during the first strand transfer step of copy-out-paste-in transposition as proposed ([[:File:FigIS3.20.png|Fig. IS3.20]]), but during the second, replication step (copy-out) by slippage and realignment of the replication primer as was later proposed for the deletion of [[IS Families/IS30 family#ISApl1|IS''30'' family IS''Apl1'']] copies flanking the [[wikipedia:Colistin|colistin resistance]], ''mcr'', gene in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref name=":43"><pubmed>29440577</pubmed></ref> and [[:File:IS30.10.png|Fig. IS30.10]]. This strand switching or primer invasion model had been proposed for the deletion of [[IS Families/IS30 family#ISApl1|IS''30'' family IS''Apl1'']] copies flanking the [[wikipedia:Colistin|colistin]] resistance, ''mcr'', gene in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref name=":47"><pubmed>27620479</pubmed></ref>.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
+
An alternative explanation to that of Kusumoto et al., <ref name=":41" />, is that they are generated not during the first strand transfer step of copy-out-paste-in transposition as proposed ([[:File:FigIS3.20.png|Fig. IS3.20]]), but during the second, replication step (copy-out) by slippage and realignment of the replication primer as was later proposed for the deletion of [[IS Families/IS30 family#ISApl1|IS''30'' family IS''Apl1'']] copies flanking the [[wikipedia:Colistin|colistin resistance]], ''mcr'', gene in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref name=":43"><nowiki><pubmed>29440577</pubmed></nowiki></ref> and [[:File:IS30.10.png|Fig. IS30.10]]. This strand switching or primer invasion model had been proposed for the deletion of [[IS Families/IS30 family#ISApl1|IS''30'' family IS''Apl1'']] copies flanking the [[wikipedia:Colistin|colistin]] resistance, ''mcr'', gene in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref name=":47"><nowiki><pubmed>27620479</pubmed></nowiki></ref>.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
  
 
[[File:IS3.23.png|center|thumb|920x920px|'''Fig. IS3.23. Alignment showing the decay of multiple different instances of [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330'']. The figure is redrawn and modified from Snesrud et al., 2018 <ref name=":43" />. (A)''' Sequence of four parental [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330'']-carrying plasmids. The conserved, ancestral CG dinucleotide on the inside end (IE) of the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] is indicated with a black triangle. The 2 bases at the end of the right-hand [[IS Families/IS30 family#ISApl1|IS''Apl1'']] that are part of the DR generated by insertion of the entire [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] are over-scored in black. The half arrows above indicate the [[IS Families/IS30 family#ISApl1|IS''Apl1'']] IRL and IRR sequences defining the ends of the downstream IS. The bases upstream and downstream that are retained after [[IS Families/IS30 family#ISApl1|IS''Apl1'']] loss are highlighted in salmon and purple, respectively. The deletion joints upstream and downstream of the ISApl1 are boxed. The Roman numerals on the left are correlated with the corresponding deletion product in '''C)'''. '''(B)''' Cartoon showing the structure of [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] and, below, the generic structure of the product in which the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy has been deleted. The IS is shown as a blue box. The triangles at each end represent the left ('''IRL''') and right ('''IRR''') terminal inverted repeats. The transposase open reading frame is shown by a blue horizontal arrow. ''mcr''-1 and pap2 reading frames are shown as red and white horizontal arrows, respectively. Thin black lines above, situate the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy to the sequences in '''A)''' and the dotted lines below, to the deletion junction in '''C). (C)''' Corresponding deletion products. The sequences highlighted in grey show the sequence of plasmids with an empty site. The bases upstream and downstream of the deletion that are retained after [[IS Families/IS30 family#ISApl1|IS''Apl1'']] loss are highlighted in red and blue, respectively, while the remaining copy of the deletion joint that is retained after the two ends are joined following [[IS Families/IS30 family#ISApl1|IS''Apl1'']] excision is highlighted in green and encased in a black rectangle. The Roman numerals on the left are correlated with the corresponding full length parent in '''A'''). The two horizontal blue arrows joined by a dotted blue line on the left indicate the parent and deletion product.]]
 
[[File:IS3.23.png|center|thumb|920x920px|'''Fig. IS3.23. Alignment showing the decay of multiple different instances of [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330'']. The figure is redrawn and modified from Snesrud et al., 2018 <ref name=":43" />. (A)''' Sequence of four parental [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330'']-carrying plasmids. The conserved, ancestral CG dinucleotide on the inside end (IE) of the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] is indicated with a black triangle. The 2 bases at the end of the right-hand [[IS Families/IS30 family#ISApl1|IS''Apl1'']] that are part of the DR generated by insertion of the entire [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] are over-scored in black. The half arrows above indicate the [[IS Families/IS30 family#ISApl1|IS''Apl1'']] IRL and IRR sequences defining the ends of the downstream IS. The bases upstream and downstream that are retained after [[IS Families/IS30 family#ISApl1|IS''Apl1'']] loss are highlighted in salmon and purple, respectively. The deletion joints upstream and downstream of the ISApl1 are boxed. The Roman numerals on the left are correlated with the corresponding deletion product in '''C)'''. '''(B)''' Cartoon showing the structure of [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] and, below, the generic structure of the product in which the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy has been deleted. The IS is shown as a blue box. The triangles at each end represent the left ('''IRL''') and right ('''IRR''') terminal inverted repeats. The transposase open reading frame is shown by a blue horizontal arrow. ''mcr''-1 and pap2 reading frames are shown as red and white horizontal arrows, respectively. Thin black lines above, situate the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy to the sequences in '''A)''' and the dotted lines below, to the deletion junction in '''C). (C)''' Corresponding deletion products. The sequences highlighted in grey show the sequence of plasmids with an empty site. The bases upstream and downstream of the deletion that are retained after [[IS Families/IS30 family#ISApl1|IS''Apl1'']] loss are highlighted in red and blue, respectively, while the remaining copy of the deletion joint that is retained after the two ends are joined following [[IS Families/IS30 family#ISApl1|IS''Apl1'']] excision is highlighted in green and encased in a black rectangle. The Roman numerals on the left are correlated with the corresponding full length parent in '''A'''). The two horizontal blue arrows joined by a dotted blue line on the left indicate the parent and deletion product.]]
  
 
====The ''mcr'' connection: a model for excision by replication-associated strand exchange and primer invasion.====
 
====The ''mcr'' connection: a model for excision by replication-associated strand exchange and primer invasion.====
[[wikipedia:Colistin|Colistin]] ([[wikipedia:Colistin|polymixin E]]) is a last resort antibacterial that was used extensively in husbandry. Discovery of a transferable phosphethanolamine transferase conferring resistance to [[wikipedia:Colistin|colistin]] in 2015 <ref><pubmed>26603172</pubmed></ref> was of such concern that it stimulated an immense effort to identify the resistance gene, ''mcr'', in various bacterial sources worldwide. This quickly generated an extensive ''mcr'' sequence library in which it was noted that the gene was often, but not always, associated with an upstream or downstream copy of an IS''30'' family sequence, [[IS Families/IS30 family#ISApl1|IS''Apl1'']]. A number of examples carried two flanking IS copies, and the entire structure, which was proposed to be a compound transposon with characteristic 2bp direct target repeats <ref name=":47" />, was called [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref><pubmed>28073961</pubmed></ref> and subsequently confirmed to undergo transposition <ref><pubmed>28416554</pubmed></ref>. [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] carries the ''mcr1'' gene together with a downstream open reading frame, ''pap2''. Examination of a significant number of structures lacking the downstream IS revealed that the ~2.6 kb region including ''mcr-1'' and ''pap2'' was 99% identical; the non-identical nucleotides were concentrated at the 3’ end of ''pap2,'' the end that carries the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] (see [[:File:IS3.23.png|Fig.IS3.22 '''B''']]). Moreover, AT rich the ''pap2'' gene in these cases was flanked by AT-rich regions. [[IS Families/IS30 family#ISApl1|IS''Apl1'']] shows a strong preference for AT rich target sites, suggesting that an ancestral downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy had been deleted.
+
[[wikipedia:Colistin|Colistin]] ([[wikipedia:Colistin|polymixin E]]) is a last resort antibacterial that was used extensively in husbandry. Discovery of a transferable phosphethanolamine transferase conferring resistance to [[wikipedia:Colistin|colistin]] in 2015 <ref><nowiki><pubmed>26603172</pubmed></nowiki></ref> was of such concern that it stimulated an immense effort to identify the resistance gene, ''mcr'', in various bacterial sources worldwide. This quickly generated an extensive ''mcr'' sequence library in which it was noted that the gene was often, but not always, associated with an upstream or downstream copy of an IS''30'' family sequence, [[IS Families/IS30 family#ISApl1|IS''Apl1'']]. A number of examples carried two flanking IS copies, and the entire structure, which was proposed to be a compound transposon with characteristic 2bp direct target repeats <ref name=":47" />, was called [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref><nowiki><pubmed>28073961</pubmed></nowiki></ref> and subsequently confirmed to undergo transposition <ref><nowiki><pubmed>28416554</pubmed></nowiki></ref>. [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] carries the ''mcr1'' gene together with a downstream open reading frame, ''pap2''. Examination of a significant number of structures lacking the downstream IS revealed that the ~2.6 kb region including ''mcr-1'' and ''pap2'' was 99% identical; the non-identical nucleotides were concentrated at the 3’ end of ''pap2,'' the end that carries the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy in [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] (see [[:File:IS3.23.png|Fig.IS3.22 '''B''']]). Moreover, AT rich the ''pap2'' gene in these cases was flanked by AT-rich regions. [[IS Families/IS30 family#ISApl1|IS''Apl1'']] shows a strong preference for AT rich target sites, suggesting that an ancestral downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy had been deleted.
  
 
The number of available sequences was such that examples of closely related plasmid backbones could be identified which either had a complete [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] insertion, with “empty” sites (often in multiple examples) as well as cases in which examples also inserted into the same location in which one or the other flanking IS was absent.  
 
The number of available sequences was such that examples of closely related plasmid backbones could be identified which either had a complete [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] insertion, with “empty” sites (often in multiple examples) as well as cases in which examples also inserted into the same location in which one or the other flanking IS was absent.  
  
 
Careful scrutiny of the sequences flanking ''pap2'' without an associated downstream IS, revealed small microhomologies which were thought to represent scars of deletions. Figure IS3.23 shows 4 plasmids each containing [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] (pMCR-M15709, plsl, pMCR_1511, pHNSHP45-2; [[:File:IS3.23.png|Fig.IS3.23 '''A''']]). The salmon colored boxes on the left represent ''pap2'' sequences retained in the subsequent deletions and the purple boxes on the right represent the external flanking sequences retained (some of these intrude into the IS). Figure IS3.23C shows individual “deletants” with similar backbones to those shown in [[:File:IS3.23.png|Figure IS3.23 '''A''']]. It was noted that, when aligned, the deletions have largely occurred between microhomologies of two to four base pairs. Similar results were obtained when the sequences upstream of ''mcr1'' were examined. Interestingly, the length of the deleted segments (Fig.IS3.24) remains close to that of [[IS Families/IS30 family#ISApl1|IS''Apl1'']], 1070 bp.
 
Careful scrutiny of the sequences flanking ''pap2'' without an associated downstream IS, revealed small microhomologies which were thought to represent scars of deletions. Figure IS3.23 shows 4 plasmids each containing [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] (pMCR-M15709, plsl, pMCR_1511, pHNSHP45-2; [[:File:IS3.23.png|Fig.IS3.23 '''A''']]). The salmon colored boxes on the left represent ''pap2'' sequences retained in the subsequent deletions and the purple boxes on the right represent the external flanking sequences retained (some of these intrude into the IS). Figure IS3.23C shows individual “deletants” with similar backbones to those shown in [[:File:IS3.23.png|Figure IS3.23 '''A''']]. It was noted that, when aligned, the deletions have largely occurred between microhomologies of two to four base pairs. Similar results were obtained when the sequences upstream of ''mcr1'' were examined. Interestingly, the length of the deleted segments (Fig.IS3.24) remains close to that of [[IS Families/IS30 family#ISApl1|IS''Apl1'']], 1070 bp.
 +
[[File:IS3.24.png|center|thumb|640x640px|'''Fig.IS3.24.''' Distribution of different deletion sizes following loss of the downstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] in 13 sequences in [[:File:FigIS3.21.png|Figure IS3.22]] together with 6 obtained for deletion of the upstream [[IS Families/IS30 family#ISApl1|IS''Apl1'']] copy. The average deletion size is 1,069.8 bp (Standard deviation: 2.4). From (Snesrud et al., 2018)<ref name=":43" />.]]
  
  
 +
A number of years earlier, Szabó et al. <ref><nowiki><pubmed>10545262</pubmed></nowiki></ref> had observed similar products with IS''30''. Additionally, when the IS''30'' transposase gene was ablated, the deletion frequency was not only reduced by a factor of 10<sup>3</sup> but the accompanying deletions were more complex, including large deletions or unidentified plasmid rearrangements.
  
 +
The detailed observations obtained for [[IS Families/IS30 family#ISApl1|IS''Apl1'']] led to the model shown in Figure IS3.25A, which is illustrated with a specific example (iii in Fig. IS3.23) (Fig.IS3.25B).                                                               
 +
 +
In this model ([[:File:FigIS3.22.png|Fig.IS3.22]]) it is envisaged that a short complementary sequence occurring outside the IS involved ([[:File:FigIS3.22.png|Fig.IS3.22 '''i''']]) and that the DNA strand generated by transposition-associated replication ([[:File:FigIS3.22.png|Fig.IS3.22 '''ii''']]) switches to the complementary sequence ([[:File:FigIS3.22.png|Fig.IS3.22 '''iii''']]). This proposed structure is equivalent to a [[wikipedia:Holliday_junction|Holliday junction]] (see for example <ref><nowiki><pubmed>27990631</pubmed></nowiki></ref> ) which could be resolved by [[wikipedia:RuvABC|RuvC]] (see for example <ref><nowiki><pubmed>8393667</pubmed></nowiki></ref>) ([[:File:FigIS3.22.png|Fig.IS3.22 '''iv''']]).                                                               
  
 +
This reinforces the idea that the protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.                                                               
  
 +
<br />
 +
 +
====The Primer Invasion model also explains high levels of precise excision.====
 +
More generally, although the examples of imprecise IS excision led to this model, it also conveniently explains the high level of precise excision observed for IS''629'' and other IS of the IS3 family <ref name=":41" /> (Fig. IS3.26). These generate small 3-4 bp direct flanking target repeats (DR)(Fig. IS3.26) on insertion )(Fig. IS3.22). Transposase-mediated synapsis of the IS ends (Fig. IS3.26i), cleavage 3-4 nucleotides distal to the opposite IS end (Fig. IS3.26ii)<ref name=":33" /> and single strand bridge formation (Fig. IS3.26ii)<ref name=":33" /> by strand transfer includes one strand of the DR, leaving the complementary single-stranded. Priming from the resulting 3’ OH (Fig. IS3.26iii) would regenerate the second DR strand (Fig. IS3.26iii). Primer invasion (Fig. IS3.26iv) then permits extension and formation of the Holiday junction. It might be expected that, since the complementary sequences are in proximity, locating microhomologies would be more efficient and therefore precise excision using this type of mechanism would be more frequent than imprecise excision. 
  
 +
[[File:IS3.25.png|center|thumb|780x780px|'''Figure IS3.25. (A)''' A General Model showing how strand switching might lead to IS excision. The model is based on datafrom analysis of loss of flanking ISApl1 elements from the colistin resistance compound transposon, Tn6330). Transposon
  
A number of years earlier, Szabó et al. <ref><pubmed>10545262</pubmed></ref> had observed similar products with IS''30''. Additionally, when the IS''30'' transposase gene was ablated, the deletion frequency was not only reduced by a factor of 10<sup>3</sup> but the accompanying deletions were more complex, including large deletions or unidentified plasmid rearrangements.  
+
DNA is shown in green, flanking DNA is shown in blue. The single strand bridged molecule is shown (from Figure 7iB) in
  
The detailed observations obtained for [[IS Families/IS30 family#ISApl1|IS''Apl1'']] led to the model shown in Figure IS3.25A, which is illustrated with a specific example (iii in Fig. IS3.23) (Fig.IS3.25B).                                                               
+
which a short, complementary DNA sequence is represented by blue and magenta boxes with their relative orientation
  
In this model ([[:File:FigIS3.22.png|Fig.IS3.22]]) it is envisaged that a short complementary sequence occurring outside the IS involved ([[:File:FigIS3.22.png|Fig.IS3.22 '''i''']]) and that the DNA strand generated by transposition-associated replication ([[:File:FigIS3.22.png|Fig.IS3.22 '''ii''']]) switches to the complementary sequence ([[:File:FigIS3.22.png|Fig.IS3.22 '''iii''']]). This proposed structure is equivalent to a [[wikipedia:Holliday_junction|Holliday junction]] (see for example <ref><pubmed>27990631</pubmed></ref> ) which could be resolved by [[wikipedia:RuvABC|RuvC]] (see for example <ref><pubmed>8393667</pubmed></ref>) ([[:File:FigIS3.22.png|Fig.IS3.22 '''iv''']]).                                                               
+
shown by a small blue arrow. (i) a 3’ primer from DNA neighboring the IS is indicated by the blue arrowhead, and the 5’
  
This reinforces the idea that the protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.                                                               
+
phosphate by the small orange circle. (ii) IS replication (copy-out) up to and including the duplicated sequence is indicated by a dotted green line. (iii) Strand switching of the primer strand to the duplicated sequence in the neighboring DNA is shown to occur with displacement of the complementary strand. This generates an intermediate which resembles a Holliday
[[File:FigIS3.22.png|center|thumb|680x680px|'''Fig.IS3.17.''' '''A Model showing how strand switching might lead to IS excision.''' The model is based on data from analysis of loss of flanking [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=ISApl1 IS''Apl1''] elements from the [[wikipedia:Colistin|colistin resistance]] compound transposon, [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] <ref name=":43" />, [[:File:IS30.10.png|Fig. IS30.10]].). Transposon DNA is shown in green, and flanking DNA is shown in blue. The single-strand bridged molecule is shown (from Figure '''[[:File:Fig.IS3.18.png|Fig.IS3.18 i]]''') in which a short, complementary DNA sequence is represented by blue and magenta boxes with their relative orientation shown by a small blue arrow. '''(i)''' a 3’ primer from DNA neighboring the IS is indicated by the blue arrowhead, and the 5’ phosphate by the small orange circle. '''(ii)''' IS replication (copy-out) up to and including the duplicated sequence is indicated by a dotted green line. '''(iii)''' Strand switching of the primer strand to the duplicated sequence in the neighboring DNA is shown to occur with a displacement of the complementary strand. This generates an intermediate that resembles a [[wikipedia:Holliday_junction|Holliday junction]]. '''(iv)''' Resolution of the [[wikipedia:Holliday_junction|Holliday junction]] resulting in deletion of most of the IS from the donor replicon. This structure can be resolved by replication.|link=Special:FilePath/FigIS3.22.png]]<br />
 
  
====The Primer Invasion model also explains high levels of precise excision.====
+
junction. (iv) Resolution of the Holliday junction resulting in deletion of most of the IS from the donor replicon. This structure
More generally, although the examples of imprecise IS excision led to this model, it also conveniently explains the high level of precise excision observed for IS''629'' and other IS of the IS3 family <ref name=":41" /> (Fig. IS3.26). These generate small 3-4 bp direct flanking target repeats (DR)(Fig. IS3.26) on insertion )(Fig. IS3.22). Transposase-mediated synapsis of the IS ends (Fig. IS3.26i), cleavage 3-4 nucleotides distal to the opposite IS end (Fig. IS3.26ii)<ref name=":33" /> and single strand bridge formation (Fig. IS3.26ii)<ref name=":33" /> by strand transfer includes one strand of the DR, leaving the complementary single-stranded. Priming from the resulting 3’ OH (Fig. IS3.26iii) would regenerate the second DR strand (Fig. IS3.26iii). Primer invasion (Fig. IS3.26iv) then permits extension and formation of the Holiday junction. It might be expected that, since the complementary sequences are in proximity, locating microhomologies would be more efficient and therefore precise excision using this type of mechanism would be more frequent than imprecise excision.
 
  
This reinforces the idea that the IEE protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.  
+
can be resolved by replication. '''(B)''' A mechanism for ISApl1 deletion:The double-strand sequence of the IS ends in IncI2 plasmid pMCR-M17059 as an example. From (Snesrud et al., 2018). The panel presents the structure of the single-strand bridged molecule. IS ends are boxed. The 3’OH generated in the donor plasmid DNA is indicated by a red dot and the corresponding 5’ phosphate at the other IS end by a black dot. The grey arrow indicates the direction of transposition-associated replication. The deletion joint is shown in blue. The sequence remaining after deletion (bottom) representing plasmid pSCS23 is composed of the bold black characters together with one of the blue tetranucleotide sequences]]This reinforces the idea that the IEE protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.  
  
 
In summary, the IEE protein plays an important role in excision of members of a number of IS families which all have in common the production of IS circles as transposition intermediates and probably all use the copy-out-paste-in transposition pathway. Excision is not only dependent on IEE but also requires an active transposase, indicating that it is associated with the transposition process itself. Not only do the biochemical properties of IEE include the ability to prime DNA synthesis and overcome potential obstacles due various lesions in the template DNA, they also include the capacity to recognize microhomologies. This suggests to us that IEE acts at the replication (copy-out) step in the transposition pathway, subsequent to initial 3’cleavage of an IS end and its transfer to the other. Specifically, based on sequence data obtained from the loss of the [[IS Families/IS30 family#ISApl1|IS''30'' family member, IS''Apl1'']]'','' which flanks the ''mcr''-1 gene in the compound transposon [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''], we suggest that it allows a strand switch (primer invasion) to suitable microhomologies in the neighboring donor DNA creating a [[wikipedia:Holliday_junction|Holliday junction]] and short circuiting the replication/transposition reaction. The IS could then be removed by [[wikipedia:Holliday_junction|Holliday junction]] resolution, a possibility that can be addressed experimentally. This would also explain the high levels of precise excision which, by definition leave no scars, since the DR would provide suitable microhomologies to facilitate primer invasion precisely at the IS ends, a prediction which could be tested directly by changing the DR sequence at one end of the IS and examining its effect on precise excision frequency in the presence of IEE.  
 
In summary, the IEE protein plays an important role in excision of members of a number of IS families which all have in common the production of IS circles as transposition intermediates and probably all use the copy-out-paste-in transposition pathway. Excision is not only dependent on IEE but also requires an active transposase, indicating that it is associated with the transposition process itself. Not only do the biochemical properties of IEE include the ability to prime DNA synthesis and overcome potential obstacles due various lesions in the template DNA, they also include the capacity to recognize microhomologies. This suggests to us that IEE acts at the replication (copy-out) step in the transposition pathway, subsequent to initial 3’cleavage of an IS end and its transfer to the other. Specifically, based on sequence data obtained from the loss of the [[IS Families/IS30 family#ISApl1|IS''30'' family member, IS''Apl1'']]'','' which flanks the ''mcr''-1 gene in the compound transposon [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''], we suggest that it allows a strand switch (primer invasion) to suitable microhomologies in the neighboring donor DNA creating a [[wikipedia:Holliday_junction|Holliday junction]] and short circuiting the replication/transposition reaction. The IS could then be removed by [[wikipedia:Holliday_junction|Holliday junction]] resolution, a possibility that can be addressed experimentally. This would also explain the high levels of precise excision which, by definition leave no scars, since the DR would provide suitable microhomologies to facilitate primer invasion precisely at the IS ends, a prediction which could be tested directly by changing the DR sequence at one end of the IS and examining its effect on precise excision frequency in the presence of IEE.  
[[File:IS3.25.png|center|thumb|780x780px|IS3.25]]
 
  
 
Importantly, the activity of IEE in removing flanking IS serves to immobilize genes carried by compound transposons, explaining the presence of certain of these genes without associated IS copies in plasmids and chromosomes. It will be important to address this question experimentally using entire compound transposons such as [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] as well as others with flanking copy-out-paste-in IS and to compare the effects with compound transposons with different transposition pathways such as cut-and-paste.
 
Importantly, the activity of IEE in removing flanking IS serves to immobilize genes carried by compound transposons, explaining the presence of certain of these genes without associated IS copies in plasmids and chromosomes. It will be important to address this question experimentally using entire compound transposons such as [https://tncentral.ncc.unesp.br/cgi-bin/tn_report.pl?id=Tn6330-KX772391 Tn''6330''] as well as others with flanking copy-out-paste-in IS and to compare the effects with compound transposons with different transposition pathways such as cut-and-paste.
<br />
 
====IEE and the Diagnostics Lab====
 
  
The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] excision behavior has been observed in a group of clinically relevant [[wikipedia:Shiga_toxin|Shiga toxin]] producing ''[[wikipedia:Escherichia_coli|E. coli]]'' (''[[wikipedia:Escherichia_coli|E. coli]]'' (STEC) O121:H19). It was noted that some, but not all of a number of clinical STEC O121:H19 isolates exhibit a phenotype called '''d'''elayed '''l'''actose '''u'''tilization (DLU), where cultures remain ''lac''<sup>-</sup> after 24 h of growth but become ''lac''<sup>+</sup> after 48h of growth <ref><pubmed>35913153</pubmed></ref><ref><pubmed>34809935</pubmed></ref>.
+
[[File:IS3.26.png|center|thumb|680x680px|'''Fig. IS3.26.''' Precise Excision model. ]]
 +
 
 +
==== IEE and the Diagnostics Lab ====
 +
The [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] excision behavior has been observed in a group of clinically relevant [[wikipedia:Shiga_toxin|Shiga toxin]] producing ''[[wikipedia:Escherichia_coli|E. coli]]'' (''[[wikipedia:Escherichia_coli|E. coli]]'' (STEC) O121:H19). It was noted that some, but not all of a number of clinical STEC O121:H19 isolates exhibit a phenotype called '''d'''elayed '''l'''actose '''u'''tilization (DLU), where cultures remain ''lac''<sup>-</sup> after 24 h of growth but become ''lac''<sup>+</sup> after 48h of growth <ref><nowiki><pubmed>35913153</pubmed></nowiki></ref><ref><nowiki><pubmed>34809935</pubmed></nowiki></ref>.
  
 
Excision of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] which was inserted into ''[[wikipedia:Lac_operon|lacZ]]'' gene was thought to be responsible and the phenomenon was observed to require the presence of lactose in the medium. Moreover, the resulting ''lac''<sup>+</sup> cultures, the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] copy had excised. The phenomenon was also shown to require a functional ''iee'' gene since its inactivation by an [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1203 IS''1203''] copy in a natural isolate prevented DLU. DLU is therefore simply the result of a selection for ''lac''<sup>+</sup> individuals in a population with high levels of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] excision facilitated by ''iee''.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
 
Excision of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] which was inserted into ''[[wikipedia:Lac_operon|lacZ]]'' gene was thought to be responsible and the phenomenon was observed to require the presence of lactose in the medium. Moreover, the resulting ''lac''<sup>+</sup> cultures, the [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] copy had excised. The phenomenon was also shown to require a functional ''iee'' gene since its inactivation by an [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS1203 IS''1203''] copy in a natural isolate prevented DLU. DLU is therefore simply the result of a selection for ''lac''<sup>+</sup> individuals in a population with high levels of [https://tncentral.ncc.unesp.br/ISfinder/scripts/ficheIS.php?name=IS600 IS''600''] excision facilitated by ''iee''.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
[[File:IS3.26.png|center|thumb|680x680px|IS3.26]]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
 
 
 
===Acknowledgements===
 
===Acknowledgements===
 
We would like to thank [https://www.cbm.uam.es/en/research/programs/genome-dynamics-and-function/genome-maintenance-and-instability/maintenance-of-bacterial-genome-stability Miguel de Vega] ([https://www.cbm.uam.es/en/ Centro de Biología Molecular Severo Ochoa] - Madrid, Spain) for reading this chapter and for critical comments.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
 
We would like to thank [https://www.cbm.uam.es/en/research/programs/genome-dynamics-and-function/genome-maintenance-and-instability/maintenance-of-bacterial-genome-stability Miguel de Vega] ([https://www.cbm.uam.es/en/ Centro de Biología Molecular Severo Ochoa] - Madrid, Spain) for reading this chapter and for critical comments.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 

Revision as of 19:46, 21 September 2023

Contents

Original Identification

IS3 and another member of this family, IS2 were identified genetically as a DNA segments causing insertional inactivation of gal and lac operons and physically by electron microscopy[1] and in plasmid F as a segment called alpha-beta[2][3]. IS3 was subsequently wrongly identified as the insertion sequence flanking the tetracycline resistance transposon Tn10[4][5]. It has subsequently been found as a component of a large number of plasmids particularly in gram negative enterics.

Presence in Compound Transposons

Although IS3 family elements do participate in compound transposons (e.g. IS3411) flanking the Citrate Utilization, to our knowledge there has been no systematic survey undertaken and very few IS3-associated compounds have been described to date. Several family members are part of compound transposons. These include: IS3411 flanking genes for citrate utilization in transposon Tn3411[6][7][8], IS4521 which flanks a heat stable enterotoxin gene in enterotoxinogenic Escherichia coli and IS1706, which flanks genes of the Clp protease/chaperone family.

Distribution

This is one of the most coherent, largest, most abundant and widely distributed IS families [9] (see [10]). Nearly 600 individual different members of this family have been identified in more than 267 bacterial species distributed over 145 genera. However, their true distribution is clearly significantly greater than this.

For example, IS911, (isolated from a Shigella dysenteriae phage λ lysogen by spontaneous insertion into the phage cI repressor gene[11]) is present in multiple copies in the original host strain and in type strains of other Shigella species. Two vestigial copies, both interrupted by a copy of IS30, were also detected in the chromosome of E. coli K12[12] and could form transposition intermediates when supplied with IS911 transposase[13]. Entire or truncated IS911 copies have also been identified in several E. coli virulence plasmids (e.g. [14]), in pathogenicity islands of uropathogenic E. coli (e.g. [15]), in various other clinical isolates of E. coli and in a large number of well-known and less well-known enterobacteria such as Escherichia fergusonii, Chronobacter, Dickeya, Erwinia, Klebsiella, Pantoea, Shimwellia, and Yersinia.

Most IS3 family members have been identified in bacteria although at least one example, ISMco1, has also been identified in the archaea Methanosaeta concilii[16]. Since this archaeon is widespread in nature[17], it is possible that this represents a case of recent horizontal transfer. The presence of 8 copies implies that ISMco1 is active in its archaeal host.

Organization

The family is quite homogenous in the organization (Fig.IS3.1). in spite of its wide distribution in bacteria exhibiting a large range of G+C contents (from 70% in the Mycobacterial examples to 25% in those isolated from Mycoplasma) and of the presence of members in hosts such as Mycoplasma with a non-universal genetic code (e.g. IS1138) or in bacteria which use stop codon read-through by insertion of the unusual amino acid selenocysteine (e.g. ISDvu3 from Desulfovibrio vulgaris). In the case of both copies of IS1138, which participates in high frequency rearrangements of the Mycoplasma pulmonis chromosome, the Tpase orf carries 11 UGA codons which are decoded as tryptophan[18].

Fig. IS3.1. (A) Genetic organization of IS911. The 1,250-bp IS911 is shown as a box. The boxes at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The two open reading frames, orfA (blue) and orfB (green) are positioned in relative reading phases 0 and −1, respectively, as indicated. The indigenous promoter, pIRL, is shown. The region of overlap between orfA and orfB, which includes the frameshifting signals to produce OrfAB, lies within IS911 coordinates 300 and 400. The precise point at which the frameshift occurs, within the last heptad of the LZ, is indicated by the vertical dotted line. (B) Structure-function map of OrfAB and OrfA. HTH, a potential helix-turn-helix motif; LZ, a leucine zipper motif involved in homo- and hetero-multimerization of OrfAB and OrfA. Programmed translational frameshifting that fuses OrfA and OrfB to generate the transposase OrfAB occurs within the fourth heptad. The LZ of OrfA and OrfAB, therefore, differ in their fourth heptad. A second region, M, necessary for multimerization of OrfAB is shown, as is the catalytic core of the enzyme which carries a third multimerization domain. OrfA translation initiates at an AUG, terminates with UAA whereas OrfAB translation terminates within the right IR. The vertical line to the right of M shows the extent of the truncated transposase, OrfAB[1–149] described in the text. (C) Frameshifting window. The mRNA sequence around the programmed translational frameshifting window is presented. The boxed sequence GGAG is the potential ribosome-binding site located upstream of orfB whose potential translation would be initiated at the boxed AUU codon. A ribosome (not to scale) is shown covering a series of “slippery” codons (AAAAAAG). A downstream secondary structure is also shown with the UAA, OrfA translation termination codon. The ribosome-binding site, slippery codons, and secondary structure all contribute to the efficiency of the programmed −1 frameshift. The box at the foot of this figure shows how the anti-codons of two tRNALys are thought to undergo re-pairing with their codons in the AAAAAAG motif.

Members are between 1200 and 1550 bp with relatively well conserved inverted terminal repeats in the range of 20-40 bp. One exception previously attributed to this family, IS481, is 1045 bp long and has now been placed in a separate family; see "IS481 family"). They generate 3 or 4 bp DR on insertion.

The majority of IR terminate with 5'-TG-----CA-3' and present an internal block of G/C residues of variable length (Fig.IS3.2).

Fig. IS3.2. WebLogo of IS3 family ends. The left (IRL) and right IRR inverted terminal repeats of the major IS3 family groups as defined in ISfinder are shown in WebLogo format. They are defined by the direction of transcription/translation of the transposase gene. IRL, by definition, is located on the 5’ side of the transposase orf.

IS3-family members generally have two consecutive and partially overlapping reading frames, orfA and orfB, in relative translational reading phases 0 and -1, respectively (Fig.IS3.1 A) under control of a weak promoter, pIRL, partially located in IRL (Fig.IS3.1 A and Fig.IS3.3 C). The 5' end of orfB overlaps the 3' end of orfA and occurs in reading phase -1 relative to orfA (Fig.IS3.1).

It had been demonstrated in the 1990s that several family members (IS150[19], IS3[20], IS911[21], and IS2[22]) express two major proteins (Fig.IS3.1 B): OrfA, the product of the upstream frame,and the transposase, OrfAB, a “fusion” or “transframe” protein generated from orfA and orfB by Programmed -1 Ribosomal Frameshifting (PRF) (see "Programmed translational frameshifting")[23]. Many other members of this family are also organized in this way[24][25]. The frameshifting frequency varies from element to element. It is approximately 50% in the case of IS150[19] and only 15% for IS911[21].

Fig. IS3.3. (A) Organization of the IS911 inverted repeat (IR). The nucleotide sequence of IRL and IRR is boxed. Grey horizontal bars above and below indicate the internal regions protected from DNaseI digestion by binding of OrfAB [1–149], a derivative of the 382-amino-acid OrfAB truncated for its catalytic domain. The dotted horizontal gray bar indicates partial protection. The dashes within the sequence indicate mismatches between the left and right ends. The −35 and −10 components of the indigenous promoter pIRL (blue boxes) and of pjunc (green boxes) are shown. The conserved 5′ TG tips are highlighted in red. (B) Organization of pjunc. The “junction” promoter assembled on the circularization of IS911 is shown as green boxes. The initiating transcript nucleotide (+1 pjunc), the indigenous pIRL (blue boxes), and the initiating transcript nucleotide (+1 pIRL) are also shown. The conserved 5′ TG tips are highlighted in red. (C) Secondary structure at the left IS911 end. The sequence of the “top” strand of IRL is shown, together with the various transcription and translation signals. The symbols below are standard “dot-bracket” notations to indicate potential secondary structures formed with transcripts from top to bottom: from an external promoter, from pjunc, or from pIRL respectively. The brackets are shown in italic, simply permit the reader to identify the apical stem of the secondary structure.

Complex internal inverted repeat sequences (Fig.IS3.3 C) (for IS911, located between coordinates 19 and 73) include the -35 and -10 hexamers of pIRL, the transcription start site and the ribosome binding site for OrfA. This is thought to play a role at the mRNA level in preventing excess transposase expression resulting from external transcription. The full secondary structure would be present in transcripts initiated outside the IS thus sequestering the translation initiation signals but only the 3’ part would be present if transcription initiates at pIRL. In this case, the translation initiation signals would be exposed. Initial studies (Prère and Fayet pers communication) have shown that translation from the longer transcript is very low but that deletion of its 5’ end to “liberate” the ribosome binding site (Fig.IS3.3 C) indeed results in a significant increase in translation. In the related IS2 element, a similar sequence appears to function as a DNA binding site for the OrfA protein which represses promoter activity but further studies are necessary to confirm this[26].

Formation of a strong transposase promoter

In common with many IS of other families (e.g. IS21[27], IS30[28], IS110[29][30]) the IS3 family IRR carry an outward-directed -35 promoter hexamer while IRL carries an inward-directed -10 promoter component (Fig.IS3.3 B). These are assembled into a strong promoter, pJunc, which serves to express high levels of transposition proteins (Fig.IS3.3 B); (Fig.IS3.4) in one of its key transposition intermediates, an excised transposon circle (see "Transposition Pathway"). Transcription initiation from pJunc, like that from impinging transcription, would also produce an RNA which could sequester the translation initiation signals but in a shorter and less stable stem loop structure (Fig.IS3.3 C).

Fig. IS3.4. Left: Primer extension analysis of lac transcripts. Lanes 1 and 2: two independent cultures. Lanes 3 and 4: primer extension products obtained from identical quantities of total RNA isolated from two independent cultures. The major products are indicated by unfilled arrowheads (right). The scheme at the left shows the relative position of the IRR–IRL junction. Middle: Schematic of the different plasmid forms notes that to obtain results for the transposon junction a copy was cloned into a suitable vector. Right: Colonies on MacConkey lactose plates.

Regulation by Methylation?

Several members carry GATC methylation sites within 50bp of their ends, which have been shown in one case, IS3, to modulate transposition activity[31], however, this is not a general characteristic of the family nor is it restricted to any particular subgroup.

Insertion specificity

There appears to be little sequence specificity for insertion of members of the family. IS2 exhibits a preference for a region of bacteriophage P1 but the basis of this preference is at present unknown[32]. Both IS911[33] and IS150 [34] have been found next to sequences which resemble their IRs (see “Targeted Insertion”) and IS1397 is invariably located within intergenic repeated sequences in E. coli (Bacterial Interspersed Mosaic Elements or BIMEs[35].

Group II intron insertions

Finally, an element isolated from the ECOR collection of E. coli and closely related to IS3411 carries a group II intron[36]. The effect of this on regulation of transposition of this element has not been investigated.

IS3 family subgroups

The IS3 family is divided into five subgroups (Table Characteristics of IS families; Fig.4.2). This is supported by deep branching in the alignment of the various OrfA and OrfB sequences[37] (Fig.IS3.5). These are: the IS2 and IS407 subgroups (which appear closely related), and the IS3, IS51, and IS150 subgroups.

Additional members of the family identified subsequently also tend to follow this pattern. One feature which lends biological credence to these subgroups is that they also clearly appear clustered (with some exceptions) in the results of the alignments with the upstream OrfA protein[37]. Moreover, there is some correlation between the members of each group and the number of base pairs of target DNA duplicated on insertion (DR): for those elements in the IS2 subgroup, insertion invariably leads to a 5 bp DR; for the IS407 subgroup a 4 bp DR is observed; while for the other groups a 3 bp DR is generated (Table Characteristics of IS families). In the latter cases some of the elements, e.g. IS911, have been shown to occasionally generate 4 bp repeats. This clustering is also exhibited to some extent in the nucleotide sequence of the terminal IRs (Fig.IS3.2) and is particularly marked in the IS2, IS51 and IS407 subgroups. It can also be observed in the primary sequence details of the putative leucine zipper[38].

Fig. IS3.5. Relationship of OrfA and OrfB in various IS3 family groups. Dendrogram based on the alignments of the amino acid sequences of predicted OrfA proteins from 40 elements (left) and 44 predicted OrfB frames (right) (adapted from Mahillon and Chandler 1998). The different colors indicate the different IS3 family groups, showing that both A and B frames are largely group-specific.

Family Exceptions

Several family members exhibit an organization which does not apparently conform to the generic IS3 member. In IS120, for example, the relationship between the reading phases of the upstream and downstream orfs appears to be +1 rather than -1 while in ISNg1 and ISYe1 the characteristic motifs of OrfB are distributed between reading phases. Other members, such as IS1076, IS1138, IS1221, and IS1141, exhibit only one long open reading frame. Although these may be true variants, it cannot at present be ruled out that the variations are simply due to errors in sequence determination.

Mycoplasma and the non-universal genetic code

Family members from Mycoplasma merit special attention. Not only does the host use a non-universal genetic code in which the opal termination codon TGA directs the insertion of tryptophan (see [39], but their genomes are among the smallest bacterial genomes known and extremely rich in A+T. To date, several different IS3 family members have been observed in Mycoplasma. Of these, only IS1138 (and IS1138b) has been demonstrated directly to undergo autonomous transposition[18]. All exhibit similarly high AT levels and this unusual base composition could lead to difficulties in sequence determination. It is remarkable that typical IS3 family characters have been maintained in such an "extreme" genetic environment. Nine individuals are closely related and form a group of iso-elements which have been called IS1221. As indicated above, one of these carries a single long reading frame (representing orfA + orfB) instead of two consecutive overlapping frames. The others each carry insertions or deletions which destroy either the equivalent of orfA, orfB, or both. Expression studies in E. coli indicate that a protein, equivalent to OrfAB, is indeed produced from the long open reading frame of IS1221. Interestingly, it appears that a second truncated protein, equivalent to OrfA, may be generated from the single orfAB frame by translational frameshifting, representing an "inverted" expression pattern to the majority of the family members[40]. Although this appears not to be a general rule for IS3 family members originating from Mycoplasma hosts, the presence of a similar single-frame arrangement in a second member, IS1138, indicates that it might not be rare. Because of the extremely high AT content of these elements, many potential frameshift windows of the A6G(/C) or A7 type are expected to occur. The only direct experiment will, therefore, be able to determine which, if any, of these sequences are used to generate the Tpase or, conversely, an OrfA-like protein.

A clade with non-canonical IR

A clade carrying non-canonical ends has recently been identified. These IS include 7 supplementary base pairs on each end flanking canonical IS3 ends: a conserved stretch of 5 C residues is located 5’ to the left IR and a less conserved motif (CGG) is located 3’ to the right end. When these additional bases are taken into account every member of this clade exhibits a 4 bp DR characteristic of the IS3 family (Table Characteristics of IS families) (Gourbeyre, pers. comm.). This conclusion is supported by the presence of multiple IS copies (e.g. ISPsy31) and also by identification of “empty sites”. This clearly requires further experimental investigation.

An additional subgroup

Recently, an additional subgroup has been proposed which includes ISPpy1[41]. However, all members belong to the IS150 subgroup and their Tpases are not separated by our standard multiple alignments and MCL analysis. Although they do exhibit some variation in the sequence of their terminal dinucleotides, similar variations are found for IS2 and members of other IS3 subgroups.

Mechanism

Transposition Proteins

Extensive alignment studies of the predicted OrfA and OrfB amino acid sequences between themselves and with those of other transposable elements[42][43][44][45][46] provided insights into structure/function relationships of the proteins (Fig.IS3.1 B).

OrfA

OrfA is small. For IS911 it has a predicted molecular weight of 11.5 kDa. The predicted primary amino acid sequences of most IS3 family members exhibit a similarly placed HTH signature (see for example [11][47]) which initially suggested that they might provide sequence-specific binding to the terminal IRs of their particular IS[48] involved in sequence-specific binding of the transposase to the terminal IRs OrfAB which was subsequently confirmed experimentally[49]. They also carry a C-terminal leucine zipper (LZ) motif first identified in IS2, IS150 and IS3 and which appears to be conserved in the majority of known members[50] and is involved in protein multimerization[11][21][38][50].

OrfB

The OrfB products carry a DD(35)E catalytic motif and share additional identities with retroviral integrases and various other Tpases[21][42][43][44][45][46][51]. These include two amino acids located 4 and 7 residues downstream from the glutamate residue.

IS911 OrfB is 299 residues long with a predicted molecular weight of 34.6kD. Its TAA termination codon lies just within IRR and may be significant in regulation. The OrfB initiation codon is AUU and consequently initiation occurs only at low levels[21][52] and is modulated by the level of initiation factor IF3[53].

OrfB has been observed for: IS3[20] (Prère & Fayet, unpublished), IS150[19], IS911[21][52][53] and IS3411/IS629[54][55] but not for IS2[56]. It is generally present at quite low levels although for IS3 approximately equal amounts of OrfB and OrfAB appear to be produced[20]. The IS150 OrfB initiation codon is out of phase with the rest of the gene and expression of full length OrfB would require a -1 frameshift after initiation.

Sequence analysis suggests that OrfB may in fact be synthesized by about 34% of IS3 family members through translational coupling: the stop codon of orfA overlaps with a potential orfB start codon (e.g. AUGA or GUGA) in 134 out of 399 ISs analyzed[24].

It is possible that the OrfB protein itself plays no direct role in transposition chemistry but that it is simply its translation signals which are important. Their recognition by the ribosome could modulate programmed translational frameshifting required to generate a single transposase protein, OrfAB, from the two reading frames orfA and orfB (see "Programmed translational frameshifting").

The OrfB amino acid sequence shares significant similarities with retroviral integrases, an observation which contributed to defining the highly conserved amino acid triad DDE common to all IS3 family members and to many of this type of phophoryltransferase enzymes[43][57]. This constitutes part of the active site (for reviews see: [48][52]).

OrfB carries neither the HTH nor the LZ motif.

OrfAB: a product of programmed ribosomal frameshifting (PRTF)

OrfAB is assembled from orfA and orfB by a programmed –1 ribosomal frameshift occurring near the 3' end of orfA (see "Programmed translational frameshifting") first demonstrated for the related IS150[19].

The transframe protein combines the orfA HTH motif, an LZ motif and the orfB DD(35)E catalytic domain [50] (Fig.IS3.1 B).

OrfAB of IS911 (382 amino acids) shares its 86 N-terminal amino acids with OrfA (100 amino acids) and its 296 C-terminal amino acids with OrfB (299 amino acids).

Ribosome rephasing to generate OrfAB occurs on a group of "slippery” lysine codons with a frequency of about 15% (measured using systems driven by two different promoters; T7p10 and ptac). OrfA is therefore normally expressed at significantly higher levels than OrfAB. Frameshifting permits the combination of different functional protein domains (Fig.IS3.1 C)..

IS3-family frameshifting is similar to that used in some retroviruses to generate the pol-gag "polyprotein"[58] and in the dnaX gene of E. coli to synthesize γ the sub-unit of DNA polymerase III[59].

The relevant IS911 sequences involved in frameshifting are shown in (Fig.IS3.1 C). Examples of frameshifting sequences from other members of the family are shown in Fig.IS3.6. The group of slippery lysine codons is A AAA AAG and is directly preceded by the AUU OrfB initiation codon. Since E. coli does not encode a tRNALys with a 3’UUC5’ anti-codon for AAG, both lysine codons are decoded by the same tRNALys with a 3’UUU5’ anticodon. Its pairing is weaker with a G at the wobble position[60] probably because modifications of U34 increase the rigidity of the anticodon[61]. The presence of an upstream RBS (GGAG sequence) and a downstream secondary structure (Y-shaped stem-loop) stimulates ribosome rephasing in the -1 direction. What drives frameshifting is probably the thermodynamically favorable re-pairing of the two tRNALys from codons AAA-AAG to codons AAA-AAA[59][62]. The stimulators likely have a mechanical effect bringing back in the register the ribosome and the mRNA after tRNA slippage. Different groups of codons have been observed to allow rephasing of the ribosome[25] and, although the most common motif is A6G, different members of the IS3 family carry a variety of these (e.g. A3G for IS3; see Atkins & Gesteland, Recoding: expansion of decoding rules enriches gene expression, Springer 2010).

Fig. IS3.6. Signals and predicted branched stem-loop structures in the frameshift regions of IS911, IS3, IS3411, and IS1222. This figure, adapted from Sharma et al., 2014 (IS911, IS3), Mazauric et al., 2008 (IS3411) and Mejlhede et al., 2004 (IS1222), illustrates several of the different potential secondary structures located downstream of the group of “slippery” codons at which a programmed -1 translational frameshift occurs. These include stem-loop structures in all cases, but may also involve the formation of a pseudoknot which enhances ribosome slippage and an upstream ribosome binding site (SD sequence).

Two similarly located partially overlapping reading frames in IS3, IS150 and IS3411[54] also produce three proteins. The transposases, OrfAB, like that of IS911, are fusion products of the two orfs generated by a –1 translational frameshift.

For IS3, frameshifting is also stimulated by a presumed H-type pseudoknot structure similar to those generally involved in viral recoding[63]. In IS3411, -1 slippage on a U UUU motif requires a more convoluted form of pseudoknot structures formed by pairing of an apical loop and an internal loop belonging to two hairpins located 65 nucleotides apart on the mRNA[54]. Two similarly arranged orfs occur in IS2 and have been shown to encode OrfA and OrfAB equivalents only[26][56]. This organization is observed in most members of the IS3 family but, beside the cases mentioned above, frameshifting has been analyzed experimentally only in a few other, less well-characterized, elements (including IS51, IS222, IS600, IS1133, IS1222).

The frequency of frameshifting is quite variable from element to element: reported values are 15% for IS911, 50% for IS150, 6% for IS3 and 2% for IS3411[54]. These values may not reflect the in vivo situation since they were not established by direct measurement of the amount of the OrfA and OrfAB proteins synthesized from an intact IS, but after modification of expression signals of the IS genes or after cloning the frameshift signals in a reporter system[19][20][21].

The level of formation of a circular IS911 transposition intermediate IS911 carrying abutted left and right ends to generate an IRR-IRL junction (Transposition Pathway) measured by PCR indeed depends on frameshifting frequency in vivo[64]. IS911 copies from several clinical isolates contained variations in the frameshift region exhibited various reduced levels of frameshifting. When these were introduced into the model IS911 they resulted in comparable reductions in a circle formation.

Frameshifting is likely modulated by the physiological state of the host cells and by the environment: for example, frameshifting decreases when the temperature is raised or when ribosome density on the mRNA is increased (O. Fayet, pers. Comm.).

Artificial orfA-orfB fusion

For experimental purposes, production of OrfAB without necessitating a translational frameshift is obtained by introduction of a single additional base pair within the frameshift region which artificially fuses the orfA and orfB frames and eliminates OrfA production[21]. It was initially difficult to construct this mutant in the context of an entire IS911 (i.e. with the two flanking IR) but more recently this has been accomplished using a longer artificial IS and resulted in an exceptionally high transposition frequency[65]. A similar mutant in IS3 results in a high frequency of adjacent deletions[20].

Structural motifs

Although no structural information is available from crystallography, the role of the HTH and LZ motifs have been probed in vivo and in vitro.

The conserved N-terminal helix-turn-helix (HTH) motif is related to the LysR family of bacterial transcription factors and has a highly conserved tryptophan residue similar to that of certain homeodomain protein HTH motifs. This domain is important in directing transposase to bind IS911 IR[49] and is present in most IS3 family members (Fig.IS3.7 A). The N-terminal helices of the related IS2 transposase are also involved in IR binding[49].

Fig. IS3.7A. Sequence alignments of the HTH motif. Top. Alignment of the predicted HTH motif of the transposase of the five defining members of subgroups within the IS3 family with that of IS911. Identical or similar residues are boxed; bold lower case characters represent residues that fit the consensus. Bottom. An expanded view of the IS911 HTH motif with (below) mutated resides used in defining DNA binding functions.

Many members carry a putative leucine zipper located at the end of OrfA (sometimes extending into the OrfB region of the OrfAB protein) (see [40] [66][67]). Studies with IS911 and IS2 indicate that this is a multimerization domain of the proteins[38][50][68]. The LZ motif of IS911 is composed of four heptameric units (Fig.IS3.1 B) with a predicted coiled coil structure including a potential buried inter-subunit hydrogen bond across the dimer interface (Fig.IS3.7 B), to maintain the zipper in a dimeric state, and correctly placed residues with opposite charges potentially able to form characteristic inter-subunit salt-bridges to stabilize the dimeric structure[50]. Leucine zipper motif are found in most IS3 family members (Fig.IS3.7 C).

Fig. IS3.7B. A) OrfAB is shown at the top. The relative positions of the A and B domains are indicated together with those of the helix-turn-helix (HTH), leucine zipper (LZ), and DD(35)E motifs. M is a second region necessary for correct multimerization. The numbers below indicate the positions in amino acid residues. The single amino acid sequence below shows the LZ motif with the four-component heptad repeats indicated below and the leucine repeat highlighted. Repeating positions are indicated by the letters a to g. The changes in LZ sequence resulting from frameshifting between OrfA and OrfAB. B) A helical wheel diagram showing a head-to-head homodimer conformation to portray the predicted hydrophobic core (positions a and d) and electrostatic interactions (positions e and g). Arrows of decreasing size and intensity are directed towards the carboxy-terminal end.
Fig. IS3.7C. Conservation of the leucine zipper motif throughout the different IS3 family subgroups. Alignment of predicted coiled-coils in the OrfA proteins of members of the five IS3 families. Leucine residues are highlighted in red and other significant residues in blue. Adapted from Haren et al., 2000.

OrfAB and OrfA form both homomultimers and mixed OrfAB-OrfA multimers[38][50].

Mutation of specific critical residues in the OrfAB LZ reduces the level of transposition intermediates in vivo and in vitro [69] (Transposition Cycle) and reduced or prevented multimer (dimer) formation. OrfAB and OrfA share three of their four heptads (Fig.IS3.7 B). The last of each differs in sequence due to the translational frameshift which occurs within the heptad in the expression of OrfAB. This presumably results in different strengths of monomer-monomer interactions in the case of homo- and hetero-multimers and this may be involved in the regulation of transposition. A poorly defined region, M, located between residues 109 and 135 (Fig.IS3.1 B) and components in the catalytic domain of OrfAB are also involved in its multimerization.

Co-translational DNA binding

IS911 OrfAB has a strong cis preference in vivo [65]. It has about a 200 fold higher activity on the IS copy from which it is expressed (in cis) than in trans. This prevents activation of transposition of one IS copy by OrfAB expressed from a second copy in the same cell. The strength of the cis effect depends on the distance of the transposase gene from the IS ends. Also, modification of the translational frameshifting pause signal has a strong influence on cis preference presumably by delaying translation and folding of the C-ter domain increasing the chance that the folded N-ter domain will recognize and bind its target IR.

In vitro analyses using ribosome display with a coupled E.coli-derived transcription-translation system coupled with size exclusion chromatography[65] demonstrated that an added IR bound nascent OrfAB derivatives while they are still attached to the ribosome. Ternary complexes containing mRNA, ribosome, and a nascent peptide specifically bound added IR copies if only the N-ter 149 amino acids extended from the ribosome whereas a full-length Tpase exiting the ribosome did not.

Direct evidence of coupled translational binding (Fig.IS3.8) was obtained using a staged coupled transcription/translation reaction: nascent OrfAB bound the IR before its synthesis was complete but not after. Thus OrfAB can efficiently bind the IR only prior to its complete translation.

Fig. IS3.8. This schematic, not to scale, shows the insertion sequence with its left (IRL)and right (IRR) ends in green. RNA polymerase, RNAP, is shown in pale green in the process of transcribing from the promoter pIRL. The mRNA is shown in dark green with a ribosome (blue)paused at the frameshift secondary structure. The nascent OrfAB peptide (brown) is shown binding to IRL while undergoing translation. Above is shown the full-length OrfAB in a folded configuration, proposed to prevent its binding to the IR as a completed protein.

Co-translational multimerisation

An intriguing question arising directly from these results is how OrfAB multimerizes as is found in the transpososome to bind both ends of the IS. Stable formation of the important synaptic complex containing both IS ends and the transposase requires a dimeric OrfAB (see "The IS911 transpososome" below). It is therefore possible that dimerization is in some way directly associated with translation. Indeed, using luxA and luxB as a model system, it been shown that luxA/B subunit assembly initiates cotranslationally on nascent LuxB in vivo. Protein assembly appears to be directly coupled to translation and involves “spatially confined, actively chaperoned cotranslational subunit interactions”[70].

The IS911 transpososome

A crucial checkpoint in transposition is the assembly of the 'transpososome'. This step is a general prerequisite for initiating DNA cleavage and the subsequent chemical steps in transposition for most elements that use a DNA (rather than RNA) transposition intermediates. In this protein-DNA complex, both ends of the transposon are bridged by the transposase before it catalyzes the DNA strand cleavages and strand transfers necessary for transposon mobility[71][72][73]. The transpososome adopts very precise architectures to accomplish these steps, and undergoes defined changes throughout the transposition process.

The overall IS911 transposition pathway is a two-step process, involving replicative excision followed by insertion (Fig.IS3.9 A and 9B). This implies consecutive assembly of two types of transpososome: one implicated in IS excision (synaptic complex A; SCA) and includes both IS ends while the other (synaptic complex B; SCB) involves the circle junction with its abutted IRs to ensure its integration into the target DNA.

Fig. IS3.9A. IS911 is shown in green, the flanking donor DNA in black, and the target DNA in blue. Transposon ends are shown as green filled circles. The small arrows shown in Figure 4 have been omitted for brevity. (A) Donor plasmid carrying the insertion sequence (IS). (B) Formation of the first synaptic complex SCA and cleavage of the left or right inverted repeat (IR) and attack of the other end. (C) Formation of a single-strand bridge to create a figure-eight molecule if the donor is a plasmid, as shown here. (D) The products of IS-specific replication: the double-strand circular IS transposition intermediate and the regenerated transposon donor plasmid. The replicated strand is shown as a green dotted line. (E) Formation of the second synaptic complex SCB and engagement of the target DNA (blue). (F) Cleavage of the IS circle and integration. (G) The newly integrated IS.
Fig. IS3.9B. Top. cartoon of the IS911 figure eight (left) and IS circle (right). Bottom. Electron microscopy of figure eight (left) and IS circle (right). DNA has been coated with RecA protein to highlight double and single-stranded DNA occurs in the "crossover" region of the figure eight molecule on the left. Electron microscopy by Edouard Boy de la Tour and Lucian Caro.

Excision synaptic complex SCA.

Using a band shift assay and IR of different lengths (the so-called “long-short” experiment) it was shown that the truncated OrfAB [1-149] forms a complex with two IR copies, the paired-end complex (PEC)[38] equivalent to the SCA. An intact OrfAB [1-149] LZ is necessary for correct PEC/SCA formation[38][50]. At higher OrfAB [1-149] concentrations a probable single end complex (SEC) composed of one IR and OrfAB [1-149] appeared. Addition of OrfA disturbed both PEC/SCA and SEC and generated a fast migrating species whose composition remains to be determined but does not appear to contain OrfA itself [38].

DNaseI and Copper phenanthroline footprinting revealed that OrfAB [1-149] protects a sub-terminal (internal) IR region including two conserved sequence blocks in the left (IRL) and right (IRR) ends (Fig.IS3.1 A). DNA binding assays in vitro and measurement of in vivo recombination activity of sequential IR deletion derivatives suggested a model in which the N-terminal region of OrfAB binds the conserved boxes in a sequence-specific manner and anchors the two IRs into the SCA. The external region of the inverted repeat was proposed to contact the C-terminal transposase domain carrying the catalytic site[74].

SCA is composed of a dimer of transposase bridging to two IR[75], as judged by the use of a tagged and untagged truncated transposase derivative, OrfAB[1-149], and also of IR of different lengths. OrfAB[1-149] assembles two IRR copies in a parallel orientation (Fig.IS3.4)[75] as studied at the single molecule level by Atomic Force Microscopy (AFM) using asymmetric IRR-carrying DNA fragments.

SCA assembly was also studied using a second single-molecule approach: tethered particle motion (TPM) (Fig.IS3.10)[76] in which a DNA molecule is tethered to a glass support and its effective length is measured by observing the Brownian motion of a bead attached to its free end (Fig.IS3.10 left). OrfAB[1-149] binding to a single IR provoked a small shortening of the DNA, consistent with a DNA bend introduced by protein binding to the IR and was confirmed using EMSA. When two ends were present on the tethered DNA in their natural, inverted, configuration, OrfAB[149] not only provoked the short reduction in length but also generated species with greatly reduced effective length (Fig.IS3.10 middle and top right) consistent with DNA looping between the ends and thus SCA formation. SCA is very stable and kinetic analysis in real-time suggested that passage from the bound unlooped to the looped state could involve another unlooped species of intermediate length in which OrfAB[149] is bound to both IRs. DNA carrying directly repeated IR also gave rise to the looped species but the level of the intermediate species was significantly enhanced (Fig.IS3.10 middle and bottom right). Its accumulation could reflect a less favorable SCA formation with directly repeated IR copies than with inverted IR. This is compatible with a model in which OrfAB binds separately to and bends each IR and protein-protein interactions then lead to SCA formation (Fig.IS3.11 A)[77]. Cleavage and strand transfer would then give rise to a species in which both IS ends are joined by a single strand bridge (or figure-eight on a circular plasmid (Fig.IS3.9 C) (see "The Transposition Pathway").

Fig. IS3.10. IR pairing by Tethered Particle Motion. The figure is adapted from Pouget et al., 2006

Insertion synaptic complex SCB

SCB has not been characterized in such a precise way as SCA. SCB is devoted to the insertion step of the transposition process. Two types of insertion, IR-targeted and non-targeted, have been observed (Fig.IS3.11 B). It has been proposed that two different protein-DNA complexes are assembled during the two types of insertion reaction: SCBt and SCBnt (for targeted and non-targeted synaptic complex respectively[78]. Nothing is known about the stoichiometry and the geometry of these complexes but, based on protein and DNA requirements for protein-DNA complex formation, as judged by band shift, and for transposition products, as judged by in vitro and in vivo transposition assays, it has been proposed that SCBt is composed of a transposase dimer bridging a DNA molecule carrying an IR and a DNA molecule carrying an IRR-IRR junction (IS911 circle), the product of the replicative IS911 excision. This IR targeted insertion explains how the original isolate of IS911 might have occurred next to a sequence which strongly resembles an IR[11] and can also explain one ended insertion[33]. In this regard, IRR shows a somewhat higher affinity than IRL. Note that if one of the two IR carried by the circle is omitted, SCBt resembles SCA (Fig.IS3.11).

Fig. IS3.11. Proposed configuration and composition of synaptic complexes SCA and SCB involved in different steps of the IS911 transposition cycle. The excision complex SCA. The tips of the insertion sequence (IS), which are not protected by the truncated transposase OrfAB[1–149] are shown as green circles containing an arrowhead. IRs are indicated by thick black lines and the IS as green lines. Full-length OrfAB, which is presumed to cover the entire IR, is shown bound as a monomer to each end and to introduce a small bend in the DNA. Dimerization creates SCA, resulting in the pairing of both IRs and in the formation of a DNA loop which includes the IS. Finally, a cleavage and strand transfer event results in the formation of a single-strand bridge between the IRs. The integration complex SCB. Symbols are as in (A). In the left-hand column, the IS circle intermediate with its newly replicated strand (dotted line) is shown to form a complex between an IR in the circle and a second in the target to form SCBt. Cleavage and strand transfer is shown to form a single-strand bridge between the two IRs. RecG helicase is thought to intervene to drive strand migration before a second cleavage and strand transfer results in the integration of the circle. This would explain the integration of the many different ISs observed to occur next to a resident IR in the target. The right-hand column: untargeted integration involving OrfA and OrfAB. OrfA is known to interact with OrfAB. It also changes in some way OrfAB binding but it is not clear whether it remains in the complex.

SCBnt is thought to differ from both SCA and SCBt and to include the second IS911 protein, OrfA. This protein, binds non-specifically to DNA and interacts with OrfAB[38][50], is proposed to direct an OrfAB-junction complex to a randomly chosen target-DNA to form SCBnt[78][79]. This is based on the observation that integration of the transposon circle intermediate is greatly stimulated by preincubation of OrfAB and OrfA in an in vitro reaction[80].

The Transposition Pathway

The IS3 family is one of an increasing number of IS families known to transpose using a double strand circular DNA intermediate. Closely related pathways have been demonstrated for IS1[81], IS2[22], IS3[82], and IS150[83]. This represents a major transposition pathway which has yet to be widely recognized. As shown in Fig.IS3.9, and the animation below, IS3 family transposition proceeds through a copy-out-paste-in process.

IS911 transposition mechanisms
IS911. copy-out-paste-in mechanism

The Figure-eight form

The initial step is recognition of the IR by OrfAB (presumably during its translation) (IS911 movie above) and assembly of SCA to correctly position the DNA ends and the transposase catalytic site for the subsequent chemical steps. Like all known DDE transposase-catalyzed reactions[84], IS911 transposition proceeds by cleavage of a single strand at the transposon end generating a 3’-OH. This then attacks a target phosphodiester bond in a strand transfer reaction. The particularity of this copy-out-paste-in mechanism is that initial cleavage occurs at only one transposon end, either left or right (Fig.IS3.9). This single liberated 3’-OH directs strand transfer to the same strand 3 bases 5’ to the other end of the element. This generates a molecule in which a single transposon strand is circularized to produce a single strand bridge generating a figure-eight structure on a circular plasmid donor molecule (Fig.IS3.12) which can be easily observed in vivo[85]. The IRs are joined by the single-stranded bridge and separated by three bases derived from flanking DNA from either the left or right end. The three (or 4) bp direct repeats flanking the original insertion are not required for further transposition (as also shown for IS3[86]) and an IS911-based transposon engineered to have different flanks generates a mixed population of figure-eight molecules with one or other flank sequence. Prevention of cleavage of one or other transposon end resulted in a homogenous population that carries the 3nt DNA flank associated with the mutant end confirming that the IRL can attack IRR and vice versa. The reaction can be viewed as a one-ended site-specific transposition event. These initial steps can be accomplished by OrfAB alone. However, it should be noted that in the presence of OrfA, no figure eight or IS circles could be detected by a simple gel assay in vivo although IS circles were found using a PCR approach[64]. This suggests that OrfA may play a role in negatively regulating the initiation of transposition. A similar conclusion has been reached for OrfA of IS3[87]. Alternatively, OrfA may stimulate the disappearance of figure eight and IS circles (see below) since no effect of OrfA was observed on figure-eight formation in vitro. Together with the fact that OrfAB is normally produced at low levels from a weak promoter[21], initiation of transposition to form the figure eight intermediate may be stochastic.

Fig. IS3.12. Agarose gel electrophoresis of DNA extracts from cells carrying a donor plasmid in the presence of high levels of transposase. The first panel, Left. Cartoons of three IS911 related species. From top to bottom: the donor plasmid, the figure 8 molecule, and the IS circle. IS911 is shown in green, plasmid backbone in black and the transposon ends as red dots. Second panel. Ethidium bromide-stained Agarose gel showing various DNA species, including the plasmid which was used to supply transposase. Third panel. Electron micrographs of RecA coated figure 8 and IS circles.

The circular intermediate

Kinetic data[65][85] indicate that the figure-eight gives rise to the circular transposon form which can easily be detected in vivo and in which the IR are abutted and separated by three base pairs of DNA flanking the original insertion (Fig.IS3.9 and Fig.IS3.12). As for figure-eight molecules, a transposon engineered to have different flanks generates a mixed population of transposon circles with one or the other 3bp flank located at the junction[88].

Studies in vivo using a labeling protocol and a temperature-sensitive plasmid as transposon donor demonstrated that conversion from the figure-eight to the transposon circle occurs by semiconservative replication where the circular intermediate is “copied out” leaving a copy in the transposon donor molecule[89] (Fig.IS3.9). This is transposon-specific, requires OrfAB (presumably to generate the figure eight and generate a 3’-OH on the IS911 DNA flank) and does not depend on replication from the donor plasmid origin of replication[89].

Using donor plasmids where one or other IR was inactivated for cleavage would be expected to determine whether one or other of the 3’-OH is used in transposon replication. This was tested using the Tus/ter system[90][91][92][93] (which blocks passage of a replication fork in an orientation specific fashion) cloned into the transposon in either one or other orientation. In the presence of Tus protein, no transposon circles were observed if the orientation of the ter site was that expected to block replication from one or the other end[89].

At present, it is not known how OrfAB is removed and how this replication step is initiated or terminated to generate the final circles. It is possible that these processes involve host factors and mechanisms similar to those, which operate in replicative transposition of bacteriophage Mu (see [94][95][96]).

RecG helicase is implicated in targeted insertion. This process involves a target IS911 end and strand transfer occur between one cleaved end of the IS circle and the target IS end to create an intermolecular single-strand bridge rather than the intramolecular bridge of the figure-eight intermediate (Fig.IS3.13). Resolution of this structure implicates branch migration and replication from the donor plasmid[97]. This reinforces the idea that host proteins including components of the replication machinery are loaded onto figure-eight intermediates.

Fig. IS3.13. In vitro reactions were performed using purified IS911 circles which included a chloramphenicol resistance gene and a plasmid target with a promoterless lacZ gene. Following a standard In vitro reaction, the reaction mixture was used to transform competent E. coli with selection for chloramphenicol resistance. Lines on the interior and exterior of the plasmid circle represent different orientations of insertion.

Integration of the circular intermediate

The IR junction formed by IS circularization is very unstable in the presence of OrfAB and undergoes high levels of deletion and insertion in vivo[98] and in vitro[80]. Transposon circle insertion presumably requires further transposase synthesis.

A remarkable consequence of transposon circle formation is the assembly of a strong promoter, pjunc, from a –35 hexamer contributed by IRR and a –10 hexamer contributed by IRL (Fig.IS3.3 B). The 3 (or more rarely 4) bp which separate IRL and IRR in the circle provide an ideal spacing between the –35 and –10 elements[98]. The junction promoter, pjunc, is 30-50 fold stronger than the indigenous promoter, pIRL[98] (Fig.IS3.4), and more than two fold stronger than lacUV5[30]. It is correctly placed to drive high levels of transposase synthesis and plays an active role in controlling IS911 transposition.

Inactivation of pjunc by mutagenesis strongly reduced IS911 transposition in vivo when transposase was expressed in its native configuration[30]. Moreover, the truncated OrfAB derivative, OrfAB[1-149] , which specifically binds IRR and IRL, reduced in vivo promoter activity 10 fold in a mutated junction resistant to cleavage. Full-length OrfAB, which binds the IR only weakly, and OrfA, which does not specifically bind the IR, had no effect[30]. Integration results in disassembly of pjunc providing a powerful feedback mechanism resulting in transient and controlled activation of integration only in the presence of the correct (circular) intermediate.

For the related IS2, this junction promoter is required for transposition[99].

Circle junction formation brings both transposons ends together in an inverted orientation. This active junction must then participate in the second type of synaptic complex which includes target DNA (Fig.IS3.9 and Fig.IS3.11 B).

Two single strand cleavages, one at each abutted IR, would linearize the transposon circle permitting the two liberated 3'-OH groups to direct coordinated strand transfer (Fig.IS3.9 and Fig.IS3.11 B). The final step requires OrfAB but is greatly stimulated by OrfA and is sensitive to the ratio of OrfAB/OrfA[80].

It is not known whether target capture occurs before or after cleavage of the circle junction although it has been observed that linear copies of IS911 are produced from transposon circles in vitro and in the presence of high OrfAB levels in vivo and a pre-cleaved linear transposon was a robust substrate for integration in vitro[100].

Based on kinetics and on the formation of the strong pjunc promoter, we favor a model in which the IS circles represent a reservoir of transposition intermediates and that linear forms are generated from the IS circles during the integration process.

This has also been proposed for IS3[86].

Targeted Insertion

As stated above, several IS including IS911 show a preference for integration next to sequences in the target similar to their IR. One way of understanding this is that the transposon circle is able to form a synaptic complex (SCBt; Fig.IS3.11 B left) which is similar to SCA (Fig.IS3.11 A) but which occurs “in trans” between an IR of the transposon circle and an IR in the target. In the case of IS911, this phenomenon occurs more frequently if OrfA is not present (Fig.IS3.13) and it was proposed that one role of OrfA is to promote dispersion of the IS[78][101].

This type of one-ended intermolecular recombination/integration has been analyzed in some detail[97][101][102].

IR-targeted insertion involves the transfer of a single end of the junction to the target IR to generate a branched DNA structure. The single-end transfer (SET) intermediate, but not the final insertion product, was detected in vitro. This implies that SET intermediates must be processed by the bacterial host to obtain the final insertion products. Sequence analysis of in vitro and in vivo IR-targeted insertion products revealed high levels of DNA sequence conversion in which mutations from one IR were transferred to another. These sequence changes could not be explained by the classic transposition pathway but could be understood in terms of a mechanism in which SET generates a four-way Holliday-like junction which is then processed by host-mediated branch migration, resolution, repair and replication. This pathway resembles those described for processing other branched DNA structures such as stalled replication forks. A version of this model is shown in Fig.IS3.14. Subsequent studies showed that the RecG helicase is implicated in vivo, as might be expected for strand migration[97].

Fig. IS3.14. IRR and IRL in red and green respectively. A mutant terminal dinucleotide (pale red or green boxes) prevents donor activity but allows target activity. Three interstitial base pairs in the IR/IR junction are as grey and white circles to distinguish DNA strand polarity. The same convention is used for the three base pairs flanking the target mutant IRL* as diamonds. Dotted lines: donor transposon circle; full lines: target DNA. The 3’ ends of Tpase-mediated nicks are indicated by arrows. Those, which may exist transiently during second strand resolution, are indicated by a gap. I, synapsis and cleavage at one end and strand transfer; II, the formation of a SET between donor and target; III, branch migration in the sense of the arrow creating hybrid IRL or IRL/IRR copies; IV, Holliday junction resolution, thick dashed lines; V, resolved product subsequently subject to mismatch repair and replication. Lower case roman numerals below indicate the type of final product. The differences between A, B and C depend on the IR which attacks the target. A. IRR attacks three base pairs from the target, IRL*. B. IRL attacks three base pairs from the target IRL* leading to hybrid IRs in which one strand was derived from IRR and the other from IRL*. The figure shows the expected results if branch migration continued into the region of non-complementarity after the IRs. C. IRL attacks at the tip of the target IRL*.

Mechanism in other family members

Several other members of this family have also been analysed in some detail. These include IS2, IS3, and IS150. All three have been shown to generate circles when supplied with high levels of the fused frame Tpase[20][22][83][86][103].

IS3 also generates adjacent deletions[20] but, unlike IS911, appears to undergo excision from the donor molecule as a linear form following a staggered double strand break at each end. These forms have a 3 base 5' overhang and may be an alternative type of transposition intermediate[103]. Such forms may be equivalent to the linear IS911 species derived from transposon circles. In addition, IS3-derivative transposons in which two abutted ends have been engineered undergo high levels of transposition[31].

Insertion of IS3 creates generally 3 and sometimes 4 bp direct target repeats. It is significant that plasmids in which the IRs are separated by 4 bp are more active than those separated by 8 bp. In these studies, the authors were unable to engineer derivatives with two complete tandem IS3 elements. This may be the result of the formation of a strong hybrid promoter which, as described for IS911 and other ISs (see above), drives high levels of Tpase expression. This configuration of ends is equivalent to that found at the circle junction and suggests that abutted ends of IS3 are also efficient substrates in transposition.

IS2 generates direct target duplications of 5 bp on insertion[104] although transposon circles generated with this element carry only a single base pair separating IRL and IRR[22].

While IS2 carries a conserved terminal 5' -CA- 3' at its right end, the left end terminates with 5' -TG- 3'. This atypical IRL does not act as a strand donor but uniquely as a target in the circularization reaction.

Functional studies indicate that the product of the upstream orfA may inhibit transposition[26]. It has been shown to bind specifically to IRL at a sequence that overlaps the -10 hexamer of the resident Tpase promoter and represses expression of OrfA.

It does not appear to bind IRR (note that in the original article the authors inverse the standard definition of IRL and IRR[26].

Several other elements also exhibit small inverted repeat sequences which flank the -10 hexamer of the putative resident Tpase promoter. IS2-derivative transposons in which two abutted ends have been engineered also undergo high levels of transposition[22][105] and, like IS911, the circle junction of IS2 also constitutes a strong promoter capable of driving Tpase expression. Several (but not all) IS3-family elements may also carry similarly located potential -35 and -10 sequences within their IRs.

Structural studies

Although there are at present no structural data available for any members of this family, recent results obtained with an IS from another family, ISCth4 from the IS256 family, which also undergoes copy-out-paste-in transposition has provided some insights [106]. This particular transposition pathway is asymmetric in the sense that one IS end is cleaved and attacks the opposite end several nucleotides from the tip [85]. In accord with this type of mechanism, crystal structures of ISCth4 transposase bound to three different substrates show a transposase dimer bound asymmetrically to a single DNA substrate: a pre-reaction substrate with IRR together with its flanking DNA, a pre-cleaved complex in which the IRR flank had been removed and a strand transfer complex including an abutted IRR and IRL separated by a gapped 6 base pair linker (Fig. IS256.8).

It is important to note that IS256 family transposases carry an alpha-helical insertion domain which separates the catalytic domain into two segments. This domain plays an important role in directing different DNA segments during the reaction. IS3 family transposases carry an uninterrupted catalytic domain without the alpha helical insertion domain implying that the atomic details of the process will differ. In this light, it is worth remembering that efficient insertion of IS911 transposon circles catalysed by OrfAB is greatly stimulated by inclusion of the upstream OrfA protein and is sensitive to the ratio of OrfAB/OrfA [107].

Excision: A dedicated enzyme

This section has been published in a modified form as Chandler M, Ross K, Varani AM. The insertion sequence excision enhancer: A PrimPol-based primer invasion system for immobilizing transposon-transmitted antibiotic resistance genes. Mol Microbiol. 2023 [108].


The IS3 family insertion sequence IS1203v (similar to IS629), originally identified in a Shiga toxin 2 gene (stx2) of Escherichia coli O157:H7 which it had insertionally inactivated, was found to undergo precise excision leading to stx2 reactivation [109]. Curiously excision of the IS3 family transposon, IS1203v occurred at a much higher frequency in some in some E.coli hosts than in others [110].

IS Excision is Stimulated by High Transposase Levels

Using a (single copy) F plasmid derivative in which an ampicillin resistance gene was interrupted by an IS1203v insertion (bla::IS1203v) to monitor precise excision rates (reversion to ampicillin resistance) (Fig. IS3.15 A), the authors showed that excision was 105 fold higher in Escherichia coli O157:H7, known to carry a significant number of IS629 copies, compared to E. coli K12 (MG1655), where it is absent [111][112].

Further studies using a number of E. coli isolates with and without IS1203v/IS629 copies supported the idea that excision was higher in those strains already carrying the IS.

In a modified experimental system in which various transposition functions were supplied in trans from a compatible plasmid (Fig. IS3.15 B), deletion was observed to be very low or below the detection level in an MG1655 host. In the O157:H7 strain, however, supplying the OrfAB transposase induced high deletion levels (~10-3) compared to that obtained with the empty vector (7.8x10-7) whereas supplying the orfA, orfB and orfAB genes in their native configuration only resulted in a moderate frequency of excision (2.6x10-6) and supplying orfA alone depressed excision (3.8x10-9). This implies that the levels of available OrfAB transposase are a determining factor in excision. A survey of a number of strains showed that excision frequencies of strains possessing IS1203v (IS629) were on average 103 times higher than those not carrying the IS. High IS629 excision frequencies were observed in a large number of clinical E. coli isolates [113].


Fig. IS3.15. (A) System used to monitor IS excision. Redrawn and adapted from [114]. The test plasmid (left) carried an insertion in an ampicillin resistance gene bla::IS1203v/IS629, light blue) of a disabled IS composed of two terminal inverted repeats (IR, dark blue triangles) flanking a tetracycline resistance gene carried by a plasmid containing an orfAB transposase gene (dark blue arrow). Excision results in loss of tetracycline resistance and appearance of ampicillin resistance (right). (B) System used to test different configurations of IS1203v/IS629 transposition genes. This is composed of the bla::IS1203v/IS629 cassette carried by one plasmid and the transposition genes carried by a second, compatible plasmid.


A Dedicated Enzyme: Identification of a common reading frame ECs1305 in all high excision strains

The authors identified a reading frame, ECs1305, present in all high excision strains that was absent in the low excision strains [115]. In EHEC O157 it It is located in large potential integrative elements that are similar to SpLE1 of EHEC O157 [114] and has probably been dispersed in this way. It was identified both in enterohemorrhagic (EHEC) and enterotoxigenic (ETEC) E. coli strains but homologues were also identified by Blast analysis in a broad range of bacteria including Alpha- Beta-, Gamma-, Delta- and Epsilon-proteobacteria; Bacteroides; Chlorobi; Cyanobacteria; Firmicutes; Actinobacteria; and Verrucomicrobia[115].

More recently it has been estimated that a highly conserved IEE gene copy is present in over 30% of available E. coli genome assemblies and is very abundant not only within enterohemorrhagic and enterotoxigenic genomes but also within enteropathogenic E. coli [116].

The ECs1305 gene was subsequently named iee for IS-excision enhancer [115]. In EHEC O157, ECs1305/IEE is located in a large potential integrative element that is similar to SpLE1 and has probably been dispersed in this way [114].

ECs1305 and an active transposase is required for high level IS excision

When this reading frame was deleted, the IS excision frequency was greatly decreased but could be restored by reintroduction of a plasmid-carried iee copy. Moreover, the use of DDE mutant transposases was used to demonstrate that the ECs1305-promoted excision behavior was also dependent on an active transposase.

The effect on excision frequency of a number of host genes was investigated and although some of these had been shown to affect excision of other ISs (e.g. [117][118][119] ), the effect of mutations were largely marginal (Fig.IS3.16) and not as pronounced as mutations in ECs1305 [115].

Fig.IS3.16. A Histogram showing the frequency of excision of IS629 in various Mutant E. coli Genetic Backgrounds. The assay consisted of a plasmid carrying an ampicillin resistance (Apr) gene inactivated by the insertion of a copy of IS629 whose transposase had been substituted for a tetracycline resistance gene (Tcr). The plasmid also included the transposase gene placed under the control of an external promoter. Excision of the IS629 derivative results in loss of Tcr and appearance of Apr. Data were taken from [115]. The authors used a number of deletion mutants of genes that are thought to influence various aspects of transposition: IHF (integration host factor), HU, H-NS, FIS (factor for inversion stimulation), ClpXP5 protease Lon protease, Dam, RecA, and RecBC. The precise IS629 excision frequency was examined in each mutant using the reporter plasmid-based assay. The hns, dam, and recB deletion mutants could not be generated in the original E. coli O157 Sakai strain and were generated in another E. coli host carrying a chromosomally inserted iee gene.

Analysis of the primary Sequence of IEE: a PrimPol Helicase

Alignment of a number of IEE proteins revealed 4 conserved regions [115]. Two of the more N-terminal regions (1 and 2) showed regions with similarities to eukaryotic/archaeal/bacterial primase domains (AEP) while the more C-terminal regions 3 and 4 showed similarities with DEAD and DEAH box helicases [115] (Fig. IS3.17). Analysis using hhpred (https://toolkit.tuebingen.mpg.de/tools/hhpred/ ) indicates that the N-terminal domain resembles the AEP Archaeo-Eukaryotic Prim-Pol Primase-Polymerase (Fig. IS3.18) while the C-terminal domain is similar to the ATP-dependent DNA helicase, UvsW, of bacteriophage T4. The authors find that targeted mutagenesis of the helicase domains “considerably reduces IEE activity” [115].

Fig.IS3.17. Organization of the IEE protein. A Schematic of the IS-excision enhancer protein is shown in blue. Alignment of a number of IEE proteins [115] revealed 4 conserved regions, whose position is indicated by the amino acid residues above the cartoon. Two of the more N-terminal regions (1 and 2) showed regions with similarities to eukaryotic/archaeal/bacterial primase domains, DNA_primase_S (pfam01896) while the more C-terminal regions 3 and 4 show similarities DEAD and DEAH box helicases, DEXDc and HELICc. The four regions are indicated and sequence signatures are those identified by Kusumoto et al., [115] and by Three conserved regions, A, B, and C [116] form the active site: Mg2+/Mn2+ ligands, D, are indicated in bold red letters; a conserved His which interacts with the incoming nucleotide is in blue. MtuLigD, M. tuberculosis NHEJ Ligase D; PaeLigD, P. aeruginosa NHEJ Ligase D; MsmPolD1, M. smegmatis PolD1; hPrimPol, human PrimPol ; pRN1/PrimPol, Sulfolobus islandicus plasmid pRN1 PrimPol. IEE Catalytic site residues are numbered D142, D144, H178, and D216.


Fig.IS3.18. HhPRED Analysis of IEE. The N-terminal domain prior to amino acid residue 321 has the following predictive result: 7NQE_ATPR_REGION domain-containing protein; AEP Archaeo-Eukaryotic Primase Apo Prim-Pol Primase-polymerase, TRANSFERASE; HET: DGT, EDO; 1.28A (martinitogasp 1137) Probability: 100%, E-value: 8.7e-32, Score: 272.77, Aligned cols: 207, Identities: 27%, Similarity: 0.464, Template Neff: 9.7. The C-terminal domain following amino acid residue 321 has the following predictive result: 2OCA_AATP-dependent DNA helicase uvsW; ATP-dependant helicase, T4-bacteriophage, Recombination, HYDROLASE; 2.7A {Enterobacteriophage T4} Probability: 100%, E-value: 1.4e-41, Score: 389.78, aligned cols: 449, Identities: 18%, Similarity: 0.192, Template Neff: 11

Biochemical Analysis of IEE Activities: Polymerase Activity

Indeed, as proposed from alignments with other members of this protein family [115][116] (Fig. IS3.17), biochemical analyses using purified IEE demonstrated that the enzyme possesses DNA polymerase activity in the presence of Mg2+ [116] and can extend a 15 bp primer on a short 33bp template using dNTPs but not NTPs (Fig. IS3. 19 i). Mutation of two of the probable AEP catalytic residues (Fig. IS3.17), D142A/D144A, eliminated this activity.

As might be expected from other examples of a number of endonucleases and nucleotidyl-transferases, the polymerization reaction was significantly higher in the presence of Mn2+ ions and the extension length was increased.

Experiments using various molar ratios of Mn2+ and Mg2+ led the authors speculate that the IEE AEP domain might preferentially use Mn2+ in vivo even in the presence of Mg2+. They argue that the active site of a number of AEP is more flexible than that of replicative polymerases since is configured to “accommodate dislocations of the template and primer strands as well as to extend mismatched base pairs[116]. These include the AEP domains of the Translesion Synthesis, TLS, class such as human PrimPol, and those of the Non Homologous End Joining, NHEJ, class such as bacterial LigD. They suggest that this flexibility allows preferential accommodation of Mn2+.

Preferential IEE use of Mn2+ suggested that it too could also allow extension with mismatched base pairs and promote DNA strand rearrangements during polymerization. This was demonstrated using a set of substrates carrying a single or double base changes in the template strand (Fig. IS3.19 ii and 19 iii) while supplying individual defined nucleotides in the reaction. Although the enzyme showed a strong preference for incorporation of complementary nucleotides at the templating nucleotides (Fig.IS3.19 ii) during primer extension, a significant degree of misincorporation (~16%) also occurred [116].

Fig. IS3.19. Oligonucleotides Used in Assessing the Polymerization Activities of IEE. All primers were 5’ end-labeled as shown by *. i) polymerase extension reaction substrate. Extension shown as red dots; ii) and iii) incoming nucleotide selection substrates with a single X or double XX template. iv) substrate carrying a GG mismatch used in template realignment assay. v) slippage-mediated primer dislocation on templates such as i) or ii) addition of successive A residues (shown in red) on a T template dinucleotide when the reaction is provided with dA or on a C template dinucleotide when provided with dG. vi) dNTP selection-mediated template dislocation where IEE bypasses a template base (X in red). vii) template dislocation by primer realignment where the primer realigns with the template.

Interestingly, a major product resulted from insertion of two (identical) nucleotides on substrates in which the first and second templating nucleotides were different. In cases where these were the same, addition of the complementary nucleotide to the reaction resulted in an “expansion” by “reiterative primer strand slippage to extend it by 5-7 nucleotides as shown in Fig.IS3.19 v. This was observed for both the TT and CC template pair where reiterative addition of A or G occurred respectively. IEE could also bypass the first template base in substrates as shown in Fig.IS3.19 v and bypass a mismatch (Fig. IS3.19 iv) as schematized in Fig. IS3.19 vii.

The authors conclude that their results indicate that IEE is an “error-prone DNA polymerase” which can undergo slippage-mediated primer dislocation (Fig.IS3.18 v), dNTP selection-mediated template dislocation (Fig.IS3.19 vi) and template dislocation by primer realignment (Fig.IS3.19 vii), all resulting in DNA distortion. Dislocation of primer and template strands would, of course, facilitate the search for microhomologies.

Microhomology-Mediated End-Joining

These properties raised the question of whether, like bacterial end-joining PolDom, the primer/template dislocations IEE AEP domain could promote Microhomology-Mediated End Joining, MMEJ. This might explain how the enzyme stimulates IS excision using its reduced nucleotide insertion fidelity (as shown in Fig.IS3.19).

Fig. IS3.20. Oligonucleotides Used in Assessing IEE Microhomology Search Activities [116]. All primers were 5’ end-labeled as shown by *. The left-hand side shows the DNA substrates with the overlap region indicated in red. From top to bottom 0, 2, 4 and six nucleotide overlaps. The right -and side shows the possible pairings using microhomology. The red arrows indicate pthe rimer extension used to assess the pairing. This generates a ladder of top -trand oligonucleotides corresponding to the primer extension. Note that in some reactions DNA ligase was added and a small signal with the length expected of a joined single -trand product was observed.

Using appropriately 5’ resected double strand oligonucleotides (Fig. IS3.20) with 0, 2, 4 and 6 nucleotide microhomologies at their 3’ ends (Fig. IS3.19 left). IEE- mediated MMEJ would create 3’ ends which should act as primers on 3’-synapsed DNA (Fig. IS3.20 right). A polymerization reaction including the four dNTPs revealed that IEE could specifically elongate a significant proportion of the 4 and 6 nucleotide substrates (28 and 46%) although no detectable elongation was observed with the 2 nt substrate under the reaction conditions used. IEE was also capable of limited synapsing of single strand substrates with 3’ terminal microhomologies [116].

The C-terminal Helicase Domain

IEE exhibits DNA-dependent ATPase activity which is eliminated in an active site K451A mutant. However, it was not able, as do other ATP dependent helicases, to couple its ATPase activity with unwinding of double strand DNA with 3’ or 5’ unpaired tails. A number of experiments comparing the activities of full length IEE and a C-terminal truncation which deletes the helicase domain from residue 289 (Fig. IS3.17) suggested that the C-terminal domain plays a role in stabilizing the interaction of the full length protein with its DNA substrates. The authors propose that it might even contribute to the removal of transposase from the transposition complex [116].

Models for IEE activity

Using a non-selective PCR approach, IEE-enhanced excision was shown not only to increase precise excision but also other, more extensive, deletion events. It was hypothesized that these occurred as a consequence of the IS3 family transposition mechanism in which initial cleavage of one IS end is followed by its strand transfer close to the opposite end to form a bridged molecule containing a small flanking sequence from the vector plasmid at the junction (Fig.IS3.21 i) (see IS3 The Transposition Pathway). The deletions were explained by the hypothesis that this initial strand transfer could be ‘sloppy” and target DNA sequences other than a second end (Fig.IS3.21 i).

However, transposition of this family of IS occurs by a copy-out mechanism and is thought to regenerate the donor plasmid (Fig.IS3.21 i). Such deletions would not be consistent with this mechanism but would require an additional type of reaction to resolve the deleted donor.

Fig.IS3.21. Model Proposed for generating different IS629 deletions (from [115]). Transposon sequences are shown in green; transposon ends are shown as red circles and the neighboring host DNA as black lines. The arrows show attack by one IS end ats the opposite end that occurs during copy-out-paste-in transposition. i) the normal productive pathway leading to forthe mation of a circular IS intermediate and regenerating the original donor replicon [10]. Note that no IS loss occurs. ii) “sloppy” attack at different positions in the donor replicon. Note that from what is known of the copy-out-paste-in transposition mechanism, this would not be expected to result in IS excision.

Calvo and coleagues [116] provide a model for IEE activity based on their biochemical results in which they propose that IEE pairs two 5’ resected ends DNA ends in a reaction which is facilitated by the C-terminal helicase domain, allowing the AEP domain to accomplish a “filling in” polymerization reaction. The model invokes an intermediate which is thought to occur during a cut-out-paste-in transposition pathway which leaves a blunt ended double strand break in the donor DNA molecule (see [120]). Moreover, although it is not pointed out, most IS generate short, direct target repeat sequences on insertion (see: General Information: What is an IS?) and would therefore provide 3’ terminal microhomologies upon 5’ resection.

However, in addition to IS3 family members, Kusumoto et al., [115] show that IEE also stimulates excision of IS1 and IS30 family members (Fig. IS3.22), both of which can transpose using a copy-out-paste-in mechanism [81][121]. A number of the IS exhibit a measurable level of excision in the absence of the iee gene. Interestingly, it was proposed based on identification of in vivo transposase-induced structures, that IS1 can transpose using a number of alternative pathways [81]: IS1 shows a low basal level of excision which is further stimulated by IEE. It is possible that excision in the absence of iee occurs by a different mechanism [122][123] as occurs in the IS4 family member, IS10. In this light, it is noteworthy that IS4 itself shows a low level of iee independent excision which is not affected by iee (Fig. IS3.22). Additionally, IS5, IS26 and IS621, none of which use a copy-out-paste-in transposition mechanism, were not observed to excise even in the presence of iee.

Thus the common mechanistic property of those IS whose excision is stimulated by IEE is that they all use a copy-out pathway.

Fig. IS3.22. Relative IS Excision Frequencies. A number of different IS were examined for IEE-stimulated excision in E.coli K-12 with and without a cloned iee copy. The assay consisted of a plasmid carrying an ampicillin resistance (Apr) gene inactivated by insertion of a copy of the IS being tested whose transposase had been substituted for a tetracycline resistance gene (Tcr). The plasmid also included the transposase gene placed and the control of an external promoter. Excision of the IS629 derivative results in loss of Tcr and appearance of Apr. Data is from Kusumoto et al., [115].

An alternative explanation to that of Kusumoto et al., [115], is that they are generated not during the first strand transfer step of copy-out-paste-in transposition as proposed (Fig. IS3.20), but during the second, replication step (copy-out) by slippage and realignment of the replication primer as was later proposed for the deletion of IS30 family ISApl1 copies flanking the colistin resistance, mcr, gene in Tn6330 [124] and Fig. IS30.10. This strand switching or primer invasion model had been proposed for the deletion of IS30 family ISApl1 copies flanking the colistin resistance, mcr, gene in Tn6330 [125].

Fig. IS3.23. Alignment showing the decay of multiple different instances of Tn6330. The figure is redrawn and modified from Snesrud et al., 2018 [124]. (A) Sequence of four parental Tn6330-carrying plasmids. The conserved, ancestral CG dinucleotide on the inside end (IE) of the downstream ISApl1 is indicated with a black triangle. The 2 bases at the end of the right-hand ISApl1 that are part of the DR generated by insertion of the entire Tn6330 are over-scored in black. The half arrows above indicate the ISApl1 IRL and IRR sequences defining the ends of the downstream IS. The bases upstream and downstream that are retained after ISApl1 loss are highlighted in salmon and purple, respectively. The deletion joints upstream and downstream of the ISApl1 are boxed. The Roman numerals on the left are correlated with the corresponding deletion product in C). (B) Cartoon showing the structure of Tn6330 and, below, the generic structure of the product in which the downstream ISApl1 copy has been deleted. The IS is shown as a blue box. The triangles at each end represent the left (IRL) and right (IRR) terminal inverted repeats. The transposase open reading frame is shown by a blue horizontal arrow. mcr-1 and pap2 reading frames are shown as red and white horizontal arrows, respectively. Thin black lines above, situate the downstream ISApl1 copy to the sequences in A) and the dotted lines below, to the deletion junction in C). (C) Corresponding deletion products. The sequences highlighted in grey show the sequence of plasmids with an empty site. The bases upstream and downstream of the deletion that are retained after ISApl1 loss are highlighted in red and blue, respectively, while the remaining copy of the deletion joint that is retained after the two ends are joined following ISApl1 excision is highlighted in green and encased in a black rectangle. The Roman numerals on the left are correlated with the corresponding full length parent in A). The two horizontal blue arrows joined by a dotted blue line on the left indicate the parent and deletion product.

The mcr connection: a model for excision by replication-associated strand exchange and primer invasion.

Colistin (polymixin E) is a last resort antibacterial that was used extensively in husbandry. Discovery of a transferable phosphethanolamine transferase conferring resistance to colistin in 2015 [126] was of such concern that it stimulated an immense effort to identify the resistance gene, mcr, in various bacterial sources worldwide. This quickly generated an extensive mcr sequence library in which it was noted that the gene was often, but not always, associated with an upstream or downstream copy of an IS30 family sequence, ISApl1. A number of examples carried two flanking IS copies, and the entire structure, which was proposed to be a compound transposon with characteristic 2bp direct target repeats [125], was called Tn6330 [127] and subsequently confirmed to undergo transposition [128]. Tn6330 carries the mcr1 gene together with a downstream open reading frame, pap2. Examination of a significant number of structures lacking the downstream IS revealed that the ~2.6 kb region including mcr-1 and pap2 was 99% identical; the non-identical nucleotides were concentrated at the 3’ end of pap2, the end that carries the downstream ISApl1 copy in Tn6330 (see Fig.IS3.22 B). Moreover, AT rich the pap2 gene in these cases was flanked by AT-rich regions. ISApl1 shows a strong preference for AT rich target sites, suggesting that an ancestral downstream ISApl1 copy had been deleted.

The number of available sequences was such that examples of closely related plasmid backbones could be identified which either had a complete Tn6330 insertion, with “empty” sites (often in multiple examples) as well as cases in which examples also inserted into the same location in which one or the other flanking IS was absent.

Careful scrutiny of the sequences flanking pap2 without an associated downstream IS, revealed small microhomologies which were thought to represent scars of deletions. Figure IS3.23 shows 4 plasmids each containing Tn6330 (pMCR-M15709, plsl, pMCR_1511, pHNSHP45-2; Fig.IS3.23 A). The salmon colored boxes on the left represent pap2 sequences retained in the subsequent deletions and the purple boxes on the right represent the external flanking sequences retained (some of these intrude into the IS). Figure IS3.23C shows individual “deletants” with similar backbones to those shown in Figure IS3.23 A. It was noted that, when aligned, the deletions have largely occurred between microhomologies of two to four base pairs. Similar results were obtained when the sequences upstream of mcr1 were examined. Interestingly, the length of the deleted segments (Fig.IS3.24) remains close to that of ISApl1, 1070 bp.

Fig.IS3.24. Distribution of different deletion sizes following loss of the downstream ISApl1 in 13 sequences in Figure IS3.22 together with 6 obtained for deletion of the upstream ISApl1 copy. The average deletion size is 1,069.8 bp (Standard deviation: 2.4). From (Snesrud et al., 2018)[124].


A number of years earlier, Szabó et al. [129] had observed similar products with IS30. Additionally, when the IS30 transposase gene was ablated, the deletion frequency was not only reduced by a factor of 103 but the accompanying deletions were more complex, including large deletions or unidentified plasmid rearrangements.

The detailed observations obtained for ISApl1 led to the model shown in Figure IS3.25A, which is illustrated with a specific example (iii in Fig. IS3.23) (Fig.IS3.25B).

In this model (Fig.IS3.22) it is envisaged that a short complementary sequence occurring outside the IS involved (Fig.IS3.22 i) and that the DNA strand generated by transposition-associated replication (Fig.IS3.22 ii) switches to the complementary sequence (Fig.IS3.22 iii). This proposed structure is equivalent to a Holliday junction (see for example [130] ) which could be resolved by RuvC (see for example [131]) (Fig.IS3.22 iv).

This reinforces the idea that the protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.


The Primer Invasion model also explains high levels of precise excision.

More generally, although the examples of imprecise IS excision led to this model, it also conveniently explains the high level of precise excision observed for IS629 and other IS of the IS3 family [115] (Fig. IS3.26). These generate small 3-4 bp direct flanking target repeats (DR)(Fig. IS3.26) on insertion )(Fig. IS3.22). Transposase-mediated synapsis of the IS ends (Fig. IS3.26i), cleavage 3-4 nucleotides distal to the opposite IS end (Fig. IS3.26ii)[85] and single strand bridge formation (Fig. IS3.26ii)[85] by strand transfer includes one strand of the DR, leaving the complementary single-stranded. Priming from the resulting 3’ OH (Fig. IS3.26iii) would regenerate the second DR strand (Fig. IS3.26iii). Primer invasion (Fig. IS3.26iv) then permits extension and formation of the Holiday junction. It might be expected that, since the complementary sequences are in proximity, locating microhomologies would be more efficient and therefore precise excision using this type of mechanism would be more frequent than imprecise excision.

Figure IS3.25. (A) A General Model showing how strand switching might lead to IS excision. The model is based on datafrom analysis of loss of flanking ISApl1 elements from the colistin resistance compound transposon, Tn6330). Transposon DNA is shown in green, flanking DNA is shown in blue. The single strand bridged molecule is shown (from Figure 7iB) in which a short, complementary DNA sequence is represented by blue and magenta boxes with their relative orientation shown by a small blue arrow. (i) a 3’ primer from DNA neighboring the IS is indicated by the blue arrowhead, and the 5’ phosphate by the small orange circle. (ii) IS replication (copy-out) up to and including the duplicated sequence is indicated by a dotted green line. (iii) Strand switching of the primer strand to the duplicated sequence in the neighboring DNA is shown to occur with displacement of the complementary strand. This generates an intermediate which resembles a Holliday junction. (iv) Resolution of the Holliday junction resulting in deletion of most of the IS from the donor replicon. This structure can be resolved by replication. (B) A mechanism for ISApl1 deletion:The double-strand sequence of the IS ends in IncI2 plasmid pMCR-M17059 as an example. From (Snesrud et al., 2018). The panel presents the structure of the single-strand bridged molecule. IS ends are boxed. The 3’OH generated in the donor plasmid DNA is indicated by a red dot and the corresponding 5’ phosphate at the other IS end by a black dot. The grey arrow indicates the direction of transposition-associated replication. The deletion joint is shown in blue. The sequence remaining after deletion (bottom) representing plasmid pSCS23 is composed of the bold black characters together with one of the blue tetranucleotide sequences

This reinforces the idea that the IEE protein might intervene during the replicative step of copy-out-paste-in transposition perhaps by interfering with the normal transposition process.

In summary, the IEE protein plays an important role in excision of members of a number of IS families which all have in common the production of IS circles as transposition intermediates and probably all use the copy-out-paste-in transposition pathway. Excision is not only dependent on IEE but also requires an active transposase, indicating that it is associated with the transposition process itself. Not only do the biochemical properties of IEE include the ability to prime DNA synthesis and overcome potential obstacles due various lesions in the template DNA, they also include the capacity to recognize microhomologies. This suggests to us that IEE acts at the replication (copy-out) step in the transposition pathway, subsequent to initial 3’cleavage of an IS end and its transfer to the other. Specifically, based on sequence data obtained from the loss of the IS30 family member, ISApl1, which flanks the mcr-1 gene in the compound transposon Tn6330, we suggest that it allows a strand switch (primer invasion) to suitable microhomologies in the neighboring donor DNA creating a Holliday junction and short circuiting the replication/transposition reaction. The IS could then be removed by Holliday junction resolution, a possibility that can be addressed experimentally. This would also explain the high levels of precise excision which, by definition leave no scars, since the DR would provide suitable microhomologies to facilitate primer invasion precisely at the IS ends, a prediction which could be tested directly by changing the DR sequence at one end of the IS and examining its effect on precise excision frequency in the presence of IEE.

Importantly, the activity of IEE in removing flanking IS serves to immobilize genes carried by compound transposons, explaining the presence of certain of these genes without associated IS copies in plasmids and chromosomes. It will be important to address this question experimentally using entire compound transposons such as Tn6330 as well as others with flanking copy-out-paste-in IS and to compare the effects with compound transposons with different transposition pathways such as cut-and-paste.

Fig. IS3.26. Precise Excision model.

IEE and the Diagnostics Lab

The IS600 excision behavior has been observed in a group of clinically relevant Shiga toxin producing E. coli (E. coli (STEC) O121:H19). It was noted that some, but not all of a number of clinical STEC O121:H19 isolates exhibit a phenotype called delayed lactose utilization (DLU), where cultures remain lac- after 24 h of growth but become lac+ after 48h of growth [132][133].

Excision of IS600 which was inserted into lacZ gene was thought to be responsible and the phenomenon was observed to require the presence of lactose in the medium. Moreover, the resulting lac+ cultures, the IS600 copy had excised. The phenomenon was also shown to require a functional iee gene since its inactivation by an IS1203 copy in a natural isolate prevented DLU. DLU is therefore simply the result of a selection for lac+ individuals in a population with high levels of IS600 excision facilitated by iee.

Acknowledgements

We would like to thank Miguel de Vega (Centro de Biología Molecular Severo Ochoa - Madrid, Spain) for reading this chapter and for critical comments.

Bibliography

  1. <pubmed>4567156</pubmed>
  2. <pubmed>1092667</pubmed>
  3. <pubmed>1092668</pubmed>
  4. <pubmed>1092669</pubmed>
  5. <pubmed>383689</pubmed>
  6. <pubmed>6277857</pubmed>
  7. <pubmed>2832386</pubmed>
  8. <pubmed>6094480</pubmed>
  9. Craig NL, Lambowitz AM, Craigie R, Gellert M, editors. Mobile DNA II. American Society of Microbiology; 2002.
  10. 10.0 10.1 <pubmed>26350305</pubmed>
  11. 11.0 11.1 11.2 11.3 <pubmed>2163395</pubmed>
  12. <pubmed>9278503</pubmed>
  13. <pubmed>9302015</pubmed>
  14. <pubmed>10496929</pubmed>
  15. <pubmed>8751923</pubmed>
  16. <pubmed>17347521</pubmed>
  17. <pubmed>17320399</pubmed>
  18. 18.0 18.1 <pubmed>8096321</pubmed>
  19. 19.0 19.1 19.2 19.3 19.4 <pubmed>1653413</pubmed>
  20. 20.0 20.1 20.2 20.3 20.4 20.5 20.6 <pubmed>8107082</pubmed>
  21. 21.0 21.1 21.2 21.3 21.4 21.5 21.6 21.7 21.8 <pubmed>1660923</pubmed>
  22. 22.0 22.1 22.2 22.3 22.4 <pubmed>9302014</pubmed>
  23. <pubmed>8384687</pubmed>
  24. 24.0 24.1 <pubmed>21673094</pubmed>
  25. 25.0 25.1 <pubmed>24875478</pubmed>
  26. 26.0 26.1 26.2 26.3 <pubmed>8107136</pubmed>
  27. <pubmed>2540414</pubmed>
  28. <pubmed>3039299</pubmed>
  29. <pubmed>10438765</pubmed>
  30. 30.0 30.1 30.2 30.3 <pubmed>11598022</pubmed>
  31. 31.0 31.1 <pubmed>1645443</pubmed>
  32. <pubmed>3035338</pubmed>
  33. 33.0 33.1 <pubmed>8106332</pubmed>
  34. Welz C. Functionelle analyse des Bakteriellen Insertionelements IS150. PhD thesis: Fakultät für Biologie der Albert-Ludwigs-Univesität Freiburg; 1993.
  35. <pubmed>9055066</pubmed>
  36. <pubmed>7994604</pubmed>
  37. 37.0 37.1 <pubmed>9729608</pubmed>
  38. 38.0 38.1 38.2 38.3 38.4 38.5 38.6 38.7 <pubmed>10677279</pubmed>
  39. <pubmed>1579111</pubmed>
  40. 40.0 40.1 <pubmed>7476162</pubmed>
  41. <pubmed>23832000</pubmed>
  42. 42.0 42.1 <pubmed>8302872</pubmed>
  43. 43.0 43.1 43.2 <pubmed>1963920</pubmed>
  44. 44.0 44.1 <pubmed>1647013</pubmed>
  45. 45.0 45.1 <pubmed>1850126</pubmed>
  46. 46.0 46.1 <pubmed>7934941</pubmed>
  47. <pubmed>9435062</pubmed>
  48. 48.0 48.1 <pubmed>2841644</pubmed>
  49. 49.0 49.1 49.2 <pubmed>14981152</pubmed>
  50. 50.0 50.1 50.2 50.3 50.4 50.5 50.6 50.7 <pubmed>9761671</pubmed>
  51. <pubmed>10547692</pubmed>
  52. 52.0 52.1 52.2 <pubmed>10064703</pubmed>
  53. 53.0 53.1 <pubmed>21478364</pubmed>
  54. 54.0 54.1 54.2 54.3 <pubmed>18474594</pubmed>
  55. <pubmed>16731525</pubmed>
  56. 56.0 56.1 <pubmed>8824609</pubmed>
  57. <pubmed>1314954</pubmed>
  58. <pubmed>7636469</pubmed>
  59. 59.0 59.1 <pubmed>1547945</pubmed>
  60. <pubmed>3860833</pubmed>
  61. <pubmed>11027137</pubmed>
  62. <pubmed>12970189</pubmed>
  63. <pubmed>18621088</pubmed>
  64. 64.0 64.1 <pubmed>12586397</pubmed>
  65. 65.0 65.1 65.2 65.3 <pubmed>22195971</pubmed>
  66. <pubmed>8520113</pubmed>
  67. <pubmed>7496528</pubmed>
  68. <pubmed>9335268</pubmed>
  69. <pubmed>9761671</pubmed>
  70. <pubmed>26405228</pubmed>
  71. <pubmed>21439812</pubmed>
  72. <pubmed>23217365</pubmed>
  73. <pubmed>16181782</pubmed>
  74. <pubmed>11352577</pubmed>
  75. 75.0 75.1 <pubmed>20553579</pubmed>
  76. <pubmed>15155821</pubmed>
  77. <pubmed>16923775</pubmed>
  78. 78.0 78.1 78.2 <pubmed>17367389</pubmed>
  79. <pubmed>18586933</pubmed>
  80. 80.0 80.1 80.2 <pubmed>9463394</pubmed>
  81. 81.0 81.1 81.2 <pubmed>7489730</pubmed>
  82. <pubmed>15493331</pubmed>
  83. 83.0 83.1 <pubmed>12374815</pubmed>
  84. <pubmed>26104718</pubmed>
  85. 85.0 85.1 85.2 85.3 85.4 <pubmed>7590258</pubmed>
  86. 86.0 86.1 86.2 <pubmed>10556026</pubmed>
  87. <pubmed>9413996</pubmed>
  88. <pubmed>1334464</pubmed>
  89. 89.0 89.1 89.2 <pubmed>15359283</pubmed>
  90. <pubmed>8021197</pubmed>
  91. <pubmed>2181438</pubmed>
  92. <pubmed>2510933</pubmed>
  93. <pubmed>16148308</pubmed>
  94. <pubmed>26104374</pubmed>
  95. <pubmed>12770828</pubmed>
  96. <pubmed>11459960</pubmed>
  97. 97.0 97.1 97.2 <pubmed>15306008</pubmed>
  98. 98.0 98.1 98.2 <pubmed>9214651</pubmed>
  99. <pubmed>14729714</pubmed>
  100. <pubmed>10320583</pubmed>
  101. 101.0 101.1 <pubmed>12145217</pubmed>
  102. <pubmed>14756780</pubmed>
  103. 103.0 103.1 <pubmed>8550559</pubmed>
  104. <pubmed>375194</pubmed>
  105. <pubmed>8676870</pubmed>
  106. <pubmed>33006208</pubmed>
  107. <pubmed>9463394</pubmed>
  108. <pubmed>37574851</pubmed>
  109. <pubmed>10698782</pubmed>
  110. <pubmed>16233651</pubmed>
  111. <pubmed>11258796</pubmed>
  112. <pubmed>9628576</pubmed>
  113. <pubmed>16233651</pubmed>
  114. 114.0 114.1 114.2 <pubmed>24334665</pubmed>
  115. 115.00 115.01 115.02 115.03 115.04 115.05 115.06 115.07 115.08 115.09 115.10 115.11 115.12 115.13 115.14 115.15 <pubmed>21224843</pubmed>
  116. 116.0 116.1 116.2 116.3 116.4 116.5 116.6 116.7 116.8 116.9 <pubmed>36715333</pubmed>
  117. <pubmed>6287993</pubmed>
  118. <pubmed>6322169</pubmed>
  119. <pubmed>2981756</pubmed>
  120. <pubmed>1316613</pubmed>
  121. <pubmed>18022196</pubmed>
  122. <pubmed>455447</pubmed>
  123. <pubmed>6260376</pubmed>
  124. 124.0 124.1 124.2 <pubmed>29440577</pubmed>
  125. 125.0 125.1 <pubmed>27620479</pubmed>
  126. <pubmed>26603172</pubmed>
  127. <pubmed>28073961</pubmed>
  128. <pubmed>28416554</pubmed>
  129. <pubmed>10545262</pubmed>
  130. <pubmed>27990631</pubmed>
  131. <pubmed>8393667</pubmed>
  132. <pubmed>35913153</pubmed>
  133. <pubmed>34809935</pubmed>