ISEc23
- Family IS66
- Group
Isoform Synonym(s)
Accession number | Transposition | Origin | Host |
---|---|---|---|
ND | Escherichia coli | Escherichia coli O150:H5 SE15 |
DNA section
IS Length : 2532 bp
Ends
IR Length : 20/24
IRL : GTAAGCGTAAACTGACCGCCGTATGTAGCCATCAGACGAGAATTGGTAAC
IRR : GTAAGCGTCAACGGAGCACCGTATTGACGCTTATTTATTGGTGAGTACTA
Insertion site
Left flank | Direct repeat | Right flank | DR Length |
---|---|---|---|
TTATTGTGAA | TATAATGG | CGTGACCGCT | 8 |
AAATCCTTTA | CGTTACTC | TCTGATGACT | 8 |
CATAGTTATT | CACACTCC | CTTCACTTAC | 8 |
ATGTGCGGAA | ATTAAATC | CTGTCGTTTC | 8 |
DNA sequence
GTAAGCGTAAACTGACCGCCGTATGTAGCCATCAGACGAGAATTGGTAACTTAGACGCCCATCTGATATAGACGGACATCTAAGTATGGAATTACAGGAC
TGGCGAAAAGAACCTCGTAAAAACTATTCGAATGAATTCAAACTTCGTATGGTGGAACTGGCATCACAACCTGGAGCTTGTGTTGCACAGATTGCACGTG
AAAATGGCGTCAATGATAATGTTATTTTCAAATGGCTCAGGCTCTGGCAGAACGAAGGGCGTGTTTCGCGGCGTCTTCCGGTAACGACCTCTTCTGACAC
TGGCGTTGAATTATTACCTGTAGAAATAACGCCGGATGAGCAGAAAGAACCTGTGGCGGCCATTGCGCCGTCTTTATCCACTTCCACTCAGACCAGAGTC
AGTGCCAGTTCCTGCAAGGTGGAATTCCGTCACGGTAACATGACGCTGGAAAATCCATCGCCAGAGCTGCTCACAGTGTTGATCCGTGAACTGACCGGGA
GGGGAAGATGATCTCACTCCCATCAGGTACCCGTATCTGGCTCGTTGCCGGCGTTACCGATATGCGTAAATCCTTCAACGGACTGGGAGAACAGGTACAA
CATGTGCTGAATGATAATCCCTTCTCCGGTCACCTGTTTATCTTCCGTGGCCGACGGGGTGACACCGTCAAAATTCTTTGGGCTGATGCTGATGGTCTGT
GCCTGTTCACCAAACGCCTGGAGGAAGGCCAGTTTATCTGGCCTGCGGTACGTGACGGCAAGGTATCCATTACCCGCTCGCAACTGGCAATGCTCCTCGA
TAAGCTGGACTGGCGTCAGCCAAAAACATCCAGCCGTAACTCACTGACAATGTTGTAAAAAACTCCTGACCGCATTATAAAAACGGTCATGAGTCAGAAA
TACCTCATTCGCATCGCAGAGCTGGAAAGGTTGCTCTCTGAGCAGGCTGAAGCCCTCCGTCAGAAAGACCAGCAACTGAGTCTGGTTGAAGAGACGGAAG
CCTTCCTGCGCTCTGCACTGACACGTGCCGAAGAAAAGATCGAAGAAGATGAACGGGAAATAGAACATCTGCGGGCTCAGATAGAAAAACTGCGCCGGAT
GCTGTTCGGTACCCGTTCTGAAAAACTGCGTCGTGAAGTTGAACTGGCTGAGGCTCTGCTGAAACAACGTGAACAGGACAGCGATCGTTACAGTGGGCGG
GAAGACGATCCTCAGGTTCCCCGCCAGTTGCGACAGTCGCGCCATCGTCGTCCGTTACCGGCACACCTTCCCCGTGAAATACACCGCCTGGAGCCAGAAG
AAAGCTGTTGCCCGGAGTGTGGCGGTGAGCTGGATTATCTGGGGGAAGTCAGCGCTGAACAGCTGGAACTGGTGAGCAGTGCCCTGAAAGTGATCCGCAC
AGAACGGGTAAAAAAAGCCTGTACAAAATGTGACTGTATTGTTGAAGCACCGGCGCCGTCCCGCCCGATAGAGCGTGGTATCGCGGGCCCCGGATTACTT
GCCCGCGTGTTAACGGGAAAATACTGCGAACATCTGCCACTGTATCGTCAGAGTGAAATCTTTGCCCGCCAGGGTGTCGAACTGAGCCGGGCCTTACTCT
CCAACTGGGTTGACGCGTGCTGCCAGTTAATGACACCGGTGAATGATGCCCTGTACCGTTATGTAATGAATACCCGCAAGGTTCACACTGATGACACACC
GGTAAAGGTACTGGCACCGGGTCAGAAAAAGGCGAAAACAGGGCGTATCTGGACGTATGTCCGGGATGATCGCAATGTGGGTTCGTCATCTCCTCCAGCG
GTCTGGTTCGCGTACTCGCCGAACCGGCAGGGGAAACACCCGGAGCAACACCTCCGCCCCTTCCGGGGTATCCTGCAGGCGGATGCGTTCACAGGTTACG
ACAGGTTGTTCAGTGCAGAACGTGAAGGTGGTGCACTGACAGAAGTTGCGTGCTGGGCCCATGCCCGGCGAAAAATCCACGATGTATACATCAGCAGCAA
AAGTGCGACGGCAGAAGAAGCCCTGAAGCGAATCAGTGAACTGTACGCCATCGAGGATGAAATACGGGGATTACCGGAGTCAGAGCGTCTTGCCGTCAGG
CAGCAGCGAAGCAAAGTGTTACTGACGTCGCTGCATGAATGGATGGTGGAGAAGAATGGTACGCTGTCGAAAAAATCCAGACTGGGCGAAGCGTTCAGCT
ATGTACTGAATCAGTGGGATGCCCTCTGTTATTACAGTGATGACGGTCTGGCGGAGGCGGATAATAATGCTGCGGAAAGAGCGCTTCGTGCAGTCTGTCT
CGGAAAGAAAAACTTTATGTTCTTTGGCAGCGATCACGGCGGCGAGCGTGGAGCACTGTTGTACGGGCTGATCGGCACCTGCCGTCTGAACGGTATCGAT
CCGGAAGCGTATCTGCGCCATATCCTGAGCGTACTGCCGGAATGGCCTTCCAACCGAGTTGACGAACTCCTGCCATGGAACGTAGTACTCACCAATAAAT
AAGCGTCAATACGGTGCTCCGTTGACGCTTAC
TGGCGAAAAGAACCTCGTAAAAACTATTCGAATGAATTCAAACTTCGTATGGTGGAACTGGCATCACAACCTGGAGCTTGTGTTGCACAGATTGCACGTG
AAAATGGCGTCAATGATAATGTTATTTTCAAATGGCTCAGGCTCTGGCAGAACGAAGGGCGTGTTTCGCGGCGTCTTCCGGTAACGACCTCTTCTGACAC
TGGCGTTGAATTATTACCTGTAGAAATAACGCCGGATGAGCAGAAAGAACCTGTGGCGGCCATTGCGCCGTCTTTATCCACTTCCACTCAGACCAGAGTC
AGTGCCAGTTCCTGCAAGGTGGAATTCCGTCACGGTAACATGACGCTGGAAAATCCATCGCCAGAGCTGCTCACAGTGTTGATCCGTGAACTGACCGGGA
GGGGAAGATGATCTCACTCCCATCAGGTACCCGTATCTGGCTCGTTGCCGGCGTTACCGATATGCGTAAATCCTTCAACGGACTGGGAGAACAGGTACAA
CATGTGCTGAATGATAATCCCTTCTCCGGTCACCTGTTTATCTTCCGTGGCCGACGGGGTGACACCGTCAAAATTCTTTGGGCTGATGCTGATGGTCTGT
GCCTGTTCACCAAACGCCTGGAGGAAGGCCAGTTTATCTGGCCTGCGGTACGTGACGGCAAGGTATCCATTACCCGCTCGCAACTGGCAATGCTCCTCGA
TAAGCTGGACTGGCGTCAGCCAAAAACATCCAGCCGTAACTCACTGACAATGTTGTAAAAAACTCCTGACCGCATTATAAAAACGGTCATGAGTCAGAAA
TACCTCATTCGCATCGCAGAGCTGGAAAGGTTGCTCTCTGAGCAGGCTGAAGCCCTCCGTCAGAAAGACCAGCAACTGAGTCTGGTTGAAGAGACGGAAG
CCTTCCTGCGCTCTGCACTGACACGTGCCGAAGAAAAGATCGAAGAAGATGAACGGGAAATAGAACATCTGCGGGCTCAGATAGAAAAACTGCGCCGGAT
GCTGTTCGGTACCCGTTCTGAAAAACTGCGTCGTGAAGTTGAACTGGCTGAGGCTCTGCTGAAACAACGTGAACAGGACAGCGATCGTTACAGTGGGCGG
GAAGACGATCCTCAGGTTCCCCGCCAGTTGCGACAGTCGCGCCATCGTCGTCCGTTACCGGCACACCTTCCCCGTGAAATACACCGCCTGGAGCCAGAAG
AAAGCTGTTGCCCGGAGTGTGGCGGTGAGCTGGATTATCTGGGGGAAGTCAGCGCTGAACAGCTGGAACTGGTGAGCAGTGCCCTGAAAGTGATCCGCAC
AGAACGGGTAAAAAAAGCCTGTACAAAATGTGACTGTATTGTTGAAGCACCGGCGCCGTCCCGCCCGATAGAGCGTGGTATCGCGGGCCCCGGATTACTT
GCCCGCGTGTTAACGGGAAAATACTGCGAACATCTGCCACTGTATCGTCAGAGTGAAATCTTTGCCCGCCAGGGTGTCGAACTGAGCCGGGCCTTACTCT
CCAACTGGGTTGACGCGTGCTGCCAGTTAATGACACCGGTGAATGATGCCCTGTACCGTTATGTAATGAATACCCGCAAGGTTCACACTGATGACACACC
GGTAAAGGTACTGGCACCGGGTCAGAAAAAGGCGAAAACAGGGCGTATCTGGACGTATGTCCGGGATGATCGCAATGTGGGTTCGTCATCTCCTCCAGCG
GTCTGGTTCGCGTACTCGCCGAACCGGCAGGGGAAACACCCGGAGCAACACCTCCGCCCCTTCCGGGGTATCCTGCAGGCGGATGCGTTCACAGGTTACG
ACAGGTTGTTCAGTGCAGAACGTGAAGGTGGTGCACTGACAGAAGTTGCGTGCTGGGCCCATGCCCGGCGAAAAATCCACGATGTATACATCAGCAGCAA
AAGTGCGACGGCAGAAGAAGCCCTGAAGCGAATCAGTGAACTGTACGCCATCGAGGATGAAATACGGGGATTACCGGAGTCAGAGCGTCTTGCCGTCAGG
CAGCAGCGAAGCAAAGTGTTACTGACGTCGCTGCATGAATGGATGGTGGAGAAGAATGGTACGCTGTCGAAAAAATCCAGACTGGGCGAAGCGTTCAGCT
ATGTACTGAATCAGTGGGATGCCCTCTGTTATTACAGTGATGACGGTCTGGCGGAGGCGGATAATAATGCTGCGGAAAGAGCGCTTCGTGCAGTCTGTCT
CGGAAAGAAAAACTTTATGTTCTTTGGCAGCGATCACGGCGGCGAGCGTGGAGCACTGTTGTACGGGCTGATCGGCACCTGCCGTCTGAACGGTATCGAT
CCGGAAGCGTATCTGCGCCATATCCTGAGCGTACTGCCGGAATGGCCTTCCAACCGAGTTGACGAACTCCTGCCATGGAACGTAGTACTCACCAATAAAT
AAGCGTCAATACGGTGCTCCGTTGACGCTTAC
Protein section
ORF number : 3
ORF 1
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
426 bp | 141 aa | 86 | 511 | + | No |
AG : IS66 TnpA
ORF sequence :
MELQDWRKEPRKNYSNEFKLRMVELASQPGACVAQIARENGVNDNVIFKWLRLWQNEGRVSRRLPVTTSSDTGVELLPVEITPDEQKEPVAAIAPSLSTS
TQTRVSASSCKVEFRHGNMTLENPSPELLTVLIRELTGRGR
TQTRVSASSCKVEFRHGNMTLENPSPELLTVLIRELTGRGR
Blast result :ORF 2
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
351 bp | 116 aa | 508 | 858 | + | No |
AG : IS66 TnpB
ORF sequence :
MISLPSGTRIWLVAGVTDMRKSFNGLGEQVQHVLNDNPFSGHLFIFRGRRGDTVKILWADADGLCLFTKRLEEGQFIWPAVRDGKVSITRSQLAMLLDKL
DWRQPKTSSRNSLTML
DWRQPKTSSRNSLTML
Blast result :ORF 3
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
1614 bp | 537 aa | 889 | 2502 | + | No |
Chemistry : DDE
ORF sequence :
MSQKYLIRIAELERLLSEQAEALRQKDQQLSLVEETEAFLRSALTRAEEKIEEDEREIEHLRAQIEKLRRMLFGTRSEKLRREVELAEALLKQREQDSDR
YSGREDDPQVPRQLRQSRHRRPLPAHLPREIHRLEPEESCCPECGGELDYLGEVSAEQLELVSSALKVIRTERVKKACTKCDCIVEAPAPSRPIERGIAG
PGLLARVLTGKYCEHLPLYRQSEIFARQGVELSRALLSNWVDACCQLMTPVNDALYRYVMNTRKVHTDDTPVKVLAPGQKKAKTGRIWTYVRDDRNVGSS
SPPAVWFAYSPNRQGKHPEQHLRPFRGILQADAFTGYDRLFSAEREGGALTEVACWAHARRKIHDVYISSKSATAEEALKRISELYAIEDEIRGLPESER
LAVRQQRSKVLLTSLHEWMVEKNGTLSKKSRLGEAFSYVLNQWDALCYYSDDGLAEADNNAAERALRAVCLGKKNFMFFGSDHGGERGALLYGLIGTCRL
NGIDPEAYLRHILSVLPEWPSNRVDELLPWNVVLTNK
YSGREDDPQVPRQLRQSRHRRPLPAHLPREIHRLEPEESCCPECGGELDYLGEVSAEQLELVSSALKVIRTERVKKACTKCDCIVEAPAPSRPIERGIAG
PGLLARVLTGKYCEHLPLYRQSEIFARQGVELSRALLSNWVDACCQLMTPVNDALYRYVMNTRKVHTDDTPVKVLAPGQKKAKTGRIWTYVRDDRNVGSS
SPPAVWFAYSPNRQGKHPEQHLRPFRGILQADAFTGYDRLFSAEREGGALTEVACWAHARRKIHDVYISSKSATAEEALKRISELYAIEDEIRGLPESER
LAVRQQRSKVLLTSLHEWMVEKNGTLSKKSRLGEAFSYVLNQWDALCYYSDDGLAEADNNAAERALRAVCLGKKNFMFFGSDHGGERGALLYGLIGTCRL
NGIDPEAYLRHILSVLPEWPSNRVDELLPWNVVLTNK
Blast result :
Comments
4 identical copies in E.coli SE15 chromosome, plus 1 remnant. An example of a complete ISEc23 is found in the E.coli SE15 genome sequence at co-ordinates 1184983-1187514.
orf1 is 60% aa similar to ISEc22 (orf1)
orf2 is 86% aa similar to ISCro1 (orf2)
orf3 is 76% aa similar to ISEc22 (orf3)
orf1 is 60% aa similar to ISEc22 (orf1)
orf2 is 86% aa similar to ISCro1 (orf2)
orf3 is 76% aa similar to ISEc22 (orf3)
References
1] Tadasuke Ooka, Tetsuya Hayashi (2008) Direct submission.