ISShdy2
- Family IS66
- Group
Isoform Synonym(s)
Accession number | Transposition | Origin | Host |
---|---|---|---|
ND | Shigella dysenteriae | Shigella dysenteriae CDC 08-3330 |
DNA section
IS Length : 2496 bp
Ends
IR Length : 25/26
IRL : GTAAGCATCCGGTGAACTCGCATTGCCGGCCGTCGTGAACCACGCCATTC
IRR : GTAAGCATCCGGTGAACTCGCATGGCGCGATCACCGGCGTTATGGCAGGG
Insertion site
Left flank | Direct repeat | Right flank | DR Length |
---|---|---|---|
GCTCGACACG | GACTCATT | GTCACCACCC | 8 |
DNA sequence
GTAAGCATCCGGTGAACTCGCATTGCCGGCCGTCGTGAACCACGCCATTCTGCAGGCTCTTCCTTTATATGCAAAAGTGAGAAACCTGATGGCAAAAAAA
ATGACCCCGGCGCAGCGACGCCAGCACTACGACGCCTGGCGCGTCAGCGGCATGTCCCGGGCCGCGTATGCCCGGCTACACGGCATCAACAACAAAACCT
TCTGGCATCTCTGCCGGGCCCTCTCCGCTGACGACGCCCGTGGCACCGCTCCGGCTGATAACAGACCGGCTGTCCTGCCGGTCACCCTCTCCGTCAGTGA
CACTGCCACCCTGAAACTCCAGCGCGCCTGCGTCACCTCCACCCCGGCCGGCATCGCCGCCATCATCCGGGAGCTTCACCTGTGCTGAACCCTCATGCCC
TCTGGCTCGTCCGCGAGCCTGCCGACATGCGCGCCGGCATCGACTCCCTGACGCGGCTCGCCACTCAGGCCGCCGGGCATCCGCCCCGGGAAGGTGAAGC
CTTCCTCTTCACCGGGAAAAAACGCACCCGCATGAAACTCCTGATGTGGGACCGACACGGCGTCTGGCTCTGCACCCGCCGCCTGCACCAGGGCGCCTTC
CGCTGGCCCCGCGACGGCGACACCACCTGGTCACTCACGGCGGAACAGTTCGCCTGGCTTACTGCCGGCATTGACTGGCTGCGTCTCTCTGCCGGCCCCC
TGCAGAGGTGGACTGAATAACCTCCTGAGCAAAAATCATAAATAATCATCCTGTTATCAGGATGGTGTTCCCCCTCATCTGTCATCATGGCTGCATGACA
GATGATATCCTGAACTCCACACAAAACCCCGATGAACTGCGCCGTATGGTAACAGCGCTGCTGACGGCACAGGCATGCGAATATGAGCAGCGCATTCATG
ACCTTAATGTCGCCATGCAGGCAGAAAAGTTGACGCTGGAGCAGCGCCTCCATGACCTTAATGCCGCCATGCAGGCAGAAAAAACCGTATATGAACAGCG
CATCCGCGAGCTGGAAGACGCCCTGAAGCTGGCACAGCAGTGGCGCTTCGGCCGCAAAAGTGAGCGCCTGCCGGCCAGCCAGAAACCACTGGCTGACGAG
GACGCGGCCAGCGATGAGGCTGATATCACCCGGCAACTGAGCGACCTGCTGCCGGAGAAGGAGAAAACCGGGAAGAAGCCGGCCCGCCAGCCCCTGCCGG
CACACCTGCCGCGCCAGGAAACCGTACTCATGCCTGAGACCGGCAGCACCTGTCCGGACTGCGGCAGTGAAATGCGACATATCCGCGACGAGGTGAATGA
GGTGCTGGAATATGTACCGGCACACTTCGTGGTGAAACGGACCGTGAGACCGCAGTACAGCTGCCCGTGCTGCGACACGGTGCACAGTGCCGTGCTGCCG
TCGGCAGTCATCGACAAAGGGCAGCCGGGCCCGGGTCTGCTGGCGCAGGTGGTGACCGCGAAGGTGCTGGAACACCTGCCACTGCAGCGGCAGCAGAAGA
TATACGCCCGTGAAGGGGTACAGCTGCCGGAAAGCACGCTGACGGACTGGTTCGGGCAGACGGCGGCGGTGCTGTCGCCGCTGGCGGCAGCCCTGAAACG
TGACCTGCTCAGGCAACCGGTGCTGCAGGCGGACGAGACGCCGCTGCAGATACTGGATACGCGGAAAGGGAAAGTCCGGAAAGGGTACCTGTGGGCATAC
GTGAGCGCGGCGGGCAGTGCCCGGGACATCGTGGTGTACGACTGCCGGCCGGGGCGTGCGGGGCAGTATGCGTGTGAGATGCTGAGCGGGTGGTCGGGGA
CTCTGGTGGCTGACGGTTATGCCGGTTACCGGGCGCTGTTCCGTGACGGGCAGGAAGGGGCCCCTCCGGTGGCCCCGGGTATCCGTGAGGCGGGGTGCAT
GGCGCACGTGCGCAGAAAGTTCATGGAGCTGTACAAAATGAACGGCAGTCCGGGGGCGAAGGAGGCGCTGAAACAGATACGGGCGCTGTATATCCTGGAG
CGGAGCATCCGGAACCGTCCGGCGGAGCAGAAACGGCGATGGCGGCGGCGGTACGCGAAGCCGCAGATGGAGGCGTTCCACAGCTGGCTGAGGGCGACGG
AAAAGACGAGCGCGCCGGGTGGCAGGCTGCACGGTGCGGTGAGGTATGCCCTGAAGCGTTGGCCGGCGCTGGAAACATACCTGAATGACGGACGGGTACC
GCTGGACAACAACCGGTGTGAGCAGATGATGCGTCCGGTGGCGCAGGGGCGGAAGTCATGGCTGTTCGCGGGTTCGCAGCGGGGAGGAGAGCGGCTGGCG
GAGCTGCTGACGCTGCTGCACACGGCGAGGCTGAACGGTCTGGAGCCAGTAGCCTGGCTGCGTGATGTGCTGGAGAAGTTGCCGTCATGGCCGGCGTCCC
GGCTGGATGAACTGCTGCCTTACCGCCGTCCGGCGGACTGAATACCCCCTGCCATAACGCCGGTGATCGCGCCATGCGAGTTCACCGGATGCTTAC
ATGACCCCGGCGCAGCGACGCCAGCACTACGACGCCTGGCGCGTCAGCGGCATGTCCCGGGCCGCGTATGCCCGGCTACACGGCATCAACAACAAAACCT
TCTGGCATCTCTGCCGGGCCCTCTCCGCTGACGACGCCCGTGGCACCGCTCCGGCTGATAACAGACCGGCTGTCCTGCCGGTCACCCTCTCCGTCAGTGA
CACTGCCACCCTGAAACTCCAGCGCGCCTGCGTCACCTCCACCCCGGCCGGCATCGCCGCCATCATCCGGGAGCTTCACCTGTGCTGAACCCTCATGCCC
TCTGGCTCGTCCGCGAGCCTGCCGACATGCGCGCCGGCATCGACTCCCTGACGCGGCTCGCCACTCAGGCCGCCGGGCATCCGCCCCGGGAAGGTGAAGC
CTTCCTCTTCACCGGGAAAAAACGCACCCGCATGAAACTCCTGATGTGGGACCGACACGGCGTCTGGCTCTGCACCCGCCGCCTGCACCAGGGCGCCTTC
CGCTGGCCCCGCGACGGCGACACCACCTGGTCACTCACGGCGGAACAGTTCGCCTGGCTTACTGCCGGCATTGACTGGCTGCGTCTCTCTGCCGGCCCCC
TGCAGAGGTGGACTGAATAACCTCCTGAGCAAAAATCATAAATAATCATCCTGTTATCAGGATGGTGTTCCCCCTCATCTGTCATCATGGCTGCATGACA
GATGATATCCTGAACTCCACACAAAACCCCGATGAACTGCGCCGTATGGTAACAGCGCTGCTGACGGCACAGGCATGCGAATATGAGCAGCGCATTCATG
ACCTTAATGTCGCCATGCAGGCAGAAAAGTTGACGCTGGAGCAGCGCCTCCATGACCTTAATGCCGCCATGCAGGCAGAAAAAACCGTATATGAACAGCG
CATCCGCGAGCTGGAAGACGCCCTGAAGCTGGCACAGCAGTGGCGCTTCGGCCGCAAAAGTGAGCGCCTGCCGGCCAGCCAGAAACCACTGGCTGACGAG
GACGCGGCCAGCGATGAGGCTGATATCACCCGGCAACTGAGCGACCTGCTGCCGGAGAAGGAGAAAACCGGGAAGAAGCCGGCCCGCCAGCCCCTGCCGG
CACACCTGCCGCGCCAGGAAACCGTACTCATGCCTGAGACCGGCAGCACCTGTCCGGACTGCGGCAGTGAAATGCGACATATCCGCGACGAGGTGAATGA
GGTGCTGGAATATGTACCGGCACACTTCGTGGTGAAACGGACCGTGAGACCGCAGTACAGCTGCCCGTGCTGCGACACGGTGCACAGTGCCGTGCTGCCG
TCGGCAGTCATCGACAAAGGGCAGCCGGGCCCGGGTCTGCTGGCGCAGGTGGTGACCGCGAAGGTGCTGGAACACCTGCCACTGCAGCGGCAGCAGAAGA
TATACGCCCGTGAAGGGGTACAGCTGCCGGAAAGCACGCTGACGGACTGGTTCGGGCAGACGGCGGCGGTGCTGTCGCCGCTGGCGGCAGCCCTGAAACG
TGACCTGCTCAGGCAACCGGTGCTGCAGGCGGACGAGACGCCGCTGCAGATACTGGATACGCGGAAAGGGAAAGTCCGGAAAGGGTACCTGTGGGCATAC
GTGAGCGCGGCGGGCAGTGCCCGGGACATCGTGGTGTACGACTGCCGGCCGGGGCGTGCGGGGCAGTATGCGTGTGAGATGCTGAGCGGGTGGTCGGGGA
CTCTGGTGGCTGACGGTTATGCCGGTTACCGGGCGCTGTTCCGTGACGGGCAGGAAGGGGCCCCTCCGGTGGCCCCGGGTATCCGTGAGGCGGGGTGCAT
GGCGCACGTGCGCAGAAAGTTCATGGAGCTGTACAAAATGAACGGCAGTCCGGGGGCGAAGGAGGCGCTGAAACAGATACGGGCGCTGTATATCCTGGAG
CGGAGCATCCGGAACCGTCCGGCGGAGCAGAAACGGCGATGGCGGCGGCGGTACGCGAAGCCGCAGATGGAGGCGTTCCACAGCTGGCTGAGGGCGACGG
AAAAGACGAGCGCGCCGGGTGGCAGGCTGCACGGTGCGGTGAGGTATGCCCTGAAGCGTTGGCCGGCGCTGGAAACATACCTGAATGACGGACGGGTACC
GCTGGACAACAACCGGTGTGAGCAGATGATGCGTCCGGTGGCGCAGGGGCGGAAGTCATGGCTGTTCGCGGGTTCGCAGCGGGGAGGAGAGCGGCTGGCG
GAGCTGCTGACGCTGCTGCACACGGCGAGGCTGAACGGTCTGGAGCCAGTAGCCTGGCTGCGTGATGTGCTGGAGAAGTTGCCGTCATGGCCGGCGTCCC
GGCTGGATGAACTGCTGCCTTACCGCCGTCCGGCGGACTGAATACCCCCTGCCATAACGCCGGTGATCGCGCCATGCGAGTTCACCGGATGCTTAC
Protein section
ORF number : 3
ORF 1
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
300 bp | 99 aa | 89 | 388 | + | No |
AG : IS66 TnpA
ORF sequence :
MAKKMTPAQRRQHYDAWRVSGMSRAAYARLHGINNKTFWHLCRALSADDARGTAPADNRPAVLPVTLSVSDTATLKLQRACVTSTPAGIAAIIRELHLC
Blast result :ORF 2
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
294 bp | 97 aa | 427 | 720 | + | No |
AG : IS66 TnpB
ORF sequence :
MRAGIDSLTRLATQAAGHPPREGEAFLFTGKKRTRMKLLMWDRHGVWLCTRRLHQGAFRWPRDGDTTWSLTAEQFAWLTAGIDWLRLSAGPLQRWTE
Blast result :ORF 3
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
1680 bp | 559 aa | 762 | 2441 | + | No |
Chemistry : DDE
ORF sequence :
MVFPLICHHGCMTDDILNSTQNPDELRRMVTALLTAQACEYEQRIHDLNVAMQAEKLTLEQRLHDLNAAMQAEKTVYEQRIRELEDALKLAQQWRFGRKS
ERLPASQKPLADEDAASDEADITRQLSDLLPEKEKTGKKPARQPLPAHLPRQETVLMPETGSTCPDCGSEMRHIRDEVNEVLEYVPAHFVVKRTVRPQYS
CPCCDTVHSAVLPSAVIDKGQPGPGLLAQVVTAKVLEHLPLQRQQKIYAREGVQLPESTLTDWFGQTAAVLSPLAAALKRDLLRQPVLQADETPLQILDT
RKGKVRKGYLWAYVSAAGSARDIVVYDCRPGRAGQYACEMLSGWSGTLVADGYAGYRALFRDGQEGAPPVAPGIREAGCMAHVRRKFMELYKMNGSPGAK
EALKQIRALYILERSIRNRPAEQKRRWRRRYAKPQMEAFHSWLRATEKTSAPGGRLHGAVRYALKRWPALETYLNDGRVPLDNNRCEQMMRPVAQGRKSW
LFAGSQRGGERLAELLTLLHTARLNGLEPVAWLRDVLEKLPSWPASRLDELLPYRRPAD
ERLPASQKPLADEDAASDEADITRQLSDLLPEKEKTGKKPARQPLPAHLPRQETVLMPETGSTCPDCGSEMRHIRDEVNEVLEYVPAHFVVKRTVRPQYS
CPCCDTVHSAVLPSAVIDKGQPGPGLLAQVVTAKVLEHLPLQRQQKIYAREGVQLPESTLTDWFGQTAAVLSPLAAALKRDLLRQPVLQADETPLQILDT
RKGKVRKGYLWAYVSAAGSARDIVVYDCRPGRAGQYACEMLSGWSGTLVADGYAGYRALFRDGQEGAPPVAPGIREAGCMAHVRRKFMELYKMNGSPGAK
EALKQIRALYILERSIRNRPAEQKRRWRRRYAKPQMEAFHSWLRATEKTSAPGGRLHGAVRYALKRWPALETYLNDGRVPLDNNRCEQMMRPVAQGRKSW
LFAGSQRGGERLAELLTLLHTARLNGLEPVAWLRDVLEKLPSWPASRLDELLPYRRPAD
Blast result :
Comments
ISShdy2 is 50% (ORFA), 67% (ORFB) and 63%(ORFC : the transposase) aa similar to ISEc49.
This structure was inserted into an IS10. This PacBio genome is not yet deposited but will be soon at the EBI-ENA database.
This structure was inserted into an IS10. This PacBio genome is not yet deposited but will be soon at the EBI-ENA database.
References