ISNarch4
- Family IS66
- Group
Isoform Synonym(s)
Accession number | Transposition | Origin | Host |
---|---|---|---|
NZ_CP050695.1 | ND | Natrialbaceae archaeon | Natrialbaceae archaeon 2447 Salinadaptatus halalkaliphilus 2447 |
DNA section
IS Length : 2455 bp
Ends
IR Length : 22/30
IRL : GTAAGCGCCAGCGAATCGGCACCTATTCCCTGGTCTGAGTACTGGTCACA
IRR : GTAAGCGCCCGTAAATCGACCCACTTGCCCCGCAGGCGGTCGTCAGCTCT
Insertion site
Left flank | Direct repeat | Right flank | DR Length |
---|---|---|---|
CAGGGCGAAG | GGACGAAG | GGACGAAGCG | 8 |
CCGGCAGTCC | GGCAGTCC | GGCAGTCCCG | 8 |
DNA sequence
GTAAGCGCCAGCGAATCGGCACCTATTCCCTGGTCTGAGTACTGGTCACACTGGTTCCATCCCGTTGACCACTGGAGGAAATCGGTATGACCAGCGCCGA
GCTGCATCATTACTGGCGACAGACGCTGGATGCGTGGACGGCCTCCGGGTTGTCCGGCGCGGCGTTCTGCAAGCAACACTCACTCACCTACCACCAGTTT
GTCTACTGGCGGCGAAAGCTCCGTGGCCCGGGCGAGTCGCCTTCGCGGGCCGGCTTTGCCAGGGTGGCGCCGGTGGCACACGATGACGCCGCGGATGGGC
TGACCGTCTCGTTGCCCGGCGGTGTGTCGATCACCGGCCTGCACGCGGGCAACATCGAGTTGCTGGGCGCGGTGCTGAGGCAGTTGTGATGCGCAACCGG
TCTCTGCGCCCGTCCCGGCAGTTGCCGGAGATTTACCTGTACCGGGCCCCGGTGGATTTCCGCAAACAGGCCCATGGTCTCGCGCTGATCGTCGAGCAGG
AGCTCGGGCACAGCCCCTTCACCGGGGCGCTGTACGCCTTCACCAACCGCCAGCGCAACAAGATCAAGTGTCTGATGTGGGAAGACAACGGCTTCGTGCT
CTACTACAAGGCCCTGGCCGAGGAGCGGTTCAAGTGGCCGGCCCCGGGCGATGAGTTGATGAGCCTGAGCGGGGAGCAGATCAACTGGCTGCTCGACGGC
TACGACATCACGCTGCTGCGGGGGCACAAAAAGCTGCATTACGAGGCGCTTGGGTAGGCGTTTTTGCGTGCGCCAGGGCGCTGTTTTTGGTATGATTTCG
TAATGAAATCAACGCCCGATAACGCCCCTCCGGCTCCCGATCTCAGCGGCCTCTCCGCCGCTGAGATGATGGCCGTTATCGGTGACCTTCAGCAACAACT
GGCCTCGAAAGAGCAGGCCATCCGGCAACGCGATACGCGCATTGATCTGCTCGAAGAACTGCTGCGCCTGAAGACCCTCCAGAAGTTCGCCGCCAGTAGC
GAGAAGCATCAAAACCAGATCACGCTGTTCGACGAGGCGGAGGTGGAAGCCGAGATCGATGCCTTGCGCGAGGCACTCCCGGACGACGCTGAACCCGACC
CGGATGAGACGCCGCGCACCTCGGGCAAGCGGCGTGAGCGGGGCTTCTCGGACACGCTGGCGCGCAGGCGCGTTGAGCTCACGCTCAGCGACGAGGAGAA
AGCCGGTGCCAGCAAGACCTTCTTCACCAAGGTCAAGGAGGAGCTTGAGTTCATCCCCGCTCAGTTGAGCGTGCTGGAGTACTGGCAGGAGAAGGCCGTG
TTCGAGCACGACGACGGGGAGGAGTCCCTAGTGGCGGCGCCCCGGCCGGTCCACCCGCTGGGCAAATGCATTGCCACCACCGCGCTGCTCGCCTACATCA
TCACCTCGAAGTACGCCGACGGTCTGCCGCTGTACCGACTGGAGAACATGCTGGCGCGGCTCGGGCATTCGGTCAGTCGCACCAGCATGGCGCACTGGAT
CATCCGCCTGGATGCGGTGTTCAGCCCGCTGATCAACCTCATGCGCGAGGCGCAGAACACCAGCGACTACCTCCAGGCCGATGAGACCCGCATGCAGGTC
CTCAAGGAGGACGGCAAGGTCGCCCAGTCCGACAAATGGATCTGGGTGACCCGGGGTGGGCCACCTGGCCGGCCAACGGTGCTGTTCGCGTACGACCCCT
CACGTGCGGGGAGCGTGCCCGTGCGCCTGCTCGATGACTTCAGCGGCATCCTGCAGGCCGATGGCTACTCCGGCTACGGCCAGGTGTGTCGGGACAACGC
CATCACCCGGATCGGGTGCTGGGATCATGCCCGTCGCAAGTTTGTCGAGGCCTCCAAGGCGGCGCCGCCCAAGAAGAAGGGCAAAGGCAAACGCCAGAGC
GCCAAGGCGGATGTGGCGCTGGGGGCGATCCAAGAGCTCTACGCCATTGAGCGCCGAATCAAGGATCTCGGCGATGATGAGCGCTATCGCATCCGCCAGG
CCGAGAGCCTGCCCCGGCTCCAGGCGTTGAAAACCTGGCTGGAAGACAACGCCGGCAAGGTCGTGAAGGGCTCACTGACCCGCAAGGCGATGGACTACAC
CCTGAACCAGTGGGACACCCTGGTGGGCTACTGCGAGCGTGGGGATCTACAGATCAGTAACGCCCTGGCCGAGAACGCCATCCGCCCGTTCGCGCTCGGT
CGCAAGGCATGGCTGTTCGCCGATACCACCCAGGGCGCACGCGCCAGCGCGAGCTGCTACTCACTAATCGAGACCGCCAAGGCCAATGGCCTGGACCCCT
CGGCCTACATCCACCATGTGCTCACGCACATCGGCGAGGCGGACACCGTCGAGAAGCTCGAAGCGCTACTGCCCTGGAATACGGGCCTGGAGCCGGCTCC
GAAAAAGAGCTGACGACCGCCTGCGGGGCAAGTGGGTCGATTTACGGGCGCTTAC
GCTGCATCATTACTGGCGACAGACGCTGGATGCGTGGACGGCCTCCGGGTTGTCCGGCGCGGCGTTCTGCAAGCAACACTCACTCACCTACCACCAGTTT
GTCTACTGGCGGCGAAAGCTCCGTGGCCCGGGCGAGTCGCCTTCGCGGGCCGGCTTTGCCAGGGTGGCGCCGGTGGCACACGATGACGCCGCGGATGGGC
TGACCGTCTCGTTGCCCGGCGGTGTGTCGATCACCGGCCTGCACGCGGGCAACATCGAGTTGCTGGGCGCGGTGCTGAGGCAGTTGTGATGCGCAACCGG
TCTCTGCGCCCGTCCCGGCAGTTGCCGGAGATTTACCTGTACCGGGCCCCGGTGGATTTCCGCAAACAGGCCCATGGTCTCGCGCTGATCGTCGAGCAGG
AGCTCGGGCACAGCCCCTTCACCGGGGCGCTGTACGCCTTCACCAACCGCCAGCGCAACAAGATCAAGTGTCTGATGTGGGAAGACAACGGCTTCGTGCT
CTACTACAAGGCCCTGGCCGAGGAGCGGTTCAAGTGGCCGGCCCCGGGCGATGAGTTGATGAGCCTGAGCGGGGAGCAGATCAACTGGCTGCTCGACGGC
TACGACATCACGCTGCTGCGGGGGCACAAAAAGCTGCATTACGAGGCGCTTGGGTAGGCGTTTTTGCGTGCGCCAGGGCGCTGTTTTTGGTATGATTTCG
TAATGAAATCAACGCCCGATAACGCCCCTCCGGCTCCCGATCTCAGCGGCCTCTCCGCCGCTGAGATGATGGCCGTTATCGGTGACCTTCAGCAACAACT
GGCCTCGAAAGAGCAGGCCATCCGGCAACGCGATACGCGCATTGATCTGCTCGAAGAACTGCTGCGCCTGAAGACCCTCCAGAAGTTCGCCGCCAGTAGC
GAGAAGCATCAAAACCAGATCACGCTGTTCGACGAGGCGGAGGTGGAAGCCGAGATCGATGCCTTGCGCGAGGCACTCCCGGACGACGCTGAACCCGACC
CGGATGAGACGCCGCGCACCTCGGGCAAGCGGCGTGAGCGGGGCTTCTCGGACACGCTGGCGCGCAGGCGCGTTGAGCTCACGCTCAGCGACGAGGAGAA
AGCCGGTGCCAGCAAGACCTTCTTCACCAAGGTCAAGGAGGAGCTTGAGTTCATCCCCGCTCAGTTGAGCGTGCTGGAGTACTGGCAGGAGAAGGCCGTG
TTCGAGCACGACGACGGGGAGGAGTCCCTAGTGGCGGCGCCCCGGCCGGTCCACCCGCTGGGCAAATGCATTGCCACCACCGCGCTGCTCGCCTACATCA
TCACCTCGAAGTACGCCGACGGTCTGCCGCTGTACCGACTGGAGAACATGCTGGCGCGGCTCGGGCATTCGGTCAGTCGCACCAGCATGGCGCACTGGAT
CATCCGCCTGGATGCGGTGTTCAGCCCGCTGATCAACCTCATGCGCGAGGCGCAGAACACCAGCGACTACCTCCAGGCCGATGAGACCCGCATGCAGGTC
CTCAAGGAGGACGGCAAGGTCGCCCAGTCCGACAAATGGATCTGGGTGACCCGGGGTGGGCCACCTGGCCGGCCAACGGTGCTGTTCGCGTACGACCCCT
CACGTGCGGGGAGCGTGCCCGTGCGCCTGCTCGATGACTTCAGCGGCATCCTGCAGGCCGATGGCTACTCCGGCTACGGCCAGGTGTGTCGGGACAACGC
CATCACCCGGATCGGGTGCTGGGATCATGCCCGTCGCAAGTTTGTCGAGGCCTCCAAGGCGGCGCCGCCCAAGAAGAAGGGCAAAGGCAAACGCCAGAGC
GCCAAGGCGGATGTGGCGCTGGGGGCGATCCAAGAGCTCTACGCCATTGAGCGCCGAATCAAGGATCTCGGCGATGATGAGCGCTATCGCATCCGCCAGG
CCGAGAGCCTGCCCCGGCTCCAGGCGTTGAAAACCTGGCTGGAAGACAACGCCGGCAAGGTCGTGAAGGGCTCACTGACCCGCAAGGCGATGGACTACAC
CCTGAACCAGTGGGACACCCTGGTGGGCTACTGCGAGCGTGGGGATCTACAGATCAGTAACGCCCTGGCCGAGAACGCCATCCGCCCGTTCGCGCTCGGT
CGCAAGGCATGGCTGTTCGCCGATACCACCCAGGGCGCACGCGCCAGCGCGAGCTGCTACTCACTAATCGAGACCGCCAAGGCCAATGGCCTGGACCCCT
CGGCCTACATCCACCATGTGCTCACGCACATCGGCGAGGCGGACACCGTCGAGAAGCTCGAAGCGCTACTGCCCTGGAATACGGGCCTGGAGCCGGCTCC
GAAAAAGAGCTGACGACCGCCTGCGGGGCAAGTGGGTCGATTTACGGGCGCTTAC
Protein section
ORF number : 3
ORF 1
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
303 bp | 100 aa | 87 | 389 | + | No |
AG : IS66 TnpA
ORF sequence :
MTSAELHHYWRQTLDAWTASGLSGAAFCKQHSLTYHQFVYWRRKLRGPGESPSRAGFARVAPVAHDDAADGLTVSLPGGVSITGLHAGNIELLGAVLRQL
Blast result :ORF 2
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
369 bp | 123 aa | 389 | 757 | + | No |
AG : IS66 TnpB
ORF sequence :
MRNRSLRPSRQLPEIYLYRAPVDFRKQAHGLALIVEQELGHSPFTGALYAFTNRQRNKIKCLMWEDNGFVLYYKALAEERFKWPAPGDELMSLSGEQINW
LLDGYDITLLRGHKKLHYEALG
LLDGYDITLLRGHKKLHYEALG
Blast result :ORF 3
Length | Begin | End | Strand | Fusion ORF | |
---|---|---|---|---|---|
1611 bp | 536 aa | 803 | 2413 | + | No |
Chemistry : DDE
ORF sequence :
MKSTPDNAPPAPDLSGLSAAEMMAVIGDLQQQLASKEQAIRQRDTRIDLLEELLRLKTLQKFAASSEKHQNQITLFDEAEVEAEIDALREALPDDAEPDP
DETPRTSGKRRERGFSDTLARRRVELTLSDEEKAGASKTFFTKVKEELEFIPAQLSVLEYWQEKAVFEHDDGEESLVAAPRPVHPLGKCIATTALLAYII
TSKYADGLPLYRLENMLARLGHSVSRTSMAHWIIRLDAVFSPLINLMREAQNTSDYLQADETRMQVLKEDGKVAQSDKWIWVTRGGPPGRPTVLFAYDPS
RAGSVPVRLLDDFSGILQADGYSGYGQVCRDNAITRIGCWDHARRKFVEASKAAPPKKKGKGKRQSAKADVALGAIQELYAIERRIKDLGDDERYRIRQA
ESLPRLQALKTWLEDNAGKVVKGSLTRKAMDYTLNQWDTLVGYCERGDLQISNALAENAIRPFALGRKAWLFADTTQGARASASCYSLIETAKANGLDPS
AYIHHVLTHIGEADTVEKLEALLPWNTGLEPAPKKS
DETPRTSGKRRERGFSDTLARRRVELTLSDEEKAGASKTFFTKVKEELEFIPAQLSVLEYWQEKAVFEHDDGEESLVAAPRPVHPLGKCIATTALLAYII
TSKYADGLPLYRLENMLARLGHSVSRTSMAHWIIRLDAVFSPLINLMREAQNTSDYLQADETRMQVLKEDGKVAQSDKWIWVTRGGPPGRPTVLFAYDPS
RAGSVPVRLLDDFSGILQADGYSGYGQVCRDNAITRIGCWDHARRKFVEASKAAPPKKKGKGKRQSAKADVALGAIQELYAIERRIKDLGDDERYRIRQA
ESLPRLQALKTWLEDNAGKVVKGSLTRKAMDYTLNQWDTLVGYCERGDLQISNALAENAIRPFALGRKAWLFADTTQGARASASCYSLIETAKANGLDPS
AYIHHVLTHIGEADTVEKLEALLPWNTGLEPAPKKS
Blast result :
Comments
ISNarch4 is 82% aa (transposase) similar to ISAeme5.
References
1] Sarah Sonbol (2020) Direct submission.
2] Xue,Q. (2020) Direct GenBank submission.
2] Xue,Q. (2020) Direct GenBank submission.