Molecular Development - Rubella Genome: Difference between revisions

From Embryology
(Created page with " LOCUS RUBCG 9755 bp ss-RNA VRL 08-MAR-1996DEFINITION Rubella virus complete genome encoding nonstructural protein, capsid protein, gly...")
 
No edit summary
Line 1: Line 1:
 
LOCUS      RUBCG        9755 bp ss-RNA            VRL      08-MAR-1996<br>DEFINITION Rubella virus complete genome encoding nonstructural protein,<br>           capsid protein, glycoproteins E1 and E2, complete cds.<br>         ACCESSION  M15240 M18901 M32735<br>NID         g333971<br>VERSION     M15240.1  GI:333971<br>KEYWORDS   C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein<br>           E2; haemagglutinin; nonstructural protein.<br>SOURCE      .<br> ORGANISM  Rubella virus<br>               Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;<br>           Rubivirus.<br>REFERENCE  1  (bases 8155 to 9754)<br> AUTHORS  Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.<br> TITLE    Molecular cloning and sequencing of the region of the rubella virus<br>           genome coding for glycoprotein E1<br> JOURNAL  Virology 154 (1), 228-232 (1986)<br> MEDLINE  86317717<br>   REFERENCE  2  (bases 5917 to 9754)<br> AUTHORS  Frey,T.K. and Marr,L.D.<br> JOURNAL  Unpublished (1987)<br>REFERENCE  3  (bases 5247 to 8366)<br> AUTHORS  Frey,T.K. and Marr,L.D.<br> TITLE    Sequence of the region coding for virion proteins C and E2 and the<br>           carboxy terminus of the nonstructural proteins of rubella virus:<br>           comparison with alphaviruses<br> JOURNAL  Gene 62 (1), 85-99 (1988)<br> MEDLINE  88226020<br>   REFERENCE  4  (bases 1 to 9755)<br> AUTHORS  Dominguez,G., Wang,C.Y. and Frey,T.K.<br> TITLE    Sequence of the genome RNA of rubella virus: evidence for genetic<br>           rearrangement during togavirus evolution<br> JOURNAL  Virology 177 (1), 225-238 (1990)<br> MEDLINE  90281585<br>   COMMENT    [2]  revises [1].<br>           Draft entry and computer-readable copy of sequence in [2] kindly<br>           provided by T.K.Frey, 01-JUN-1987.<br>           Draft entry and computer-readable sequence for [4] kindly submitted<br>           by G.Dominguez, 09-MAR-1990, for release after publication.<br>           Glycoprotein E1 contains the viral hemagglutinin activity. Multiple<br>           copies of the C protein comprise the nucleocapsid.<br>         FEATURES            Location/Qualifiers<br>     source          1..9755<br>                     /organism="Rubella virus"<br>                     /note="other clones: pRUB1010[1012, 1002, 1006, 1015,<br>                     1001]"<br>                     /db_xref="taxon:11041"<br>                     /clone="pRUB1025"<br>     CDS            39..6656<br>                     /note="nonstructural polyprotein precursor"<br>                     /codon_start=1<br>                     /protein_id="AAA88528.1"<br>                     /db_xref="PID:g333972"<br>                     /db_xref="GI:333972"<br>                     /translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA<br>                     QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE<br>                     VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG<br>                     GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH<br>                     DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS<br>                     WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF<br>                     QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF<br>                     RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV<br>                     AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS<br>                     APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY<br>                     RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV<br>                     VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ<br>                     QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD<br>                     ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS<br>                     TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN<br>                     ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA<br>                     YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR<br>                     ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP<br>                     RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY<br>                     SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL<br>                     DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA<br>                     DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA<br>                     RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV<br>                     RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL<br>                     LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE<br>                     VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT<br>                     GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD<br>                     VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV<br>                     VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY<br>                     MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP<br>                     RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV<br>                     PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR<br>                     ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG<br>                     LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR<br>                     WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG<br>                     LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV<br>                     LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS<br>                     PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA<br>                     PAAAAATARLQHLRR"<br>     mRNA            6428..9755<br>                     /note="subgenomic RNA"<br>     mat_peptide    6505..7404<br>                     /note="capsid protein (C)"<br>     CDS            6505..9696<br>                     /note="structural polyprotein precursor"<br>                     /codon_start=1<br>                     /protein_id="AAA88529.1"<br>                     /db_xref="PID:g333973"<br>                     /db_xref="GI:333973"<br>                     /translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD<br><br>                     SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ<br><br>                     QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS<br><br>                     EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG<br><br>                     VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI<br><br>                     RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF<br><br>                     LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV<br><br>                     EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT<br><br>                     RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP<br><br>                     ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA<br><br>                     ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD<br><br>                     LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS<br><br>                     YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK<br><br>                     FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS<br><br>                     RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV<br><br>                     WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV<br><br>                     ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL<br><br>                     LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW<br><br>                     QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"<br>     mat_peptide    7405..8250<br>                     /note="glycoprotein E2"<br>     mat_peptide    8251..9693<br>                     /note="glycoprotein E1"<br>BASE COUNT    1457 a  3781 c  3007 g  1510 t<br>         ORIGIN     <br>        1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg<br>       61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc<br>     121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg<br>     181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag<br>     241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag<br>     301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc<br>     361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg<br>     421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg<br>     481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat<br>     541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc<br>     601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg<br>     661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg<br>     721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag<br>     781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc<br>     841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc<br>     901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg<br>     961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca<br>     1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg<br>     1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg<br>     1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg<br>     1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg<br>     1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca<br>     1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg<br>     1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg<br>     1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag<br>     1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt<br>     1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc<br>     1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg<br>     1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc<br>     1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc<br>     1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg<br>     1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac<br>     1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg<br>     1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg<br>     2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg<br>     2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg<br>     2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc<br>     2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg<br>     2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg<br>     2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg<br>     2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg<br>     2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc<br>     2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg<br>     2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc<br>     2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc<br>     2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg<br>     2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg<br>     2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc<br>     2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct<br>     2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg<br>     3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg<br>     3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc<br>     3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc<br>     3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca<br>     3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact<br>     3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt<br>     3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg<br>     3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc<br>     3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg<br>     3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca<br>     3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc<br>     3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc<br>     3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc<br>     3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca<br>     3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc<br>     3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg<br>     3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca<br>     4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg<br>     4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg<br>     4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg<br>     4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg<br>     4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt<br>     4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc<br>     4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc<br>     4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg<br>     4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg<br>     4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg<br>     4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca<br>     4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc<br>     4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc<br>     4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg<br>     4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg<br>     4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg<br>     4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt<br>     5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac<br>     5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc<br>     5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg<br>     5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca<br>     5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc<br>     5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc<br>     5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc<br>     5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg<br>     5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat<br>     5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca<br>     5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc<br>     5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc<br>     5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct<br>     5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca<br>     5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata<br>     5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg<br>     6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct<br>     6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg<br>     6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc<br>     6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg<br>     6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc<br>     6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc<br>     6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc<br>     6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat<br>     6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc<br>     6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg<br>     6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc<br>     6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg<br>     6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg<br>     6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg<br>     6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc<br>     6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc<br>     6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg<br>     7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc<br>     7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct<br>     7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc<br>     7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc<br>     7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat<br>     7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg<br>     7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct<br>     7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg<br>     7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga<br>     7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg<br>     7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg<br>     7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg<br>     7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc<br>     7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc<br>     7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc<br>     7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac<br>     7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc<br>     8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg<br>     8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt<br>     8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc<br>     8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca<br>     8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc<br>     8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt<br>     8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct<br>     8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac<br>     8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag<br>     8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc<br>     8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac<br>     8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc<br>     8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga<br>     8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac<br>     8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg<br>     8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca<br>     9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga<br>     9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac<br>     9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac<br>     9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc<br>     9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac<br>     9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc<br>     9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc<br>     9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg<br>     9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact<br>     9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc<br>     9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc<br>     9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac<br>     9721 taggccacta gatccccgca cctgttgctg tatag<br>//
LOCUS      RUBCG        9755 bp ss-RNA            VRL      08-MAR-1996DEFINITION Rubella virus complete genome encoding nonstructural protein,            capsid protein, glycoproteins E1 and E2, complete cds.        ACCESSION  M15240 M18901 M32735NID         g333971VERSION     M15240.1  GI:333971KEYWORDS   C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein            E2; haemagglutinin; nonstructural protein.SOURCE      .  ORGANISM  Rubella virus              Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;            Rubivirus.REFERENCE  1  (bases 8155 to 9754)  AUTHORS  Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.  TITLE    Molecular cloning and sequencing of the region of the rubella virus            genome coding for glycoprotein E1  JOURNAL  Virology 154 (1), 228-232 (1986)  MEDLINE  86317717  REFERENCE  2  (bases 5917 to 9754)  AUTHORS  Frey,T.K. and Marr,L.D.  JOURNAL  Unpublished (1987)REFERENCE  3  (bases 5247 to 8366)  AUTHORS  Frey,T.K. and Marr,L.D.  TITLE    Sequence of the region coding for virion proteins C and E2 and the            carboxy terminus of the nonstructural proteins of rubella virus:            comparison with alphaviruses  JOURNAL  Gene 62 (1), 85-99 (1988)  MEDLINE  88226020  REFERENCE  4  (bases 1 to 9755)  AUTHORS  Dominguez,G., Wang,C.Y. and Frey,T.K.  TITLE    Sequence of the genome RNA of rubella virus: evidence for genetic            rearrangement during togavirus evolution  JOURNAL  Virology 177 (1), 225-238 (1990)  MEDLINE  90281585  COMMENT    [2]  revises [1].            Draft entry and computer-readable copy of sequence in [2] kindly            provided by T.K.Frey, 01-JUN-1987.            Draft entry and computer-readable sequence for [4] kindly submitted            by G.Dominguez, 09-MAR-1990, for release after publication.            Glycoprotein E1 contains the viral hemagglutinin activity. Multiple            copies of the C protein comprise the nucleocapsid.        FEATURES            Location/Qualifiers    source          1..9755                    /organism="Rubella virus"                    /note="other clones: pRUB1010[1012, 1002, 1006, 1015,                    1001]"                    /db_xref="taxon:11041"                    /clone="pRUB1025"    CDS            39..6656                    /note="nonstructural polyprotein precursor"                    /codon_start=1                    /protein_id="AAA88528.1"                    /db_xref="PID:g333972"                    /db_xref="GI:333972"                    /translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA                    QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE                    VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG                    GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH                    DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS                    WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF                    QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF                    RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV                    AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS                    APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY                    RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV                    VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ                    QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD                    ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS                    TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN                    ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA                    YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR                    ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP                    RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY                    SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL                    DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA                    DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA                    RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV                    RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL                    LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE                    VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT                    GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD                    VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV                    VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY                    MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP                    RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV                    PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR                    ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG                    LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR                    WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG                    LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV                    LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS                    PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA                    PAAAAATARLQHLRR"    mRNA            6428..9755                    /note="subgenomic RNA"    mat_peptide    6505..7404                    /note="capsid protein (C)"    CDS            6505..9696                    /note="structural polyprotein precursor"                    /codon_start=1                    /protein_id="AAA88529.1"                    /db_xref="PID:g333973"                    /db_xref="GI:333973"                    /translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD                    SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ                    QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS                    EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG                    VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI                    RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF                    LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV                    EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT                    RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP                    ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA                    ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD                    LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS                    YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK                    FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS                    RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV                    WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV                    ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL                    LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW                    QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"    mat_peptide    7405..8250                    /note="glycoprotein E2"    mat_peptide    8251..9693                    /note="glycoprotein E1"BASE COUNT    1457 a  3781 c  3007 g  1510 t        ORIGIN             1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg      61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc      121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg      181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag      241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag      301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc      361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg      421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg      481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat      541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc      601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg      661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg      721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag      781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc      841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc      901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg      961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca    1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg    1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg    1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg    1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg    1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca    1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg    1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg    1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag    1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt    1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc    1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg    1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc    1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc    1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg    1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac    1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg    1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg    2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg    2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg    2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc    2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg    2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg    2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg    2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg    2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc    2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg    2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc    2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc    2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg    2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg    2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc    2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct    2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg    3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg    3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc    3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc    3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca    3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact    3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt    3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg    3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc    3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg    3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca    3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc    3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc    3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc    3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca    3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc    3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg    3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca    4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg    4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg    4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg    4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg    4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt    4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc    4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc    4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg    4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg    4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg    4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca    4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc    4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc    4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg    4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg    4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg    4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt    5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac    5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc    5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg    5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca    5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc    5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc    5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc    5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg    5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat    5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca    5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc    5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc    5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct    5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca    5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata    5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg    6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct    6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg    6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc    6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg    6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc    6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc    6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc    6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat    6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc    6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg    6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc    6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg    6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg    6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg    6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc    6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc    6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg    7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc    7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct    7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc    7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc    7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat    7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg    7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct    7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg    7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga    7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg    7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg    7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg    7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc    7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc    7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc    7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac    7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc    8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg    8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt    8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc    8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca    8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc    8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt    8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct    8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac    8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag    8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc    8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac    8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc    8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga    8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac    8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg    8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca    9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga    9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac    9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac    9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc    9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac    9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc    9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc    9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg    9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact    9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc    9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc    9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac    9721 taggccacta gatccccgca cctgttgctg tatag//

Revision as of 01:20, 2 November 2011

LOCUS RUBCG 9755 bp ss-RNA VRL 08-MAR-1996
DEFINITION Rubella virus complete genome encoding nonstructural protein,
capsid protein, glycoproteins E1 and E2, complete cds.
ACCESSION M15240 M18901 M32735
NID g333971
VERSION M15240.1 GI:333971
KEYWORDS C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein
E2; haemagglutinin; nonstructural protein.
SOURCE .
ORGANISM Rubella virus
Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;
Rubivirus.
REFERENCE 1 (bases 8155 to 9754)
AUTHORS Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.
TITLE Molecular cloning and sequencing of the region of the rubella virus
genome coding for glycoprotein E1
JOURNAL Virology 154 (1), 228-232 (1986)
MEDLINE 86317717
REFERENCE 2 (bases 5917 to 9754)
AUTHORS Frey,T.K. and Marr,L.D.
JOURNAL Unpublished (1987)
REFERENCE 3 (bases 5247 to 8366)
AUTHORS Frey,T.K. and Marr,L.D.
TITLE Sequence of the region coding for virion proteins C and E2 and the
carboxy terminus of the nonstructural proteins of rubella virus:
comparison with alphaviruses
JOURNAL Gene 62 (1), 85-99 (1988)
MEDLINE 88226020
REFERENCE 4 (bases 1 to 9755)
AUTHORS Dominguez,G., Wang,C.Y. and Frey,T.K.
TITLE Sequence of the genome RNA of rubella virus: evidence for genetic
rearrangement during togavirus evolution
JOURNAL Virology 177 (1), 225-238 (1990)
MEDLINE 90281585
COMMENT [2] revises [1].
Draft entry and computer-readable copy of sequence in [2] kindly
provided by T.K.Frey, 01-JUN-1987.
Draft entry and computer-readable sequence for [4] kindly submitted
by G.Dominguez, 09-MAR-1990, for release after publication.
Glycoprotein E1 contains the viral hemagglutinin activity. Multiple
copies of the C protein comprise the nucleocapsid.
FEATURES Location/Qualifiers
source 1..9755
/organism="Rubella virus"
/note="other clones: pRUB1010[1012, 1002, 1006, 1015,
1001]"
/db_xref="taxon:11041"
/clone="pRUB1025"
CDS 39..6656
/note="nonstructural polyprotein precursor"
/codon_start=1
/protein_id="AAA88528.1"
/db_xref="PID:g333972"
/db_xref="GI:333972"
/translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA
QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE
VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG
GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH
DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS
WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF
QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF
RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV
AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS
APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY
RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV
VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ
QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD
ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS
TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN
ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA
YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR
ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP
RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY
SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL
DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA
DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA
RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV
RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL
LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE
VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT
GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD
VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV
VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY
MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP
RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV
PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR
ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG
LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR
WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG
LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV
LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS
PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA
PAAAAATARLQHLRR"
mRNA 6428..9755
/note="subgenomic RNA"
mat_peptide 6505..7404
/note="capsid protein (C)"
CDS 6505..9696
/note="structural polyprotein precursor"
/codon_start=1
/protein_id="AAA88529.1"
/db_xref="PID:g333973"
/db_xref="GI:333973"
/translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD

SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ

QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS

EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG

VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI

RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF

LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV

EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT

RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP

ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA

ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD

LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS

YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK

FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS

RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV

WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV

ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL

LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW

QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"
mat_peptide 7405..8250
/note="glycoprotein E2"
mat_peptide 8251..9693
/note="glycoprotein E1"
BASE COUNT 1457 a 3781 c 3007 g 1510 t
ORIGIN
1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg
61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc
121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg
181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag
241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag
301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc
361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg
421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg
481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat
541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc
601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg
661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg
721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag
781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc
841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc
901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg
961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca
1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg
1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg
1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg
1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg
1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca
1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg
1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg
1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag
1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt
1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc
1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg
1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc
1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc
1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg
1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac
1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg
1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg
2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg
2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg
2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc
2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg
2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg
2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg
2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg
2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc
2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg
2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc
2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc
2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg
2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg
2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc
2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct
2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg
3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg
3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc
3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc
3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca
3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact
3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt
3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg
3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc
3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg
3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca
3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc
3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc
3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc
3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca
3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc
3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg
3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca
4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg
4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg
4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg
4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg
4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt
4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc
4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc
4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg
4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg
4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg
4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca
4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc
4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc
4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg
4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg
4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg
4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt
5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac
5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc
5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg
5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca
5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc
5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc
5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc
5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg
5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat
5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca
5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc
5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc
5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct
5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca
5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata
5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg
6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct
6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg
6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc
6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg
6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc
6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc
6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc
6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat
6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc
6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg
6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc
6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg
6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg
6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg
6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc
6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc
6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg
7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc
7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct
7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc
7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc
7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat
7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg
7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct
7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg
7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga
7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg
7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg
7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg
7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc
7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc
7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc
7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac
7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc
8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg
8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt
8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc
8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca
8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc
8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt
8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct
8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac
8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag
8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc
8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac
8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc
8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga
8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac
8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg
8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca
9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga
9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac
9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac
9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc
9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac
9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc
9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc
9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg
9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact
9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc
9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc
9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac
9721 taggccacta gatccccgca cctgttgctg tatag
//