Molecular Development - Rubella Genome: Difference between revisions

From Embryology
No edit summary
Line 1: Line 1:
==Introduction==
==Introduction==


:{{Virus Links}}
:{{Viral Links}}


LOCUS      RUBCG        9755 bp ss-RNA            VRL      08-MAR-1996<br>DEFINITION  Rubella virus complete genome encoding nonstructural protein,<br>            capsid protein, glycoproteins E1 and E2, complete cds.<br>        ACCESSION  M15240 M18901 M32735<br>NID        g333971<br>VERSION    M15240.1  GI:333971<br>KEYWORDS    C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein<br>            E2; haemagglutinin; nonstructural protein.<br>SOURCE      .<br>  ORGANISM  Rubella virus<br>              Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;<br>            Rubivirus.<br>REFERENCE  1  (bases 8155 to 9754)<br>  AUTHORS  Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.<br>  TITLE    Molecular cloning and sequencing of the region of the rubella virus<br>            genome coding for glycoprotein E1<br>  JOURNAL  Virology 154 (1), 228-232 (1986)<br>  MEDLINE  86317717<br>  REFERENCE  2  (bases 5917 to 9754)<br>  AUTHORS  Frey,T.K. and Marr,L.D.<br>  JOURNAL  Unpublished (1987)<br>REFERENCE  3  (bases 5247 to 8366)<br>  AUTHORS  Frey,T.K. and Marr,L.D.<br>  TITLE    Sequence of the region coding for virion proteins C and E2 and the<br>            carboxy terminus of the nonstructural proteins of rubella virus:<br>            comparison with alphaviruses<br>  JOURNAL  Gene 62 (1), 85-99 (1988)<br>  MEDLINE  88226020<br>  REFERENCE  4  (bases 1 to 9755)<br>  AUTHORS  Dominguez,G., Wang,C.Y. and Frey,T.K.<br>  TITLE    Sequence of the genome RNA of rubella virus: evidence for genetic<br>            rearrangement during togavirus evolution<br>  JOURNAL  Virology 177 (1), 225-238 (1990)<br>  MEDLINE  90281585<br>  COMMENT    [2]  revises [1].<br>            Draft entry and computer-readable copy of sequence in [2] kindly<br>            provided by T.K.Frey, 01-JUN-1987.<br>            Draft entry and computer-readable sequence for [4] kindly submitted<br>            by G.Dominguez, 09-MAR-1990, for release after publication.<br>            Glycoprotein E1 contains the viral hemagglutinin activity. Multiple<br>            copies of the C protein comprise the nucleocapsid.<br>        FEATURES            Location/Qualifiers<br>    source          1..9755<br>                    /organism="Rubella virus"<br>                    /note="other clones: pRUB1010[1012, 1002, 1006, 1015,<br>                    1001]"<br>                    /db_xref="taxon:11041"<br>                    /clone="pRUB1025"<br>    CDS            39..6656<br>                    /note="nonstructural polyprotein precursor"<br>                    /codon_start=1<br>                    /protein_id="AAA88528.1"<br>                    /db_xref="PID:g333972"<br>                    /db_xref="GI:333972"<br>                    /translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA<br>                    QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE<br>                    VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG<br>                    GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH<br>                    DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS<br>                    WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF<br>                    QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF<br>                    RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV<br>                    AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS<br>                    APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY<br>                    RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV<br>                    VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ<br>                    QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD<br>                    ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS<br>                    TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN<br>                    ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA<br>                    YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR<br>                    ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP<br>                    RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY<br>                    SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL<br>                    DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA<br>                    DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA<br>                    RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV<br>                    RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL<br>                    LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE<br>                    VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT<br>                    GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD<br>                    VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV<br>                    VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY<br>                    MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP<br>                    RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV<br>                    PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR<br>                    ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG<br>                    LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR<br>                    WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG<br>                    LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV<br>                    LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS<br>                    PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA<br>                    PAAAAATARLQHLRR"<br>    mRNA            6428..9755<br>                    /note="subgenomic RNA"<br>    mat_peptide    6505..7404<br>                    /note="capsid protein (C)"<br>    CDS            6505..9696<br>                    /note="structural polyprotein precursor"<br>                    /codon_start=1<br>                    /protein_id="AAA88529.1"<br>                    /db_xref="PID:g333973"<br>                    /db_xref="GI:333973"<br>                    /translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD<br><br>                    SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ<br><br>                    QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS<br><br>                    EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG<br><br>                    VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI<br><br>                    RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF<br><br>                    LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV<br><br>                    EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT<br><br>                    RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP<br><br>                    ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA<br><br>                    ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD<br><br>                    LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS<br><br>                    YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK<br><br>                    FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS<br><br>                    RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV<br><br>                    WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV<br><br>                    ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL<br><br>                    LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW<br><br>                    QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"<br>    mat_peptide    7405..8250<br>                    /note="glycoprotein E2"<br>    mat_peptide    8251..9693<br>                    /note="glycoprotein E1"<br>BASE COUNT    1457 a  3781 c  3007 g  1510 t<br>        ORIGIN      <br>        1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg<br>      61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc<br>      121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg<br>      181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag<br>      241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag<br>      301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc<br>      361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg<br>      421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg<br>      481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat<br>      541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc<br>      601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg<br>      661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg<br>      721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag<br>      781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc<br>      841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc<br>      901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg<br>      961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca<br>    1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg<br>    1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg<br>    1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg<br>    1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg<br>    1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca<br>    1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg<br>    1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg<br>    1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag<br>    1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt<br>    1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc<br>    1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg<br>    1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc<br>    1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc<br>    1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg<br>    1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac<br>    1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg<br>    1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg<br>    2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg<br>    2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg<br>    2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc<br>    2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg<br>    2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg<br>    2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg<br>    2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg<br>    2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc<br>    2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg<br>    2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc<br>    2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc<br>    2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg<br>    2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg<br>    2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc<br>    2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct<br>    2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg<br>    3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg<br>    3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc<br>    3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc<br>    3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca<br>    3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact<br>    3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt<br>    3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg<br>    3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc<br>    3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg<br>    3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca<br>    3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc<br>    3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc<br>    3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc<br>    3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca<br>    3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc<br>    3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg<br>    3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca<br>    4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg<br>    4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg<br>    4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg<br>    4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg<br>    4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt<br>    4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc<br>    4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc<br>    4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg<br>    4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg<br>    4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg<br>    4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca<br>    4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc<br>    4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc<br>    4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg<br>    4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg<br>    4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg<br>    4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt<br>    5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac<br>    5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc<br>    5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg<br>    5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca<br>    5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc<br>    5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc<br>    5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc<br>    5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg<br>    5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat<br>    5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca<br>    5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc<br>    5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc<br>    5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct<br>    5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca<br>    5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata<br>    5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg<br>    6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct<br>    6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg<br>    6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc<br>    6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg<br>    6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc<br>    6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc<br>    6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc<br>    6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat<br>    6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc<br>    6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg<br>    6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc<br>    6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg<br>    6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg<br>    6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg<br>    6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc<br>    6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc<br>    6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg<br>    7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc<br>    7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct<br>    7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc<br>    7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc<br>    7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat<br>    7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg<br>    7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct<br>    7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg<br>    7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga<br>    7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg<br>    7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg<br>    7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg<br>    7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc<br>    7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc<br>    7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc<br>    7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac<br>    7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc<br>    8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg<br>    8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt<br>    8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc<br>    8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca<br>    8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc<br>    8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt<br>    8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct<br>    8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac<br>    8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag<br>    8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc<br>    8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac<br>    8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc<br>    8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga<br>    8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac<br>    8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg<br>    8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca<br>    9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga<br>    9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac<br>    9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac<br>    9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc<br>    9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac<br>    9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc<br>    9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc<br>    9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg<br>    9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact<br>    9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc<br>    9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc<br>    9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac<br>    9721 taggccacta gatccccgca cctgttgctg tatag<br>//
LOCUS      RUBCG        9755 bp ss-RNA            VRL      08-MAR-1996<br>DEFINITION  Rubella virus complete genome encoding nonstructural protein,<br>            capsid protein, glycoproteins E1 and E2, complete cds.<br>        ACCESSION  M15240 M18901 M32735<br>NID        g333971<br>VERSION    M15240.1  GI:333971<br>KEYWORDS    C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein<br>            E2; haemagglutinin; nonstructural protein.<br>SOURCE      .<br>  ORGANISM  Rubella virus<br>              Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;<br>            Rubivirus.<br>REFERENCE  1  (bases 8155 to 9754)<br>  AUTHORS  Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.<br>  TITLE    Molecular cloning and sequencing of the region of the rubella virus<br>            genome coding for glycoprotein E1<br>  JOURNAL  Virology 154 (1), 228-232 (1986)<br>  MEDLINE  86317717<br>  REFERENCE  2  (bases 5917 to 9754)<br>  AUTHORS  Frey,T.K. and Marr,L.D.<br>  JOURNAL  Unpublished (1987)<br>REFERENCE  3  (bases 5247 to 8366)<br>  AUTHORS  Frey,T.K. and Marr,L.D.<br>  TITLE    Sequence of the region coding for virion proteins C and E2 and the<br>            carboxy terminus of the nonstructural proteins of rubella virus:<br>            comparison with alphaviruses<br>  JOURNAL  Gene 62 (1), 85-99 (1988)<br>  MEDLINE  88226020<br>  REFERENCE  4  (bases 1 to 9755)<br>  AUTHORS  Dominguez,G., Wang,C.Y. and Frey,T.K.<br>  TITLE    Sequence of the genome RNA of rubella virus: evidence for genetic<br>            rearrangement during togavirus evolution<br>  JOURNAL  Virology 177 (1), 225-238 (1990)<br>  MEDLINE  90281585<br>  COMMENT    [2]  revises [1].<br>            Draft entry and computer-readable copy of sequence in [2] kindly<br>            provided by T.K.Frey, 01-JUN-1987.<br>            Draft entry and computer-readable sequence for [4] kindly submitted<br>            by G.Dominguez, 09-MAR-1990, for release after publication.<br>            Glycoprotein E1 contains the viral hemagglutinin activity. Multiple<br>            copies of the C protein comprise the nucleocapsid.<br>        FEATURES            Location/Qualifiers<br>    source          1..9755<br>                    /organism="Rubella virus"<br>                    /note="other clones: pRUB1010[1012, 1002, 1006, 1015,<br>                    1001]"<br>                    /db_xref="taxon:11041"<br>                    /clone="pRUB1025"<br>    CDS            39..6656<br>                    /note="nonstructural polyprotein precursor"<br>                    /codon_start=1<br>                    /protein_id="AAA88528.1"<br>                    /db_xref="PID:g333972"<br>                    /db_xref="GI:333972"<br>                    /translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA<br>                    QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE<br>                    VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG<br>                    GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH<br>                    DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS<br>                    WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF<br>                    QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF<br>                    RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV<br>                    AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS<br>                    APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY<br>                    RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV<br>                    VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ<br>                    QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD<br>                    ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS<br>                    TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN<br>                    ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA<br>                    YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR<br>                    ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP<br>                    RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY<br>                    SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL<br>                    DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA<br>                    DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA<br>                    RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV<br>                    RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL<br>                    LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE<br>                    VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT<br>                    GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD<br>                    VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV<br>                    VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY<br>                    MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP<br>                    RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV<br>                    PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR<br>                    ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG<br>                    LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR<br>                    WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG<br>                    LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV<br>                    LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS<br>                    PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA<br>                    PAAAAATARLQHLRR"<br>    mRNA            6428..9755<br>                    /note="subgenomic RNA"<br>    mat_peptide    6505..7404<br>                    /note="capsid protein (C)"<br>    CDS            6505..9696<br>                    /note="structural polyprotein precursor"<br>                    /codon_start=1<br>                    /protein_id="AAA88529.1"<br>                    /db_xref="PID:g333973"<br>                    /db_xref="GI:333973"<br>                    /translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD<br><br>                    SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ<br><br>                    QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS<br><br>                    EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG<br><br>                    VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI<br><br>                    RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF<br><br>                    LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV<br><br>                    EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT<br><br>                    RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP<br><br>                    ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA<br><br>                    ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD<br><br>                    LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS<br><br>                    YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK<br><br>                    FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS<br><br>                    RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV<br><br>                    WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV<br><br>                    ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL<br><br>                    LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW<br><br>                    QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"<br>    mat_peptide    7405..8250<br>                    /note="glycoprotein E2"<br>    mat_peptide    8251..9693<br>                    /note="glycoprotein E1"<br>BASE COUNT    1457 a  3781 c  3007 g  1510 t<br>        ORIGIN      <br>        1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg<br>      61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc<br>      121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg<br>      181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag<br>      241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag<br>      301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc<br>      361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg<br>      421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg<br>      481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat<br>      541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc<br>      601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg<br>      661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg<br>      721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag<br>      781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc<br>      841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc<br>      901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg<br>      961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca<br>    1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg<br>    1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg<br>    1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg<br>    1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg<br>    1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca<br>    1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg<br>    1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg<br>    1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag<br>    1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt<br>    1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc<br>    1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg<br>    1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc<br>    1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc<br>    1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg<br>    1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac<br>    1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg<br>    1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg<br>    2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg<br>    2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg<br>    2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc<br>    2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg<br>    2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg<br>    2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg<br>    2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg<br>    2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc<br>    2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg<br>    2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc<br>    2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc<br>    2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg<br>    2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg<br>    2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc<br>    2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct<br>    2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg<br>    3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg<br>    3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc<br>    3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc<br>    3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca<br>    3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact<br>    3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt<br>    3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg<br>    3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc<br>    3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg<br>    3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca<br>    3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc<br>    3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc<br>    3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc<br>    3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca<br>    3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc<br>    3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg<br>    3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca<br>    4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg<br>    4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg<br>    4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg<br>    4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg<br>    4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt<br>    4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc<br>    4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc<br>    4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg<br>    4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg<br>    4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg<br>    4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca<br>    4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc<br>    4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc<br>    4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg<br>    4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg<br>    4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg<br>    4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt<br>    5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac<br>    5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc<br>    5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg<br>    5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca<br>    5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc<br>    5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc<br>    5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc<br>    5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg<br>    5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat<br>    5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca<br>    5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc<br>    5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc<br>    5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct<br>    5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca<br>    5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata<br>    5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg<br>    6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct<br>    6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg<br>    6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc<br>    6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg<br>    6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc<br>    6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc<br>    6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc<br>    6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat<br>    6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc<br>    6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg<br>    6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc<br>    6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg<br>    6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg<br>    6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg<br>    6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc<br>    6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc<br>    6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg<br>    7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc<br>    7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct<br>    7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc<br>    7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc<br>    7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat<br>    7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg<br>    7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct<br>    7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg<br>    7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga<br>    7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg<br>    7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg<br>    7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg<br>    7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc<br>    7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc<br>    7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc<br>    7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac<br>    7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc<br>    8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg<br>    8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt<br>    8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc<br>    8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca<br>    8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc<br>    8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt<br>    8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct<br>    8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac<br>    8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag<br>    8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc<br>    8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac<br>    8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc<br>    8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga<br>    8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac<br>    8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg<br>    8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca<br>    9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga<br>    9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac<br>    9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac<br>    9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc<br>    9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac<br>    9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc<br>    9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc<br>    9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg<br>    9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact<br>    9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc<br>    9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc<br>    9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac<br>    9721 taggccacta gatccccgca cctgttgctg tatag<br>//
==External Links==
* NCBI [http://www.ncbi.nlm.nih.gov/nuccore/NC_001545?report=graph&from=1&to=9762
Rubella virus, complete genome]


{{Glossary}}
{{Glossary}}
{{Footer}}
{{Footer}}

Revision as of 01:24, 2 November 2011

Introduction

Viral Links: viral infection | TORCH | cytomegalovirus | hepatitis | HIV | parvovirus | polio | rubella virus | chickenpox | Lymphocytic Choriomeningitis Virus | Zika virus | human papillomavirus | rotavirus | West Nile virus | varicella virus | vaccination | zoonotic infection | environment
Historic Embryology - Viral 
1941 Rubella Cataracts | 1944 Rubella Defects

LOCUS RUBCG 9755 bp ss-RNA VRL 08-MAR-1996
DEFINITION Rubella virus complete genome encoding nonstructural protein,
capsid protein, glycoproteins E1 and E2, complete cds.
ACCESSION M15240 M18901 M32735
NID g333971
VERSION M15240.1 GI:333971
KEYWORDS C gene; capsid protein; glycoprotein; glycoprotein E1; glycoprotein
E2; haemagglutinin; nonstructural protein.
SOURCE .
ORGANISM Rubella virus
Viruses; ssRNA positive-strand viruses, no DNA stage; Togaviridae;
Rubivirus.
REFERENCE 1 (bases 8155 to 9754)
AUTHORS Frey,T.K., Marr,L.D., Hemphill,M.L. and Dominguez,G.
TITLE Molecular cloning and sequencing of the region of the rubella virus
genome coding for glycoprotein E1
JOURNAL Virology 154 (1), 228-232 (1986)
MEDLINE 86317717
REFERENCE 2 (bases 5917 to 9754)
AUTHORS Frey,T.K. and Marr,L.D.
JOURNAL Unpublished (1987)
REFERENCE 3 (bases 5247 to 8366)
AUTHORS Frey,T.K. and Marr,L.D.
TITLE Sequence of the region coding for virion proteins C and E2 and the
carboxy terminus of the nonstructural proteins of rubella virus:
comparison with alphaviruses
JOURNAL Gene 62 (1), 85-99 (1988)
MEDLINE 88226020
REFERENCE 4 (bases 1 to 9755)
AUTHORS Dominguez,G., Wang,C.Y. and Frey,T.K.
TITLE Sequence of the genome RNA of rubella virus: evidence for genetic
rearrangement during togavirus evolution
JOURNAL Virology 177 (1), 225-238 (1990)
MEDLINE 90281585
COMMENT [2] revises [1].
Draft entry and computer-readable copy of sequence in [2] kindly
provided by T.K.Frey, 01-JUN-1987.
Draft entry and computer-readable sequence for [4] kindly submitted
by G.Dominguez, 09-MAR-1990, for release after publication.
Glycoprotein E1 contains the viral hemagglutinin activity. Multiple
copies of the C protein comprise the nucleocapsid.
FEATURES Location/Qualifiers
source 1..9755
/organism="Rubella virus"
/note="other clones: pRUB1010[1012, 1002, 1006, 1015,
1001]"
/db_xref="taxon:11041"
/clone="pRUB1025"
CDS 39..6656
/note="nonstructural polyprotein precursor"
/codon_start=1
/protein_id="AAA88528.1"
/db_xref="PID:g333972"
/db_xref="GI:333972"
/translation="MEKLLDEVLAPGGPYNLTVGSWVRDHVRSIVEGAWEVRDVVTAA
QKRAIVAVIPRPVFTQMQVSDHPALHAISRYTRRHWIEWGPKEALHVLIDPSPGLLRE
VARVERRWVALCLHRTARKLATALAETASEAWHADYVCALRGAPSGPFYVHPEDVPHG
GRAVADRCLLYYTPMQMCELMRTIDATLLVAVDLWPVALAAHVGDDWDDLGIAWHLDH
DGGCPADCRGAGAGPTPGYTRPCTTRIYQVLPDTAHPGRLYRCGPRLWTRDCAVAELS
WEVAQHCGHQARVRAVRCTLPIRHVRSLQPSARVRLPDLVHLAEVGRWRWFSLPRPVF
QRMLSYCKTLSPDAYYSERVFKFKNALCHSITLAGNVLQEGWKGTCAEEDALCAYVAF
RAWQSNARLAGIMKGAKCAADSLSVAGWLDTIWDAIKRFLGSVPLAERMEEWEQDAAV
AAFDRGPLEDGGRHLDTVQPPKSPPRPEIAATWIVHAASEDRHCACAPRCDVPRERPS
APAGQPDDEALIPPWLFAERRALRCREWDFEALRARADTAAAPAPPAPRPARYPTVLY
RHPAHHGPWLTLDEPGEADAALVLCDPLGQPLRGPERHFAAGAHMCAQARGLQAFVRV
VPPPERPWADGGARAWAKFFRGCAWAQRLLGEPAVMHLPYTDGDVPQLIALALRTLAQ
QGAALALSVRDLPGGAAFDANAVTAAVRAGPRQSAAASPPPGDPPPPRRARRSQRHSD
ARGTPPPAPARDPPPPAPSPPAPPRAGDPVPPIPAGPADRARDAELEVACEPSGPPTS
TRADPDSDIVESYARAAGPVHLRVRDIMDPPPGCKVVVNAANEGLLAGSGVCGAIFAN
ATAALAANCRRLAPCPTGEAVATPGHGCGYTHIIHAVAPRRPRDPAALEEGEALLERA
YRSIVALAAARRWACVACPLLGAGVYGWSAAESLRAALAATRTEPVERVSLHICHPDR
ATLTHASVLVGAGLAARRVSPPPTEPLASCPAGDPGRPAQRSASPPATPLGDATAPEP
RGCQGCELCRYTRVTNDRAYVNLWLERDRGATSWAMRIPEVVVYGPEHLATHFPLNHY
SVLKPAEVRPPRGMCGSDMWRCRGWHGMPQVRCTPSNAHAALCRTGVPPRASTRGGEL
DPNTCWLRAAANVAQAARACGAYTSAGCPKCAYGRALSEARTHEDFAALSQRWSASHA
DASPDGTGDPLDPLMETVGCACSRVWVGSEHEAPPDHLLVSLHRAPNGPWGVVLEVRA
RPEGGNPTGHFVCAVGGGPRRVSDRPHLWLAVPLSRGGGTCAATDEGLAQAYYDDLEV
RRLGDDAMARAALASVQRPRKGPYNIRVWNMAAGAGKTTRILAAFTREDLYVCPTNAL
LHEIQAKLRARDIDIKNAATYERRLTKPLAAYRRIYIDEAFTLGGEYCAFVASQTTAE
VICVGDRDQCGPHYANNCRTPVPDRWPTERSRHTWRFPDCWAARLRAGLDYDIEGERT
GTFACNLWDGRQVDLHLAFSRETVRRLHEAGIRAYTVREAQGMSVGTACIHVGRDGTD
VALALTRDLAIVSLTRASDALYLHELEDGSLRAAGLSAFLDAGALAELKEVPAGIDRV
VAVEQAPPPLPPADGIPEAQDVPPFCPRTLEELVFGRAGHPHYADLNRVTEGEREVRY
MRISRHLLNKNHTEMPGTERVLSAVCAVRRYRAGEDGSTLRTAVARQHPRPFRQIPPP
RVTAGVAQEWRMTYLRERIDLTDVYTQMGVAARELTDRYARRYPEIFAGMCTAQSLSV
PAFLKATLKCVDAALGPRDTEDCHAAQGKAGLEIRAWAKEWVQVMSPHFRAIQKIIMR
ALRPQFLVAAGHTEPEVDAWWQAHYTTNAIEVDFTEFDMNQTLATRDVELEISAALLG
LPCAEDYRALRAGSYCTLRELGSTETGCERTSGEPATLLHNTTVAMCMAMRMVPKGVR
WAGIFQGDDMVIFLPEGARSAALKWTPAEVGLFGFHIPVKHVSTPTPSFCGHVGTAAG
LFHDVMHQAIKVLCRRFDPDVLEEQQVALLDRLRGVYAALPDTVAANAAYYDYSAERV
LAIVRELTAYAGARPRPPGHHRRARGDSDPLRARQSPRRRLTPLYVGPLILPTLTRSS
PTVVSPHLVGTQLLPFGRAPGCPNGFYYPHHHGGPPEGPRGTIPRPARGTRRRRLAVA
PAAAAATARLQHLRR"
mRNA 6428..9755
/note="subgenomic RNA"
mat_peptide 6505..7404
/note="capsid protein (C)"
CDS 6505..9696
/note="structural polyprotein precursor"
/codon_start=1
/protein_id="AAA88529.1"
/db_xref="PID:g333973"
/db_xref="GI:333973"
/translation="MASTTPITMEDLQKALEAQSRALRAELAAGASQSRRPRPPRQRD

SSTSGDDSGRDSGGPRRRRGNRGRGQRRDWSRAPPPPEERQETRSQTPAPKPSRAPPQ

QPQPPRMQTGRGGSAPRPELGPPTNPFQAAVARGLRPPLHDPDTEAPTEACVTSWLWS

EGEGAVFYRVDLHFTNLGTPPLDEDGRWDPALMYNPCGPEPPAHVVRAYNQPAGDVRG

VWGKGERTYAEQDFRVGGTRWHRLLRMPVRGLDGDSAPLPPHTTERIETRSARHPWRI

RFGAPQAFLAGLLLATVAVGTARAGLQPRADMAAPPTLPQPPCAHGQHYGHHHHQLPF

LGHDGHHGGTLRVGQHYRNASDVLPGHWLQGGWGCYNLSDWHQGTHVCHTKHMDFWCV

EHDRPPPATPTPLTTAANSTTAATPATAPAPCHAGLNDSCGGFLSGCGPMRLRHGADT

RCGRLICGLSTTAQYPPTRFGCAMRWGLPPWELVVLTARPEDGWTCRGVPAHPGARCP

ELVSPMGRATCSPASALWLATANALSLDHALAAFVLLVPWVLIFMVCRRACRRRGAAA

ALTAVVLQGYNPPAYGEEAFTYLCTAPGCATQAPVPVRLAGVRFESKIVDGGCFAPWD

LEATGACICEIPTDVSCEGLGAWVPAAPCARIWNGTQRACTFWAVNAYSSGGYAQLAS

YFNPGGSYYKQYHPTACEVEPAFGHSDAACWGFPTDTVMSVFALASYVQHPHKTVRVK

FHTETRTVWQLSVAGVSCNVTTEHPFCNTPHGQLEVQVPPDPGDLVEYIMNYTGNQQS

RWGLGSPNCHGPDWASPVCQRHSPDCSRLVGATPERPRLRLVDADDPLLRTAPGPGEV

WVTPVIGSQARKCGLHIRAGPYGHATVEMPEWIHAHTTSDPWHPPGPLGLKFKTVRPV

ALPRTLAPPRNVRVTGCYQCGTPALVEGLAPGGGNCHLTVNGEDLGAVPPGKFVTAAL

LNTPPPYQVSCGGESDRATARVIDPAAQSFTGVVYGTHTTAVSETRQTWAEWAAAHWW

QLTLGAICALPLAGLLACCAKCLYYLRGAIAPR"
mat_peptide 7405..8250
/note="glycoprotein E2"
mat_peptide 8251..9693
/note="glycoprotein E1"
BASE COUNT 1457 a 3781 c 3007 g 1510 t
ORIGIN
1 atggaagcta tcggacctcg cttaggactc ccattcccat ggagaaactc ctagatgagg
61 ttcttgcccc cggtgggcct tataacttaa ccgtcggcag ttgggtaaga gaccacgtcc
121 gatcaattgt cgagggcgcg tgggaagtgc gcgatgttgt taccgctgcc caaaagcggg
181 ccatcgtagc cgtgataccc agacctgtgt tcacgcagat gcaggtcagt gatcacccag
241 cactccacgc aatttcgcgg tatacccgcc gccattggat cgagtggggc cctaaagaag
301 ccctacacgt cctcatcgac ccaagcccgg gcctgctccg cgaggtcgct cgcgttgagc
361 gccgctgggt cgcactgtgc ctccacagga cggcacgcaa actcgccacc gccctggccg
421 agacggccag cgaggcgtgg cacgctgact acgtgtgcgc gctgcgtggc gcaccgagcg
481 gccccttcta cgtccaccct gaggacgtcc cgcacggcgg tcgcgccgtg gcggacagat
541 gcttgctcta ctacacaccc atgcagatgt gcgagctgat gcgtaccatt gacgccaccc
601 tgctcgtggc ggttgacttg tggccggtcg cccttgcggc ccacgtcggc gacgactggg
661 acgacctggg cattgcctgg catctcgacc atgacggcgg ttgccccgcc gattgccgcg
721 gagccggcgc tgggcccacg cccggctaca cccgcccctg caccacacgc atctaccaag
781 tcctgccgga caccgcccac cccgggcgcc tctaccggtg cgggccccgc ctgtggacgc
841 gcgattgcgc cgtggccgaa ctctcatggg aggttgccca acactgcggg caccaggcgc
901 gcgtgcgcgc cgtgcgatgc accctcccta tccgccacgt gcgcagcctc caacccagcg
961 cgcgggtccg actcccggac ctcgtccatc tcgccgaggt gggccggtgg cggtggttca
1021 gcctcccccg ccccgtgttc cagcgcatgc tgtcctactg caagaccctg agccccgacg
1081 cgtactacag cgagcgcgtg ttcaagttca agaacgccct gtgccacagc atcacgctcg
1141 cgggcaatgt gctgcaagag gggtggaagg gcacgtgcgc cgaggaagac gcgctgtgcg
1201 catacgtagc cttccgcgcg tggcagtcta acgccaggtt ggcggggatt atgaaaggcg
1261 cgaagtgcgc cgccgactct ttgagcgtgg ccggctggct ggacaccatt tgggacgcca
1321 ttaagcggtt cctcggtagc gtgcccctcg ccgagcgcat ggaggagtgg gaacaggacg
1381 ccgcggtcgc cgccttcgac cgcggccccc tcgaggacgg cgggcgccac ttggacaccg
1441 tgcaaccccc aaaatcgccg ccccgccctg agatcgccgc gacctggatc gtccacgcag
1501 ccagcgaaga ccgccattgc gcgtgcgctc cccgctgcga cgtcccgcgc gaacgtcctt
1561 ccgcgcccgc cggccagccg gatgacgagg cgctcatccc gccgtggctg ttcgccgagc
1621 gccgtgccct ccgctgccgc gagtgggatt tcgaggctct ccgcgcgcgc gccgatacgg
1681 cggccgcgcc cgccccgccg gctccacgcc ccgcgcggta ccccaccgtg ctctaccgcc
1741 accccgccca ccacggcccg tggctcaccc ttgacgagcc gggcgaggct gacgcggccc
1801 tggtcttatg cgacccactt ggccagccgc tccggggccc tgaacgccac ttcgccgccg
1861 gcgcgcatat gtgcgcgcag gcgcgggggc tccaggcttt tgtccgtgtc gtgcctccac
1921 ccgagcgccc ctgggccgac gggggcgcca gagcgtgggc gaagttcttc cgcggctgcg
1981 cctgggcgca gcgcttgctc ggcgagccag cagttatgca cctcccatac accgatggcg
2041 acgtgccaca gctgatcgca ctggctttgc gcacgctggc ccaacagggg gccgccttgg
2101 cactctcggt gcgtgacctg cccgggggtg cagcgttcga cgcaaacgcg gtcaccgccg
2161 ccgtgcgcgc tggcccccgc cagtccgcgg ccgcgtcacc gccacccggc gaccccccgc
2221 cgccgcgccg cgcacggcga tcgcaacggc actcggacgc tcgcggcact ccgccccccg
2281 cgcctgcgcg cgacccgccg ccgcccgccc ccagcccgcc cgcgccaccc cgcgctggtg
2341 acccggtccc tcccattccc gcggggccgg cggatcgcgc gcgtgacgcc gagctggagg
2401 tcgcctgcga gccgagcggc ccccccacgt caaccagggc agacccagac agcgacatcg
2461 ttgaaagtta cgcccgcgcc gccggacccg tgcacctccg agtccgcgac atcatggacc
2521 caccgcccgg ctgcaaggtc gtggtcaacg ccgccaacga ggggctactg gccggctctg
2581 gcgtgtgcgg tgccatcttt gccaacgcca cggcggccct cgctgcaaac tgccggcgcc
2641 tcgccccatg ccccaccggc gaggcagtgg cgacacccgg ccacggctgc gggtacaccc
2701 acatcatcca cgccgtcgcg ccgcggcgtc ctcgggaccc cgccgccctc gaggagggcg
2761 aagcgctgct cgagcgcgcc taccgcagca tcgtcgcgct agccgccgcg cgtcggtggg
2821 cgtgtgtcgc gtgccccctc ctcggcgctg gcgtctacgg ctggtctgct gcggagtccc
2881 tccgagccgc gctcgcggct acgcgcaccg agcccgtcga gcgcgtgagc ctgcacatct
2941 gccaccccga ccgcgccacg ctgacgcacg cctccgtgct cgtcggcgcg gggctcgctg
3001 ccaggcgcgt cagtcctcct ccgaccgagc ccctcgcatc ttgccccgcc ggtgacccgg
3061 gccgaccggc tcagcgcagc gcgtcgcccc cagcgacccc ccttggggat gccaccgcgc
3121 ccgagccccg cggatgccag gggtgcgaac tctgccggta cacgcgcgtc accaatgacc
3181 gcgcctatgt caacctgtgg ctcgagcgcg accgcggcgc caccagctgg gccatgcgca
3241 ttcccgaggt ggttgtctac gggccggagc acctcgccac gcattttcca ttaaaccact
3301 acagtgtgct caagcccgcg gaggtcaggc ccccgcgagg catgtgcggg agtgacatgt
3361 ggcgctgccg cggctggcat ggcatgccgc aggtgcggtg caccccctcc aacgctcacg
3421 ccgccctgtg ccgcacaggc gtgccccctc gggcgagcac gcgaggcggc gagctagacc
3481 caaacacctg ctggctccgc gccgccgcca acgttgcgca ggctgcgcgc gcctgcggcg
3541 cctacacgag tgccgggtgc cccaagtgcg cctacggccg cgccctgagc gaagcccgca
3601 ctcatgagga cttcgccgcg ctgagccagc ggtggagcgc gagccacgcc gatgcctccc
3661 ctgacggcac cggagatccc ctcgaccccc tgatggagac cgtgggatgc gcctgttcgc
3721 gcgtgtgggt cggctccgag catgaggccc cgcccgacca cctcctggtg tcccttcacc
3781 gtgccccaaa tggtccgtgg ggcgtagtgc tcgaggtgcg tgcgcgcccc gaggggggca
3841 accccaccgg ccacttcgtc tgcgcggtcg gcggcggccc acgccgcgtc tcggaccgcc
3901 cccacctctg gcttgcggtc cccctgtctc ggggcggtgg cacctgtgcc gcgaccgacg
3961 aggggctggc ccaggcgtac tacgacgacc tcgaggtgcg ccgcctcggg gatgacgcca
4021 tggcccgggc ggccctcgca tcagtccaac gccctcgcaa aggcccttac aatatcaggg
4081 tatggaacat ggccgcaggc gctggcaaga ctacccgcat cctcgctgcc ttcacgcgcg
4141 aagaccttta cgtctgcccc accaatgcgc tcctgcacga gatccaggcc aaactccgcg
4201 cgcgcgatat cgacatcaag aacgccgcca cctacgagcg ccggctgacg aaaccgctcg
4261 ccgcctaccg ccgcatctac atcgatgagg cgttcactct cggcggcgag tactgcgcgt
4321 tcgttgccag ccaaaccacc gcggaggtga tctgcgtcgg tgatcgggac cagtgcggcc
4381 cacactacgc caataactgc cgcacccccg tccctgaccg ctggcctacc gagcgctcgc
4441 gccacacttg gcgcttcccc gactgctggg cggcccgcct gcgcgcgggg ctcgattatg
4501 acatcgaggg cgagcgcacc ggcaccttcg cctgcaacct ttgggacggc cgccaggtcg
4561 accttcacct cgccttctcg cgcgaaaccg tgcgccgcct tcacgaggct ggcatacgcg
4621 catacaccgt gcgcgaggcc cagggtatga gcgtcggcac cgcctgcatc catgtaggca
4681 gagacggcac ggacgttgcc ctggcgctga cacgcgacct cgccatcgtc agcctgaccc
4741 gggcctccga cgcactctac ctccacgagc tcgaggacgg ctcactgcgc gctgcggggc
4801 tcagcgcgtt cctcgacgcc ggggcactgg cggagctcaa ggaggttccc gctggcattg
4861 accgcgttgt cgccgtcgag caggcaccac caccgttgcc gcccgccgac ggcatccccg
4921 aggcccaaga cgtgccgccc ttctgccccc gcactctgga ggagctcgtc ttcggccgtg
4981 ccggccaccc ccattacgcg gacctcaacc gcgtgactga gggcgaacga gaagtgcggt
5041 acatgcgcat ctcgcgtcac ctgctcaaca agaatcacac cgagatgccc ggaacggaac
5101 gcgttctcag tgccgtttgc gccgtgcggc gctaccgcgc gggcgaggat gggtcgaccc
5161 tccgcactgc tgtggcccgc cagcacccgc gcccttttcg ccagatccca cccccgcgcg
5221 tcactgctgg ggtcgcccag gagtggcgca tgacgtactt gcgggaacgg atcgacctca
5281 ctgatgtcta cacgcagatg ggcgtggccg cgcgggagct caccgaccgc tacgcgcgcc
5341 gctatcctga gatcttcgcc ggcatgtgta ccgcccagag cctgagcgtc cccgccttcc
5401 tcaaagccac cttgaagtgc gtagacgccg ccctcggccc cagggacacc gaggactgcc
5461 acgccgctca ggggaaagcc ggccttgaga tccgggcgtg ggccaaggag tgggttcagg
5521 ttatgtcccc gcatttccgc gcgatccaga agatcatcat gcgcgccttg cgcccgcaat
5581 tccttgtggc cgctggccat acggagcccg aggtcgatgc gtggtggcag gcccattaca
5641 ccaccaacgc catcgaggtc gacttcactg agttcgacat gaaccagacc ctcgctactc
5701 gggacgtcga gctcgagatt agcgccgctc tcttgggcct cccttgcgcc gaagactacc
5761 gcgcgctccg cgccggcagc tactgcaccc tgcgcgaact gggctccact gagaccggct
5821 gcgagcgcac aagcggcgag cccgccacgc tgctgcacaa caccaccgtg gccatgtgca
5881 tggccatgcg catggtcccc aaaggcgtgc gctgggccgg gattttccag ggtgacgata
5941 tggtcatctt cctccccgag ggcgcgcgca gcgcggcact caagtggacc cccgccgagg
6001 tgggcttgtt tggcttccac atcccggtga agcacgtgag cacccctacc cccagcttct
6061 gcgggcacgt cggcaccgcg gccggcctct tccatgatgt catgcaccag gcgatcaagg
6121 tgctttgccg ccgtttcgac ccagacgtgc ttgaagaaca gcaggtggcc ctcctcgacc
6181 gcctccgggg ggtctacgcg gctctgcctg acaccgttgc cgccaatgct gcgtactacg
6241 actacagcgc ggagcgcgtc ctcgctatcg tgcgcgaact taccgcgtac gcgggggcgc
6301 ggcctcgacc acccggccac catcggcgcg ctcgaggaga ttcagacccc ctacgcgcgc
6361 gccaatctcc acgacgccga ctaacgcccc tgtacgtggg gcctttaatc ttacctactc
6421 taaccaggtc atcacccacc gttgtttcgc cgcatctggt gggtacccaa cttttgccat
6481 tcgggagagc cccagggtgc ccgaatggct tctactaccc ccatcaccat ggaggacctc
6541 cagaaggccc tcgaggcaca atcccgcgcc ctgcgcgcgg aactcgccgc cggcgcctcg
6601 cagtcgcgcc ggccgcggcc gccgcgacag cgcgactcca gcacctccgg agatgactcc
6661 ggccgtgact ccggagggcc ccgccgccgc cgcggcaacc ggggccgtgg ccagcgcagg
6721 gactggtcca gggccccgcc ccccccggag gagcggcaag aaactcgctc ccagactccg
6781 gccccgaagc catcgcgggc gccgccacaa cagcctcaac ccccgcgcat gcaaaccggg
6841 cgtgggggct ctgccccgcg ccccgagctg gggccaccga ccaacccgtt ccaagcagcc
6901 gtggcgcgtg gcctgcgccc gcctctccac gaccctgaca ccgaggcacc caccgaggcc
6961 tgcgtgacct cgtggctttg gagcgagggc gaaggcgcgg tcttttaccg cgtcgacctg
7021 catttcacca acctgggcac ccccccactc gacgaggacg gccgctggga ccctgcgctc
7081 atgtacaacc cttgcgggcc cgagccgccc gctcacgtcg tccgcgcgta caatcaacct
7141 gccggcgacg tcaggggcgt ttggggtaaa ggcgagcgca cctacgccga gcaggacttc
7201 cgcgtcggcg gcacgcgctg gcaccgactg ctgcgcatgc cagtgcgcgg cctcgacggc
7261 gacagcgccc cgcttccccc ccacaccacc gagcgcattg agacccgctc ggcgcgccat
7321 ccttggcgca tccgcttcgg tgccccccag gccttccttg ccgggctctt gctcgccacg
7381 gtcgccgttg gcaccgcgcg cgccgggctc cagccccgcg ctgatatggc ggcacctcct
7441 acgctgccgc agcccccctg tgcgcacggg cagcattacg gccaccacca ccatcagctg
7501 ccgttcctcg ggcacgacgg ccatcatggc ggcaccttgc gcgtcggcca gcattaccga
7561 aacgccagcg acgtgctgcc cggccactgg ctccaaggcg gctggggttg ctacaacctg
7621 agcgactggc accagggcac tcatgtctgt cataccaagc acatggactt ctggtgtgtg
7681 gagcacgacc gaccgccgcc cgcgaccccg acgcctctca ccaccgcggc gaactccacg
7741 accgccgcca cccccgccac tgcgccggcc ccctgccacg ccggcctcaa tgacagctgc
7801 ggcggcttct tgtctgggtg cgggccgatg cgcctgcgcc acggcgctga cacccggtgc
7861 ggtcggttga tctgcgggct gtccaccacc gcccagtacc cgcctacccg gtttggctgc
7921 gctatgcggt ggggccttcc cccctgggaa ctggtcgtcc ttaccgcccg ccccgaagac
7981 ggctggactt gccgcggcgt gcccgcccat ccaggcgccc gctgccccga actggtgagc
8041 cccatgggac gcgcgacttg ctccccagcc tcggccctct ggctcgccac agcgaacgcg
8101 ctgtctcttg atcacgccct cgcggccttc gtcctgctgg tcccgtgggt cctgatattt
8161 atggtgtgcc gccgcgcctg tcgccgccgc ggcgccgccg ccgccctcac cgcggtcgtc
8221 ctgcaggggt acaacccccc cgcctatggc gaggaggctt tcacctacct ctgcactgca
8281 ccggggtgcg ccactcaagc acctgtcccc gtgcgcctcg ctggcgtccg ttttgagtcc
8341 aagattgtgg acggcggctg ctttgcccca tgggacctcg aggccactgg agcctgcatt
8401 tgcgagatcc ccactgatgt ctcgtgcgag ggcttggggg cctgggtacc cgcagcccct
8461 tgcgcgcgca tctggaatgg cacacagcgc gcgtgcacct tctgggctgt caacgcctac
8521 tcctctggcg ggtacgcgca gctggcctct tacttcaacc ctggcggcag ctactacaag
8581 cagtaccacc ctaccgcgtg cgaggttgaa cctgccttcg gacacagcga cgcggcctgc
8641 tggggcttcc ccaccgacac cgtgatgagc gtgttcgccc ttgctagcta cgtccagcac
8701 cctcacaaga ccgtccgggt caagttccat acagagacca ggaccgtctg gcaactctcc
8761 gttgccggcg tgtcgtgcaa cgtcaccact gaacacccgt tctgcaacac gccgcacgga
8821 caactcgagg tccaggtccc gcccgacccc ggggacctgg ttgagtacat tatgaattac
8881 accggcaatc agcagtcccg gtggggcctc gggagcccga attgccacgg ccccgattgg
8941 gcctccccgg tttgccaacg ccattcccct gactgctcgc ggcttgtggg ggccacgcca
9001 gagcgccccc ggctgcgcct ggtcgacgcc gacgaccccc tgctgcgcac tgcccctgga
9061 cccggcgagg tgtgggtcac gcctgtcata ggctctcagg cgcgcaagtg cggactccac
9121 atacgcgctg gaccgtacgg ccatgctacc gtcgaaatgc ccgagtggat ccacgcccac
9181 accaccagcg acccctggca tccaccgggc cccttggggc tgaagttcaa gacagttcgc
9241 ccggtggccc tgccacgcac gttagcgcca ccccgcaatg tgcgtgtgac cgggtgctac
9301 cagtgcggta cccccgcgct ggtggaaggc cttgcccccg ggggaggcaa ttgccatctc
9361 accgtcaatg gcgaggacct cggcgccgtc ccccctggga agttcgtcac cgccgccctc
9421 ctcaacaccc ccccgcccta ccaagtcagc tgcgggggcg agagcgatcg cgcgaccgcg
9481 cgggtcatcg accccgccgc gcaatcgttt accggcgtgg tgtatggcac acacaccact
9541 gctgtgtcgg agacccggca gacctgggcg gagtgggctg ctgcccattg gtggcagctc
9601 actctgggcg ccatttgcgc cctcccactc gctggcttac tcgcttgctg tgccaaatgc
9661 ttgtactact tgcgcggcgc tatagcgcct cgctagtggg cccccgcgcg aaacccgcac
9721 taggccacta gatccccgca cctgttgctg tatag
//

External Links

Rubella virus, complete genome]

Glossary Links

Glossary: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Numbers | Symbols | Term Link

Cite this page: Hill, M.A. (2024, April 25) Embryology Molecular Development - Rubella Genome. Retrieved from https://embryology.med.unsw.edu.au/embryology/index.php/Molecular_Development_-_Rubella_Genome

What Links Here?
© Dr Mark Hill 2024, UNSW Embryology ISBN: 978 0 7334 2609 4 - UNSW CRICOS Provider Code No. 00098G