1
A Promoter Transcription start Physcomitrella patens TGTGTAATAATAAGTTTTAATACCTGCCCCAGAG-AACTTCTCTATCATAGTCGTTCCGGTTCTAAAC---TGATCGAAA------------------------AGGAGT-- Marchantia polymorpha TGTGTCATAATAAATTTGAAGACCTGCCT-AGGT-AATTTCTGTATAATAGTTGTTCCAGTTTTAAAC---TGA--GAAA------------------------GAGATCTA Psilotum nudum TGTGTCATAATAAATTTGAATACCTGCCCTAGGT-AACTTCTGTATTATAGTTGCTCCAGTTTTAAAC---TGAA-AAAA------------------------AAGATC-A Sorghum bicolor TATGTCATAATAGATCCGAACACTTGCCTTGGATTGACTTCAATAT-ATAATTGCTCCAGTGAATAAC---TAAAAAAAA---ATAGAAGGACGGTAGATAGTAAAGAAA-- Oryza sativa TATGTCATAATAGATCCGAACACTTGCCTCAGATTGACTTCAATAT-ATAATTGCTCCAGTGAATAAC---TAAAAAAAA---ATAGAAGGACGGTAGATAGTAAAGAAA-- Polupus trichocarpa TATGTCATAATAGATCCGAACACTTGCCCTGGATCGACTTCCAGATCATAATTGCTCTAGTGAATAAC---TAACGAAAC---------------TAGATA-----GAAA-- Arabidopsis thaliana TATGTCATAATAGATCCGAACACTTGCCCCAGATCGACTTCCAGATCATAATTGCTCTAGTGAATAAC---TAAAGAAAATAGATGAATAGATGGAAGATAGAAGAGAAAGA Selaginella m. CATGCCATACTGAGATCGAGTACCCGCCCGAG-----CTCCCGCATCATAGCTGCTCCGGTCTCGGACGAATGAGTGGAA-----------------------GGTGTAGTT ** *** * * ** *** * * * ** *** * ** ** ** ** * * Shine Dalgarno Start Physcomitrella patens --AGTC--TT----------TAAAG---ATTTCCTAGAAGTTTCCTATTAA---------TTGCTGGCAGGT-TGTTGCTATTCCTCGTCTCGAGAGAGGAGAATCTCAATG Marchantia polymorpha AAAGTT--ATA---------TATGG---ATCTCTCAGAAGTTTACTAATTA---------TTGTTGGTAGGT-TTTTCCTATGCCTCGTCTGAAGAGAGGAGAACCTCGATG Psilotum nudum AAAGTTGGATAATAAAATAATAAGT---ATCTTTTAGAAGTTTACTATTCA---------TTATTGGTGGGT-TTTTCCTATGCCTCGTCTGAAGAGAGGAGAACCTCGATG Sorghum bicolor AGGACTAATCA---------TAATATCTATCTTTAAAAGATTCAATAAAAA-------GACAGTTGGCGGGTCTCTTTGTATGTCTTGTCCGGAAAGAGGAGGA-CTTAATG Oryza sativa AGGACTAATCA---------TAATATCTATCTTTCAAAGGTTCACTAAAAA-------G----TTGGCGGGTCTCTTTGTATGTCTTGTCCGGAAAGAGGAGGA-CTTAATG Polupus trichocarpa AAAACTAATTA---------GAAAA---ATCTCTCATGGGTTCACTAATTACTT----GACTGTTGGCGGGTCTCTTTGTATGTGTTGTCCGGAAAGAGGAGGA-CTCAATG Arabidopsis thaliana AAAAAAAATTC---------TAAG---TATCTATCATCGGTTCACTAATTACTTGAGTGACTGTTGGCGGGTTTCTTTGTATGTGTTGTCCGGAAAGAGGAGGA-CTCAATG Selaginella m. GGGGGAAATTC---------CGAAG-----CTCACGGGAGTTTACTAATCA---------CTATCGGCGGGT-TTCTTCTGTGCTTCATCCGGAAAGGGGAGAACCCCGACG * ** ** * ** *** * * * * * ** * ** **** * * * * B Promoter Transcription start Arabidopsis thaliana TATGTCA----------TAATAGATCCGAACACTTGCCCCAGATCGACTTCCAGATCATAATTG----CTCTAG-----TGAATAACTAAAGAAAATAGATGAATAGATGGAAGAT Populus trichocarpa TATGTCA----------TAATAGATCCGAACACTTGCCCTGGATCGACTTCCAGATCATAATTG----CTCTAG-----TGAATAACTAACGAAACTAGAT--------------- Physcomitrella patens TGTGTCA----------TAATAAATTTGAATACCTGCCCTAGGTA-ACTTCTGTATTATAGTTG----CTCCAG-----TTTTAAACTGAA-AAAAAAGATCAAAAG--------- Chlorella vulgaris AATATC-----------TTTAAAATAAGTAAAAATAATTTGTAAACCAATAAAAAATATATTTA----TGGTAT-----AATATAACATATGATGTAAAAAAAACTA--------- Thalassiosira p. AAATTCGCTAATTATTGTTTATTAACAGTGTAATTTAATTGTAAGACAAAAAATTGTATAAATT----TTCTTTTTGAGAAAATGAAAAATCGTGTATAATATATAA--------- Euglena gracilis ------------------AAAACTTAAGTTCTAATTACATTTATAATTTATTTAACTTTAAGTG----AGTTGG-------TAGCTCAGTTGGTAGAGCACTCGG----------- Chara vulgaris TTTG----------AAATCAAAAAAGAGTACAAAAT-------------TTTTACAAATGG--------TCCATTAAGCGCTCTATATATATG--CCATACTACAGG--------- Coccomyxa spec. GGAGGGGC-----CCACGCCCAAGGGCGTTCCGTCGGACGGGAAGCCCCTTTCACTAAACGGGGGGGAGGCCTTGTCCCGTGAGACATGGTAGGCCCAGACGGCGAG--------- Shine Dalgarno Start Arabidopsis thaliana AGAAGAGAAAGAAAAAAAAATTCTAAGTATCTATCATCGGTTCACTAATTACTTGAGTGACTGTTGGCGGGTTTCTTTGTATGTGTTGTCCGG-AAAGAGGAGGAC---TCAATG Populus trichocarpa -------AGAAAAAAACTAATTAGAAAAATCTCTCATGGGTTCACTAATTACTT----GACTGTTGGCGGGTCTCTTTGTATGTGTTGTCCGG-AAAGAGGAGGAC---TCAATG Physcomitrell apatens -------TTGGATAATAAAATAATAAGTATCTTTTAGAAGTTTACTATT--CAT-------TATTGGTGGGTTT-TTCCTATGCCTCGTCTGA-AGAGAGGAGAACC--TCGATG Chlorella vulgaris -------TTGATTTTCTTAACCG-CTATTTTTTTTAAAAAT--ACTGTGGTAAG----ATTTTAAAAAAGATAATTTCGGT----AATTCAAA-TAAGGAGAATTTTTCGCCATG Thalassiosira p. -------TGAAAACTCGTATTTAACAAATAGCTAAAAAAATTAACTATTTTTTA----TTTTTTATGAGAGTTTCATAGATTTCCGTCTCCCA-AAAGGAGAAAAT----CAATG Euglena gracilis ------CTTTTAACCGATCGGTCCTGGGTTCGAATCCCAGCCAACTCATTTTTA-----ATTTTGTGTAGGTGT-----------ATGATAAT-TTAATCGAAAAT---TATATG Chara vulgaris -----TATGAAAGTCTGCTTGAATT---AATTATAGTTGTTCTAGTGATACATT----CAAAAGGAAAAAGATAGTTGTTGATTATTGACATATTTTTATTTTATATTTTGTATG Coccomyxa spec. -----AGGGAGAGTCCCACAGGCCTGGCAGCATCTGTTGTTCTGCTGAAGGGTT----TGAGATAGGAGGGCTGAGTCTCTTTTATAGTATTTGTTTTTAGGAGAAATTCG-ATG * * *** 2
Supplemental Figure 2. Conservation of the psaa 5 UTR in Embryophyta. (A) Sequence alignment of the psaa 5 UTR of dicots, monocots, the fern-like vascular plant Psilotum nudum, the spikermoss Selaginella moellendorffii and mosses. The psaa promoter and conserved regions of the psaa 5 UTR in embryophyta are labeled in blue and yellow, respectively. The start codon is labeled in green. Stars indicate identical nucleotides. The bars under the alignment represent the probes psaa- 1-1 (blue), psaa-1-2 (red), and psaa-1-3 (green) used in EMSA experiments shown in Figure 7C. Accession numbers are indicated in the Method section. (B) Sequence alignment of the psaa 5 UTR of vascular plants, Physcomitrella, the diatom Thalassiosia pseudonana, Euglena gracilis, and diverse green algae. 3
4
A HCF145 SRPBCC1 (134 AA) At VWNVLTDYERLADFIPNLVWSGRIPCPHP-----GRIWLEQRGLQRALYWHIEARVVLDLHECLDSPNGRELHFSMVDGDFKKFEGKWSVKS-GIRSVGTVLSYEVNVIPRF---NFPAIFLERIIRSDLPVNLRAVARQAEK Gm VWNALTDYEHLADFIPNLVWSGRIPCPYP-----GRIWLEQRGFQRAMYWHIEARVVLDLQEVVNSAWDRELHFSMVDGDFKKFDGKWSVKS-GTRSSTAILSYEVNVIPRF---NFPAIFLERIIRSDLPVNLRALAYRAER Pt VWNSLTDYERLADFIPNLVCSGRIPCPHP-----GRVWLEQRGLQRALYWHIEARVVLDLQEFPHSANNRELHFSMVDGDFKKFEGKWSLRS-GTRHGTTTLSYEVNVMPRY---NFPAIFLERIIGSDLPVNLRALACRAER Bd VWRIITDYERLAEFVPNLVHSGRIPCPHE-----GRIWLEQRGLQQALYWHIEARVVLDLREVPDAVNGRELHFSMVDGDFKKFEGKWSVRS-GPRSASAILLYEVNVIPRF---NFPSIFLERIIRSDLPVNLRALAFRSEK Os VWRVITDYERLAEFIPNLVHSGRIPCPHQ-----GRVWLEQRGLQQALYWHIEARVVLDLKEVPDAVNGRELHFSMVDGDFKKFEGKWSIRS-GPRSSSAILLYEVNVIPRF---NFPAIFLERIIRSDLPVNLGALACRAEN Sb LWQVITDYERLADFIPNLVQSGTIPCPHE-----GRIWLEQRGLQQALYWHIEARVVLDLQEIHDSINGRELHFSMVDGDFKKFEGKWSIRS-GPRSSSAVLLYEVNVIPRF---NFPAIFLEKIIRSDLPVNLGALACRAEK Sm VWEVLTDYERLAEFIPNLIHSARIPCPYP-----GRIWLLQRGLHTAMYWHIEATVVLDLEEFPHLTDGRSLQFCMVDGDFKKYAGRWLLQA-GTRPGTTDLHYEVNVIPRL---LLPGVFVEGIIKSDLPVNLRAIAERAEK Pp VWEVLTDYGRLAEFIPNLTRSEQIPCPHP-----GRTWLLQEGKQSAMYWQIEARVVLDLEEFLDAKDGRELRFSMVDGDFKRYVGRWYLRP-DVRPGTIILHYEVNVTPRL---LFPAAFVEKIIKSDLPTNLRAIAARAED :*. :*** :**:*:*** * ****: ** ** *.* : *:**:*** *****.*..*:*.*******:: *:* ::. * * ***** ** :*. *:* ******* ** *:* ::* Cs VWAVLTDYDRLVEFVPNLEVCEKLPGGSA-----TRYRLRQQGCSQSLYLRLEASAVLDVQEVKGPLGRRELRFAMVESPNLKFSGQWTVEPDPTVRDGRSLGTTKLRYEIS---VAPKWSIPSTLVSKVVKSGLPANICAIA Cv VWRVLTNYERLADFVPNLESCERLPSPRT-----GRVWIRQRGCSQGVLWRLEAEAVIAVEEVRLPLGRREARFNMVDGDFKEMSGRWVVEPDP---SSAVGMATLLRFDIT---VQPKISLPSSVVSYVVRAGLPANIQAVS HCF145 SRPBCC2 (129-130 AA) At VWKVLTSYESLPEIVPNLAISKILSRDNN----KVRI--LQEGCKGLLYMVLHARAVLDLHEIRE----QEIRFEQVEGDFDSLEGKWIFEQLGSHHTLLKYTVESKMRKDS---FLSEAIMEEVIYEDLPSNLCAIRDYIEK Gm VWNILTAYETLPKIVPNLAISKVVSRDNN----KVRI--LQEGCKGLLYMVLHARVVLDLCEYLE----QEISFEQVEGDFDSFRGKWIFEQLGNHHTLLKYSVESKMRKDT---FLSEAIMEEVIYEDLPSNLSAIRDYIEN Pt VWNVLTAYESLPEFVPNLAISKILSRENN----KVRI--LQEGCKGLLYMVLHARVVLDLCEHLE----QEISFEQVEGDFDSFQGKWILEQLGSHHTLLKYNVESKTHRDT---FLSEAIMEEVIYEDLPSNLCAIRDYIEK Bd VWNVLTAYE-LPEIIPNLAISRILLRDNN-----VRI--LQEGCKGLLYMVLHARVVM-LREKLE----REISFEQVEGDFFSFKGKWRLEQLGDQHTLLKYMVETKMHKDT---FLSESILEEVIYEDLPSNLCAIRDYVEK Os VWNILTAYEKLPEFVPNLAISRIIRRDNN----KVRI--LQEGCKGLLYMVLHARVVMDLREKLE----REISFEQVEGDFYSFKGKWRLEQLGDQHTLLKYMVETKMHKDT---FLSESILEEVIYEDLPSNLCAIRDYIEK Sb VWNVLTAYENLPEFVPNLAISRIVLRDNN----KVRI--MQEGCKGLLYMVLHARVVMDLREKFE----QEIRFEQVEGDFYSFKGKWRLEQLGDQHTLLKYMVETKMHRDT---FLSESILEEVIYEDLPSNLCAIRDYIEK Sm VWNVLTSYETLSEFVPNLSSSKIVSRHGN----HARV--LQEGCKCLLYMVLHARVVLELQELPP----NEITFQQVEGDFDVFSGKWTLESLGAEHTLLRYSVDMKMHNDF---LLPREIIEEIVYEDLPENLCAIRARVEL Pp VWAVLTAYESLQEFIPNLAICKVLTREKN----KVRL--LQEGCKCLLYMVLHARVILDLWERPQ----YEILFQQVEGDFDSFQGKWTLEPLGAQHTLLKYLVDTKMHKDS---LLAEALVEEVIYEDLPANLCAIRDRVEL **::**:** * :::***:.::: *. * :.*: :***** *********.:: * ** *:*.**** : *** :* **.****:* *: * :.* :*.. ::**::***** *:.*** :* Cs VWDVLTDYEALPEFVPNLAVCERLPVPAGMESRLTRL--RQVGFKDMVFMQLHAEAVLDLHERPH----REIQFRAVAGDFGVLQGKFMLSEPERKETHLKYAVEVKIPRSTPMMGLLEPILERMVYEDIPFNLAALKQR Cv VWDVLTDYNRLAEFIPNLAVSQRIALPSNAPANIIRI--RQVGYKRMLYMCLHAESVLDLIEKPQ----GEIQFRQVAGDFERFQGKWMLQGLPLSGNSSSTTSDAEPSASQTQLKYAVEIVIPRSTRMLGVLEPLLERT B RNA binding motif TMR1 (68 AA) At EVLKSEILKFISEHGQE-GFMPMRKQLRLHGRVDIEKAITRMGGFRRIALMMNLSLAYKHRKPKGYWDN Vascular plants: Gm KVLESELLKFIAEHGQE-GFMPMRKQLRLHGRVDIEKAITRMGGFRKIATILNLSLAYKHRKPKGYWDN At: Arabidopsis thaliana At5g08720 Pt DVLKSELLKFISEHGQE-GFMPMRKQLRLHGRVDIEKAITRMGGFRRIATLMNLSLAYKHRKPKGYWDN Gm: Glycine max XM_003528024 Bd EVLKSELGSFISKYGQN-GFMPKRKHLRTHGRVDIEKAITRMGGFRKIASIMNLSLSYKNRKPRGYWDN Pt: Populus trichocarpa XM_002307027 Os EVLKSELEKFIAKYGQD-GFMPKRKHLRLHGRVDIEKAITRMGGFRKIASIMNLSLSYKNRKPRGYWDN Bd: Brachypodium distachyon XM_003559134 Sb EVLKSELENFIAEYGQY-GFMPKRKHLRSHGRVDIEKAITRMGGFRKIASIMNLSLSYKNRKPRGYWDN Os: Oryza sativa Os03g0837900 Sm KVLERELEGFVAKAGKE-RVMPVRAELRKNGRVDLEKAIRRFGGFRSIAERLNMSLAYKRRKPRGFWQN Sb: Sorghum bicolor XM_002466088 Pp NILQQELLKFIAEKGTK-GVMPLRCELREAGRVDLEKAITRNGGFGPVASKLNLSLAYKERKPRGYWDN Sm: Selaginella moellendorffii XM_002987569.:*: *: *::: *.** *.** ****:**** * *** :* :*:**:**.***:*:*:* Pp: Physcomitrella patens XM_001783694 Cs EDFGLLAAELERCFGGT-GTMPTRAQLRAITRTDLEKAMVAHGGPAAVAKRMGWKLAYKAKAPRGYWDK Cv VGRKCRLPYLAFSPVLE-ALCPQKMTLYGAGRYDIARAVERWGGLYELAGELGYAVTGSRKPGFSEWQE Green Algae: HHHHHHHHHHHHHH---------HHHHHHH-HHHHHHHHHH---HHHHHHH------------------ Cs: Coccomyxa subellipsoidea XM_005645014 RNA binding motif TMR2 (67-69 AA) Cv: Chlorella variabilis XM_005849478 At ENLQEEIGRFQQSWGMDPSFMPSRKSFERAGRYDIARALEKWGGLHEVSRLLALNVRHPNRQLNSRKDN Gm ENLQEEISRFQRGWGIDPSFMPSRKSFERAGRYDIARALEKWGGLHEVSRLLSLKVRQRSRQDNLAKDK Pt ENLQEEISRFQRSWGMDLSFMPSRKSFERAGRYDIARALEKWGGLHEVSRLLALKVRHPNRQANSIKDR Bd ENLQEEI-RFQ-NWGIDPSYMPSRKSFERAGRYDIARALEKWGGIQEVSRLLSLEPRRPRRQADSDSEK Os ENLQEEIRRFQKNWGMDPAYMPSRKSFERAGRYDIARALEKWGGVHEVSRLLSLELRRPRRRANSDDES Sb ENLQEEISRFQKSWGMDPSYMPSRKSFERAGRYDIARALEKWGGVQEVSRLLSLKLRRPRRQGDLDDES Sm ENLKREIQLFQKKLRSDPSRMPSRRTLERAGRY-IARALEKWGGLHEVAKVLNLQTKRKRCLEEPEEGW Pp QNVHKEILLFQKEHGNDRTTMPTRQSLERAGRYDLARSLEKWGGLREVARVLGLQVKKRQKSRTAKTDV :*::.** ** * : **:*:::****** :**:******::**:::* *: :: Cs KNVEREIAEFCEQEGLPPRIMPLKMDFVRANRYDLAHVVERWGGLSELAELLEYQACTCRSVSPHVCKE CV HISELAASTGLSGREGLFELASKTYAARRSMQGSVDGGEDMALADILTADAVGARAASANSKASAENGA HHHHHHHHHHHHHH---------HHHHHHH-HHHHHHHHHH---HHHHHHH------------------ C 5
Supplemental Figure 4. Conservation of the Tandem Repeated SRPBCC and TMR motifs in HCF145. (A) Alignment of tandem repeated HCF145 motifs SRPBCC1 and SRPBCC2. The number of amino acids (aa) is indicated. Acidic and basic amino acids are labeled in red and blue, respectively. Highly conserved amino acids in both motifs are boxed in yellow. (B) Alignment of tandem repeated RNA binding TMR motifs 1 and 2. The number of amino acids (aa) is indicated. Highly conserved amino acids in both TMR motifs are boxed in yellow. The predicted helices (Jpred [http://www.compbio.dundee.ac.uk/jpred/] and NetSurfP [http://www.cbs.dtu.dk/services/netsurfp]) are shown below the alignment. For schematic presentation see Figure 3. (C) Abbreviations of organisms and accession numbers used in the alignments. 6
A Np: Csp: Cst: Ds: Gs: Pt: Vv: At: Os: Bd: Sb: Si: Sm: Pp: Cr: Cs: Pm: Ss: AtM1: AtM2: Ct: Pa: EKIWKVLTDYEALPDFLPNLAKSRLIEHPNGG-IRLEQVGSQ-RLLNFNFSARVVLDLEECFP------------REINFR-MVEGDFKGFSGSWCLEPYSLG-----------EYIGTNLCYTIQV--WPKLTMPVGIIENRLSKDLRLNL EKTWQVLTDYEALADFIPNLIKSRLLEHPDGG-IRLEQIGSQ-RLLNFNFCARVVLDLEEYFL------------KEINFR-MIEGDFKGFSGSWCLKPYSFG-----------DLVGTDLCYTIQV--WPKLTMPLKIIEPRLTNDMHVNL EKIWQILTDYESLADFIPNLAQSRLLAHPQGG-IRLEQIGSQ-RLLNFKFCARVVLDLEELFP------------KEINFQ-MVEGDFKGFSGKWCLEPYSLG-----------AAQGTNLCYTIQV--WPKLTMPISILENRLSNDLRLNL EAVWQVLTDYESLPEFIPSLEKSQRLEHPEGEKVRLEQVGKQ-RLFKVNFSARVVLDLTEMPP------------SRIDFE-MVEGDFKAFSGYWSLEEADQ---------------KTELIYSIFV--WPPRTMPVSLIERRLSLDLSLNL KAVWDLLTDYEHLAEFIPNLAVSRLRYHPQGG-IRLEQEGVQ-SVLGFRFRASVILDMYEKFS---------EDRAEIDFVLADSQDFDVFEGSWLMYPMKR--------------NWTHLIYQVTV--QPKRFVPVQAVEWRIREDVPSNL DTVWKILTDYEKLADFIPGLAVSKLIDKKDKF-ARLYQIGQQNLAFGLKFNAKAILDCYERDLQTLAS----GEKRDIEFK-MTEGDFQFFEGMWSIEQLAK----PKTEDSVGQEYETTLSYLVDV--KPKMWLPVNLIEGRICKEIKSNL HTVWSILTDYEGLADFIPGLAVSQLVEKGEKF-ARLFQIGQQDLAFGLKFNAKGIVDCYEKDLESLPF----GEKRDIEFK-MIEGDFQIFEGKWSIEQRNTNT-WEGKDSSVGQEFYTTLTYVVDV--EPKRWLPVYLVEGRLSREIKMNL DSVWSVLTDYEKLSDFIPGLVVSELVEKEGNR-VRLFQMGQQNLALGLKFNAKAVLDCYEKELEVLPH----GRRREIDFK-MVEGDFQLFEGKWSIEQLDKGI-HGEALDLQFKDFRTTLAYTVDV--KPKMWLPVRLVEGRLCKEIRTNL DAVWATLTDYEGLAGFIPGLSECRLLDQSDCF-ARLYQVGEQDLALGFKFNARGTIDCYEGELQLLP---AGARRREIAFN-MIDGDFKVFEGNWSVQEEVD-----GGEISADQEFQTILSYVVEL--EPKLWVPVRLLEGRICNEIKTNL EAVWATLTDYEGLAGFIPGLSECRLLHQDAAF-ARLYQVGEQDLALGFKFNAKGTIDCYEGEMEVLP---AGARRREIAFN-MVEGDFKVFEGKWSVEEVEDSL-DEGGENPTGQEFQTTLSYVVEL--EPKLWVPVRLLEGRICKEIKTNL EAVWATLTDYEGLADFIPGLSECRLLDQHDGF-ARIYQVGEQDLALGFKFNAKGTIDCYEGDMEVLPD--AGARRREIAFN-MIDGDFKLFQGKWSVEEVDGSI-VEGGGNSEEQEFQTTLSYLLEL--EPKLWVPVRLLEGRICSEIKNNL EAVWATLTDYEGLADFIPGLSECRLLDQAQGF-ARLYQVGEQDLALGFKFNAKGTIDCYEGDLESLPDAQGNARRREIAFN-MIDGDFKVFQGKWSVQESVEQEQGGGDSDEGQESQTTTLSYLVEL--EPKLWVPVRLLEGRICSEIKNNL ETVWGVLTDYEGLADFIPGLASSKVLERRENG-AQLLQIGEQELALGVKFRAKGVIEVTELPLELLDNG----CRRDIGFD-MVEGDFNLFRGIWRIEQILHG--------VEDATTQTSLTYILEV--QPKIWIPVALLEGRLQKEVSNNL EAVWGVLTDYDHLADHIPGLAESSVLQRRSNG-ARLKQIGQKNFALGVKFKAKAVVEVTEEAAQDLDDG----TLRDLHFE-TVEGDFQVFKGTWRMLEKSLE--------SNDAKVETYLSYILEV--QPKRWMPVALIEGVLGQEITCNL SAVWLALSDYDNLGKFIPSLVENRCLERGGRT-AVLYQVGAQDVAMGVKFSA-ALASVEALFP-YPLTSAPGVSSSDITFE-LVEGDFQAFRGVWRMQQT--------------GEATTLLSYALFV--KPQAWLPVALIQGRIENEVVRNL EVIWGALTDYDSLGTFIPGLAENRCLERRAQG-AQLLQIGEQEIAFGAKFRARVVLDIEEHWSGVPG 35 aa HDIAFC-ACEGDFQVFRGVWRIQEGSR------------GEGSSRLSYALFV--RPQIWLPVRLVQGRIESEIKNNL DSLWSVLTDYDRLNLYIPNLLSSKKIFQKGNN-VHLKQVGAQ-DFLGMKFSAEVTIDLFEN-----------KELGLLKFN-LIKGDFRKFEGSWKIQNIKNTS-------------TNSLIYDLTV--QGCQWMPIGMIEKRLKKDLSENL DELWEVLTDYENLSKFIPNLSSSQLVHREGHT-VRLQQVGSQ-QLLGLRFSAQVQLELTEF-----------RSEGLLSFK-MVKGDFRRFEGAWRVNELADG---------------CSLVYELTV--QGCIGMPIALIEERLRDDLSSNL QSVWNVLTDYERLADFIPNLVWSGRIPCPHPGRIWLEQRGLQ-RALYWHIEARVVLDLHECLDSPNG--------RELHFS-MVDGDFKKFEGKWSVKSGIR-------------SVGTVLSYEVNV--IPRFNFPAIFLERIIRSDLPVNL CEVWKVLTSYESLPEIVPNLAIS-KILSRDNNKVRILQEGCK-GLLYMVLHARAVLDLHEIRE------------QEIRFE-QVEGDFDSLEGKWIFEQ-LG-------------SHHTLLKYTVESKMRKDSFLSEAIMEEVIYEDLPSNL KHVWAAITDYNNHKSFVPKLIDSGLISDNGRE-QVMFERGKTGIFLFRKTVYIKLSLQGEY-------------PKRLDFH-QIEGDFKVYEGDWLIERASDG---------------KGSILTFRAKIKPDFFAPAMFVRKVQQNDLPMVL ETIWNLLTDYNNLSTIIPKVIDSRLIEDNGSH-KIIDQTGKSGILFIEKSVRIVLKVTEKF-------------PNALLFE-MVEGDFSTYTGSWSFRPGSSR---------------EQTFVSWQTDFKPTFFAPPFLVSFLQHQDLPVVM * *:.*: : :*.: : * * : : : :. : *.. ** : * *. : : :.. :: : :: ** B AtM1: Arabidopsis thaliana HCF145 SRPBCC1 AtM2: Arabidopsis thaliana HCF145 SRPBCC2 Np: Nostoc punctiforme PCC73102 Csp: Calothrix sp. PCC7507 Cst: Cylindrospermum stagnale PCC7417 Ds: Dactylococcopsis salina PCC8305 Gs: Galdieria sulphuraria Pt: Populus trichcarpa Vv: Vitis vinifera At: Arabidopsis thaliana Os: Oryza sativa Bd: Brachypodium distachyon Sb: Sorghum bicolor Si: Setaria italica Sm: Selaginella moellendorffii Pp: Physcomitrella patens Cr: Chlamydomonas reinhardtii Cs: Coccomyxa subellipsoidea Pm: Prochlorococcus marinus sp. AS9601 Ss: Synechococcus sp. CC9902 Ct: Chlorobium tepidum Pa: Prosthecochloris aestuarii C At5g08720 At5g08720 CP001037 CP003943 CP003642 CP003944 XM_005707359 XM_002320890 XM_002280685 At4g01650 AK059198 XM_003567477 XM_002459131 XM_004971363 XM_002987229 XM_001775661 XM_001699680 XM_005645755 CP000551 CP000097 AE006470 CP001108 7
Supplemental Figure 5. Conservation of the SRPBCC Motif in HCF145-L Proteins of Photosynthetic Organisms. (A) Multiple alignment of HCF145-L proteins in photosynthetic organisms. HCF145 motifs SRPBCC1 (AtM1) and SRPBCC2 (AtM2) are included. Acidic and basic amino acids are labeled in red and blue, respectively. Highly conserved amino acids are boxed in yellow. For schematic presentation see Figure 3. (B) Abbreviations of organisms and accession numbers used in the alignment. (C) Unrooted phylogram of all sequence motifs used in the alignment analyzed by the maximum likelihood method. Bootstrap values (based on 1000 iterations) are shown for corresponding nodes. The scale bar is an idicator for the evolutionary distance in substitutions per site. 8
9
Vascular Plants Arabidopsis thaliana HCF145 At5g08720 1. FMPMRKQLRLHGRVDIEKAIT--RMGGFRRIALMMNL-SLAYKHRKPKGYWD 2. FMPSRKSFERAGRYDIARALE--KWGGLHEVSRLLAL-NVRHPNRQLNSRKD Green algae Chlorella variabilis XM_005846292 Chlorellales 1. TMPTSAQLEAAGRRDLVAAVRA--AGGFLEVAQALGLR----SQRKPAGYWE 2. VMPSRSALQAAGRYDLHHAVML--HGGYTVASQSLDR----------RPAWP 3. CLPTASQLLEAGRGDLYQASRW--GGGAIVRRGGFCA-------AGQALGWE 4. RMPTHLQLASAGRHDLKYALQL--HGSASIAAM-LGLQ------GNTQGAHN Bathycoccus prasinos CCO17813 Mamiellales 1. GMPSKRSLEKENRKDLIKRVEK--LFGYDWLTMAVLLDF--EPFRKPFYYWD 2. VMPTRRDLIDARRWDLHHAVVL--HGGYGAVAKTLKWPR--ARWAEDRHLLN 3. RLPSALELRNVGRDDLARHMVE--HGGPVTVAKRMRMKP-------GKGAWI 4. YMPTDEELINAGRHDLRYRVKE--IGSATVAKYAKLQNRTEKMSLAEARAFL Coccomyxa subellipsoidea C-169 XP_005645828 Coccomyxaceae 1. RMPSCTELREAGAFTLYSAISK--HGGVGAFARQLGLDP----KRRDSGYWE 2. GMPTIQDLQRSGHNSLIKAINH--WGGRSAVARRLGLACSPTRRLMTLGDLS 3. VMPSRTQLLEAGRPDLLQAVKR--MGGFKRVAAALELAF----LPARRGRSA Ostreococcus lucimarinus XM_001415510 Mamiellophyceae 1. CMPTREQLR-GGRHWDAIQQIE-SLGGFVKVAQLLDWSG---AKTRPRGYWT 2. RMPSQKSLRDAGRADIVNALKR--FGGAEKVAASMGLEFGSGNKRSSASARG Red algae Cyanidioschyzon merolae XM_005538035 Cyanidiales TMPTAGQLAAHHRSDLIRAIRK--HGGFPKVAEQLGLK----AHRRPNGYWN Cyanidioschyzon merolae XM_005538049 Cyanidiales VMPPERILAKAGRFDLIISIEY--HGGSRAVAEICELRDSASWEYVLEMRDL EMPSIAELQRQGREDLARLIRR--HGGPLVFAARFGLYVPPSRRRREADLKW Cyanidioschyzon merolae XM_005538687 Cyanidiales 1. YMPTSNELREAKRSDLVRAIIV--HGGYAKVAERCGL--QPHR--RSFGYWR 2. RMPTYNELVAAKKRILAYAIAE--NGGFLEVARRMNLQLSSDE-----TPWR 3. RFPRQQDLVRLGRYDLDWAIHRW-HGGYTRLAAELGYLRSRLPC-KPRNFWS 4. RMPNRKELEALDRHDLIYAIRK--FGGFLTVATKLGLSRDALTHTRPRGYWS 5. IMPRLEQLRMYNREDLINAIHR--HGGAANVARRLHLFWY-----GPKTFWR 6. KMPTQQELISAGRVDVAYGVHL--HGGVYEVARRLRLQVLDPP--RAPFYWN 7. VMPTSMTIVRSGRRDLAAAIRR--HGGWDAFARRLNLRPAAPK--RPKGYWN 8. FVRNYAADFVRRPGDDEDDPDAAGKIAYSDVSEILAGNVGNKRCRTVVGAIR 9. VMPTAEELRLDGRADLVFACERI-HGGLATVARGLGWPLLAER--LPPESLK 10. EMPTEADLLRTGGIDIHEAIVC--HGGYVEVARSLNLRHPEDP---EWTDWS Galdieria sulphuraria XM_005705295 Cyanidiales 1. SVHLLQVKKSGGECPLPRSSTKCYYGGGTPSFELLVP-20aa-VRDVRGYWK 2. RMPSANELRKSGRYALALAISSH-HGGFHAVAREIGLQPNHHS----SGYWD 3. YMPTYHQLLKARRLDLAKAIHK--YGGFPAVAEKLGR---IPN--KRRKYWH 4. QLPTMTLLSTCKRWDLMGAIRL--HGGLYEVSRKTQIPLSKSTR-QPRGYWS 5. IVPTLSNLKRNQRQDLVEAIRK--HGGVQTVAAKLYMLRQSKR--KAKGYWN 6. VMPLGHELRRHQRRDLCYAIQL--HGGFSVVAGKLHL---NWI--GPISFWR 7. RMPTLNDLVIRGRVDLAFGIRL--HHGFPAVAKAFGLEWTIPS--RPRMYWN 8. YMPSNETLYQLGRGDLADAIRD--TKGWVYYAKRLGLVPHYRCI-SSHKLWK 9. AVASVEELYRDGRGDIAFAIMKY-HQGATQLAHRLKWKAPHMRP-LPPAYYR 10. LMPSKKELFQTGYRDLVFVIYR--HGGFQKVACRMGWTIHEDNP-HWLTQWL Cyanobacteria Microcoleus sp. PCC 7113 AFZ22266 Oscillatoriales 1. VMPKAAQLRQLGRYDLA-MAISKYHGGYRSVASRLGLTYT----GQRFGYWH 2. VMPSRQQLEQAGEKPLAAAIG-LHGGVL-AVARRLGFKLP--YGRKPRGYWK 3. VMPTREQLVQIQRAELISAIA-TNGGWP-SVARRFGL------ANPNKGYTS 10
Supplemental Figure 7. Multiple Alignment of Transcript Binding Motif Repeats (TMR) Present in Representative TMR Proteins of Photosynthetic Organisms in Eukaryotes and Cyanobacteria. The consensus sequence MP-x(4)-L-x(3)-GR-x-DL-x(2)-AI-x(2-4)-HGG-x(3)- VA-x(2)-LGL-x(5-27)-GYW is colored in yellow. Acidic and basic amino acids are labeled in red and blue, respectively. For schematic presentation see Figure 3. 11
12
Supplemental Table 1. Oligonucleotides Used for PCR, RT-PCR, Primer Extension, Probe Generation, Genotyping, and other Applications. Primer name Primer sequence 5 3 720-ATG-f-P 720-3UTR-r-P Cacc-145-for 145-rev. ex7-for ex8-for ex10-rev2 ex10-rev1 LBb1 PsaA80-mer At4-Kpn1-f At4-Kpn1-r Cacc-145-for 145-rev Fw-BclI-hcf145 Rev-SalI-hcf145-strep Rev-SalI-hcf145- strepa2 Fw-BclI-hcf145-B1 Rev-SalI-hcf145-strep psaa ATG rev 70-mer psaa 1 70-mer psaa 2 70-mer psaa 3 psba 80-mer T7_5UTRpsaA 3 T7_5UTR-psaA 2 psaa5utrrev HCF145-like 2-f HCF145-like 2-r LB4 B3-T7-petB-for petb-ex-rev atgtcagtgagcaagtttccacatctc tatctttgcgagttacaagactacac caccatgtcagtgagcaagtttccacatctc atattgaacccaattgatatcaagatc ctgaagcaatcatggaagagg agctttgatgatgaatctttcacttgc ggatgtctcacgttcaatgc aagacgagatacttcgtgtaatcctc gcgtggaccgcttgctgcaact ccacatctccattcaggatttcttggcccactattggccaaaccacctgagcactaggtccaatgtgagtagg atcactc ctgctttggtaccgctttccagatctag atcatggtacccaaccaattcgctaacaagacc caccatgtcagtgagcaagtttccacatctc atattgaacccaattgatatcaagatc atatattgatcacatagcggcgccggtag ataatagtcgacttatttttcgaactgcgggtggctccaagcgctatattgaacccaattgatatcaag ataatagtcgacttatttttcgaactgcgggtggctccaagcgctaatgtctctttgtagaccaggaat atatattgatcaagaggagagaaatcttcagag ataatagtcgacttatttttcgaactgcgggtggctccaagcgctatattgaacccaattgatatcaag gttccggcgaacgaataatcat tctattttctttagttattcactagagcaattatgatctggaagtcgatctggggcaagtgttcggatct aagtaattagtgaaccgatgatagatacttagaatttttttttctttctcttctatcttccatctattca gcgaacgaataatcattgagtcctcctctttccggacaacacatacaaagaaacccgccaacagtcactc gacggttttcagtgctagttatccagttacagaagcgaccccataggctttcgctttcgcgtctctctaaaattg cagtc taatacgactcactatagggagaagatccgaacacttgccc taatacgactcactatagggctaagtatctatcatcggttcac gtgaaccgatgatagatacttag ctcgacactgctagcttcttc tcgtaacaatctagaacagctttg cgtgtgccaggtgcccacggaatagt cgtaatacgactcactatagggactgaagctaactttgg taccggaatagcgtcaggtacac 13
Supplemental Table 1. continued Primer name Primer sequence 5 3 psaa 5' 1 pppsaa_f pppsaa_r pppsba_f pppsba_r pprps14_f pprps14_r ppycf3_f ppycf3_r npt5 npt3 pp5 f pp5 r pp3 f pp3 r ppf1 ppr1 ppf2 ppr2 ppf3 ppr3 EF1α-f EF1α-r Fw T7 psaa3_1 Fw T7 psaa3_2 Rev psaa3_1 Rev psaa3_2 taccggaatagcgtcaggtacac tggcgagcatctggaataacta tgtccagcaaccagaaaaagaa ttcattgcagctcctcctgtag aggatgttgtgttcagcttgga aaaaatatcaagattttcgtcactc tgcatgagccatttctcgta tgttgcagacattttactccgaat ccaatgctattgcttgtttcca agtgcattctttggacctccaatcggatcctgtcaaacactga gacaggaggcccgatctagtaa gctcgagtttttcagcaagatgtggccaacacaaatgaacagt tggaggtccaaagaatgcacta ttactagtcgggcctcctgtccacgtgtttcttgaaccagagg aggagatcttctagaaagatgtcgacctcatatgaatggccacgatgt atcagcagtgattctgccctaag gcggctgagtggctccttca aaattatcgcgcgcggtgtc tgatcatggtgagaagtgttgct ggtttgggaggttttgactgac cgtggagagacaacatcctttg agcgtggtatcacaattgac gatcgctcgatcatgttatc taatacgactcactatagggagacataattgctctagtgaataactaa taatacgactcactatagggagagaatagatggaagatagaagagaa ttagttattcactagagcaattatg ttctcttctatcttccatctattc 14