Our laboratory has published the following review on protein-RNA complexes:
Serganov, S. & Patel, D. J. (2008). Towards deciphering the principles underlying a mRNA recognition code. Curr. Opin. Struct. Biol. 18, 120-129. [PubMed Abstract]
Proteins involved in Early Neuronal Morphology
Cellular morphology is an essential determinant of cellular function in all kingdoms of life. The Yang Shi laboratory (Harvard Medical School) has identified a molecular program that controls the early morphology of neurons through a metazoan-specific zinc finger protein, Unkempt.
Unkempt is an evolutionarily conserved RNA-binding protein that regulates translation of its target genes and is required for the establishment of the early bipolar neuronal morphology. Here we determine the X-ray crystal structure of mouse Unkempt and show that its six CCCH zinc fingers (ZnFs) form two compact clusters, ZnF1–3 and ZnF4–6, that recognize distinct trinucleotide RNA substrates. Both ZnF clusters adopt a similar overall topology, and use similar recognition principles to target specific RNA sequences. Structure-guided point mutations reduce the RNA-binding affinity of Unkempt both in vitro and in vivo, ablate its translational control and impair the ability of Unkempt to induce a bipolar cellular morphology. Our study unravels a novel mode of RNA sequence recognition by clusters of CCCH ZnFs that is critical for post-transcriptional control of neuronal morphology.
Murn, J., Zarnack, K., Yang, Y. J., Durak, O., Murphy, E. A., Cheloufi, S., Gonzalez, D. M., Teplova, M., Curk, T., Zuber, J., Patel D. J., Ule, J., Luscombe, N. M., Tsai, L. H., Walsh, C. A. and Shi, Y. (2015). Control of a neural morphology program by an RNA-binding zinc finger protein, Unkempt. Genes Dev. 29, 501-512.
Murn, J., Teplova, M., Zarnack, K., Shi, Y. and Patel, D. J. (2016). Recognition of distinct RNA motifs by the clustered CCCH zinc fingers of neuronal protein Unkempt. Nat. Struct. Mol. Biol. in press.
Mammalian Quaking (QKI) and its C. elegans homolog GLD-1 are evolutionary conserved RNA-binding proteins, which post-transcriptionally regulate target genes essential for developmental processes and myelination. QKI proteins regulate the stability, export and alternative splicing of multiple mRNAs associated with the formation of myelin. All QKI proteins bind RNA via their STAR domains, composed of a KH domain flanked by two conserved Qua1 and Qua2 domains. Qua1 has been shown to be critical for homodimerization, while KH-Qua2 is critical for RNA binding. To date, the structure of the entire STAR domain without or with bound RNA has not been solved precluding an understanding of the molecular recognition principles associated with complex formation.
We present x-ray structures of the STAR domain, composed of Qua1, KH, and Qua2 motifs of QKI and GLD-1 bound to high-affinity in vivo RNA targets containing YUAAY RNA recognition elements (RREs). The KH and Qua2 motifs of the STAR domain synergize to specifically interact with bases and sugar-phosphate backbones of the bound RRE. Qua1-mediated homodimerization generates a scaffold that enables concurrent recognition of two RREs, thereby plausibly targeting tandem RREs present in many QKI-targeted transcripts. Studies on structure-guided mutations in the laboratory of our collaborator Thomas Tuschl (Rockefeller University) reduced QKI RNA-binding affinity in vitro and in vivo and expression of QKI mutants in HEK293 cells significantly decreased abundance of QKI target mRNAs. Overall, our studies define principles underlying RNA target selection by STAR homodimers and provide insights into the post-transcriptional regulatory function of mammalian QKI proteins.
Teplova, M., Hafner, M., Teplov, D., Essig, K., Tuschl, T. and Patel, D. J. (2013). Structure-function studies of STAR family Quaking proteins bound to their in vivo RNA target sites. Genes Dev. 27, 928-940.
Nucleocytoplasmic Export of mRNAs and Retroviral RNAs
Messenger RNA export is mediated by the TAP-p15 heterodimer, which belongs to the family of NTF2-like export receptors. TAP-p15 heterodimers also bind to the constitutive transport element (CTE) present in simian type D retroviral RNAs, and mediate export of viral unspliced RNAs to the host cytoplasm. Given that TAP-p15 recognizes CTE RNA, it was of interest to define the molecular principles underlying protein-RNA recognition associated with targeting and sequestration of a previously unidentified RNA fold for TAP-mediated transport through the nuclear pore.
We have solved the crystal structure of the RNA recognition and leucine-rich repeat motifs of TAP bound to one symmetrical-half of CTE RNA. L-shaped conformations of protein and RNA are involved in a mutual molecular embrace on complex formation. Our collaborator Elisa Izaurralde laboratory (Max-Planck Institute, Tuebingen) has monitored the impact of structure-guided mutations on binding affinities in vitro and transport assays in vivo. Our studies define the principles by which CTE RNA subverts the mRNA export receptor TAP, thereby facilitating nuclear export of viral genomic RNAs, and more generally, provide insights on cargo RNA recognition by mRNA export receptors.
Teplova, M., Wohlbold, L., Kim, N. Y., Izaurralde, E. & Patel, D. J. (2011). Structure-function studies of nucleocytoplasmic transport of retroviral genomic RNA by mRNA export factor TAP. Nat. Struct. Mol. Biol. 18, 990-998.
Recently, a protein complex called C3PO (component 3 promoter of RISC) was shown by the Qinghua Liu laboratory (University of Texas Southwestern Medical School) to be a Mg2+-dependent endoRNase, which facilitates RISC activation by siRNA unwinding, as well as through removal of cleaved passenger strand. C3PO forms a mutimeric complex of translin and TRAX, in which TRAX acts as the catalytic subunit. An emerging challenge relates to the ratio of translin to TRAX in multimeric C3PO, its overall architecture and how does C3PO carry out its endoRNase activity.
Trax/translin heteromers, also known as C3PO, have been proposed to activate RNA-induced silencing complex (RISC) by facilitating endonucleolytic cleavage of the siRNA passenger strand. We report on the crystal structure of hexameric Drosophila C3PO formed by truncated Trax and translin, along with electron microscopic and mass spectrometric studies on full-length octameric Trax and translin. Our studies establish that Trax adopts the translin fold, possesses catalytic centers essential for C3PO’s endoribonuclease activity and interacts extensively with translin to form an octameric assembly. The catalytic pocket of Trax subunits are located within the interior chamber of the octameric scaffold. Biochemical studies in the laboratory of our collaborator Thomas Tuschl (Rockefeller University) established that truncated C3PO, like full-length C3PO, shows endoRNase activity that leaves 3’-hydroxyl-cleaved ends. We have measured the catalytic activity of C3PO and shown it to cleave almost stoichiometric amounts of substrate per second.
Tian, Y., Simanshu, D. K., Ascano, M., Daiz-Avalos, R., Park, A. Y., Juranek, S. A., Rice, W. J., Yin, Q., Robinson, C. C., Tuschl, T. & Patel, D. J. (2011). Multimeric assembly and biochemical characterization of the Trax-translin endonuclease complex. Nat. Struct. Mol. Biol. 18, 658-664.
Autoimmune Disease Syndromes
Diverse aspects of RNA metabolism are dictated by the La autoantigen, an abundant RNA-binding phosphoprotein found in the nucleus of all eukaryotes, and originally identified as an autoantigen in patients with systemic lupus erythematosus and Sjörgen’s syndrome. La specifically targets and protects the UUUOH 3’-terminii of nascent RNA polymerase III transcripts, including pre-tRNAs, 5S rRNAs and snRNAs from exonuclease digestion, while discriminating against 3’-phosphate-containing internal oligo U tracts and degraded RNA. La plays a role in 5’- and 3’-end processing of pre-tRNA precursors and exhibits RNA chaperone-like activity, thereby playing a key role in facilitating correct transcript folding, downstream processing and maturation, and ribonucleoprotein particle assembly. In addition La binds viral RNAs by site-specifically targeting their internal ribosome entry sites and stimulating translational inhibition.
We have recently solved the crystal structure of the N-terminal domain (NTD) of human La, consisting of La and RRM1 motifs, bound to a 9-mer ending in UUUOH and also the NMR solution structure of the La NTD complexed to a 3-mer UUUOH complex. The UUUOH 3’-end, in a splayed apart orientation, is sequestered in a basic and aromatic amino acid-lined cleft between the La and RRM1 motifs. The specificity-determining central U base bridges both motifs, in part through unprecedented targeting of the β-sheet edge, rather than the anticipated β-sheet face, of the RRM1 motif. Both hydroxyls of the sugar ring of the last U are hydrogen-bonded, with neither phosphate nor bulky modifications tolerated at this site. Our structural and mutation results establish how the La NTD protects the UUUOH 3’-ends of nascent RNA transcripts during downstream processing and maturation events. Current efforts in the laboratory are focused on structural characterization of complexes of the La C-terminal domain (CTD) with its RNA targets.
Teplova, M., Yuan, Y. R., Phan, A. T., Malinina, L., Ilin, S., Teplov, A. & Patel, D. J. (2006). Structural basis for recognition and sequestration of UUUOH 3’-terminii of nascent mRNA polymerase III transcripts by La autoantigen. Mol. Cell 21, 75-85. [PubMed Abstract]
The generation of functionally diverse proteins required for cell growth and differentiation in metazoan organisms is critically dependent on alternate splicing of pre-mRNAs. Alternative splicing regulators control the expression of tissue-specific or developmental stage-specific protein isoforms through binding either directly to splice sites or to other sequences in pre-mRNA, thereby enhancing or repressing inclusion of alternative exons. The proteins of the muscleblind-like family, MBNL, have been identified as important tissue-specific alternative splicing regulators that play a key role in terminal muscle differentiation. Normal splicing pattern is altered specifically in the neuromuscular disease myotonic dystrophy (DM), in part, due to inactivation of MBNL.
CUG binding protein 1 (CUGBP1) regulates multiple aspects of nuclear and cytoplasmic mRNA processing, with implications for onset of myotonic dystrophy. CUGBP1 harbors three RRM domains and preferentially targets UGU-rich mRNA elements. We report on crystal structures of CUGBP1 RRM1 and tandem RRM1/2 domains bound to RNAs containing tandem UGU(U/G) elements. Both RRM1 in RRM1-RNA and RRM2 in RRM1/2-RNA complexes use similar principles to target UGU(U/G) elements, with recognition mediated by face-to-edge stacking and water-mediated hydrogen bonding networks. The UG step adopts a left-handed Z-RNA conformation, with the syn guanine recognized through Hoogsteen edge-protein backbone hydrogen-bonding interactions. NMR studies on the RRM1/2-RNA complex establish that both RRM domains target tandem UGUU motifs in solution, while filter-binding assays identify a preference for recognition of GU over AU or GC steps. We discuss the implications of CUGBP1-mediated targeting and sequestration of UGU(U/G) elements on pre-mRNA alternative-splicing regulation, translational regulation and mRNA decay.
Teplova, M., Song, J., Gaw, H. Y., Teplov, V. & Patel, D. J. (2010). Structural insights into RNA recognition by the CUG binding protein 1. Structure 18, 1364-1367.
MBNL proteins harbor tandem CCCH Zn finger (ZnF) domains that target pre-mRNAs containing YGCU(U/G)Y sequence elements. In myotonic dystrophy, reduced levels of MBNL proteins leads to aberrant alternative splicing of a subset of pre-mRNAs. Our crystal structure of MBNL1 ZnF3/4 bound to r(CGCUGU) establishes that both ZnF3 and ZnF4 specifically target GC steps. The guanine and cytosine bases of the GC step insert into adjoining pockets, where they stack on conserved arginine and aromatic residues and form a network of hydrogen bonds primarily with main-chain groups of the protein, while the 2’-OH groups are hydrogen bonded to conserved side-chains. The relative alignment of ZnF3 and ZnF4 domains is dictated by the topology of the interdomain linker, with the resulting anti-parallel orientation of bound GC elements, supportive of a chain-reversal loop trajectory for MBNL1-bound pre-mRNA targets, thereby impacting on alternative splicing regulation.
Teplova, M. & Patel, D. J. (2008). Structural insights into RNA recognition by the alternate splicing regulator muscleblind-like MBNL1. Nat. Struct. Mol. Biol. 15, 1343-1351. [PubMed Abstract]
Neurodegenerative Disease Syndromes
Nova proteins, expressed in central nervous system neurons, are target antigens of the autoimmune disorder POMA syndrome, a neurodegenerative disease that originates when systemic malignant tumors express proteins normally sequestered in the central nervous system. The immune system targets these antigens to be non-self, and the ensuing response results in neurodegeneration. Our structural studies of the POMA syndrome are directed toward providing a structural understanding of how full-length Nova, which contains three K-homology (KH) domains, targets and regulates alternate splicing events within the 2 glycine receptor subunit pre-mRNA.
The fragile X syndrome, the most common form of inherited mental retardation in humans, results from the expansion and hypermethylation of trinucleotide CGG repeats located within the 5’-untranslated region of the FMR1 gene. Fragile X mental retardation protein (FMRP) contains RNA-binding KH and RGG domains and is known to regulate mRNA localization and/or translation. The onset of the fragile X mental retardation (FXMR) syndrome is associated with loss of FMRP function, which is essential for higher cognitive function. Recently, mRNAs encoding proteins that contain FMRP-binding elements have been identified, and it appears that their dysregulation may underlie human mental retardation. We are interested in the characterization of RNA complexes formed with RNA-binding elements within the FMRP protein, in our attempts to understand factors that contribute to the FXMR syndrome.
Nova onconeural antigens are neuron-specific RNA-binding proteins implicated in paraneoplastic opsoclonus-myoclonus-ataxia (POMA) syndrome. Nova harbors three K-homology (KH) motifs implicated in alternate splicing regulation of genes involved in inhibitory synaptic transmission. We report the crystal structure of the first two KH domains (KH1/2) of Nova-1 bound to an in vitro selected RNA hairpin, containing a UCAG-UCAC high-affinity binding site. Sequence-specific intermolecular contacts in the complex involve KH1 and the second UCAC repeat, with the RNA scaffold buttressed by interactions between repeats. While the canonical RNA-binding surface of KH2 in the above complex engages in protein-protein interactions in the crystalline state, the individual KH2 domain can sequence-specifically target the UCAC RNA element in solution. The observed anti-parallel alignment of KH1 and KH2 domains in the crystal structure of the complex generates a scaffold that could facilitate target pre-mRNA looping upon Nova binding, thereby potentially explaining Nova’s functional role in splicing regulation. Binding studies in our collaborators Jennifer and Robert Darnell laboratories (Rockefeller University) validate the structural conclusions on the Nova-RNA complex.
Teplova, M., Malinina, L., Darnell, J. C., Song, J., Lu, M., Abagyan, R., Musunuru, K., Teplov, A., Burley, S. K., Darnell, R. B. & Patel, D. J. (2011). Protein-RNA and protein-protein recognition by dual KH1/2 domains of the neuronal splicing factor Nova-1. Structure 19, 930-944.
Fragile X Mental Retardation Protein (FMRP) is a regulatory RNA binding protein that plays a central role in the development of several human disorders including Fragile X Syndrome (FXS) and autism. FMRP uses an arginine-glycine-rich (RGG) motif for specific interactions with guanine (G)-quadruplexes, mRNA elements implicated in the disease-associated regulation of specific mRNAs. Here we report the 2.8-Å crystal structure of the complex between the human FMRP RGG peptide bound to the in vitro selected G-rich RNA. In this model system, the RNA adopts an intramolecular K(+)-stabilized G-quadruplex structure composed of three G-quartets and a mixed tetrad connected to an RNA duplex. The RGG peptide specifically binds to the duplex-quadruplex junction, the mixed tetrad, and the duplex region of the RNA through shape complementarity, cation-π interactions, and multiple hydrogen bonds. Many of these interactions critically depend on a type I β-turn, a secondary structure element whose formation was not previously recognized in the RGG motif of FMRP. RNA mutagenesis and footprinting experiments indicate that interactions of the peptide with the duplex-quadruplex junction and the duplex of RNA are equally important for affinity and specificity of the RGG-RNA complex formation. These results suggest that specific binding of cellular RNAs by FMRP may involve hydrogen bonding with RNA duplexes and that RNA duplex recognition can be a characteristic RNA binding feature for RGG motifs in other proteins.
Vasilyev, N., Polonskaia, A., Darnell, J. C., Darnell, R. B., Patel, D. J. and Serganov, A. (2015). Crystal structure reveals specific recognition of a G-quadruplex RNA by a b-turn in the RGG motif of FMRP. Proc. Natl. Acad. Scis. USA. 112, E5391-E5400.
We have determined the solution structure of the complex between an arginine-glycine-rich RGG peptide from the fragile X mental retardation protein (FMRP) and an in vitro-selected guanine-rich sc1 RNA. The bound RNA forms a novel G-quadruplex separated from the flanking duplex stem by a mixed junctional tetrad. The RGG peptide is positioned along the major groove of the RNA duplex, with the G-quadruplex forcing a sharp turn of R10GGGGR15 at the duplex-quadruplex junction. Arginines R10 and R15 form cross-strand specificity-determining intermolecular hydrogen-bonds with the major-groove edges of guanines of adjacent Watson-Crick G•C pairs. Filter binding assays undertaken by the laboratories of our collaborators Jennifer and Robert Darnell (Rockefeller University) on RNA and peptide mutations identify and validate contributions of peptide-RNA intermolecular contacts and shape complementarity to molecular recognition. These findings on FMRP RGG domain recognition by a combination of G-quadruplex and surrounding RNA sequences have implications for recognition of other genomic G-rich RNAs.
Phan, A. T., Kuryavyi, V., Darnell, J. C., Serganov, A., Majumdar, A., Raslin, T., Polonskaia, A., Chen, C., Clain, D., Darnell, R. B. & Patel, D. J. (2011). Structure-function studies of FMRP RGG peptide recognition of an RNA duplex-quadruplex junction. Nat. Struct. Mol. Biol. 18, 796-804.
Aminoglycoside Antibiotic-RNA Complexes
Our group is interested in the molecular basis for site-specific aminoglycoside-RNA recognition both on natural and in vitro selected targets. Several groups have identified RNA folds following in vitro selection that target aminoglycoside antibiotics with affinities ranging from mM to nM. These RNA aptamer sequences are likely to undergo adaptive binding on complex formation to generate specific pockets for the bound aminoglycoside antibiotics.
Apramycin is unique among aminoglycoside antibiotics in containing a bicyclic core domain. It binds preferentially to eukaryotic decoding sites compared with prokaryotic counterparts and induces misreading of the genetic code during translation. The structure of the complex has been solved at 1.5 resolution, with the apramycin binding in the deep groove of the decoding site RNA, which forms a continuously stacked helix comprising novel non-canonical CA and GA pairs and a bulged adenine. Apramycin recognizes the RNA target by specific direct contacts and interactions mediated by a Mg cation and water molecules. We have also solved the free eukaryotic decoding site at 2.4 resolution, and observe that the RNA does not undergo a conformational transition on apramycin complex formation.
Hermann, T., Tereshko, V., Skripkin, E. & Patel, D. J. (2007). The structure of the apramycin-eukaryotic RNA decoding site complex. Blood Cells, Molecules, and Diseases 38, 193-198. [PubMed Abstract]
The aminocyclitol antibiotic streptomycin was the second antibiotic after penicillin to have a dramatic impact on medical practice and treatment. Streptomycin interacts with the central domain of 16S ribosomal RNA and also inhibits group I intron splicing. A modular streptomycin-binding RNA aptamer has been identified by in vitro selection in the Renee Schroeder laboratory. Streptomycin-RNA aptamer complex formation occurs with micromolar affinity and, strikingly, has an absolute requirement for divalent cations as an essential cofactor for the interaction. We describe a 2.9 Å x-ray structure of the complex between streptomycin and an in vitro selected RNA aptamer, solved using the anomalous diffraction properties of bound Ba cations. The RNA aptamer, which contains two asymmetric internal loops, adopts a distinct cation-stabilized fold involving a series of S-shaped backbone turns anchored by canonical and non-canonical pairs and triples. The streptomycin streptose ring is encapsulated by stacked arrays of bases from both loops at the elbow of the L-shaped RNA architecture. Specificity is defined by direct hydrogen bonds between all streptose functional groups and base edges that line the inner walls of the cylindrical binding pocket. By contrast, the majority of the intermolecular interactions involve contacts to backbone phosphates in the published structure of streptomycin bound to 16S RNA.
Tereshko, V., Skripkin, E. & Patel, D. J. (2003). Encapsulating streptomycin within a small 40-mer RNA. Chem. Biol. 10, 175-187. [PubMed Abstract]