Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Cell ; 183(4): 905-917.e16, 2020 11 12.
Artigo em Inglês | MEDLINE | ID: mdl-33186529

RESUMO

The generation of functional genomics datasets is surging, because they provide insight into gene regulation and organismal phenotypes (e.g., genes upregulated in cancer). The intent behind functional genomics experiments is not necessarily to study genetic variants, yet they pose privacy concerns due to their use of next-generation sequencing. Moreover, there is a great incentive to broadly share raw reads for better statistical power and general research reproducibility. Thus, we need new modes of sharing beyond traditional controlled-access models. Here, we develop a data-sanitization procedure allowing raw functional genomics reads to be shared while minimizing privacy leakage, enabling principled privacy-utility trade-offs. Our protocol works with traditional Illumina-based assays and newer technologies such as 10x single-cell RNA sequencing. It involves quantifying the privacy leakage in reads by statistically linking study participants to known individuals. We carried out these linkages using data from highly accurate reference genomes and more realistic environmental samples.


Assuntos
Segurança Computacional , Genômica , Privacidade , Genoma Humano , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Fenótipo , Filogenia , Reprodutibilidade dos Testes , Análise de Sequência de RNA , Análise de Célula Única
2.
Nature ; 611(7936): 532-539, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36323788

RESUMO

Neuropsychiatric disorders classically lack defining brain pathologies, but recent work has demonstrated dysregulation at the molecular level, characterized by transcriptomic and epigenetic alterations1-3. In autism spectrum disorder (ASD), this molecular pathology involves the upregulation of microglial, astrocyte and neural-immune genes, the downregulation of synaptic genes, and attenuation of gene-expression gradients in cortex1,2,4-6. However, whether these changes are limited to cortical association regions or are more widespread remains unknown. To address this issue, we performed RNA-sequencing analysis of 725 brain samples spanning 11 cortical areas from 112 post-mortem samples from individuals with ASD and neurotypical controls. We find widespread transcriptomic changes across the cortex in ASD, exhibiting an anterior-to-posterior gradient, with the greatest differences in primary visual cortex, coincident with an attenuation of the typical transcriptomic differences between cortical regions. Single-nucleus RNA-sequencing and methylation profiling demonstrate that this robust molecular signature reflects changes in cell-type-specific gene expression, particularly affecting excitatory neurons and glia. Both rare and common ASD-associated genetic variation converge within a downregulated co-expression module involving synaptic signalling, and common variation alone is enriched within a module of upregulated protein chaperone genes. These results highlight widespread molecular changes across the cerebral cortex in ASD, extending beyond association cortex to broadly involve primary sensory regions.


Assuntos
Transtorno do Espectro Autista , Córtex Cerebral , Variação Genética , Transcriptoma , Humanos , Transtorno do Espectro Autista/genética , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/patologia , Córtex Cerebral/metabolismo , Córtex Cerebral/patologia , Neurônios/metabolismo , RNA/análise , RNA/genética , Transcriptoma/genética , Autopsia , Análise de Sequência de RNA , Córtex Visual Primário/metabolismo , Neuroglia/metabolismo
3.
Genome Res ; 2023 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-38097386

RESUMO

Single nucleotide polymorphisms (SNPs) from omics data create a reidentification risk for individuals and their relatives. Although the ability of thousands of SNPs (especially rare ones) to identify individuals has been repeatedly shown, the availability of small sets of noisy genotypes, from environmental DNA samples or functional genomics data, motivated us to quantify their informativeness. We present a computational tool suite, termed Privacy Leakage by Inference across Genotypic HMM Trajectories (PLIGHT), using population-genetics-based hidden Markov models (HMMs) of recombination and mutation to find piecewise alignment of small, noisy SNP sets to reference haplotype databases. We explore cases in which query individuals are either known to be in the database, or not, and consider several genotype queries, including those from environmental sample swabs from known individuals and from simulated "mosaics" (two-individual composites). Using PLIGHT on a database with ∼5000 haplotypes, we find for common, noise-free SNPs that only ten are sufficient to identify individuals, ∼20 can identify both components in two-individual mosaics, and 20-30 can identify first-order relatives. Using noisy environmental-sample-derived SNPs, PLIGHT identifies individuals in a database using ∼30 SNPs. Even when the individuals are not in the database, local genotype matches allow for some phenotypic information leakage based on coarse-grained SNP imputation. Finally, by quantifying privacy leakage from sparse SNP sets, PLIGHT helps determine the value of selectively sanitizing released SNPs without explicit assumptions about population membership or allele frequency. To make this practical, we provide a sanitization tool to remove the most identifying SNPs from genomic data.

4.
Mol Psychiatry ; 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38879719

RESUMO

Substance use disorders (SUD) and drug addiction are major threats to public health, impacting not only the millions of individuals struggling with SUD, but also surrounding families and communities. One of the seminal challenges in treating and studying addiction in human populations is the high prevalence of co-morbid conditions, including an increased risk of contracting a human immunodeficiency virus (HIV) infection. Of the ~15 million people who inject drugs globally, 17% are persons with HIV. Conversely, HIV is a risk factor for SUD because chronic pain syndromes, often encountered in persons with HIV, can lead to an increased use of opioid pain medications that in turn can increase the risk for opioid addiction. We hypothesize that SUD and HIV exert shared effects on brain cell types, including adaptations related to neuroplasticity, neurodegeneration, and neuroinflammation. Basic research is needed to refine our understanding of these affected cell types and adaptations. Studying the effects of SUD in the context of HIV at the single-cell level represents a compelling strategy to understand the reciprocal interactions among both conditions, made feasible by the availability of large, extensively-phenotyped human brain tissue collections that have been amassed by the Neuro-HIV research community. In addition, sophisticated animal models that have been developed for both conditions provide a means to precisely evaluate specific exposures and stages of disease. We propose that single-cell genomics is a uniquely powerful technology to characterize the effects of SUD and HIV in the brain, integrating data from human cohorts and animal models. We have formed the Single-Cell Opioid Responses in the Context of HIV (SCORCH) consortium to carry out this strategy.

5.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36477833

RESUMO

MOTIVATION: While many quantum computing (QC) methods promise theoretical advantages over classical counterparts, quantum hardware remains limited. Exploiting near-term QC in computer-aided drug design (CADD) thus requires judicious partitioning between classical and quantum calculations. RESULTS: We present HypaCADD, a hybrid classical-quantum workflow for finding ligands binding to proteins, while accounting for genetic mutations. We explicitly identify modules of our drug-design workflow currently amenable to replacement by QC: non-intuitively, we identify the mutation-impact predictor as the best candidate. HypaCADD thus combines classical docking and molecular dynamics with quantum machine learning (QML) to infer the impact of mutations. We present a case study with the coronavirus (SARS-CoV-2) protease and associated mutants. We map a classical machine-learning module onto QC, using a neural network constructed from qubit-rotation gates. We have implemented this in simulation and on two commercial quantum computers. We find that the QML models can perform on par with, if not better than, classical baselines. In summary, HypaCADD offers a successful strategy for leveraging QC for CADD. AVAILABILITY AND IMPLEMENTATION: Jupyter Notebooks with Python code are freely available for academic use on GitHub: https://www.github.com/hypahub/hypacadd_notebook. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Software , Humanos , Fluxo de Trabalho , Metodologias Computacionais , Teoria Quântica , SARS-CoV-2 , Desenho de Fármacos , Simulação de Dinâmica Molecular
6.
Bioinformatics ; 39(2)2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36692135

RESUMO

MOTIVATION: MHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for MHC Class I proteins, in vitro experiments have been conducted to screen peptides with high binding affinities to hundreds of given MHC Class I proteins. However, considering tens of thousands of known MHC Class I proteins, conducting in vitro experiments for extensive MHC proteins is infeasible, and thus a more efficient and scalable way to characterize binding motifs is needed. RESULTS: We presented a de novo generation framework, coined PepPPO, to characterize binding motif for any given MHC Class I proteins via generating repertoires of peptides presented by them. PepPPO leverages a reinforcement learning agent with a mutation policy to mutate random input peptides into positive presented ones. Using PepPPO, we characterized binding motifs for around 10 000 known human MHC Class I proteins with and without experimental data. These computed motifs demonstrated high similarities with those derived from experimental data. In addition, we found that the motifs could be used for the rapid screening of neoantigens at a much lower time cost than previous deep-learning methods. AVAILABILITY AND IMPLEMENTATION: The software can be found in https://github.com/minrq/pMHC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Antígenos de Histocompatibilidade Classe I , Peptídeos , Humanos , Ligação Proteica , Peptídeos/química , Antígenos de Histocompatibilidade Classe I/metabolismo , Software
8.
Langmuir ; 34(29): 8678-8684, 2018 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-27039990

RESUMO

Diatoms are unicellular algae that construct cell walls called frustules by the precipitation of silica, using special proteins that order the silica into a wide variety of nanostructures. The diatom species Cylindrotheca fusiformis contains proteins called silaffins within its frustules, which are believed to assemble into supramolecular matrices that serve as both accelerators and templates for silica deposition. Studying the properties of these biosilicification proteins has allowed the design of new protein and peptide systems that generate customizable silica nanostructures, with potential generalization to other mineral systems. It is essential to understand the mechanisms of aggregation of the protein and its coprecipitation with silica. We continue previous investigations into the peptide R5, derived from silaffin protein sil1p, shown to independently catalyze the precipitation of silica nanospheres in vitro. We used the solid-state NMR technique 13C{29Si} and 15N{29Si} REDOR to investigate the structure and interactions of R5 in complex with coprecipitated silica. These experiments are sensitive to the strength of magnetic dipole-dipole interactions between the 13C nuclei in R5 and the 29Si nuclei in the silica and thus yield distance between parts of R5 and 29Si in silica. Our data show strong interactions and short internuclear distances of 3.74 ± 0.20 Å between 13C═O Lys3 and silica. On the other hand, the Cα and Cß nuclei show little or no interaction with 29Si. This selective proximity between the K3 C═O and the silica supports a previously proposed mechanism of rapid silicification of the antimicrobial peptide KSL (KKVVFKVKFK) through an imidate intermediate. This study reports for the first time a direct interaction between the N-terminus of R5 and silica, leading us to believe that the N-terminus of R5 is a key component in the molecular recognition process and a major factor in silica morphogenesis.


Assuntos
Diatomáceas/metabolismo , Lisina/química , Lisina/metabolismo , Espectroscopia de Ressonância Magnética , Nanoestruturas/química , Dióxido de Silício/metabolismo , Diatomáceas/química , Peptídeos/química , Proteínas/química , Dióxido de Silício/química
9.
J Am Chem Soc ; 136(32): 11402-11, 2014 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-25054469

RESUMO

Extracellular matrix proteins adsorbed onto mineral surfaces exist in a unique environment where the structure and dynamics of the protein can be altered profoundly. To further elucidate how the mineral surface impacts molecular properties, we perform a comparative study of the dynamics of nonpolar side chains within the mineral-recognition domain of the biomineralization protein salivary statherin adsorbed onto its native hydroxyapatite (HAP) mineral surface versus the dynamics displayed by the native protein in the hydrated solid state. Specifically, the dynamics of phenylalanine side chains (viz., F7 and F14) located in the surface-adsorbed 15-amino acid HAP-recognition fragment (SN15: DpSpSEEKFLRRIGRFG) are studied using deuterium magic angle spinning ((2)H MAS) line shape and spin-lattice relaxation measurements. (2)H NMR MAS spectra and T1 relaxation times obtained from the deuterated phenylalanine side chains in free and HAP-adsorbed SN15 are fitted to models where the side chains are assumed to exchange between rotameric states and where the exchange rates and a priori rotameric state populations are varied iteratively. In condensed proteins, phenylalanine side-chain dynamics are dominated by 180° flips of the phenyl ring, i.e., the "π flip". However, for both F7 and F14, the number of exchanging side-chain rotameric states increases in the HAP-bound complex relative to the unbound solid sample, indicating that increased dynamic freedom accompanies introduction of the protein into the biofilm state. The observed rotameric exchange dynamics in the HAP-bound complex are on the order of 5-6 × 10(6) s(-1), as determined from the deuterium MAS line shapes. The dynamics in the HAP-bound complex are also shown to have some solution-like behavioral characteristics, with some interesting deviations from rotameric library statistics.


Assuntos
Durapatita/química , Peptídeos/química , Fenilalanina/química , Proteínas e Peptídeos Salivares/química , Adsorção , Algoritmos , Biofilmes , Simulação por Computador , Espectroscopia de Ressonância Magnética , Modelos Moleculares , Movimento (Física) , Estrutura Secundária de Proteína , Saliva/metabolismo , Soluções , Propriedades de Superfície
10.
Langmuir ; 30(24): 7152-61, 2014 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-24896500

RESUMO

The use of biomimetic approaches in the production of inorganic nanostructures is of great interest to the scientific and industrial community due to the relatively moderate physical conditions needed. In this vein, taking cues from silaffin proteins used by unicellular diatoms, several studies have identified peptide candidates for the production of silica nanostructures. In the current article, we study intensively one such silica-precipitating peptide, LKα14 (Ac-LKKLLKLLKKLLKL-c), an amphiphilic lysine/leucine repeat peptide that self-organizes into an α-helical secondary structure under appropriate concentration and buffer conditions. The suggested mechanism of precipitation is that the sequestration of hydrophilic lysines on one side of this helix allows interaction with the negatively charged surface of silica nanoparticles, which in turn can aggregate further into larger structures. To investigate the process, we carry out 1D and 2D solid-state NMR (ssNMR) studies on samples with one or two uniformly (13)C- and (15)N-labeled residues to determine the backbone and side-chain chemical shifts. We also further study the dynamics of two leucine residues in the sequence through (13)C spin-lattice relaxation times (T1) to determine the impact of silica coprecipitation on their mobility. Our results confirm the α-helical secondary structure in both the neat and silica-complexed states of the peptide, and the patterns of chemical shift and relaxation time changes between the two states suggest possible mechanisms of self-aggregation and silica precipitation.


Assuntos
Leucina/química , Lisina/química , Peptídeos/química , Dióxido de Silício/química , Interações Hidrofóbicas e Hidrofílicas , Espectroscopia de Ressonância Magnética
11.
Science ; 384(6698): eadi5199, 2024 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-38781369

RESUMO

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.


Assuntos
Encéfalo , Redes Reguladoras de Genes , Transtornos Mentais , Análise de Célula Única , Humanos , Envelhecimento/genética , Encéfalo/metabolismo , Comunicação Celular/genética , Cromatina/metabolismo , Cromatina/genética , Genômica , Transtornos Mentais/genética , Córtex Pré-Frontal/metabolismo , Córtex Pré-Frontal/fisiologia , Locos de Características Quantitativas
12.
bioRxiv ; 2024 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-38562822

RESUMO

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multi-omics datasets into a resource comprising >2.8M nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550K cell-type-specific regulatory elements and >1.4M single-cell expression-quantitative-trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.

13.
bioRxiv ; 2023 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-37461642

RESUMO

The functional properties of the human brain arise, in part, from the vast assortment of cell types that pattern the cortex. The cortical sheet can be broadly divided into distinct networks, which are further embedded into processing streams, or gradients, that extend from unimodal systems through higher-order association territories. Here, using transcriptional data from the Allen Human Brain Atlas, we demonstrate that imputed cell type distributions are spatially coupled to the functional organization of cortex, as estimated through fMRI. Cortical cellular profiles follow the macro-scale organization of the functional gradients as well as the associated large-scale networks. Distinct cellular fingerprints were evident across networks, and a classifier trained on post-mortem cell-type distributions was able to predict the functional network allegiance of cortical tissue samples. These data indicate that the in vivo organization of the cortical sheet is reflected in the spatial variability of its cellular composition.

14.
J Phys Chem A ; 115(44): 12055-69, 2011 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-21870804

RESUMO

Solution NMR spectroscopy can elucidate many features of the structure and dynamics of macromolecules, yet relaxation measurements, the most common source of experimental information on dynamics, can sample only certain ranges of dynamic rates. A complete characterization of motion of a macromolecule thus requires the introduction of complementary experimental approaches. Solid-state NMR spectroscopy successfully probes the time scale of nanoseconds to microseconds, a dynamic window where solution NMR results have been deficient, and probes conditions where the averaging effects of rotational diffusion of the molecule are absent. Combining the results of the two distinct techniques within a single framework provides greater insight into dynamics, but this task requires the common interpretation of results recorded under very different experimental conditions. Herein, we provide a unified description of dynamics that is robust to the presence of large-scale conformational exchange, where the diffusion tensor of the molecule varies on a time scale comparable to rotational diffusion in solution. We apply this methodology to the HIV-1 TAR RNA molecule, where conformational rearrangements are both substantial and functionally important. The formalism described herein is of greater generality than earlier combined solid-state/solution NMR interpretations, if detailed molecular structures are available, and can offer a more complete description of RNA dynamics than either solution or solid-state NMR spectroscopy alone.


Assuntos
Espectroscopia de Ressonância Magnética/métodos , Modelos Moleculares , RNA/química , Difusão , Repetição Terminal Longa de HIV , HIV-1 , Movimento (Física) , RNA Viral/química , Rotação
15.
Nat Commun ; 11(1): 3696, 2020 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-32728046

RESUMO

ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.


Assuntos
Bases de Dados Genéticas , Genômica , Neoplasias/genética , Linhagem Celular Tumoral , Transformação Celular Neoplásica/genética , Redes Reguladoras de Genes , Humanos , Mutação/genética , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismo
16.
J Phys Chem B ; 123(51): 10915-10929, 2019 12 26.
Artigo em Inglês | MEDLINE | ID: mdl-31769684

RESUMO

Interpreting dynamics in solid-state molecular systems requires characterization of the potentially heterogeneous environmental contexts of molecules. In particular, the analysis of solid-state nuclear magnetic resonance (ssNMR) data to elucidate molecular dynamics (MD) involves modeling the restriction to overall tumbling by neighbors, as well as the concentrations of water and buffer. In this exploration of the factors that influence motion, we utilize atomistic MD trajectories of peptide aggregates with varying hydration to mimic an amorphous solid-state environment and predict ssNMR relaxation rates. We also account for spin diffusion in multiply spin-labeled (up to 19 nuclei) residues, with several models of dipolar-coupling networks. The framework serves as a general approach to determine essential spin couplings affecting relaxation, benchmark MD force fields, and reveal the hydration dependence of dynamics in a crowded environment. We demonstrate the methodology on a previously characterized amphiphilic 14-residue lysine-leucine repeat peptide, LKα14 (Ac-LKKLLKLLKKLLKL-c), which has an α-helical secondary structure and putatively forms leucine-burying tetramers in the solid state. We measure the R1 relaxation rates of uniformly 13C-labeled and site-specific 2H-labeled leucines in the hydrophobic core of LKα14 at multiple hydration levels. Studies of 9 and 18 tetramer bundles reveal the following: (a) for the incoherent component of 13C relaxation, the nearest-neighbor spin interactions dominate, while the 1H-1H interactions have minimal impact; (b) the AMBER ff14SB dihedral barriers for the leucine Cγ-Cδ bond ("methyl rotation barriers") must be lowered by a factor of 0.7 to better match the 2H data; (c) proton-driven spin diffusion explains some of the discrepancy between experimental and simulated rates for the Cß and Cα nuclei; and (d) 13C relaxation rates are mostly underestimated in the MD simulations at all hydrations, and the discrepancies identify likely motions missing in the 50 ns MD trajectories.


Assuntos
Leucina/química , Lisina/química , Simulação de Dinâmica Molecular , Ressonância Magnética Nuclear Biomolecular/métodos , Peptídeos/química , Interações Hidrofóbicas e Hidrofílicas , Conformação Proteica em alfa-Hélice
17.
Structure ; 27(9): 1469-1481.e3, 2019 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-31279629

RESUMO

A key issue in drug design is how population variation affects drug efficacy by altering binding affinity (BA) in different individuals, an essential consideration for government regulators. Ideally, we would like to evaluate the BA perturbations of millions of single-nucleotide variants (SNVs). However, only hundreds of protein-drug complexes with SNVs have experimentally characterized BAs, constituting too small a gold standard for straightforward statistical model training. Thus, we take a hybrid approach: using physically based calculations to bootstrap the parameterization of a full model. In particular, we do 3D structure-based docking on ∼10,000 SNVs modifying known protein-drug complexes to construct a pseudo gold standard. Then we use this augmented set of BAs to train a statistical model combining structure, ligand and sequence features and illustrate how it can be applied to millions of SNVs. Finally, we show that our model has good cross-validated performance (97% AUROC) and can also be validated by orthogonal ligand-binding data.


Assuntos
Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único , Proteínas/química , Proteínas/genética , Bases de Dados de Proteínas , Desenho de Fármacos , Humanos , Ligantes , Aprendizado de Máquina , Modelos Estatísticos , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Proteínas/metabolismo
18.
Science ; 362(6420)2018 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-30545857

RESUMO

Despite progress in defining genetic risk for psychiatric disorders, their molecular mechanisms remain elusive. Addressing this, the PsychENCODE Consortium has generated a comprehensive online resource for the adult brain across 1866 individuals. The PsychENCODE resource contains ~79,000 brain-active enhancers, sets of Hi-C linkages, and topologically associating domains; single-cell expression profiles for many cell types; expression quantitative-trait loci (QTLs); and further QTLs associated with chromatin, splicing, and cell-type proportions. Integration shows that varying cell-type proportions largely account for the cross-population variation in expression (with >88% reconstruction accuracy). It also allows building of a gene regulatory network, linking genome-wide association study variants to genes (e.g., 321 for schizophrenia). We embed this network into an interpretable deep-learning model, which improves disease prediction by ~6-fold versus polygenic risk scores and identifies key genes and pathways in psychiatric disorders.


Assuntos
Encéfalo/metabolismo , Regulação da Expressão Gênica , Transtornos Mentais/genética , Conjuntos de Dados como Assunto , Aprendizado Profundo , Elementos Facilitadores Genéticos , Epigênese Genética , Epigenômica , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Humanos , Locos de Características Quantitativas , Análise de Célula Única , Transcriptoma
20.
J Phys Chem B ; 121(1): 110-117, 2017 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-27930881

RESUMO

Intrinsic motions may allow HIV-1 transactivation response (TAR) RNA to change its conformation to form a functional complex with the Tat protein, which is essential for viral replication. Understanding the dynamic properties of TAR necessitates determining motion on the intermediate nanosecond-to-microsecond time scale. To this end, we performed solid-state deuterium NMR line-shape and T1Z relaxation-time experiments to measure intermediate motions for two uridine residues, U40 and U42, within the lower helix of TAR. We infer global motions at rates of ∼105 s-1 in the lower helix, which are much slower than those in the upper helix (∼106 s-1), indicating that the two helical domains reorient independently of one another in the solid-state sample. These results contribute to the aim of fully describing the properties of functional motions in TAR RNA.


Assuntos
RNA Viral/química , Deutério , Repetição Terminal Longa de HIV , Espectroscopia de Ressonância Magnética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA