Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 73.024
Filtrar
1.
Plant Genome ; 13(1): e20014, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-33016635

RESUMO

Genomic prediction (GP) might be an efficient way to improve haploid induction rate (HIR) and to reduce the laborious and time-consuming task of phenotypic selection for HIR in maize (Zea mays L.). In this study, we evaluated GP accuracies for HIR and other agronomic traits of importance to inducers by independent and cross-validation. We propose the use of GP for cross prediction and parental selection in the development of new inducer breeding populations. A panel of 159 inducers from Iowa State University (ISU set) was genotyped and phenotyped for HIR and several agronomic traits. The data of an independent set of 53 inducers evaluated by the University of Hohenheim (UOH set) was used for independent validation. The HIR ranged from 0.61 to 20.74% and exhibited high heritability (0.90). High cross-validation prediction accuracy was observed for HIR (r = 0.82), whereas for other traits it ranged from 0.36 (self-induction rate) to 0.74 (days to anthesis). Prediction accuracies across different sets were higher when the larger panel (ISU set) was used as a training population (r = 0.54). The average HIR of the 12,561 superior predicted progenies (µSP ) ranged from 1.00-18.36% and was closely related to the corresponding midparent genomic estimated breeding value (GEBV). A predicted genetic variance (VG ) of reduced magnitude was observed in the twenty crosses with highest midparent GEBV or µSP for HIR. Our results indicate that although GP is a useful tool for parental selection, decisions about which cross combinations should be pursued need to be based on optimal trade-offs between maximizing both µSP and VG .


Assuntos
Modelos Genéticos , Zea mays , Genoma , Genômica , Haploidia , Zea mays/genética
2.
Plant Genome ; 13(1): e20002, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-33016638

RESUMO

Genomic selection (GS) is a marker-based selection initially suggested for livestock breeding and is being encouraged for crop breeding. Several statistical models are used to implement GS; however, none have been tested for use in lentil (Lens culinaris Medik.) breeding. This study was conducted to compare the accuracy of different GS models and prediction scenarios based on empirical data and to make recommendations for designing genomic selection strategies for lentil breeding. We evaluated nine single-trait (ST) models, two multiple-trait (MT) models, and a model that incorporates genotype × environment interaction (GEI) using populations from a lentil diversity panel and two recombinant inbred lines (RILs). The lines in all populations were phenotyped for five phenological traits and genotyped using a custom exome capture assay. Within-population, across-population, and across-environment genomic predictions were made. Prediction accuracy varied among the evaluated models, populations, prediction scenarios, and traits. Single-trait models showed similar accuracy in the absence of large effect quantitative trait loci (QTL) but BayesB outperformed all models when there were QTL with relatively large effects. Models that accounted for GEI and MT-GS models increased prediction accuracy for a low heritability trait by up to 66 and 14%, respectively. Moderate to high accuracies were obtained for within-population (range of .36-.85) and across-environment (range of .19-.89) predictions but across-population prediction accuracy was very low. Results suggest that GS can be implemented in lentil breeding to make predictions within populations and across environments, but across-population prediction should not be considered when the population size is small.


Assuntos
Lens (Planta) , Cruzamento , Genômica , Lens (Planta)/genética , Modelos Genéticos , Seleção Genética
3.
Nat Commun ; 11(1): 4703, 2020 09 17.
Artigo em Inglês | MEDLINE | ID: mdl-32943643

RESUMO

Deep learning models have shown great promise in predicting regulatory effects from DNA sequence, but their informativeness for human complex diseases is not fully understood. Here, we evaluate genome-wide SNP annotations from two previous deep learning models, DeepSEA and Basenji, by applying stratified LD score regression to 41 diseases and traits (average N = 320K), conditioning on a broad set of coding, conserved and regulatory annotations. We aggregated annotations across all (respectively blood or brain) tissues/cell-types in meta-analyses across all (respectively 11 blood or 8 brain) traits. The annotations were highly enriched for disease heritability, but produced only limited conditionally significant results: non-tissue-specific and brain-specific Basenji-H3K4me3 for all traits and brain traits respectively. We conclude that deep learning models have yet to achieve their full potential to provide considerable unique information for complex disease, and that their conditional informativeness for disease cannot be inferred from their accuracy in predicting regulatory annotations.


Assuntos
Aprendizado Profundo , Doença/genética , Anotação de Sequência Molecular , Alelos , Predisposição Genética para Doença , Genoma Humano , Estudo de Associação Genômica Ampla , Histonas/genética , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único
4.
PLoS One ; 15(8): e0236226, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32866160

RESUMO

Amine oxidases (AOs) including copper containing amine oxidases (CuAOs) and FAD-dependent polyamine oxidases (PAOs) are associated with polyamine catabolism in the peroxisome, apoplast and cytoplasm and play an essential role in growth and developmental processes and response to biotic and abiotic stresses. Here, we identified PAO genes in common wheat (Triticum aestivum), T. urartu and Aegilops tauschii and reported the genome organization, evolutionary features and expression profiles of the wheat PAO genes (TaPAO). Expression analysis using publicly available RNASeq data showed that TaPAO genes are expressed redundantly in various tissues and developmental stages. A large percentage of TaPAOs respond significantly to abiotic stresses, especially temperature (i.e. heat and cold stress). Some TaPAOs were also involved in response to other stresses such as powdery mildew, stripe rust and Fusarium infection. Overall, TaPAOs may have various functions in stress tolerances responses, and play vital roles in different tissues and developmental stages. Our results provided a reference for further functional investigation of TaPAO proteins.


Assuntos
Resposta ao Choque Frio/genética , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/genética , Proteínas de Plantas/genética , Termotolerância/genética , Triticum/genética , Aegilops/enzimologia , Aegilops/genética , Processamento Alternativo , Sequência de Aminoácidos , Conjuntos de Dados como Assunto , Evolução Molecular , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Genoma de Planta , Estudo de Associação Genômica Ampla , Cadeias de Markov , Modelos Genéticos , Peso Molecular , Família Multigênica , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/química , Oxirredutases atuantes sobre Doadores de Grupo CH-NH/metabolismo , Filogenia , Proteínas de Plantas/química , Proteínas de Plantas/metabolismo , Domínios Proteicos/genética , RNA-Seq , Alinhamento de Sequência , Triticum/enzimologia
5.
PLoS One ; 15(8): e0237808, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32866209

RESUMO

In this study, we performed an analysis of the impact of performance enhancing polymorphisms (PEPs) on gymnastic aptitude while considering epistatic effects. Seven PEPs (rs1815739, rs8192678, rs4253778, rs6265, rs5443, rs1076560, rs362584) were considered in a case (gymnasts)-control (sedentary individuals) setting. The study sample comprised of two athletes' sets: 27 elite (aged 24.8 ± 2.1 years) and 46 sub-elite (aged 19.7 ± 2.4 years) sportsmen as well as a control group of 245 sedentary individuals (aged 22.5 ± 2.1 years). The DNA was derived from saliva and PEP alleles were determined by PCR, RT-PCR. Following Multifactor Dimensionality Reduction, logistic regression models were built. The synergistic effect for rs1815739 x rs362584 reached 5.43%. The rs1815739 x rs362584 epistatic regression model exhibited a good fit to the data (Chi-squared = 33.758, p ≈ 0) achieving a significant improvement in sportsmen identification over naïve guessing. The area under the receiver operating characteristic curve was 0.715 (Z-score = 38.917, p ≈ 0). In contrast, the additive ACTN3 -SNAP-25 logistic regression model has been verified as non-significant. We demonstrate that a gene involved in the differentiation of muscle architecture-ACTN3 and a gene, which plays an important role in the nervous system-SNAP-25 interact. From the perspective originally established by the Berlin Academy of Science in 1751, the matter of communication between the brain and muscles via nerves adopts molecular manifestations. Further in-vitro investigations are required to explain the molecular details of the rs1815739 -rs362584 interaction.


Assuntos
Actinina/genética , Aptidão , Epistasia Genética , Ginástica/fisiologia , Proteína 25 Associada a Sinaptossoma/genética , Adulto , Alelos , Área Sob a Curva , Bases de Dados Genéticas , Entropia , Feminino , Marcadores Genéticos , Humanos , Modelos Logísticos , Masculino , Modelos Genéticos , Redução Dimensional com Múltiplos Fatores , Polimorfismo de Nucleotídeo Único/genética , Adulto Jovem
6.
Nat Commun ; 11(1): 4876, 2020 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-32978378

RESUMO

In most crops, genetic and environmental factors interact in complex ways giving rise to substantial genotype-by-environment interactions (G×E). We propose that computer simulations leveraging field trial data, DNA sequences, and historical weather records can be used to tackle the longstanding problem of predicting cultivars' future performances under largely uncertain weather conditions. We present a computer simulation platform that uses Monte Carlo methods to integrate uncertainty about future weather conditions and model parameters. We use extensive experimental wheat yield data (n = 25,841) to learn G×E patterns and validate, using left-trial-out cross-validation, the predictive performance of the model. Subsequently, we use the fitted model to generate circa 143 million grain yield data points for 28 wheat genotypes in 16 locations in France, over 16 years of historical weather records. The phenotypes generated by the simulation platform have multiple downstream uses; we illustrate this by predicting the distribution of expected yield at 448 cultivar-location combinations and performing means-stability analyses.


Assuntos
Simulação por Computador , Produtos Agrícolas/genética , Genótipo , Incerteza , Tempo (Meteorologia) , Agricultura/métodos , DNA de Plantas , Grão Comestível/genética , França , Interação Gene-Ambiente , Modelos Genéticos , Fenótipo , Triticum/genética
7.
Nat Commun ; 11(1): 4661, 2020 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-32938925

RESUMO

The recent years have seen a growing number of studies investigating evolutionary questions using ancient DNA. To address these questions, one of the most frequently-used method is principal component analysis (PCA). When PCA is applied to temporal samples, the sample dates are, however, ignored during analysis, leading to imperfect representations of samples in PC plots. Here, we present a factor analysis (FA) method in which individual scores are corrected for the effect of allele frequency drift over time. We obtained exact solutions for the estimates of corrected factors, and we provided a fast algorithm for their computation. Using computer simulations and ancient European samples, we compared geometric representations obtained from FA with PCA and with ancestry estimation programs. In admixture analyses, FA estimates agreed with tree-based statistics, and they were more accurate than those obtained from PCA projections and from ancestry estimation programs. A great advantage of FA over existing approaches is to improve descriptive analyses of ancient DNA samples without requiring inclusion of outgroup or present-day samples.


Assuntos
DNA Antigo/análise , Análise Fatorial , Genoma Humano , Metagenômica/estatística & dados numéricos , Algoritmos , Inglaterra , Europa (Continente) , Frequência do Gene , Deriva Genética , Genética Populacional/estatística & dados numéricos , Humanos , Modelos Genéticos , Análise de Componente Principal
8.
Nat Commun ; 11(1): 4662, 2020 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-32938926

RESUMO

Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only individual variants. We demonstrate HapTree-X's feasibility on in-house sequenced Genome in a Bottle RNA-seq and various whole exome, genome, and 10X Genomics datasets. HapTree-X produces more complete phases (up to 25%), even in clinically important genes, and phases more variants than other methods while maintaining similar or higher accuracy and being up to 10×  faster than other tools. The advantage of HapTree-X's ability to use multiple lines of evidence, as well as to phase polyploid genomes in a single integrative framework, substantially grows as the amount of diverse data increases.


Assuntos
Desequilíbrio Alélico , Haplótipos , Análise de Sequência de RNA , Algoritmos , Bases de Dados Genéticas , Diploide , Humanos , Células K562 , Modelos Genéticos , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Poliploidia , RNA-Seq , Análise de Sequência de RNA/métodos , Análise de Sequência de RNA/estatística & dados numéricos
9.
Nat Commun ; 11(1): 4459, 2020 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-32900997

RESUMO

The origins of multicellular physiology are tied to evolution of gene expression. Genes can shift expression as organisms evolve, but how ancestral expression influences altered descendant expression is not well understood. To examine this, we amalgamate 1,903 RNA-seq datasets from 182 research projects, including 6 organs in 21 vertebrate species. Quality control eliminates project-specific biases, and expression shifts are reconstructed using gene-family-wise phylogenetic Ornstein-Uhlenbeck models. Expression shifts following gene duplication result in more drastic changes in expression properties than shifts without gene duplication. The expression properties are tightly coupled with protein evolutionary rate, depending on whether and how gene duplication occurred. Fluxes in expression patterns among organs are nonrandom, forming modular connections that are reshaped by gene duplication. Thus, if expression shifts, ancestral expression in some organs induces a strong propensity for expression in particular organs in descendants. Regardless of whether the shifts are adaptive or not, this supports a major role for what might be termed preadaptive pathways of gene expression evolution.


Assuntos
Evolução Molecular , Transcriptoma , Animais , Bases de Dados de Ácidos Nucleicos , Feminino , Duplicação Gênica , Humanos , Masculino , Modelos Genéticos , Família Multigênica , Especificidade de Órgãos , Filogenia , Proteínas/genética , RNA-Seq , Especificidade da Espécie , Vertebrados/classificação , Vertebrados/genética
10.
Nat Commun ; 11(1): 4469, 2020 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-32901013

RESUMO

Dissecting tumor heterogeneity is a key to understanding the complex mechanisms underlying drug resistance in cancers. The rich literature of pioneering studies on tumor heterogeneity analysis spurred a recent community-wide benchmark study that compares diverse modeling algorithms. Here we present FastClone, a top-performing algorithm in accuracy in this benchmark. FastClone improves over existing methods by allowing the deconvolution of subclones that have independent copy number variation events within the same chromosome regions. We characterize the behavior of FastClone in identifying subclones using stage III colon cancer primary tumor samples as well as simulated data. It achieves approximately 100-fold acceleration in computation for both simulated and patient data. The efficacy of FastClone will allow its application to large-scale data and clinical data, and facilitate personalized medicine in cancers.


Assuntos
Algoritmos , Variações do Número de Cópias de DNA , Neoplasias/genética , Neoplasias do Colo/genética , Neoplasias do Colo/patologia , Biologia Computacional/métodos , Simulação por Computador , DNA de Neoplasias/genética , Resistencia a Medicamentos Antineoplásicos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Genéticos , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Filogenia , Medicina de Precisão , Análise de Sequência de DNA
11.
Nat Commun ; 11(1): 4468, 2020 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-32901021

RESUMO

Speciation constrains the flow of genetic information between populations of sexually reproducing organisms. Gaining control over mechanisms of speciation would enable new strategies to manage wild populations of disease vectors, agricultural pests, and invasive species. Additionally, such control would provide safe biocontainment of transgenes and gene drives. Here, we demonstrate a general approach to create engineered genetic incompatibilities (EGIs) in the model insect Drosophila melanogaster. EGI couples a dominant lethal transgene with a recessive resistance allele. Strains homozygous for both elements are fertile and fecund when they mate with similarly engineered strains, but incompatible with wild-type strains that lack resistant alleles. EGI genotypes can also be tuned to cause hybrid lethality at different developmental life-stages. Further, we demonstrate that multiple orthogonal EGI strains of D. melanogaster can be engineered to be mutually incompatible with wild-type and with each other. EGI is a simple and robust approach in multiple sexually reproducing organisms.


Assuntos
Drosophila melanogaster/genética , Engenharia Genética/métodos , Especiação Genética , Animais , Animais Geneticamente Modificados , Cruzamentos Genéticos , Feminino , Genes de Insetos , Genes Letais , Genótipo , Hibridização Genética , Masculino , Modelos Genéticos , Transgenes
12.
Nat Commun ; 11(1): 4556, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32917883

RESUMO

Previous genetic studies have identified local population structure within the Netherlands; however their resolution is limited by use of unlinked markers and absence of external reference data. Here we apply advanced haplotype sharing methods (ChromoPainter/fineSTRUCTURE) to study fine-grained population genetic structure and demographic change across the Netherlands using genome-wide single nucleotide polymorphism data (1,626 individuals) with associated geography (1,422 individuals). We identify 40 haplotypic clusters exhibiting strong north/south variation and fine-scale differentiation within provinces. Clustering is tied to country-wide ancestry gradients from neighbouring lands and to locally restricted gene flow across major Dutch rivers. North-south structure is temporally stable, with west-east differentiation more transient, potentially influenced by migrations during the middle ages. Despite superexponential population growth, regional demographic estimates reveal population crashes contemporaneous with the Black Death. Within Dutch and international data, GWAS incorporating fine-grained haplotypic covariates are less confounded than standard methods.


Assuntos
Grupos Étnicos/genética , Genética Populacional , Estudo de Associação Genômica Ampla , Estudos de Casos e Controles , Análise por Conglomerados , Emigração e Imigração , Grupo com Ancestrais do Continente Europeu/genética , Fluxo Gênico , Variação Genética/genética , Genoma , Geografia , Haplótipos , Humanos , Pessoa de Meia-Idade , Modelos Genéticos , Países Baixos , Polimorfismo de Nucleotídeo Único
13.
Nat Commun ; 11(1): 4572, 2020 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-32917907

RESUMO

Undomesticated wild species, crop wild relatives, and landraces represent sources of variation for wheat improvement to address challenges from climate change and the growing human population. Here, we study 56,342 domesticated hexaploid, 18,946 domesticated tetraploid and 3,903 crop wild relatives in a massive-scale genotyping and diversity analysis. Using DArTseqTM technology, we identify more than 300,000 high-quality SNPs and SilicoDArT markers and align them to three reference maps: the IWGSC RefSeq v1.0 genome assembly, the durum wheat genome assembly (cv. Svevo), and the DArT genetic map. On average, 72% of the markers are uniquely placed on these maps and 50% are linked to genes. The analysis reveals landraces with unexplored diversity and genetic footprints defined by regions under selection. This provides fertile ground to develop wheat varieties of the future by exploring specific gene or chromosome regions and identifying germplasm conserving allelic diversity missing in current breeding programs.


Assuntos
Variação Genética , Genoma de Planta , Triticum/genética , Alelos , Domesticação , Genótipo , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Alinhamento de Sequência , Tetraploidia
14.
Nat Commun ; 11(1): 4897, 2020 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-32994415

RESUMO

Soil microbial respiration is an important source of uncertainty in projecting future climate and carbon (C) cycle feedbacks. However, its feedbacks to climate warming and underlying microbial mechanisms are still poorly understood. Here we show that the temperature sensitivity of soil microbial respiration (Q10) in a temperate grassland ecosystem persistently decreases by 12.0 ± 3.7% across 7 years of warming. Also, the shifts of microbial communities play critical roles in regulating thermal adaptation of soil respiration. Incorporating microbial functional gene abundance data into a microbially-enabled ecosystem model significantly improves the modeling performance of soil microbial respiration by 5-19%, and reduces model parametric uncertainty by 55-71%. In addition, modeling analyses show that the microbial thermal adaptation can lead to considerably less heterotrophic respiration (11.6 ± 7.5%), and hence less soil C loss. If such microbially mediated dampening effects occur generally across different spatial and temporal scales, the potential positive feedback of soil microbial respiration in response to climate warming may be less than previously predicted.


Assuntos
Carbono/análise , Metagenoma/genética , Microbiota/fisiologia , Microbiologia do Solo , Solo/química , Aclimatação/genética , Archaea/genética , Archaea/isolamento & purificação , Archaea/metabolismo , Bactérias/genética , Bactérias/isolamento & purificação , Bactérias/metabolismo , Carbono/metabolismo , Ciclo do Carbono , Celulose/metabolismo , DNA Ambiental/genética , DNA Ambiental/isolamento & purificação , Fungos/genética , Fungos/isolamento & purificação , Fungos/metabolismo , Aquecimento Global , Pradaria , Temperatura Alta/efeitos adversos , Metagenômica , Modelos Genéticos , Raízes de Plantas/química , Poaceae/química
15.
Nat Commun ; 11(1): 4758, 2020 09 21.
Artigo em Inglês | MEDLINE | ID: mdl-32958811

RESUMO

Genetic programs operating in a history-dependent fashion are ubiquitous in nature and govern sophisticated processes such as development and differentiation. The ability to systematically and predictably encode such programs would advance the engineering of synthetic organisms and ecosystems with rich signal processing abilities. Here we implement robust, scalable history-dependent programs by distributing the computational labor across a cellular population. Our design is based on standardized recombinase-driven DNA scaffolds expressing different genes according to the order of occurrence of inputs. These multicellular computing systems are highly modular, do not require cell-cell communication channels, and any program can be built by differential composition of strains containing well-characterized logic scaffolds. We developed automated workflows that researchers can use to streamline program design and optimization. We anticipate that the history-dependent programs presented here will support many applications using cellular populations for material engineering, biomanufacturing and healthcare.


Assuntos
Modelos Genéticos , Biologia Sintética/métodos , Fenômenos Fisiológicos Celulares/genética , DNA/genética , DNA/metabolismo , Lógica , Recombinases/genética , Recombinases/metabolismo , Software , Fluxo de Trabalho
16.
Front Immunol ; 11: 1664, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32754161

RESUMO

The rapidly spreading, highly contagious and pathogenic SARS-coronavirus 2 (SARS-CoV-2) associated Coronavirus Disease 2019 (COVID-19) has been declared as a pandemic by the World Health Organization (WHO). The novel 2019 SARS-CoV-2 enters the host cell by binding of the viral surface spike glycoprotein (S-protein) to cellular angiotensin converting enzyme 2 (ACE2) receptor. The virus specific molecular interaction with the host cell represents a promising therapeutic target for identifying SARS-CoV-2 antiviral drugs. The repurposing of drugs can provide a rapid and potential cure toward exponentially expanding COVID-19. Thereto, high throughput virtual screening approach was used to investigate FDA approved LOPAC library drugs against both the receptor binding domain of spike protein (S-RBD) and ACE2 host cell receptor. Primary screening identified a few promising molecules for both the targets, which were further analyzed in details by their binding energy, binding modes through molecular docking, dynamics and simulations. Evidently, GR 127935 hydrochloride hydrate, GNF-5, RS504393, TNP, and eptifibatide acetate were found binding to virus binding motifs of ACE2 receptor. Additionally, KT203, BMS195614, KT185, RS504393, and GSK1838705A were identified to bind at the receptor binding site on the viral S-protein. These identified molecules may effectively assist in controlling the rapid spread of SARS-CoV-2 by not only potentially inhibiting the virus at entry step but are also hypothesized to act as anti-inflammatory agents, which could impart relief in lung inflammation. Timely identification and determination of an effective drug to combat and tranquilize the COVID-19 global crisis is the utmost need of hour. Further, prompt in vivo testing to validate the anti-SARS-CoV-2 inhibition efficiency by these molecules could save lives is justified.


Assuntos
Betacoronavirus/fisiologia , Simulação por Computador , Infecções por Coronavirus/tratamento farmacológico , Reposicionamento de Medicamentos/métodos , Pneumonia Viral/tratamento farmacológico , Interface Usuário-Computador , Internalização do Vírus/efeitos dos fármacos , Anti-Inflamatórios/uso terapêutico , Sítios de Ligação , Infecções por Coronavirus/virologia , Genoma Viral/genética , Humanos , Modelos Genéticos , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Pandemias , Peptidil Dipeptidase A/química , Peptidil Dipeptidase A/metabolismo , Pneumonia Viral/virologia , Ligação Proteica , Domínios Proteicos , Receptores Virais/metabolismo , Glicoproteína da Espícula de Coronavírus/antagonistas & inibidores , Glicoproteína da Espícula de Coronavírus/química , Ligação Viral
17.
Nat Commun ; 11(1): 4020, 2020 08 11.
Artigo em Inglês | MEDLINE | ID: mdl-32782262

RESUMO

While variance components analysis has emerged as a powerful tool in complex trait genetics, existing methods for fitting variance components do not scale well to large-scale datasets of genetic variation. Here, we present a method for variance components analysis that is accurate and efficient: capable of estimating one hundred variance components on a million individuals genotyped at a million SNPs in a few hours. We illustrate the utility of our method in estimating and partitioning variation in a trait explained by genotyped SNPs (SNP-heritability). Analyzing 22 traits with genotypes from 300,000 individuals across about 8 million common and low frequency SNPs, we observe that per-allele squared effect size increases with decreasing minor allele frequency (MAF) and linkage disequilibrium (LD) consistent with the action of negative selection. Partitioning heritability across 28 functional annotations, we observe enrichment of heritability in FANTOM5 enhancers in asthma, eczema, thyroid and autoimmune disorders.


Assuntos
Variação Genética/genética , Genoma Humano/genética , Modelos Genéticos , Alelos , Frequência do Gene , Genótipo , Humanos , Desequilíbrio de Ligação , Herança Multifatorial/genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável
18.
Nat Commun ; 11(1): 4208, 2020 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-32826890

RESUMO

As a key variance partitioning tool, linear mixed models (LMMs) using genome-based restricted maximum likelihood (GREML) allow both fixed and random effects. Classic LMMs assume independence between random effects, which can be violated, causing bias. Here we introduce a generalized GREML, named CORE GREML, that explicitly estimates the covariance between random effects. Using extensive simulations, we show that CORE GREML outperforms the conventional GREML, providing variance and covariance estimates free from bias due to correlated random effects. Applying CORE GREML to UK Biobank data, we find, for example, that the transcriptome, imputed using genotype data, explains a significant proportion of phenotypic variance for height (0.15, p-value = 1.5e-283), and that these transcriptomic effects correlate with the genomic effects (genome-transcriptome correlation = 0.35, p-value = 1.2e-14). We conclude that the covariance between random effects is a key parameter for estimation, especially when partitioning phenotypic variance by multi-omics layers.


Assuntos
Genoma , Genótipo , Herança Multifatorial , Fenótipo , Transcriptoma , Biologia Computacional , Expressão Gênica , Estudos de Associação Genética , Genômica , Humanos , Funções Verossimilhança , Modelos Lineares , Modelos Genéticos , Polimorfismo de Nucleotídeo Único
19.
BMC Bioinformatics ; 21(1): 339, 2020 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-32736513

RESUMO

BACKGROUND: It has been widely accepted that long non-coding RNAs (lncRNAs) play important roles in the development and progression of human diseases. Many association prediction models have been proposed for predicting lncRNA functions and identifying potential lncRNA-disease associations. Nevertheless, among them, little effort has been attempted to measure lncRNA functional similarity, which is an essential part of association prediction models. RESULTS: In this study, we presented an lncRNA functional similarity calculation model, IDSSIM for short, based on an improved disease semantic similarity method, highlight of which is the introduction of information content contribution factor into the semantic value calculation to take into account both the hierarchical structures of disease directed acyclic graphs and the disease specificities. IDSSIM and three state-of-the-art models, i.e., LNCSIM1, LNCSIM2, and ILNCSIM, were evaluated by applying their disease semantic similarity matrices and the lncRNA functional similarity matrices, as well as corresponding matrices of human lncRNA-disease associations coming from either lncRNADisease database or MNDR database, into an association prediction method WKNKN for lncRNA-disease association prediction. In addition, case studies of breast cancer and adenocarcinoma were also performed to validate the effectiveness of IDSSIM. CONCLUSIONS: Results demonstrated that in terms of ROC curves and AUC values, IDSSIM is superior to compared models, and can improve accuracy of disease semantic similarity effectively, leading to increase the association prediction ability of the IDSSIM-WKNKN model; in terms of case studies, most of potential disease-associated lncRNAs predicted by IDSSIM can be confirmed by databases and literatures, implying that IDSSIM can serve as a promising tool for predicting lncRNA functions, identifying potential lncRNA-disease associations, and pre-screening candidate lncRNAs to perform biological experiments. The IDSSIM code, all experimental data and prediction results are available online at https://github.com/CDMB-lab/IDSSIM .


Assuntos
Algoritmos , Biologia Computacional/métodos , Doença/genética , Modelos Genéticos , RNA Longo não Codificante/genética , Semântica , Adenocarcinoma/genética , Área Sob a Curva , Neoplasias da Mama/genética , Bases de Dados Genéticas , Feminino , Humanos , Curva ROC
20.
Nat Commun ; 11(1): 3877, 2020 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-32747659

RESUMO

Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We show that HE2RNA, a model based on the integration of multiple data modes, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without expert annotation. Through its interpretable design, HE2RNA provides virtual spatialization of gene expression, as validated by CD3- and CD20-staining on an independent dataset. The transcriptomic representation learned by HE2RNA can also be transferred on other datasets, even of small size, to increase prediction performance for specific molecular phenotypes. We illustrate the use of this approach in clinical diagnosis purposes such as the identification of tumors with microsatellite instability.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Regulação Neoplásica da Expressão Gênica , Processamento de Imagem Assistida por Computador/métodos , Neoplasias/genética , RNA-Seq/métodos , Algoritmos , Perfilação da Expressão Gênica/métodos , Humanos , Instabilidade de Microssatélites , Modelos Genéticos , Neoplasias/diagnóstico , Neoplasias/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA