Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37220903

RESUMO

MOTIVATION: Developing new crop varieties with superior performance is highly important to ensure robust and sustainable global food security. The speed of variety development is limited by long field cycles and advanced generation selections in plant breeding programs. While methods to predict yield from genotype or phenotype data have been proposed, improved performance and integrated models are needed. RESULTS: We propose a machine learning model that leverages both genotype and phenotype measurements by fusing genetic variants with multiple data sources collected by unmanned aerial systems. We use a deep multiple instance learning framework with an attention mechanism that sheds light on the importance given to each input during prediction, enhancing interpretability. Our model reaches 0.754 ± 0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions; a 34.8% improvement over the genotype-only linear baseline (0.559 ± 0.050). We further predict yield on new lines in an unseen environment using only genotypes, obtaining a prediction accuracy of 0.386 ± 0.010, a 13.5% improvement over the linear baseline. Our multi-modal deep learning architecture efficiently accounts for plant health and environment, distilling the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training therefore promise to improve breeding programs, ultimately speeding up delivery of improved varieties. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/BorgwardtLab/PheGeMIL (code) and https://doi.org/doi:10.5061/dryad.kprr4xh5p (data).


Assuntos
Aprendizado Profundo , Fenômica , Triticum/genética , Melhoramento Vegetal/métodos , Seleção Genética , Fenótipo , Genótipo , Genômica/métodos , Grão Comestível/genética
2.
NPJ Regen Med ; 8(1): 4, 2023 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-36639373

RESUMO

The proper regulation of muscle stem cell (MuSC) fate by cues from the niche is essential for regeneration of skeletal muscle. How pro-regenerative niche factors control the dynamics of MuSC fate decisions remains unknown due to limitations of population-level endpoint assays. To address this knowledge gap, we developed a dual fluorescence imaging time lapse (Dual-FLIT) microscopy approach that leverages machine learning classification strategies to track single cell fate decisions with high temporal resolution. Using two fluorescent reporters that read out maintenance of stemness and myogenic commitment, we constructed detailed lineage trees for individual MuSCs and their progeny, classifying each division event as symmetric self-renewing, asymmetric, or symmetric committed. Our analysis reveals that treatment with the lipid metabolite, prostaglandin E2 (PGE2), accelerates the rate of MuSC proliferation over time, while biasing division events toward symmetric self-renewal. In contrast, the IL6 family member, Oncostatin M (OSM), decreases the proliferation rate after the first generation, while blocking myogenic commitment. These insights into the dynamics of MuSC regulation by niche cues were uniquely enabled by our Dual-FLIT approach. We anticipate that similar binary live cell readouts derived from Dual-FLIT will markedly expand our understanding of how niche factors control tissue regeneration in real time.

3.
Pharmacoecon Open ; 7(1): 149-161, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36703022

RESUMO

OBJECTIVE: This study aimed to map the Insomnia Severity Index (ISI) to the EQ-5D-3L utility values from a UK perspective. METHODS: Source data were derived from the 2020 National Health and Wellness Survey (NHWS) for France, Germany, Italy, Spain, the UK and the US. Ordinary least squares regression, generalised linear model (GLM), censored least absolute deviation, and adjusted limited dependent variable mixture model (ALDVMM) were employed to explore the relationship between ISI total summary score and EQ-5D utility while accounting for adjustment covariates derived from the NHWS. Fitting performance was assessed using standard metrics, including mean-squared error (MSE) and coefficient of determination (R2). RESULTS: A total of 17,955 respondent observations were included, with a mean ISI score of 12.12 ± 5.32 and a mean EQ-5D-3L utility (UK tariff) of 0.71 ± 0.23. GLM gamma-log and ALDVMM were the two best performing models. The ALDVMM had better fitting performance (R2 = 0.320, MSE 0.0347) than the GLM gamma-log (R2 = 0.303, MSE 0.0353); in train-test split-sample validation, ALDVMM also slightly outperformed the GLM gamma-log model, with an MSE of 0.0351 versus 0.0355. Based on fitting performance, ALDVMM and GLM gamma-log were the preferred models. CONCLUSIONS: In the absence of preference-based measures, this study provides an updated mapping algorithm for estimating EQ-5D-3L utilities from the ISI summary total score. This new mapping not only draws its strengths from the use of a large international dataset but also the incorporation of adjustment variables (including sociodemographic and general health characteristics) to reduce the effects of confounders.

4.
Bioinformatics ; 38(13): 3454-3461, 2022 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-35639661

RESUMO

MOTIVATION: Protein design has become increasingly important for medical and biotechnological applications. Because of the complex mechanisms underlying protein formation, the creation of a novel protein requires tedious and time-consuming computational or experimental protocols. At the same time, machine learning has enabled the solving of complex problems by leveraging large amounts of available data, more recently with great improvements on the domain of generative modeling. Yet, generative models have mainly been applied to specific sub-problems of protein design. RESULTS: Here, we approach the problem of general-purpose protein design conditioned on functional labels of the hierarchical Gene Ontology. Since a canonical way to evaluate generative models in this domain is missing, we devise an evaluation scheme of several biologically and statistically inspired metrics. We then develop the conditional generative adversarial network ProteoGAN and show that it outperforms several classic and more recent deep-learning baselines for protein sequence generation. We further give insights into the model by analyzing hyperparameters and ablation baselines. Lastly, we hypothesize that a functionally conditional model could generate proteins with novel functions by combining labels and provide first steps into this direction of research. AVAILABILITY AND IMPLEMENTATION: The code and data underlying this article are available on GitHub at https://github.com/timkucera/proteogan, and can be accessed with doi:10.5281/zenodo.6591379. SUPPLEMENTARY INFORMATION: Supplemental data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Proteínas , Proteínas/metabolismo , Ontologia Genética
5.
Nat Commun ; 12(1): 3282, 2021 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-34078900

RESUMO

Bacterial processes necessary for adaption to stressful host environments are potential targets for new antimicrobials. Here, we report large-scale transcriptomic analyses of 32 human bacterial pathogens grown under 11 stress conditions mimicking human host environments. The potential relevance of the in vitro stress conditions and responses is supported by comparisons with available in vivo transcriptomes of clinically important pathogens. Calculation of a probability score enables comparative cross-microbial analyses of the stress responses, revealing common and unique regulatory responses to different stresses, as well as overlapping processes participating in different stress responses. We identify conserved and species-specific 'universal stress responders', that is, genes showing altered expression in multiple stress conditions. Non-coding RNAs are involved in a substantial proportion of the responses. The data are collected in a freely available, interactive online resource (PATHOgenex).


Assuntos
Regulação Bacteriana da Expressão Gênica , Bactérias Gram-Negativas/genética , Bactérias Gram-Positivas/genética , RNA Bacteriano/genética , Estresse Fisiológico/genética , Transcriptoma , Adaptação Fisiológica/genética , Atlas como Assunto , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Genes Bacterianos , Bactérias Gram-Negativas/classificação , Bactérias Gram-Negativas/metabolismo , Bactérias Gram-Negativas/patogenicidade , Bactérias Gram-Positivas/classificação , Bactérias Gram-Positivas/metabolismo , Bactérias Gram-Positivas/patogenicidade , Interações entre Hospedeiro e Microrganismos/genética , Humanos , Internet , Microbiota/genética , Filogenia , RNA Bacteriano/metabolismo
6.
Nucleic Acids Res ; 48(D1): D1063-D1068, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31642487

RESUMO

Genome-wide association studies (GWAS) are integral for studying genotype-phenotype relationships and gaining a deeper understanding of the genetic architecture underlying trait variation. A plethora of genetic associations between distinct loci and various traits have been successfully discovered and published for the model plant Arabidopsis thaliana. This success and the free availability of full genomes and phenotypic data for more than 1,000 different natural inbred lines led to the development of several data repositories. AraPheno (https://arapheno.1001genomes.org) serves as a central repository of population-scale phenotypes in A. thaliana, while the AraGWAS Catalog (https://aragwas.1001genomes.org) provides a publicly available, manually curated and standardized collection of marker-trait associations for all available phenotypes from AraPheno. In this major update, we introduce the next generation of both platforms, including new data, features and tools. We included novel results on associations between knockout-mutations and all AraPheno traits. Furthermore, AraPheno has been extended to display RNA-Seq data for hundreds of accessions, providing expression information for over 28 000 genes for these accessions. All data, including the imputed genotype matrix used for GWAS, are easily downloadable via the respective databases.


Assuntos
Arabidopsis/genética , Biologia Computacional , Bases de Dados Genéticas , Genoma de Planta , Estudo de Associação Genômica Ampla , Fenótipo , Biologia Computacional/métodos , Técnicas de Inativação de Genes , Estudo de Associação Genômica Ampla/métodos , Genótipo , Mutação , Locos de Características Quantitativas , Característica Quantitativa Herdável , Análise de Sequência de RNA , Navegador
7.
J Am Soc Nephrol ; 30(11): 2262-2274, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31653784

RESUMO

BACKGROUND: Patients on organ transplant waiting lists are evaluated for preexisting alloimmunity to minimize episodes of acute and chronic rejection by regularly monitoring for changes in alloimmune status. There are few studies on how alloimmunity changes over time in patients on kidney allograft waiting lists, and an apparent lack of research-based evidence supporting currently used monitoring intervals. METHODS: To investigate the dynamics of alloimmune responses directed at HLA antigens, we retrospectively evaluated data on anti-HLA antibodies measured by the single-antigen bead assay from 627 waitlisted patients who subsequently received a kidney transplant at University Hospital Zurich, Switzerland, between 2008 and 2017. Our analysis focused on a filtered dataset comprising 467 patients who had at least two assay measurements. RESULTS: Within the filtered dataset, we analyzed potential changes in mean fluorescence intensity values (reflecting bound anti-HLA antibodies) between consecutive measurements for individual patients in relation to the time interval between measurements. Using multiple approaches, we found no correlation between these two factors. However, when we stratified the dataset on the basis of documented previous immunizing events (transplant, pregnancy, or transfusion), we found significant differences in the magnitude of change in alloimmune status, especially among patients with a previous transplant versus patients without such a history. Further efforts to cluster patients according to statistical properties related to alloimmune status kinetics were unsuccessful, indicating considerable complexity in individual variability. CONCLUSIONS: Alloimmune kinetics in patients on a kidney transplant waiting list do not appear to be related to the interval between measurements, but are instead associated with alloimmunization history. This suggests that an individualized strategy for alloimmune status monitoring may be preferable to currently used intervals.


Assuntos
Antígenos HLA/imunologia , Isoanticorpos/análise , Transplante de Rim , Listas de Espera , Feminino , Humanos , Cinética , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Tempo
8.
Bioinformatics ; 34(17): i687-i696, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-30423082

RESUMO

Motivation: Methods based on summary statistics obtained from genome-wide association studies have gained considerable interest in genetics due to the computational cost and privacy advantages they present. Imputing missing summary statistics has therefore become a key procedure in many bioinformatics pipelines, but available solutions may rely on additional knowledge about the populations used in the original study and, as a result, may not always ensure feasibility or high accuracy of the imputation procedure. Results: We present ARDISS, a method to impute missing summary statistics in mixed-ethnicity cohorts through Gaussian Process Regression and automatic relevance determination. ARDISS is trained on an external reference panel and does not require information about allele frequencies of genotypes from the original study. Our method approximates the original GWAS population by a combination of samples from a reference panel relying exclusively on the summary statistics and without any external information. ARDISS successfully reconstructs the original composition of mixed-ethnicity cohorts and outperforms alternative solutions in terms of speed and imputation accuracy both for heterogeneous and homogeneous datasets. Availability and implementation: The proposed method is available at https://github.com/BorgwardtLab/ARDISS. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Etnicidade/genética , Estudos de Coortes , Frequência do Gene , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Software
9.
Nucleic Acids Res ; 46(D1): D1150-D1156, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29059333

RESUMO

The abundance of high-quality genotype and phenotype data for the model organism Arabidopsis thaliana enables scientists to study the genetic architecture of many complex traits at an unprecedented level of detail using genome-wide association studies (GWAS). GWAS have been a great success in A. thaliana and many SNP-trait associations have been published. With the AraGWAS Catalog (https://aragwas.1001genomes.org) we provide a publicly available, manually curated and standardized GWAS catalog for all publicly available phenotypes from the central A. thaliana phenotype repository, AraPheno. All GWAS have been recomputed on the latest imputed genotype release of the 1001 Genomes Consortium using a standardized GWAS pipeline to ensure comparability between results. The catalog includes currently 167 phenotypes and more than 222 000 SNP-trait associations with P < 10-4, of which 3887 are significantly associated using permutation-based thresholds. The AraGWAS Catalog can be accessed via a modern web-interface and provides various features to easily access, download and visualize the results and summary statistics across GWAS.


Assuntos
Arabidopsis/genética , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA