Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
J Exp Bot ; 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-38954539

RESUMEN

Linear mixed models (LMMs) are a commonly used method for genome-wide association studies (GWAS) that aim to detect associations between genetic markers and phenotypic measurements in a population of individuals while accounting for population structure and cryptic relatedness. In a standard GWAS, hundreds of thousands to millions of statistical tests are performed, requiring control for multiple hypothesis testing. Typically, static corrections that penalize the number of tests performed are used to control for the family-wise error rate, which is the probability of making at least one false positive. However, it has been shown that in practice this threshold is too conservative for normally distributed phenotypes and not stringent enough for non-normally distributed phenotypes. Therefore, permutation-based LMM approaches have recently been proposed to provide a more realistic threshold that takes phenotypic distributions into account. In this work, we will discuss the advantages of permutation-based GWAS approaches, including new simulations and results from a re-analysis of all publicly available Arabidopsis thaliana phenotypes from the AraPheno database.

2.
Sci Data ; 11(1): 109, 2024 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-38263173

RESUMEN

Sustainable weed management strategies are critical to feeding the world's population while preserving ecosystems and biodiversity. Therefore, site-specific weed control strategies based on automation are needed to reduce the additional time and effort required for weeding. Machine vision-based methods appear to be a promising approach for weed detection, but require high quality data on the species in a specific agricultural area. Here we present a dataset, the Moving Fields Weed Dataset (MFWD), which captures the growth of 28 weed species commonly found in sorghum and maize fields in Germany. A total of 94,321 images were acquired in a fully automated, high-throughput phenotyping facility to track over 5,000 individual plants at high spatial and temporal resolution. A rich set of manually curated ground truth information is also provided, which can be used not only for plant species classification, object detection and instance segmentation tasks, but also for multiple object tracking.

3.
Health Care Manag Sci ; 26(4): 785-806, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38015289

RESUMEN

Assigning inpatients to hospital beds impacts patient satisfaction and the workload of nurses and doctors. The assignment is subject to unknown inpatient arrivals, in particular for emergency patients. Hospitals, therefore, need to deal with uncertainty on actual bed requirements and potential shortage situations as bed capacities are limited. This paper develops a model and solution approach for solving the patient bed-assignment problem that is based on a machine learning (ML) approach to forecasting emergency patients. First, it contributes by improving the anticipation of emergency patients using ML approaches, incorporating weather data, time and dates, important local and regional events, as well as current and historical occupancy levels. Drawing on real-life data from a large case hospital, we were able to improve forecasting accuracy for emergency inpatient arrivals. We achieved up to 17% better root mean square error (RMSE) when using ML methods compared to a baseline approach relying on averages for historical arrival rates. We further show that the ML methods outperform time series forecasts. Second, we develop a new hyper-heuristic for solving real-life problem instances based on the pilot method and a specialized greedy look-ahead (GLA) heuristic. When applying the hyper-heuristic in test sets we were able to increase the objective function by up to 5.3% in comparison to the benchmark approach in [40]. A benchmark with a Genetic Algorithm shows also the superiority of the hyper-heuristic. Third, the combination of ML for emergency patient admission forecasting with advanced optimization through the hyper-heuristic allowed us to obtain an improvement of up to 3.3% on a real-life problem.


Asunto(s)
Servicio de Urgencia en Hospital , Hospitalización , Humanos , Hospitales , Admisión del Paciente , Aprendizaje Automático
4.
NAR Genom Bioinform ; 5(4): lqad087, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37829176

RESUMEN

Protein thermostability is important in many areas of biotechnology, including enzyme engineering and protein-hybrid optoelectronics. Ever-growing protein databases and information on stability at different temperatures allow the training of machine learning models to predict whether proteins are thermophilic. In silico predictions could reduce costs and accelerate the development process by guiding researchers to more promising candidates. Existing models for predicting protein thermophilicity rely mainly on features derived from physicochemical properties. Recently, modern protein language models that directly use sequence information have demonstrated superior performance in several tasks. In this study, we evaluate the usefulness of protein language model embeddings for thermophilicity prediction with ProLaTherm, a Protein Language model-based Thermophilicity predictor. ProLaTherm significantly outperforms all feature-, sequence- and literature-based comparison partners on multiple evaluation metrics. In terms of the Matthew's correlation coefficient, ProLaTherm outperforms the second-best competitor by 18.1% in a nested cross-validation setup. Using proteins from species not overlapping with species from the training data, ProLaTherm outperforms all competitors by at least 9.7%. On these data, it misclassified only one nonthermophilic protein as thermophilic. Furthermore, it correctly identified 97.4% of all thermophilic proteins in our test set with an optimal growth temperature above 70°C.

5.
Plant Methods ; 19(1): 87, 2023 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-37608384

RESUMEN

BACKGROUND: Efficient and site-specific weed management is a critical step in many agricultural tasks. Image captures from drones and modern machine learning based computer vision methods can be used to assess weed infestation in agricultural fields more efficiently. However, the image quality of the captures can be affected by several factors, including motion blur. Image captures can be blurred because the drone moves during the image capturing process, e.g. due to wind pressure or camera settings. These influences complicate the annotation of training and test samples and can also lead to reduced predictive power in segmentation and classification tasks. RESULTS: In this study, we propose DeBlurWeedSeg, a combined deblurring and segmentation model for weed and crop segmentation in motion blurred images. For this purpose, we first collected a new dataset of matching sharp and naturally blurred image pairs of real sorghum and weed plants from drone images of the same agricultural field. The data was used to train and evaluate the performance of DeBlurWeedSeg on both sharp and blurred images of a hold-out test-set. We show that DeBlurWeedSeg outperforms a standard segmentation model that does not include an integrated deblurring step, with a relative improvement of [Formula: see text] in terms of the Sørensen-Dice coefficient. CONCLUSION: Our combined deblurring and segmentation model DeBlurWeedSeg is able to accurately segment weeds from sorghum and background, in both sharp as well as motion blurred drone captures. This has high practical implications, as lower error rates in weed and crop segmentation could lead to better weed control, e.g. when using robots for mechanical weed removal.

6.
Bioinform Adv ; 3(1): vbad035, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37066135

RESUMEN

Summary: Predicting complex traits from genotypic information is a major challenge in various biological domains. With easyPheno, we present a comprehensive Python framework enabling the rigorous training, comparison and analysis of phenotype predictions for a variety of different models, ranging from common genomic selection approaches over classical machine learning and modern deep learning-based techniques. Our framework is easy-to-use, also for non-programming-experts, and includes an automatic hyperparameter search using state-of-the-art Bayesian optimization. Moreover, easyPheno provides various benefits for bioinformaticians developing new prediction models. easyPheno enables to quickly integrate novel models and functionalities in a reliable framework and to benchmark against various integrated prediction models in a comparable setup. In addition, the framework allows the assessment of newly developed prediction models under pre-defined settings using simulated data. We provide a detailed documentation with various hands-on tutorials and videos explaining the usage of easyPheno to novice users. Availability and implementation: easyPheno is publicly available at https://github.com/grimmlab/easyPheno and can be easily installed as Python package via https://pypi.org/project/easypheno/ or using Docker. A comprehensive documentation including various tutorials complemented with videos can be found at https://easypheno.readthedocs.io/. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

7.
Front Plant Sci ; 13: 932512, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36407627

RESUMEN

Genomic selection is an integral tool for breeders to accurately select plants directly from genotype data leading to faster and more resource-efficient breeding programs. Several prediction methods have been established in the last few years. These range from classical linear mixed models to complex non-linear machine learning approaches, such as Support Vector Regression, and modern deep learning-based architectures. Many of these methods have been extensively evaluated on different crop species with varying outcomes. In this work, our aim is to systematically compare 12 different phenotype prediction models, including basic genomic selection methods to more advanced deep learning-based techniques. More importantly, we assess the performance of these models on simulated phenotype data as well as on real-world data from Arabidopsis thaliana and two breeding datasets from soy and corn. The synthetic phenotypic data allow us to analyze all prediction models and especially the selected markers under controlled and predefined settings. We show that Bayes B and linear regression models with sparsity constraints perform best under different simulation settings with respect to explained variance. Further, we can confirm results from other studies that there is no superiority of more complex neural network-based architectures for phenotype prediction compared to well-established methods. However, on real-world data, for which several prediction models yield comparable results with slight advantages for Elastic Net, this picture is less clear, suggesting that there is a lot of room for future research.

8.
Sci Data ; 9(1): 735, 2022 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-36450875

RESUMEN

Genomic studies often attempt to link natural genetic variation with important phenotypic variation. To succeed, robust and reliable phenotypic data, as well as curated genomic assemblies, are required. Wild sunflowers, originally from North America, are adapted to diverse and often extreme environments and have historically been a widely used model plant system for the study of population genomics, adaptation, and speciation. Moreover, cultivated sunflower, domesticated from a wild relative (Helianthus annuus) is a global oil crop, ranking fourth in production of vegetable oils worldwide. Public availability of data resources both for the plant research community and for the associated agricultural sector, are extremely valuable. We have created HeliantHOME ( http://www.helianthome.org ), a curated, public, and interactive database of phenotypes including developmental, structural and environmental ones, obtained from a large collection of both wild and cultivated sunflower individuals. Additionally, the database is enriched with external genomic data and results of genome-wide association studies. Finally, being a community open-source platform, HeliantHOME is expected to expand as new knowledge and resources become available.


Asunto(s)
Genómica , Helianthus , Bases de Datos Factuales , Helianthus/genética , Fenotipo
9.
NAR Genom Bioinform ; 4(3): lqac074, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36186922

RESUMEN

Transcriptional-translational coupling is accepted to be a fundamental mechanism of gene expression in prokaryotes and therefore has been analyzed in detail. However, the underlying genomic architecture of the expression machinery has not been well investigated so far. In this study, we established a bioinformatics pipeline to systematically investigated >1800 bacterial genomes for the abundance of transcriptional and translational associated genes clustered in distinct gene cassettes. We identified three highly frequent cassettes containing transcriptional and translational genes, i.e. rplk-nusG (gene cassette 1; in 553 genomes), rpoA-rplQ-rpsD-rpsK-rpsM (gene cassette 2; in 656 genomes) and nusA-infB (gene cassette 3; in 877 genomes). Interestingly, each of the three cassettes harbors a gene (nusG, rpsD and nusA) encoding a protein which links transcription and translation in bacteria. The analyses suggest an enrichment of these cassettes in pathogenic bacterial phyla with >70% for cassette 3 (i.e. Neisseria, Salmonella and Escherichia) and >50% for cassette 1 (i.e. Treponema, Prevotella, Leptospira and Fusobacterium) and cassette 2 (i.e. Helicobacter, Campylobacter, Treponema and Prevotella). These insights form the basis to analyze the transcriptional regulatory mechanisms orchestrating transcriptional-translational coupling and might open novel avenues for future biotechnological approaches.

10.
Bioinformatics ; 38(Suppl_2): ii5-ii12, 2022 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-36124808

RESUMEN

MOTIVATION: Genome-wide association studies (GWAS) are an integral tool for studying the architecture of complex genotype and phenotype relationships. Linear mixed models (LMMs) are commonly used to detect associations between genetic markers and a trait of interest, while at the same time allowing to account for population structure and cryptic relatedness. Assumptions of LMMs include a normal distribution of the residuals and that the genetic markers are independent and identically distributed-both assumptions are often violated in real data. Permutation-based methods can help to overcome some of these limitations and provide more realistic thresholds for the discovery of true associations. Still, in practice, they are rarely implemented due to the high computational complexity. RESULTS: We propose permGWAS, an efficient LMM reformulation based on 4D tensors that can provide permutation-based significance thresholds. We show that our method outperforms current state-of-the-art LMMs with respect to runtime and that permutation-based thresholds have lower false discovery rates for skewed phenotypes compared to the commonly used Bonferroni threshold. Furthermore, using permGWAS we re-analyzed more than 500 Arabidopsis thaliana phenotypes with 100 permutations each in less than 8 days on a single GPU. Our re-analyses suggest that applying a permutation-based threshold can improve and refine the interpretation of GWAS results. AVAILABILITY AND IMPLEMENTATION: permGWAS is open-source and publicly available on GitHub for download: https://github.com/grimmlab/permGWAS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Estudio de Asociación del Genoma Completo , Marcadores Genéticos , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Modelos Lineales , Fenotipo
11.
Comput Struct Biotechnol J ; 20: 2699-2712, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35685359

RESUMEN

Physically interacting proteins form macromolecule complexes that drive diverse cellular processes. Advances in experimental techniques that capture interactions between proteins provide us with protein-protein interaction (PPI) networks from several model organisms. These datasets have enabled the prediction and other computational analyses of protein complexes. Here we provide a systematic review of the state-of-the-art algorithms for protein complex prediction from PPI networks proposed in the past two decades. The existing approaches that solve this problem are categorized into three groups, including: cluster-quality-based, node affinity-based, and network embedding-based approaches, and we compare and contrast the advantages and disadvantages. We further include a comparative analysis by computing the performance of eighteen methods based on twelve well-established performance measures on four widely used benchmark protein-protein interaction networks. Finally, the limitations and drawbacks of both, current data and approaches, along with the potential solutions in this field are discussed, with emphasis on the points that pave the way for future research efforts in this field.

13.
Brief Bioinform ; 22(1): 178-193, 2021 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-31848574

RESUMEN

Analyzing the microbiome of diverse species and environments using next-generation sequencing techniques has significantly enhanced our understanding on metabolic, physiological and ecological roles of environmental microorganisms. However, the analysis of the microbiome is affected by experimental conditions (e.g. sequencing errors and genomic repeats) and computationally intensive and cumbersome downstream analysis (e.g. quality control, assembly, binning and statistical analyses). Moreover, the introduction of new sequencing technologies and protocols led to a flood of new methodologies, which also have an immediate effect on the results of the analyses. The aim of this work is to review the most important workflows for 16S rRNA sequencing and shotgun and long-read metagenomics, as well as to provide best-practice protocols on experimental design, sample processing, sequencing, assembly, binning, annotation and visualization. To simplify and standardize the computational analysis, we provide a set of best-practice workflows for 16S rRNA and metagenomic sequencing data (available at https://github.com/grimmlab/MicrobiomeBestPracticeReview).


Asunto(s)
Metagenómica/métodos , Microbiota/genética , Guías de Práctica Clínica como Asunto , Animales , Código de Barras del ADN Taxonómico/métodos , Código de Barras del ADN Taxonómico/normas , Humanos , Metagenómica/normas , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de ADN/normas
14.
Bioinformatics ; 37(1): 57-65, 2021 04 09.
Artículo en Inglés | MEDLINE | ID: mdl-32573681

RESUMEN

MOTIVATION: Correlating genetic loci with a disease phenotype is a common approach to improve our understanding of the genetics underlying complex diseases. Standard analyses mostly ignore two aspects, namely genetic heterogeneity and interactions between loci. Genetic heterogeneity, the phenomenon that genetic variants at different loci lead to the same phenotype, promises to increase statistical power by aggregating low-signal variants. Incorporating interactions between loci results in a computational and statistical bottleneck due to the vast amount of candidate interactions. RESULTS: We propose a novel method SiNIMin that addresses these two aspects by finding pairs of interacting genes that are, upon combination, associated with a phenotype of interest under a model of genetic heterogeneity. We guide the interaction search using biological prior knowledge in the form of protein-protein interaction networks. Our method controls type I error and outperforms state-of-the-art methods with respect to statistical power. Additionally, we find novel associations for multiple Arabidopsis thaliana phenotypes, and, with an adapted variant of SiNIMin, for a study of rare variants in migraine patients. AVAILABILITY AND IMPLEMENTATION: Code available at https://github.com/BorgwardtLab/SiNIMin. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Heterogeneidad Genética , Mapas de Interacción de Proteínas , Sitios Genéticos , Humanos , Fenotipo , Programas Informáticos
15.
Plant Methods ; 16(1): 157, 2020 Dec 22.
Artículo en Inglés | MEDLINE | ID: mdl-33353559

RESUMEN

BACKGROUND: Assessment of seed germination is an essential task for seed researchers to measure the quality and performance of seeds. Usually, seed assessments are done manually, which is a cumbersome, time consuming and error-prone process. Classical image analyses methods are not well suited for large-scale germination experiments, because they often rely on manual adjustments of color-based thresholds. We here propose a machine learning approach using modern artificial neural networks with region proposals for accurate seed germination detection and high-throughput seed germination experiments. RESULTS: We generated labeled imaging data of the germination process of more than 2400 seeds for three different crops, Zea mays (maize), Secale cereale (rye) and Pennisetum glaucum (pearl millet), with a total of more than 23,000 images. Different state-of-the-art convolutional neural network (CNN) architectures with region proposals have been trained using transfer learning to automatically identify seeds within petri dishes and to predict whether the seeds germinated or not. Our proposed models achieved a high mean average precision (mAP) on a hold-out test data set of approximately 97.9%, 94.2% and 94.3% for Zea mays, Secale cereale and Pennisetum glaucum respectively. Further, various single-value germination indices, such as Mean Germination Time and Germination Uncertainty, can be computed more accurately with the predictions of our proposed model compared to manual countings. CONCLUSION: Our proposed machine learning-based method can help to speed up the assessment of seed germination experiments for different seed cultivars. It has lower error rates and a higher performance compared to conventional and manual methods, leading to more accurate germination indices and quality assessments of seeds.

16.
Nucleic Acids Res ; 48(D1): D1063-D1068, 2020 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-31642487

RESUMEN

Genome-wide association studies (GWAS) are integral for studying genotype-phenotype relationships and gaining a deeper understanding of the genetic architecture underlying trait variation. A plethora of genetic associations between distinct loci and various traits have been successfully discovered and published for the model plant Arabidopsis thaliana. This success and the free availability of full genomes and phenotypic data for more than 1,000 different natural inbred lines led to the development of several data repositories. AraPheno (https://arapheno.1001genomes.org) serves as a central repository of population-scale phenotypes in A. thaliana, while the AraGWAS Catalog (https://aragwas.1001genomes.org) provides a publicly available, manually curated and standardized collection of marker-trait associations for all available phenotypes from AraPheno. In this major update, we introduce the next generation of both platforms, including new data, features and tools. We included novel results on associations between knockout-mutations and all AraPheno traits. Furthermore, AraPheno has been extended to display RNA-Seq data for hundreds of accessions, providing expression information for over 28 000 genes for these accessions. All data, including the imputed genotype matrix used for GWAS, are easily downloadable via the respective databases.


Asunto(s)
Arabidopsis/genética , Biología Computacional , Bases de Datos Genéticas , Genoma de Planta , Estudio de Asociación del Genoma Completo , Fenotipo , Biología Computacional/métodos , Técnicas de Inactivación de Genes , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Mutación , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Análisis de Secuencia de ARN , Navegador Web
17.
Oncotarget ; 10(30): 2911-2920, 2019 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-31080561

RESUMEN

Non-small cell lung cancer (NSCLC) is the most prevalent form of lung cancer and its molecular landscape has been extensively studied. The most common genetic alterations in NSCLC are mutations within the epidermal growth factor receptor (EGFR) gene, with frequencies between 10-40%. There are several molecular targeted therapies for patients harboring these mutations. Liquid biopsies constitute a flexible approach to monitor these mutations in real time as opposed to tissue biopsies that represent a single snap-shot in time. However, interrogating cell free DNA (cfDNA) has inherent biological limitations, especially at early or localized disease stages, where there is not enough tumor material released into the patient's circulation. We developed a qPCR- based test (ExoDx EGFR) that interrogates mutations within EGFR using Exosomal RNA/DNA and cfDNA (ExoNA) derived from plasma in a cohort of 110 NSCLC patients. The performance of the assay yielded an overall sensitivity of 90% for L858R, 83% for T790M and 73% for exon 19 indels with specificities of 100%, 100%, and 96% respectively. In a subcohort of patients with extrathoracic disease (M1b and MX) the sensitivities were 92% (L858R), 95% (T790M), and 86% (exon 19 indels) with specificity of 100%, 100% and 94% respectively.

18.
Methods Mol Biol ; 1819: 93-136, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30421401

RESUMEN

Many traits, such as height, the response to a given drug, or the susceptibility to certain diseases are presumably co-determined by genetics. Especially in the field of medicine, it is of major interest to identify genetic aberrations that alter an individual's risk to develop a certain phenotypic trait. Addressing this question requires the availability of comprehensive, high-quality genetic datasets. The technological advancements and the decreasing cost of genotyping in the last decade led to an increase in such datasets. Parallel to and in line with this technological progress, an analysis framework under the name of genome-wide association studies was developed to properly collect and analyze these data. Genome-wide association studies aim at finding statistical dependencies-or associations-between a trait of interest and point-mutations in the DNA. The statistical models used to detect such associations are diverse, spanning the whole range from the frequentist to the Bayesian setting.Since genetic datasets are inherently high-dimensional, the search for associations poses not only a statistical but also a computational challenge. As a result, a variety of toolboxes and software packages have been developed, each implementing different statistical methods while using various optimizations and mathematical techniques to enhance the computations.This chapter is devoted to the discussion of widely used methods and tools in genome-wide association studies. We present the different statistical models and the assumptions on which they are based, explain peculiarities of the data that have to be accounted for and, most importantly, introduce commonly used tools and software packages for the different tasks in a genome-wide association study, complemented with examples for their application.


Asunto(s)
Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo/métodos , Modelos Genéticos , Mutación Puntual , Carácter Cuantitativo Heredable , Animales , Humanos
19.
Clin Cancer Res ; 24(12): 2944-2950, 2018 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-29535126

RESUMEN

Purpose: About 60% of non-small cell lung cancer (NSCLC) patients develop resistance to targeted epidermal growth factor receptor (EGFR) inhibitor therapy through the EGFR T790M mutation. Patients with this mutation respond well to third-generation tyrosine kinase inhibitors, but obtaining a tissue biopsy to confirm the mutation poses risks and is often not feasible. Liquid biopsies using circulating free tumor DNA (cfDNA) have emerged as a noninvasive option to detect the mutation; however, sensitivity is low as many patients have too few detectable copies in circulation. Here, we have developed and validated a novel test that overcomes the limited abundance of the mutation by simultaneously capturing and interrogating exosomal RNA/DNA and cfDNA (exoNA) in a single step followed by a sensitive allele-specific qPCR.Experimental Design: ExoNA was extracted from the plasma of NSCLC patients with biopsy-confirmed T790M-positive (N = 102) and T790M-negative (N = 108) samples. The T790M mutation status was determined using an analytically validated allele-specific qPCR assay in a Clinical Laboratory Improvement Amendment laboratory.Results: Detection of the T790M mutation on exoNA achieved 92% sensitivity and 89% specificity using tumor biopsy results as gold standard. We also obtained high sensitivity (88%) in patients with intrathoracic disease (M0/M1a), for whom detection by liquid biopsy has been particularly challenging.Conclusions: The combination of exoRNA/DNA and cfDNA for T790M detection has higher sensitivity and specificity compared with historical cohorts using cfDNA alone. This could further help avoid unnecessary tumor biopsies for T790M mutation testing. Clin Cancer Res; 24(12); 2944-50. ©2018 AACR.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas/genética , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Exosomas/metabolismo , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Mutación , Alelos , Biomarcadores de Tumor , Biopsia , Carcinoma de Pulmón de Células no Pequeñas/patología , ADN Tumoral Circulante , Receptores ErbB/sangre , Receptores ErbB/genética , Exones , Humanos , Neoplasias Pulmonares/patología , Estadificación de Neoplasias , Curva ROC , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
20.
PLoS Genet ; 14(2): e1007155, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29432421

RESUMEN

By following the evolution of populations that are initially genetically homogeneous, much can be learned about core biological principles. For example, it allows for detailed studies of the rate of emergence of de novo mutations and their change in frequency due to drift and selection. Unfortunately, in multicellular organisms with generation times of months or years, it is difficult to set up and carry out such experiments over many generations. An alternative is provided by "natural evolution experiments" that started from colonizations or invasions of new habitats by selfing lineages. With limited or missing gene flow from other lineages, new mutations and their effects can be easily detected. North America has been colonized in historic times by the plant Arabidopsis thaliana, and although multiple intercrossing lineages are found today, many of the individuals belong to a single lineage, HPG1. To determine in this lineage the rate of substitutions-the subset of mutations that survived natural selection and drift-, we have sequenced genomes from plants collected between 1863 and 2006. We identified 73 modern and 27 herbarium specimens that belonged to HPG1. Using the estimated substitution rate, we infer that the last common HPG1 ancestor lived in the early 17th century, when it was most likely introduced by chance from Europe. Mutations in coding regions are depleted in frequency compared to those in other portions of the genome, consistent with purifying selection. Nevertheless, a handful of mutations is found at high frequency in present-day populations. We link these to detectable phenotypic variance in traits of known ecological importance, life history and growth, which could reflect their adaptive value. Our work showcases how, by applying genomics methods to a combination of modern and historic samples from colonizing lineages, we can directly study new mutations and their potential evolutionary relevance.


Asunto(s)
Genoma de Planta , Tasa de Mutación , Mutación/fisiología , Desarrollo de la Planta/genética , Arabidopsis/genética , Arabidopsis/crecimiento & desarrollo , Cruzamientos Genéticos , Evolución Molecular Dirigida , Evolución Molecular , Flujo Génico/fisiología , Especies Introducidas , Fenotipo , Filogenia , Malezas/genética , Malezas/crecimiento & desarrollo , Selección Genética , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...