Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Bioinformatics ; 37(8): 1125-1134, 2021 05 23.
Artigo em Inglês | MEDLINE | ID: mdl-33135051

RESUMO

MOTIVATION: Expression quantitative trait loci (eQTL) harbor genetic variants modulating gene transcription. Fine mapping of regulatory variants at these loci is a daunting task due to the juxtaposition of causal and linked variants at a locus as well as the likelihood of interactions among multiple variants. This problem is exacerbated in genes with multiple cis-acting eQTL, where superimposed effects of adjacent loci further distort the association signals. RESULTS: We developed a novel algorithm, TreeMap, that identifies putative causal variants in cis-eQTL accounting for multisite effects and genetic linkage at a locus. Guided by the hierarchical structure of linkage disequilibrium, TreeMap performs an organized search for individual and multiple causal variants. Via extensive simulations, we show that TreeMap detects co-regulating variants more accurately than current methods. Furthermore, its high computational efficiency enables genome-wide analysis of long-range eQTL. We applied TreeMap to GTEx data of brain hippocampus samples and transverse colon samples to search for eQTL in gene bodies and in 4 Mbps gene-flanking regions, discovering numerous distal eQTL. Furthermore, we found concordant distal eQTL that were present in both brain and colon samples, implying long-range regulation of gene expression. AVAILABILITY AND IMPLEMENTATION: TreeMap is available as an R package enabled for parallel processing at https://github.com/liliulab/treemap. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Mapeamento Cromossômico , Colo , Expressão Gênica , Hipocampo , Humanos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas/genética
2.
Mol Biol Evol ; 35(8): 2015-2025, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29846678

RESUMO

The human genome contains hundreds of thousands of missense mutations. However, only a handful of these variants are known to be adaptive, which implies that adaptation through protein sequence change is an extremely rare phenomenon in human evolution. Alternatively, existing methods may lack the power to pinpoint adaptive variation. We have developed and applied an Evolutionary Probability Approach (EPA) to discover candidate adaptive polymorphisms (CAPs) through the discordance between allelic evolutionary probabilities and their observed frequencies in human populations. EPA reveals thousands of missense CAPs, which suggest that a large number of previously optimal alleles experienced a reversal of fortune in the human lineage. We explored nonadaptive mechanisms to explain CAPs, including the effects of demography, mutation rate variability, and negative and positive selective pressures in modern humans. Many nonadaptive hypotheses were tested, but failed to explain the data, which suggests that a large proportion of CAP alleles have increased in frequency due to beneficial selection. This suggestion is supported by the fact that a vast majority of adaptive missense variants discovered previously in humans are CAPs, and hundreds of CAP alleles are protective in genotype-phenotype association data. Our integrated phylogenomic and population genetic EPA approach predicts the existence of thousands of nonneutral candidate variants in the human proteome. We expect this collection to be enriched in beneficial variation. The EPA approach can be applied to discover candidate adaptive variation in any protein, population, or species for which allele frequency data and reliable multispecies alignments are available.


Assuntos
Adaptação Biológica , Evolução Biológica , Exoma , Genoma Humano , Polimorfismo Genético , Conversão Gênica , Humanos , Filogenia
3.
Front Bioinform ; 3: 1090730, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37261293

RESUMO

Bulk sequencing is commonly used to characterize the genetic diversity of cancer cell populations in tumors and the evolutionary relationships of cancer clones. However, bulk sequencing produces aggregate information on nucleotide variants and their sample frequencies, necessitating computational methods to predict distinct clone sequences and their frequencies within a sample. Interestingly, no methods are available to measure the statistical confidence in the variants assigned to inferred clones. We introduce a bootstrap resampling approach that combines clone prediction and statistical confidence calculation for every variant assignment. Analysis of computer-simulated datasets showed the bootstrap approach to work well in assessing the reliability of predicted clones as well downstream inferences using the predicted clones (e.g., mapping metastatic migration paths). We found that only a fraction of inferences have good bootstrap support, which means that many inferences are tentative for real data. Using the bootstrap approach, we analyzed empirical datasets from metastatic cancers and placed bootstrap confidence on the estimated number of mutations involved in cell migration events. We found that the numbers of driver mutations involved in metastatic cell migration events sourced from primary tumors are similar to those where metastatic tumors are the source of new metastases. So, mutations with driver potential seem to keep arising during metastasis. The bootstrap approach developed in this study is implemented in software available at https://github.com/SayakaMiura/CloneFinderPlus.

4.
Nat Commun ; 10(1): 330, 2019 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-30659175

RESUMO

Computational prediction of the phenotypic propensities of noncoding single nucleotide variants typically combines annotation of genomic, functional and evolutionary attributes into a single score. Here, we evaluate if the claimed excellent accuracies of these predictions translate into high rates of success in addressing questions important in biological research, such as fine mapping causal variants, distinguishing pathogenic allele(s) at a given position, and prioritizing variants for genetic risk assessment. A significant disconnect is found to exist between the statistical modelling and biological performance of predictive approaches. We discuss fundamental reasons underlying these deficiencies and suggest that future improvements of computational predictions need to address confounding of allelic, positional and regional effects as well as imbalance of the proportion of true positive variants in candidate lists.


Assuntos
Doença/genética , Modelos Estatísticos , RNA não Traduzido/genética , Algoritmos , Animais , Biologia Computacional , Evolução Molecular , Estudo de Associação Genômica Ampla , Humanos , Aprendizado de Máquina , Mamíferos/genética , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA