Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Nature ; 538(7624): 207-214, 2016 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-27654914

RESUMO

The population history of Aboriginal Australians remains largely uncharacterized. Here we generate high-coverage genomes for 83 Aboriginal Australians (speakers of Pama-Nyungan languages) and 25 Papuans from the New Guinea Highlands. We find that Papuan and Aboriginal Australian ancestors diversified 25-40 thousand years ago (kya), suggesting pre-Holocene population structure in the ancient continent of Sahul (Australia, New Guinea and Tasmania). However, all of the studied Aboriginal Australians descend from a single founding population that differentiated ~10-32 kya. We infer a population expansion in northeast Australia during the Holocene epoch (past 10,000 years) associated with limited gene flow from this region to the rest of Australia, consistent with the spread of the Pama-Nyungan languages. We estimate that Aboriginal Australians and Papuans diverged from Eurasians 51-72 kya, following a single out-of-Africa dispersal, and subsequently admixed with archaic populations. Finally, we report evidence of selection in Aboriginal Australians potentially associated with living in the desert.


Assuntos
Genoma Humano/genética , Genômica , Havaiano Nativo ou Outro Ilhéu do Pacífico/genética , Filogenia , Grupos Raciais/genética , África/etnologia , Austrália , Conjuntos de Dados como Assunto , Clima Desértico , Fluxo Gênico , Genética Populacional , História Antiga , Migração Humana/história , Humanos , Idioma , Nova Guiné , Dinâmica Populacional , Tasmânia
2.
Proc Natl Acad Sci U S A ; 114(32): E6498-E6506, 2017 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-28716916

RESUMO

Although situated ∼400 km from the east coast of Africa, Madagascar exhibits cultural, linguistic, and genetic traits from both Southeast Asia and Eastern Africa. The settlement history remains contentious; we therefore used a grid-based approach to sample at high resolution the genomic diversity (including maternal lineages, paternal lineages, and genome-wide data) across 257 villages and 2,704 Malagasy individuals. We find a common Bantu and Austronesian descent for all Malagasy individuals with a limited paternal contribution from Europe and the Middle East. Admixture and demographic growth happened recently, suggesting a rapid settlement of Madagascar during the last millennium. However, the distribution of African and Asian ancestry across the island reveals that the admixture was sex biased and happened heterogeneously across Madagascar, suggesting independent colonization of Madagascar from Africa and Asia rather than settlement by an already admixed population. In addition, there are geographic influences on the present genomic diversity, independent of the admixture, showing that a few centuries is sufficient to produce detectable genetic structure in human populations.


Assuntos
Povo Asiático/genética , População Negra/genética , Etnicidade/genética , Variação Genética , Genoma Humano , Estudo de Associação Genômica Ampla , Idoso , Feminino , Humanos , Madagáscar/etnologia , Masculino , Pessoa de Meia-Idade
3.
Proc Natl Acad Sci U S A ; 112(8): 2491-6, 2015 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-25675502

RESUMO

Heteroplasmy in human mtDNA may play a role in cancer, other diseases, and aging, but patterns of heteroplasmy variation across different tissues have not been thoroughly investigated. Here, we analyzed complete mtDNA genome sequences at ∼3,500× average coverage from each of 12 tissues obtained at autopsy from each of 152 individuals. We identified 4,577 heteroplasmies (with an alternative allele frequency of at least 0.5%) at 393 positions across the mtDNA genome. Surprisingly, different nucleotide positions (nps) exhibit high frequencies of heteroplasmy in different tissues, and, moreover, heteroplasmy is strongly dependent on the specific consensus allele at an np. All of these tissue-related and allele-related heteroplasmies show a significant age-related accumulation, suggesting positive selection for specific alleles at specific positions in specific tissues. We also find a highly significant excess of liver-specific heteroplasmies involving nonsynonymous changes, most of which are predicted to have an impact on protein function. This apparent positive selection for reduced mitochondrial function in the liver may reflect selection to decrease damaging byproducts of liver mitochondrial metabolism (i.e., "survival of the slowest"). Overall, our results provide compelling evidence for positive selection acting on some somatic mtDNA mutations.


Assuntos
Alelos , DNA Mitocondrial/genética , Mutação/genética , Especificidade de Órgãos/genética , Seleção Genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Sequência de Bases , Criança , Pré-Escolar , Humanos , Lactente , Recém-Nascido , Fígado/metabolismo , Pessoa de Meia-Idade , Dados de Sequência Molecular , Adulto Jovem
4.
BMC Genomics ; 17: 139, 2016 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-26920804

RESUMO

BACKGROUND: Minor allele detection in very high coverage sequence data (>1000X) has many applications such as detecting mtDNA heteroplasmy, somatic mutations in cancer or tumors, SNP calling in pool sequencing, etc., where reads with low frequency are not necessarily sequence error but may instead convey biological information. However, the suitability of common base quality recalibration tools for such applications has not been investigated in detail. RESULTS: We show that the widely used tool GATK BaseRecalibration has several limitations in minor allele detection. First, GATK IndelRealignment fails to work if the sequence coverage is above a certain level since it then becomes computationally infeasible. Second, the accuracy of the base quality largely depends on the database of known SNPs as the control, which limits the ability of de novo minor allele detection. Third, GATK reduces the base quality of sequence errors at the cost of reducing scores for true minor alleles. To overcome these limitations, we present a novel approach called SEGREG, which applies segmented regression to control sequences (e.g. phiX174 DNA) spiked into a sequencing run. Based on simulations SEGREG improves both the accuracy of base quality scores and the detection of minor alleles. We further investigate sequence error and recalibration parameters by applying a Logarithm Likelihood Ratio (LLR) approach to SEGREG recalibrated base quality scores for phiX174 DNA sequenced to very high coverage, and for mtDNA genome sequences previously analyzed for heteroplasmic variants. CONCLUSIONS: Our results suggest that SEGREG improves base recalibration without suffering the limitations discussed above, and the LLR approach benefits from SEGREG in identifying more true minor alleles, while avoiding false positives from sequencing error.


Assuntos
Alelos , Frequência do Gene , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Calibragem , Simulação por Computador , DNA Mitocondrial/genética , Genoma Mitocondrial , Humanos , Funções Verossimilhança , Polimorfismo de Nucleotídeo Único , Proteínas Virais/genética
5.
Hum Genet ; 135(5): 541-553, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-27043341

RESUMO

The recent availability of large-scale sequence data for the human Y chromosome has revolutionized analyses of and insights gained from this non-recombining, paternally inherited chromosome. However, the studies to date focus on Eurasian variation, and hence the diversity of early-diverging branches found in Africa has not been adequately documented. Here, we analyze over 900 kb of Y chromosome sequence obtained from 547 individuals from southern African Khoisan- and Bantu-speaking populations, identifying 232 new sequences from basal haplogroups A and B. We identify new clades in the phylogeny, an older age for the root, and substantially older ages for some individual haplogroups. Furthermore, while haplogroup B2a is traditionally associated with the spread of Bantu speakers, we find that it probably also existed in Khoisan groups before the arrival of Bantu speakers. Finally, there is pronounced variation in branch length between major haplogroups; in particular, haplogroups associated with Bantu speakers have significantly longer branches. Technical artifacts cannot explain this branch length variation, which instead likely reflects aspects of the demographic history of Bantu speakers, such as recent population expansion and an older average paternal age. The influence of demographic factors on branch length variation has broader implications both for the human Y phylogeny and for similar analyses of other species.


Assuntos
População Negra/genética , Cromossomos Humanos Y/genética , Variação Genética/genética , Genética Populacional , Haplótipos/genética , África , Humanos , Filogenia
6.
Nat Commun ; 14(1): 2734, 2023 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-37173341

RESUMO

Formalin-fixed paraffin-embedded (FFPE) tissues constitute a vast and valuable patient material bank for clinical history and follow-up data. It is still challenging to achieve single cell/nucleus RNA (sc/snRNA) profile in FFPE tissues. Here, we develop a droplet-based snRNA sequencing technology (snRandom-seq) for FFPE tissues by capturing full-length total RNAs with random primers. snRandom-seq shows a minor doublet rate (0.3%), a much higher RNA coverage, and detects more non-coding RNAs and nascent RNAs, compared with state-of-art high-throughput scRNA-seq technologies. snRandom-seq detects a median of >3000 genes per nucleus and identifies 25 typical cell types. Moreover, we apply snRandom-seq on a clinical FFPE human liver cancer specimen and reveal an interesting subpopulation of nuclei with high proliferative activity. Our method provides a powerful snRNA-seq platform for clinical FFPE specimens and promises enormous applications in biomedical research.


Assuntos
Formaldeído , Perfilação da Expressão Gênica , Humanos , Perfilação da Expressão Gênica/métodos , Inclusão em Parafina/métodos , Fixação de Tecidos/métodos , Análise de Sequência de RNA/métodos , RNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Nuclear Pequeno
7.
BMC Bioinformatics ; 12: 347, 2011 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-21851598

RESUMO

BACKGROUND: Comparing biological time series data across different conditions, or different specimens, is a common but still challenging task. Algorithms aligning two time series represent a valuable tool for such comparisons. While many powerful computation tools for time series alignment have been developed, they do not provide significance estimates for time shift measurements. RESULTS: Here, we present an extended version of the original DTW algorithm that allows us to determine the significance of time shift estimates in time series alignments, the DTW-Significance (DTW-S) algorithm. The DTW-S combines important properties of the original algorithm and other published time series alignment tools: DTW-S calculates the optimal alignment for each time point of each gene, it uses interpolated time points for time shift estimation, and it does not require alignment of the time-series end points. As a new feature, we implement a simulation procedure based on parameters estimated from real time series data, on a series-by-series basis, allowing us to determine the false positive rate (FPR) and the significance of the estimated time shift values. We assess the performance of our method using simulation data and real expression time series from two published primate brain expression datasets. Our results show that this method can provide accurate and robust time shift estimates for each time point on a gene-by-gene basis. Using these estimates, we are able to uncover novel features of the biological processes underlying human brain development and maturation. CONCLUSIONS: The DTW-S provides a convenient tool for calculating accurate and robust time shift estimates at each time point for each gene, based on time series data. The estimates can be used to uncover novel biological features of the system being studied. The DTW-S is freely available as an R package TimeShift at http://www.picb.ac.cn/Comparative/data.html.


Assuntos
Algoritmos , Simulação por Computador , Perfilação da Expressão Gênica , Córtex Pré-Frontal/crescimento & desenvolvimento , Córtex Pré-Frontal/metabolismo , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Animais , Cerebelo/crescimento & desenvolvimento , Cerebelo/metabolismo , Criança , Pré-Escolar , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Lactente , Macaca mulatta/genética , Macaca mulatta/metabolismo , Pessoa de Meia-Idade , Pan troglodytes/embriologia , Pan troglodytes/metabolismo , Primatas , Tempo , Adulto Jovem
8.
J Comput Biol ; 19(6): 766-75, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22697246

RESUMO

Bioinformatics analyses frequently yield results in the form of lists of genes sorted by, for example, sequence similarity to a query sequence or degree of differential expression of a gene upon a change of cellular condition. Comparison of such results may depend strongly on the particular scoring system throughout the entire list, although the crucial information resides in which genes are ranked at the top of the list. Here, we propose to reduce the lists to the mere ranking of the genes and to compare only the ranked lists. To this end, we introduce a measure of similarity between ranked lists. Our measure puts particular emphasis on finding the same items near the top of the list, while the genes further down should not have a strong influence. Our approach can be understood as a special version of a two-dimensional Kolmogorov-Smirnov statistic. We present a dynamic programming algorithm for its computation and study the distribution of the similarity values. The performance on simulated and on real biological data is studied in comparison to other available measures. Supplementary Material is available online (www.liebertonline.com/cmb).


Assuntos
Algoritmos , Biologia Computacional/estatística & dados numéricos , Expressão Gênica , Software , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Estatísticas não Paramétricas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA