Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Genome Res ; 24(3): 454-66, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24299735

RESUMO

Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery of the expected signals at promoter regions, exon/intron boundaries, and CTCF sites. The top-scoring nucleosome calls revealed distinct DNA positioning biases, attesting to nucleotide-level accuracy. The ancient methylation levels exhibited high conservation over time, clustering closely with modern hair tissues. Using ancient methylation information, we estimated the age at death of the Saqqaq individual and illustrate how epigenetic information can be used to infer ancient gene expression. Similar epigenetic signatures were found in other fossil material, such as 110,000- to 130,000-yr-old bones, supporting the contention that ancient epigenomic information can be reconstructed from a deep past. Our findings lay the foundation for extracting epigenomic information from ancient samples, allowing shifts in epialleles to be tracked through evolutionary time, as well as providing an original window into modern epigenomics.


Assuntos
Citosina/metabolismo , Metilação de DNA , Genoma Humano , Inuíte/genética , Nucleossomos/genética , Animais , Mapeamento Cromossômico , Epigênese Genética , Epigenômica , Evolução Molecular , Expressão Gênica , Regulação da Expressão Gênica , Humanos , Filogenia , Regiões Promotoras Genéticas , Análise de Sequência de DNA
2.
Nature ; 463(7282): 757-62, 2010 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-20148029

RESUMO

We report here the genome sequence of an ancient human. Obtained from approximately 4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20x, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit.


Assuntos
Criopreservação , Extinção Biológica , Genoma Humano/genética , Inuíte/genética , Emigração e Imigração/história , Genética Populacional , Genômica , Genótipo , Groenlândia , Cabelo , História Antiga , Humanos , Masculino , Fenótipo , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA , Sibéria/etnologia
3.
PLoS Comput Biol ; 10(10): e1003907, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25357249

RESUMO

Noncoding RNAs are integral to a wide range of biological processes, including translation, gene regulation, host-pathogen interactions and environmental sensing. While genomics is now a mature field, our capacity to identify noncoding RNA elements in bacterial and archaeal genomes is hampered by the difficulty of de novo identification. The emergence of new technologies for characterizing transcriptome outputs, notably RNA-seq, are improving noncoding RNA identification and expression quantification. However, a major challenge is to robustly distinguish functional outputs from transcriptional noise. To establish whether annotation of existing transcriptome data has effectively captured all functional outputs, we analysed over 400 publicly available RNA-seq datasets spanning 37 different Archaea and Bacteria. Using comparative tools, we identify close to a thousand highly-expressed candidate noncoding RNAs. However, our analyses reveal that capacity to identify noncoding RNA outputs is strongly dependent on phylogenetic sampling. Surprisingly, and in stark contrast to protein-coding genes, the phylogenetic window for effective use of comparative methods is perversely narrow: aggregating public datasets only produced one phylogenetic cluster where these tools could be used to robustly separate unannotated noncoding RNAs from a null hypothesis of transcriptional noise. Our results show that for the full potential of transcriptomics data to be realized, a change in experimental design is paramount: effective transcriptomics requires phylogeny-aware sampling.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA não Traduzido/classificação , RNA não Traduzido/genética , Transcriptoma/genética , Archaea/genética , Bactérias/genética , Análise por Conglomerados , Biologia Computacional , Bases de Dados Genéticas , Filogenia , RNA Arqueal/química , RNA Arqueal/classificação , RNA Arqueal/genética , RNA Bacteriano/química , RNA Bacteriano/classificação , RNA Bacteriano/genética , RNA não Traduzido/química
4.
BMC Bioinformatics ; 15: 100, 2014 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-24717095

RESUMO

BACKGROUND: Modern DNA sequencing methods produce vast amounts of data that often requires mapping to a reference genome. Most existing programs use the number of mismatches between the read and the genome as a measure of quality. This approach is without a statistical foundation and can for some data types result in many wrongly mapped reads. Here we present a probabilistic mapping method based on position-specific scoring matrices, which can take into account not only the quality scores of the reads but also user-specified models of evolution and data-specific biases. RESULTS: We show how evolution, data-specific biases, and sequencing errors are naturally dealt with probabilistically. Our method achieves better results than Bowtie and BWA on simulated and real ancient and PAR-CLIP reads, as well as on simulated reads from the AT rich organism P. falciparum, when modeling the biases of these data. For simulated Illumina reads, the method has consistently higher sensitivity for both single-end and paired-end data. We also show that our probabilistic approach can limit the problem of random matches from short reads of contamination and that it improves the mapping of real reads from one organism (D. melanogaster) to a related genome (D. simulans). CONCLUSION: The presented work is an implementation of a novel approach to short read mapping where quality scores, prior mismatch probabilities and mapping qualities are handled in a statistically sound manner. The resulting implementation provides not only a tool for biologists working with low quality and/or biased sequencing data but also a demonstration of the feasibility of using a probability based alignment method on real and simulated data sets.


Assuntos
Matrizes de Pontuação de Posição Específica , Animais , Drosophila , Evolução Molecular , Genoma , Humanos , Probabilidade , Análise de Sequência de DNA , Software
5.
BMC Genomics ; 13: 178, 2012 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-22574660

RESUMO

BACKGROUND: Next-Generation Sequencing has revolutionized our approach to ancient DNA (aDNA) research, by providing complete genomic sequences of ancient individuals and extinct species. However, the recovery of genetic material from long-dead organisms is still complicated by a number of issues, including post-mortem DNA damage and high levels of environmental contamination. Together with error profiles specific to the type of sequencing platforms used, these specificities could limit our ability to map sequencing reads against modern reference genomes and therefore limit our ability to identify endogenous ancient reads, reducing the efficiency of shotgun sequencing aDNA. RESULTS: In this study, we compare different computational methods for improving the accuracy and sensitivity of aDNA sequence identification, based on shotgun sequencing reads recovered from Pleistocene horse extracts using Illumina GAIIx and Helicos Heliscope platforms. We show that the performance of the Burrows Wheeler Aligner (BWA), that has been developed for mapping of undamaged sequencing reads using platforms with low rates of indel-types of sequencing errors, can be employed at acceptable run-times by modifying default parameters in a platform-specific manner. We also examine if trimming likely damaged positions at read ends can increase the recovery of genuine aDNA fragments and if accurate identification of human contamination can be achieved using a strategy previously suggested based on best hit filtering. We show that combining our different mapping and filtering approaches can increase the number of high-quality endogenous hits recovered by up to 33%. CONCLUSIONS: We have shown that Illumina and Helicos sequences recovered from aDNA extracts could not be aligned to modern reference genomes with the same efficiency unless mapping parameters are optimized for the specific types of errors generated by these platforms and by post-mortem DNA damage. Our findings have important implications for future aDNA research, as we define mapping guidelines that improve our ability to identify genuine aDNA sequences, which in turn could improve the genotyping accuracy of ancient specimens. Our framework provides a significant improvement to the standard procedures used for characterizing ancient genomes, which is challenged by contamination and often low amounts of DNA material.


Assuntos
Análise de Sequência de DNA/métodos , Dano ao DNA/genética , Fósseis , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos
6.
Nucleic Acids Res ; 37(Database issue): D136-40, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18953034

RESUMO

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA/química , RNA/classificação , Gráficos por Computador , Internet , Alinhamento de Sequência , Análise de Sequência de RNA
7.
Psychiatry Res ; 301: 113964, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33975171

RESUMO

Paroxetine and sertraline are the only FDA approved drugs for treatment of posttraumatic stress disorder (PTSD). Although both drugs show better outcomes than placebo, not all patients benefit from treatment. We examined predictors and latent classes of SSRI treatment response in patients with PTSD. Symptom severity was measured over a 12-week period in 390 patients suffering from PTSD treated with open-label sertraline or paroxetine and a double-blinded placebo. First, growth curve modeling (GCM) was used to examine population-level predictors of treatment response. Second, growth mixture modeling (GMM) was used to group patients into latent classes based on their treatment response trajectories over time and to investigate predictors of latent class membership. Gender, childhood sexual trauma, and sexual assault as index trauma moderated the population-level treatment response using GCM. GMM identified three classes: fast responders, responders with low pretreatment symptom severity and responders with high pretreatment symptom severity. Class membership was predicted based on time since index trauma, severity of depression, and severity of anxiety. The study shows that higher severity of comorbid disorders does not result in an inferior response to treatment and suggests that patients with longer time since index trauma might particularly benefit from treatment with sertraline or paroxetine.


Assuntos
Transtornos de Estresse Pós-Traumáticos , Ansiedade , Transtornos de Ansiedade , Criança , Método Duplo-Cego , Humanos , Paroxetina/uso terapêutico , Inibidores Seletivos de Recaptação de Serotonina/uso terapêutico , Transtornos de Estresse Pós-Traumáticos/tratamento farmacológico
8.
Nucleic Acids Res ; 36(Web Server issue): W79-84, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18492721

RESUMO

We present an easy-to-use webserver that makes it possible to simultaneously use a number of state of the art methods for performing multiple alignment and secondary structure prediction for noncoding RNA sequences. This makes it possible to use the programs without having to download the code and get the programs to run. The results of all the programs are presented on a webpage and can easily be downloaded for further analysis. Additional measures are calculated for each program to make it easier to judge the individual predictions, and a consensus prediction taking all the programs into account is also calculated. This website is free and open to all users and there is no login requirement. The webserver can be found at: http://genome.ku.dk/resources/war.


Assuntos
RNA não Traduzido/química , Alinhamento de Sequência , Análise de Sequência de RNA , Software , Internet , Conformação de Ácido Nucleico
9.
Sci Rep ; 10(1): 6896, 2020 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-32313073

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

10.
Sci Rep ; 9(1): 2540, 2019 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-30796259

RESUMO

Environmental changes alter the diversity and structure of communities. By shifting the range of species traits that will be successful under new conditions, environmental drivers can also dramatically impact ecosystem functioning and resilience. Above and belowground communities jointly regulate whole-ecosystem processes and responses to change, yet they are frequently studied separately. To determine whether these communities respond similarly to environmental changes, we measured taxonomic and trait-based responses of plant and soil microbial communities to four years of experimental warming and nitrogen deposition in a temperate grassland. Plant diversity responded strongly to N addition, whereas soil microbial communities responded primarily to warming, likely via an associated decrease in soil moisture. These above and belowground changes were associated with selection for more resource-conservative plant and microbe growth strategies, which reduced community functional diversity. Functional characteristics of plant and soil microbial communities were weakly correlated (P = 0.07) under control conditions, but not when above or belowground communities were altered by either global change driver. These results highlight the potential for global change drivers operating simultaneously to have asynchronous impacts on above and belowground components of ecosystems. Assessment of a single ecosystem component may therefore greatly underestimate the whole-system impact of global environmental changes.

11.
Bioinformatics ; 23(24): 3304-11, 2007 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-18006551

RESUMO

MOTIVATION: As more non-coding RNAs are discovered, the importance of methods for RNA analysis increases. Since the structure of ncRNA is intimately tied to the function of the molecule, programs for RNA structure prediction are necessary tools in this growing field of research. Furthermore, it is known that RNA structure is often evolutionarily more conserved than sequence. However, few existing methods are capable of simultaneously considering multiple sequence alignment and structure prediction. RESULT: We present a novel solution to the problem of simultaneous structure prediction and multiple alignment of RNA sequences. Using Markov chain Monte Carlo in a simulated annealing framework, the algorithm MASTR (Multiple Alignment of STructural RNAs) iteratively improves both sequence alignment and structure prediction for a set of RNA sequences. This is done by minimizing a combined cost function that considers sequence conservation, covariation and basepairing probabilities. The results show that the method is very competitive to similar programs available today, both in terms of accuracy and computational efficiency. AVAILABILITY: Source code available from http://mastr.binf.ku.dk/


Assuntos
Algoritmos , RNA não Traduzido/genética , RNA/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Dados de Sequência Molecular
12.
Nat Commun ; 8: 14247, 2017 02 06.
Artigo em Inglês | MEDLINE | ID: mdl-28165463

RESUMO

Sulfate is a well-established sulfur source for fungi; however, in soils sulfonates and sulfate esters, especially choline sulfate, are often much more prominent. Here we show that Saccharomyces cerevisiae YIL166C(SOA1) encodes an inorganic sulfur (sulfate, sulfite and thiosulfate) transporter that also catalyses sulfonate and choline sulfate uptake. Phylogenetic analysis of fungal SOA1 orthologues and expression of 20 members in the sul1Δ sul2Δ soa1Δ strain, which is deficient in inorganic and organic sulfur compound uptake, reveals that these transporters have diverse substrate preferences for sulfur compounds. We further show that SOA2, a S. cerevisiae SOA1 paralogue found in S. uvarum, S. eubayanus and S. arboricola is likely to be an evolutionary remnant of the uncharacterized open reading frames YOL163W and YOL162W. Our work highlights the importance of sulfonates and choline sulfate as sulfur sources in the natural environment of S. cerevisiae and other fungi by identifying fungal transporters for these compounds.


Assuntos
Proteínas de Membrana Transportadoras/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Compostos de Enxofre/farmacocinética , Transporte Biológico , Proteínas de Membrana Transportadoras/química , Filogenia , Proteínas de Saccharomyces cerevisiae/genética , Solo/química , Especificidade por Substrato , Compostos de Enxofre/química
13.
mSystems ; 2(6)2017.
Artigo em Inglês | MEDLINE | ID: mdl-29152586

RESUMO

Neisseria meningitidis (meningococcus) can cause meningococcal disease, a rapidly progressing and often fatal disease that can occur in previously healthy children. Meningococci are found in healthy carriers, where they reside in the nasopharynx as commensals. While carriage is relatively common, invasive disease, associated with hypervirulent strains, is a comparatively rare event. The basis of increased virulence in some strains is not well understood. New Zealand suffered a protracted meningococcal disease epidemic, from 1991 to 2008. During this time, a household carriage study was carried out in Auckland: household contacts of index meningococcal disease patients were swabbed for isolation of carriage strains. In many households, healthy carriers harbored strains identical, as determined by laboratory typing, to the ones infecting the associated patient. We carried out more-detailed analyses of carriage and disease isolates from a select number of households. We found that isolates, although indistinguishable by laboratory typing methods and likely closely related, had many differences. We identified multiple genome variants and transcriptional differences between isolates. These studies enabled the identification of two new phase-variable genes. We also found that several carriage strains had lost their type IV pili and that this loss correlated with reduced tumor necrosis factor alpha (TNF-α) expression when cultured with epithelial cells. While nonpiliated meningococcal isolates have been previously found in carriage strains, this is the first evidence of an association between type IV pili from meningococci and a proinflammatory epithelial response. We also identified potentially important metabolic differences between carriage and disease isolates, including the sulfate assimilation pathway. IMPORTANCENeisseria meningitidis causes meningococcal disease but is frequently carried in the throats of healthy individuals; the factors that determine whether invasive disease develops are not completely understood. We carried out detailed studies of isolates, collected from patients and their household contacts, to identify differences between commensal throat isolates and those that caused invasive disease. Though isolates were identical by laboratory typing methods, we uncovered many differences in their genomes, in gene expression, and in their interactions with host cells. In particular, we found that several carriage isolates had lost their type IV pili, a surprising finding since pili are often described as essential for colonization. However, loss of type IV pili correlated with reduced secretion of a proinflammatory cytokine, TNF-α, when meningococci were cocultured with human bronchial epithelial cells; hence, the loss of pili could provide an advantage to meningococci, by resulting in a dampened localized host immune response.

14.
PLoS One ; 12(2): e0169432, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28146565

RESUMO

Planctomycetes are distinguished from other Bacteria by compartmentalization of cells via internal membranes, interpretation of which has been subject to recent debate regarding potential relations to Gram-negative cell structure. In our interpretation of the available data, the planctomycete Gemmata obscuriglobus contains a nuclear body compartment, and thus possesses a type of cell organization with parallels to the eukaryote nucleus. Here we show that pore-like structures occur in internal membranes of G.obscuriglobus and that they have elements structurally similar to eukaryote nuclear pores, including a basket, ring-spoke structure, and eight-fold rotational symmetry. Bioinformatic analysis of proteomic data reveals that some of the G. obscuriglobus proteins associated with pore-containing membranes possess structural domains found in eukaryote nuclear pore complexes. Moreover, immunogold labelling demonstrates localization of one such protein, containing a ß-propeller domain, specifically to the G. obscuriglobus pore-like structures. Finding bacterial pores within internal cell membranes and with structural similarities to eukaryote nuclear pore complexes raises the dual possibilities of either hitherto undetected homology or stunning evolutionary convergence.


Assuntos
Bactérias/ultraestrutura , Poro Nuclear/ultraestrutura , Bactérias/metabolismo , Proteínas de Bactérias/química , Proteínas de Bactérias/metabolismo , Evolução Biológica , Compartimento Celular , Parede Celular/metabolismo , Biologia Computacional/métodos , Eucariotos/ultraestrutura , Imageamento Tridimensional , Membranas Intracelulares/ultraestrutura , Modelos Moleculares , Planctomycetales/ultraestrutura , Conformação Proteica , Proteoma , Proteômica
15.
BMC Res Notes ; 9: 88, 2016 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-26868221

RESUMO

BACKGROUND: As high-throughput sequencing platforms produce longer and longer reads, sequences generated from short inserts, such as those obtained from fossil and degraded material, are increasingly expected to contain adapter sequences. Efficient adapter trimming algorithms are also needed to process the growing amount of data generated per sequencing run. FINDINGS: We introduce AdapterRemoval v2, a major revision of AdapterRemoval v1, which introduces (i) striking improvements in throughput, through the use of single instruction, multiple data (SIMD; SSE1 and SSE2) instructions and multi-threading support, (ii) the ability to handle datasets containing reads or read-pairs with different adapters or adapter pairs, (iii) simultaneous demultiplexing and adapter trimming, (iv) the ability to reconstruct adapter sequences from paired-end reads for poorly documented data sets, and (v) native gzip and bzip2 support. CONCLUSIONS: We show that AdapterRemoval v2 compares favorably with existing tools, while offering superior throughput to most alternatives examined here, both for single and multi-threaded operations.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequência de Bases
16.
Sci Rep ; 6: 19233, 2016 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-26778510

RESUMO

Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html.


Assuntos
Microbioma Gastrointestinal/genética , Metagenoma/genética , Metagenômica/métodos , Pele/microbiologia , Software , Biologia Computacional/métodos , Ecossistema , Sequenciamento de Nucleotídeos em Larga Escala , Humanos
17.
BMC Res Notes ; 7: 698, 2014 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-25294605

RESUMO

BACKGROUND: As the use of next-generation sequencing technologies is becoming more widespread, the need for robust software to help with the analysis is growing as well. A key challenge when analyzing sequencing data is the prediction of genotypes from the reads, i.e. correct inference of the underlying DNA sequences that gave rise to the sequenced fragments. For diploid organisms, the genotyper should be able to predict both alleles in the individual. Variations between the individual and the population can then be analyzed by looking for SNPs (single nucleotide polymorphisms) in order to investigate diseases or phenotypic features. To perform robust and high confidence genotyping and SNP calling, methods are needed that take the technology specific limitations into account and can model different sources of error. As an example, ancient DNA poses special challenges as the data is often shallow and subject to errors induced by post mortem damage. FINDINGS: We present a novel approach to the genotyping problem where a probabilistic framework describing the process from sampling to sequencing is implemented as a graphical model. This makes it possible to model technology specific errors and other sources of variation that can affect the result. The inferred genotype is given a posterior probability to signify the confidence in the result. SNPest has already been used to genotype large scale projects such as the first ancient human genome published in 2010. CONCLUSIONS: We compare the performance of SNPest to a number of other widely used genotypers on both real and simulated data, covering both haploid and diploid genomes. We investigate the effects of read depth, of removing adapters before mapping and genotyping, of using different mapping tools, and of using the correct model in the genotyping process. We show that the performance of SNPest is comparable to existing methods, and we also illustrate cases where SNPest has an advantage over other methods, e.g. when dealing with simulated ancient DNA.


Assuntos
Escherichia coli/genética , Genoma Bacteriano , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Simulação por Computador , Diploide , Frequência do Gene , Estudos de Associação Genética , Genótipo , Haploidia , Humanos , Fenótipo , Probabilidade , Reprodutibilidade dos Testes , Design de Software
18.
BMC Res Notes ; 5: 337, 2012 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-22748135

RESUMO

BACKGROUND: With the advent of next-generation sequencing there is an increased demand for tools to pre-process and handle the vast amounts of data generated. One recurring problem is adapter contamination in the reads, i.e. the partial or complete sequencing of adapter sequences. These adapter sequences have to be removed as they can hinder correct mapping of the reads and influence SNP calling and other downstream analyses. FINDINGS: We present a tool called AdapterRemoval which is able to pre-process both single and paired-end data. The program locates and removes adapter residues from the reads, it is able to combine paired reads if they overlap, and it can optionally trim low-quality nucleotides. Furthermore, it can look for adapter sequence in both the 5' and 3' ends of the reads. This is a flexible tool that can be tuned to accommodate different experimental settings and sequencing platforms producing FASTQ files. AdapterRemoval is shown to be good at trimming adapters from both single-end and paired-end data. CONCLUSIONS: AdapterRemoval is a comprehensive tool for analyzing next-generation sequencing data. It exhibits good performance both in terms of sensitivity and specificity. AdapterRemoval has already been used in various large projects and it is possible to extend it further to accommodate application-specific biases in the data.


Assuntos
Análise de Sequência de DNA/instrumentação , Algoritmos , Polimorfismo de Nucleotídeo Único
19.
Science ; 334(6052): 94-8, 2011 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-21940856

RESUMO

We present an Aboriginal Australian genomic sequence obtained from a 100-year-old lock of hair donated by an Aboriginal man from southern Western Australia in the early 20th century. We detect no evidence of European admixture and estimate contamination levels to be below 0.5%. We show that Aboriginal Australians are descendants of an early human dispersal into eastern Asia, possibly 62,000 to 75,000 years ago. This dispersal is separate from the one that gave rise to modern Asians 25,000 to 38,000 years ago. We also find evidence of gene flow between populations of the two dispersal waves prior to the divergence of Native Americans from modern Asian ancestors. Our findings support the hypothesis that present-day Aboriginal Australians descend from the earliest humans to occupy Australia, likely representing one of the oldest continuous populations outside Africa.


Assuntos
Genoma Humano , Havaiano Nativo ou Outro Ilhéu do Pacífico/genética , Animais , Ásia , Povo Asiático/genética , População Negra , Simulação por Computador , DNA Mitocondrial/genética , Emigração e Imigração , Etnicidade/genética , Ásia Oriental , Fluxo Gênico , Frequência do Gene , Genética Populacional/métodos , Genoma Mitocondrial , Haplótipos , Hominidae/genética , Humanos , Desequilíbrio de Ligação , Masculino , Filogenia , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Austrália Ocidental , População Branca/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA