Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
1.
Nature ; 607(7920): 732-740, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35859178

RESUMEN

Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.


Asunto(s)
Bancos de Muestras Biológicas , Bases de Datos Genéticas , Variación Genética , Genoma Humano , Genómica , Secuenciación Completa del Genoma , África/etnología , Asia/etnología , Estudios de Cohortes , Secuencia Conservada , Exones/genética , Genoma Humano/genética , Haplotipos/genética , Humanos , Mutación INDEL , Irlanda/etnología , Repeticiones de Microsatélite , Polimorfismo de Nucleótido Simple/genética , Reino Unido
2.
Bioinformatics ; 40(3)2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38485699

RESUMEN

MOTIVATION: Local alignments of query sequences in large databases represent a core part of metagenomic studies and facilitate homology search. Following the development of NCBI Blast, many applications aimed to provide faster and equally sensitive local alignment frameworks. Most applications focus on protein alignments, while only few also facilitate DNA-based searches. None of the established programs allow searching DNA sequences from bisulfite sequencing experiments commonly used for DNA methylation profiling, for which specific alignment strategies need to be implemented. RESULTS: Here, we introduce Lambda3, a new version of the local alignment application Lambda. Lambda3 is the first solution that enables the search of protein, nucleotide as well as bisulfite-converted nucleotide query sequences. Its protein mode achieves comparable performance to that of the highly optimized protein alignment application Diamond, while the nucleotide mode consistently outperforms established local nucleotide aligners. Combined, Lambda3 presents a universal local alignment framework that enables fast and sensitive homology searches for a wide range of use-cases. AVAILABILITY AND IMPLEMENTATION: Lambda3 is free and open-source software publicly available at https://github.com/seqan/lambda/.


Asunto(s)
Algoritmos , Programas Informáticos , Sulfitos , Alineación de Secuencia , Proteínas
3.
Bioinformatics ; 30(17): i349-55, 2014 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-25161219

RESUMEN

MOTIVATION: Next-generation sequencing technologies produce unprecedented amounts of data, leading to completely new research fields. One of these is metagenomics, the study of large-size DNA samples containing a multitude of diverse organisms. A key problem in metagenomics is to functionally and taxonomically classify the sequenced DNA, to which end the well-known BLAST program is usually used. But BLAST has dramatic resource requirements at metagenomic scales of data, imposing a high financial or technical burden on the researcher. Multiple attempts have been made to overcome these limitations and present a viable alternative to BLAST. RESULTS: In this work we present Lambda, our own alternative for BLAST in the context of sequence classification. In our tests, Lambda often outperforms the best tools at reproducing BLAST's results and is the fastest compared with the current state of the art at comparable levels of sensitivity. AVAILABILITY AND IMPLEMENTATION: Lambda was implemented in the SeqAn open-source C++ library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/lambda. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Programas Informáticos
4.
J Biotechnol ; 261: 157-168, 2017 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-28888961

RESUMEN

BACKGROUND: The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome (Venter et al., 2001) would not have been possible without advanced assembly algorithms and the development of practical BWT based read mappers have been instrumental for NGS analysis. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there was a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. We previously addressed this by introducing the SeqAn library of efficient data types and algorithms in 2008 (Döring et al., 2008). RESULTS: The SeqAn library has matured considerably since its first publication 9 years ago. In this article we review its status as an established resource for programmers in the field of sequence analysis and its contributions to many analysis tools. CONCLUSIONS: We anticipate that SeqAn will continue to be a valuable resource, especially since it started to actively support various hardware acceleration techniques in a systematic manner.


Asunto(s)
Bases de Datos Genéticas , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA