Búsqueda | Portal Regional de la BVS

1.

SweGen: a whole-genome data resource of genetic variability in a cross-section of the Swedish population.

Ameur, Adam; Dahlberg, Johan; Olason, Pall; Vezzi, Francesco; Karlsson, Robert; Martin, Marcel; Viklund, Johan; Kähäri, Andreas Kusalananda; Lundin, Pär; Che, Huiwen; Thutkawkorapin, Jessada; Eisfeldt, Jesper; Lampa, Samuel; Dahlberg, Mats; Hagberg, Jonas; Jareborg, Niclas; Liljedahl, Ulrika; Jonasson, Inger; Johansson, Åsa; Feuk, Lars; Lundeberg, Joakim; Syvänen, Ann-Christine; Lundin, Sverker; Nilsson, Daniel; Nystedt, Björn; Magnusson, Patrik Ke; Gyllensten, Ulf.

Eur J Hum Genet ; 25(11): 1253-1260, 2017 11.

Artículo en Inglés | MEDLINE | ID: mdl-28832569

RESUMEN

Here we describe the SweGen data set, a comprehensive map of genetic variation in the Swedish population. These data represent a basic resource for clinical genetics laboratories as well as for sequencing-based association studies by providing information on genetic variant frequencies in a cohort that is well matched to national patient cohorts. To select samples for this study, we first examined the genetic structure of the Swedish population using high-density SNP-array data from a nation-wide cohort of over 10 000 Swedish-born individuals included in the Swedish Twin Registry. A total of 1000 individuals, reflecting a cross-section of the population and capturing the main genetic structure, were selected for whole-genome sequencing. Analysis pipelines were developed for automated alignment, variant calling and quality control of the sequencing data. This resulted in a genome-wide collection of aggregated variant frequencies in the Swedish population that we have made available to the scientific community through the website https://swefreq.nbis.se. A total of 29.2 million single-nucleotide variants and 3.8 million indels were detected in the 1000 samples, with 9.9 million of these variants not present in current databases. Each sample contributed with an average of 7199 individual-specific variants. In addition, an average of 8645 larger structural variants (SVs) were detected per individual, and we demonstrate that the population frequencies of these SVs can be used for efficient filtering analyses. Finally, our results show that the genetic diversity within Sweden is substantial compared with the diversity among continental European populations, underscoring the relevance of establishing a local reference data set.

Asunto(s)

Genoma Humano , Polimorfismo de Nucleótido Simple , Sistema de Registros , Conjuntos de Datos como Asunto , Estudio de Asociación del Genoma Completo , Humanos , Suecia , Gemelos/genética

2.

MultiQC: summarize analysis results for multiple tools and samples in a single report.

Ewels, Philip; Magnusson, Måns; Lundin, Sverker; Käller, Max.

Bioinformatics ; 32(19): 3047-8, 2016 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-27312411

RESUMEN

MOTIVATION: Fast and accurate quality control is essential for studies involving next-generation sequencing data. Whilst numerous tools exist to quantify QC metrics, there is no common approach to flexibly integrate these across tools and large sample sets. Assessing analysis results across an entire project can be time consuming and error prone; batch effects and outlier samples can easily be missed in the early stages of analysis. RESULTS: We present MultiQC, a tool to create a single report visualising output from multiple tools across many samples, enabling global trends and biases to be quickly identified. MultiQC can plot data from many common bioinformatics tools and is built to allow easy extension and customization. AVAILABILITY AND IMPLEMENTATION: MultiQC is available with an GNU GPLv3 license on GitHub, the Python Package Index and Bioconda. Documentation and example reports are available at http://multiqc.info CONTACT: phil.ewels@scilifelab.se.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Control de Calidad , Biología Computacional , Programas Informáticos

3.

T-cell receptor-HLA-DRB1 associations suggest specific antigens in pulmonary sarcoidosis.

Grunewald, Johan; Kaiser, Ylva; Ostadkarampour, Mahyar; Rivera, Natalia V; Vezzi, Francesco; Lötstedt, Britta; Olsen, Remi-André; Sylwan, Lina; Lundin, Sverker; Käller, Max; Sandalova, Tatiana; Ahlgren, Kerstin M; Wahlström, Jan; Achour, Adnane; Ronninger, Marcus; Eklund, Anders.

Eur Respir J ; 47(3): 898-909, 2016 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-26585430

RESUMEN

In pulmonary sarcoidosis, CD4(+) T-cells expressing T-cell receptor Vα2.3 accumulate in the lungs of HLA-DRB1*03(+) patients. To investigate T-cell receptor-HLA-DRB1*03 interactions underlying recognition of hitherto unknown antigens, we performed detailed analyses of T-cell receptor expression on bronchoalveolar lavage fluid CD4(+) T-cells from sarcoidosis patients.Pulmonary sarcoidosis patients (n=43) underwent bronchoscopy with bronchoalveolar lavage. T-cell receptor α and ß chains of CD4(+) T-cells were analysed by flow cytometry, DNA-sequenced, and three-dimensional molecular models of T-cell receptor-HLA-DRB1*03 complexes generated.Simultaneous expression of Vα2.3 with the Vß22 chain was identified in the lungs of all HLA-DRB1*03(+) patients. Accumulated Vα2.3/Vß22-expressing T-cells were highly clonal, with identical or near-identical Vα2.3 chain sequences and inter-patient similarities in Vß22 chain amino acid distribution. Molecular modelling revealed specific T-cell receptor-HLA-DRB1*03-peptide interactions, with a previously identified, sarcoidosis-associated vimentin peptide, (Vim)429-443 DSLPLVDTHSKRTLL, matching both the HLA peptide-binding cleft and distinct T-cell receptor features perfectly.We demonstrate, for the first time, the accumulation of large clonal populations of specific Vα2.3/Vß22 T-cell receptor-expressing CD4(+) T-cells in the lungs of HLA-DRB1*03(+) sarcoidosis patients. Several distinct contact points between Vα2.3/Vß22 receptors and HLA-DRB1*03 molecules suggest presentation of prototypic vimentin-derived peptides.

Asunto(s)

Linfocitos T CD4-Positivos/inmunología , Cadenas HLA-DRB1/metabolismo , Receptores de Antígenos de Linfocitos T/inmunología , Sarcoidosis Pulmonar/inmunología , Adulto , Líquido del Lavado Bronquioalveolar , Broncoscopía , Femenino , Citometría de Flujo , Humanos , Pulmón/inmunología , Masculino , Persona de Mediana Edad , Modelos Moleculares , Suecia

4.

Phasing of single DNA molecules by massively parallel barcoding.

Borgström, Erik; Redin, David; Lundin, Sverker; Berglund, Emelie; Andersson, Anders F; Ahmadian, Afshin.

Nat Commun ; 6: 7173, 2015 Jun 09.

Artículo en Inglés | MEDLINE | ID: mdl-26055759

RESUMEN

High-throughput sequencing platforms mainly produce short-read data, resulting in a loss of phasing information for many of the genetic variants analysed. For certain applications, it is vital to know which variant alleles are connected to each individual DNA molecule. Here we demonstrate a method for massively parallel barcoding and phasing of single DNA molecules. First, a primer library with millions of uniquely barcoded beads is generated. When compartmentalized with single DNA molecules, the beads can be used to amplify and tag any target sequences of interest, enabling coupling of the biological information from multiple loci. We apply the assay to bacterial 16S sequencing and up to 94% of the hypothesized phasing events are shown to originate from single molecules. The method enables use of widely available short-read-sequencing platforms to study long single molecules within a complex sample, without losing phase information.

Asunto(s)

Código de Barras del ADN Taxonómico , ADN/química

5.

Endonuclease specificity and sequence dependence of type IIS restriction enzymes.

Lundin, Sverker; Jemt, Anders; Terje-Hegge, Finn; Foam, Napoleon; Pettersson, Erik; Käller, Max; Wirta, Valtteri; Lexow, Preben; Lundeberg, Joakim.

PLoS One ; 10(1): e0117059, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-25629514

RESUMEN

Restriction enzymes that recognize specific sequences but cleave unknown sequence outside the recognition site are extensively utilized tools in molecular biology. Despite this, systematic functional categorization of cleavage performance has largely been lacking. We established a simple and automatable model system to assay cleavage distance variation (termed slippage) and the sequence dependence thereof. We coupled this to massively parallel sequencing in order to provide sensitive and accurate measurement. With this system 14 enzymes were assayed (AcuI, BbvI, BpmI, BpuEI, BseRI, BsgI, Eco57I, Eco57MI, EcoP15I, FauI, FokI, GsuI, MmeI and SmuI). We report significant variation of slippage ranging from 1-54%, variations in sequence context dependence, as well as variation between isoschizomers. We believe this largely overlooked property of enzymes with shifted cleavage would benefit from further large scale classification and engineering efforts seeking to improve performance. The gained insights of in-vitro performance may also aid the in-vivo understanding of these enzymes.

Asunto(s)

Desoxirribonucleasas de Localización Especificada Tipo II/genética , Endonucleasas/genética , Secuencia de Bases , Desoxirribonucleasas de Localización Especificada Tipo II/metabolismo , Endonucleasas/metabolismo , Especificidad por Sustrato

6.

DegePrime, a program for degenerate primer design for broad-taxonomic-range PCR in microbial ecology studies.

Hugerth, Luisa W; Wefer, Hugo A; Lundin, Sverker; Jakobsson, Hedvig E; Lindberg, Mathilda; Rodin, Sandra; Engstrand, Lars; Andersson, Anders F.

Appl Environ Microbiol ; 80(16): 5116-23, 2014 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-24928874

RESUMEN

The taxonomic composition of a microbial community can be deduced by analyzing its rRNA gene content by, e.g., high-throughput DNA sequencing or DNA chips. Such methods typically are based on PCR amplification of rRNA gene sequences using broad-taxonomic-range PCR primers. In these analyses, the use of optimal primers is crucial for achieving an unbiased representation of community composition. Here, we present the computer program DegePrime that, for each position of a multiple sequence alignment, finds a degenerate oligomer of as high coverage as possible and outputs its coverage among taxonomic divisions. We show that our novel heuristic, which we call weighted randomized combination, performs better than previously described algorithms for solving the maximum coverage degenerate primer design problem. We previously used DegePrime to design a broad-taxonomic-range primer pair that targets the bacterial V3-V4 region (341F-805R) (D. P. Herlemann, M. Labrenz, K. Jurgens, S. Bertilsson, J. J. Waniek, and A. F. Andersson, ISME J. 5:1571-1579, 2011, http://dx.doi.org/10.1038/ismej.2011.41), and here we use the program to significantly increase the coverage of a primer pair (515F-806R) widely used for Illumina-based surveys of bacterial and archaeal diversity. By comparison with shotgun metagenomics, we show that the primers give an accurate representation of microbial diversity in natural samples.

Asunto(s)

Bacterias/genética , Cartilla de ADN/química , Programas Informáticos , Algoritmos , Animales , Bacterias/clasificación , Bacterias/aislamiento & purificación , Computadores Moleculares , Cartilla de ADN/genética , ADN Bacteriano/genética , Reacción en Cadena de la Polimerasa , ARN Ribosómico 16S/genética , Rumen/microbiología , Agua de Mar/microbiología

7.

Hierarchical molecular tagging to resolve long continuous sequences by massively parallel sequencing.

Lundin, Sverker; Gruselius, Joel; Nystedt, Björn; Lexow, Preben; Käller, Max; Lundeberg, Joakim.

Sci Rep ; 3: 1186, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-23470464

RESUMEN

Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region.

Asunto(s)

Bacteriófago lambda/genética , Mitocondrias/genética , Análisis de Secuencia de ADN/métodos , Proteína p53 Supresora de Tumor/genética , Animales , Línea Celular Tumoral , Código de Barras del ADN Taxonómico , Perros/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Oligonucleótidos/análisis , Oligonucleótidos/química , Reacción en Cadena de la Polimerasa

8.

Comprehensive analysis of the genome transcriptome and proteome landscapes of three tumor cell lines.

Akan, Pelin; Alexeyenko, Andrey; Costea, Paul Igor; Hedberg, Lilia; Solnestam, Beata Werne; Lundin, Sverker; Hällman, Jimmie; Lundberg, Emma; Uhlén, Mathias; Lundeberg, Joakim.

Genome Med ; 4(11): 86, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-23158748

RESUMEN

We here present a comparative genome, transcriptome and functional network analysis of three human cancer cell lines (A431, U251MG and U2OS), and investigate their relation to protein expression. Gene copy numbers significantly influenced corresponding transcript levels; their effect on protein levels was less pronounced. We focused on genes with altered mRNA and/or protein levels to identify those active in tumor maintenance. We provide comprehensive information for the three genomes and demonstrate the advantage of integrative analysis for identifying tumor-related genes amidst numerous background mutations by relating genomic variation to expression/protein abundance data and use gene networks to reveal implicated pathways.

9.

Large scale library generation for high throughput sequencing.

Borgström, Erik; Lundin, Sverker; Lundeberg, Joakim.

PLoS One ; 6(4): e19119, 2011 Apr 27.

Artículo en Inglés | MEDLINE | ID: mdl-21589638

RESUMEN

BACKGROUND: Large efforts have recently been made to automate the sample preparation protocols for massively parallel sequencing in order to match the increasing instrument throughput. Still, the size selection through agarose gel electrophoresis separation is a labor-intensive bottleneck of these protocols. METHODOLOGY/PRINCIPAL FINDINGS: In this study a method for automatic library preparation and size selection on a liquid handling robot is presented. The method utilizes selective precipitation of certain sizes of DNA molecules on to paramagnetic beads for cleanup and selection after standard enzymatic reactions. CONCLUSIONS/SIGNIFICANCE: The method is used to generate libraries for de novo and re-sequencing on the Illumina HiSeq 2000 instrument with a throughput of 12 samples per instrument in approximately 4 hours. The resulting output data show quality scores and pass filter rates comparable to manually prepared samples. The sample size distribution can be adjusted for each application, and are suitable for all high throughput DNA processing protocols seeking to control size intervals.

Asunto(s)

Biblioteca de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Automatización , Análisis de Secuencia de ADN

10.

Decoding a substantial set of samples in parallel by massive sequencing.

Neiman, Mårten; Lundin, Sverker; Savolainen, Peter; Ahmadian, Afshin.

PLoS One ; 6(3): e17785, 2011 Mar 09.

Artículo en Inglés | MEDLINE | ID: mdl-21408018

RESUMEN

There has been a dramatic increase of throughput of sequenced bases in the last years but sequencing a multitude of samples in parallel has not yet developed equally. Here we present a novel strategy where the combination of two tags is used to link sequencing reads back to their origins from a pool of samples. By incorporating the tags in two steps sample-handling complexity is lowered by nearly 100 times compared to conventional indexing protocols. In addition, the method described here enables accurate identification and typing of thousands of samples in parallel. In this study the system was designed to test 4992 samples using only 122 tags. To prove the concept of the two-tagging method, the highly polymorphic 2(nd) exon of DLA-DRB1 in dogs and wolves was sequenced using the 454 GS FLX Titanium Chemistry. By requiring a minimum sequence depth of 20 reads per sample, 94% of the successfully amplified samples were genotyped. In addition, the method allowed digital detection of chimeric fragments. These results demonstrate that it is possible to sequence thousands of samples in parallel without complex pooling patterns or primer combinations. Furthermore, the method is highly scalable as only a limited number of additional tags leads to substantial increase of the sample size.

Asunto(s)

Perros/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Lobos/genética , Animales , Polimorfismo Genético

11.

Increased throughput by parallelization of library preparation for massive sequencing.

Lundin, Sverker; Stranneheim, Henrik; Pettersson, Erik; Klevebring, Daniel; Lundeberg, Joakim.

PLoS One ; 5(4): e10029, 2010 Apr 06.

Artículo en Inglés | MEDLINE | ID: mdl-20386591

RESUMEN

BACKGROUND: Massively parallel sequencing systems continue to improve on data output, while leaving labor-intensive library preparations a potential bottleneck. Efforts are currently under way to relieve the crucial and time-consuming work to prepare DNA for high-throughput sequencing. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we demonstrate an automated parallel library preparation protocol using generic carboxylic acid-coated superparamagnetic beads and polyethylene glycol precipitation as a reproducible and flexible method for DNA fragment length separation. With this approach the library preparation for DNA sequencing can easily be adjusted to a desired fragment length. The automated protocol, here demonstrated using the GS FLX Titanium instrument, was compared to the standard manual library preparation, showing higher yield, throughput and great reproducibility. In addition, 12 libraries were prepared and uniquely tagged in parallel, and the distribution of sequence reads between these indexed samples could be improved using quantitative PCR-assisted pooling. CONCLUSIONS/SIGNIFICANCE: We present a novel automated procedure that makes it possible to prepare 36 indexed libraries per person and day, which can be increased to up to 96 libraries processed simultaneously. The yield, speed and robust performance of the protocol constitute a substantial improvement to present manual methods, without the need of extensive equipment investments. The described procedure enables a considerable efficiency increase for small to midsize sequencing centers.

Asunto(s)

Biblioteca de Genes , Análisis de Secuencia de ADN/métodos , Automatización , Ácidos Carboxílicos , Precipitación Química , Polietilenglicoles , Análisis de Secuencia de ADN/instrumentación

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA