Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Euro Surveill ; 24(50)2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31847944

RESUMO

BackgroundWhole genome sequencing (WGS) is a reliable tool for studying tuberculosis (TB) transmission. WGS data are usually processed by custom-built analysis pipelines with little standardisation between them.AimTo compare the impact of variability of several WGS analysis pipelines used internationally to detect epidemiologically linked TB cases.MethodsFrom the Netherlands, 535 Mycobacterium tuberculosis complex (MTBC) strains from 2016 were included. Epidemiological information obtained from municipal health services was available for all mycobacterial interspersed repeat unit-variable number of tandem repeat (MIRU-VNTR) clustered cases. WGS data was analysed using five different pipelines: one core genome multilocus sequence typing (cgMLST) approach and four single nucleotide polymorphism (SNP)-based pipelines developed in Oxford, United Kingdom; Borstel, Germany; Bilthoven, the Netherlands and Copenhagen, Denmark. WGS clusters were defined using a maximum pairwise distance of 12 SNPs/alleles.ResultsThe cgMLST approach and Oxford pipeline clustered all epidemiologically linked cases, however, in the other three SNP-based pipelines one epidemiological link was missed due to insufficient coverage. In general, the genetic distances varied between pipelines, reflecting different clustering rates: the cgMLST approach clustered 92 cases, followed by 84, 83, 83 and 82 cases in the SNP-based pipelines from Copenhagen, Oxford, Borstel and Bilthoven respectively.ConclusionConcordance in ruling out epidemiological links was high between pipelines, which is an important step in the international validation of WGS data analysis. To increase accuracy in identifying TB transmission clusters, standardisation of crucial WGS criteria and creation of a reference database of representative MTBC sequences would be advisable.


Assuntos
Epidemiologia Molecular/métodos , Tipagem de Sequências Multilocus/métodos , Mycobacterium tuberculosis/classificação , Mycobacterium tuberculosis/genética , Polimorfismo de Nucleotídeo Único , Tuberculose/epidemiologia , Sequenciamento Completo do Genoma/métodos , Transmissão de Doença Infecciosa , Monitoramento Epidemiológico , Humanos , Repetições Minissatélites , Mycobacterium tuberculosis/isolamento & purificação , Países Baixos , Sequências de Repetição em Tandem , Tuberculose/diagnóstico , Tuberculose/transmissão
2.
Methods ; 72: 3-8, 2015 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-25233806

RESUMO

The Illumina HumanMethylation450 BeadChip has become a popular platform for interrogating DNA methylation in epigenome-wide association studies (EWAS) and related projects as well as resource efforts such as the International Cancer Genome Consortium (ICGC) and the International Human Epigenome Consortium (IHEC). This has resulted in an exponential increase of 450k data in recent years and triggered the development of numerous integrated analysis pipelines and stand-alone packages. This review will introduce and discuss the currently most popular pipelines and packages and is particularly aimed at new 450k users.


Assuntos
Ilhas de CpG , Metilação de DNA , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Epigenômica/métodos , Genoma , Genoma Humano , Humanos , Software
3.
Small Methods ; : e2301758, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38967205

RESUMO

Organogenesis, the phase of embryonic development that starts at the end of gastrulation and continues until birth is the critical process for understanding cellular differentiation and maturation during organ development. The rapid development of single-cell transcriptomics technology has led to many novel discoveries in understanding organogenesis while also accumulating a large quantity of data. To fill this gap, OrganogenesisDB (http://organogenesisdb.com/), which is a comprehensive database dedicated to exploring cell-type identification and gene expression dynamics during organogenesis, is developed. OrganogenesisDB contains single-cell RNA sequencing data for more than 1.4 million cells from 49 published datasets spanning various developmental stages. Additionally, 3324 cell markers are manually curated for 1120 cell types across 9 human organs and 4 mouse organs. OrganogenesisDB leverages various analysis tools to assist users in annotating and understanding cell types at different developmental stages and helps in mining and presenting genes that exhibit specific patterns and play key regulatory roles during cell maturation and differentiation. This work provides a critical resource and useful tool for deciphering cell lineage determination and uncovering the mechanisms underlying organogenesis.

4.
Front Endocrinol (Lausanne) ; 15: 1344152, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38948515

RESUMO

Background: Analyzing bacterial microbiomes consistently using next-generation sequencing (NGS) is challenging due to the diversity of synthetic platforms for 16S rRNA genes and their analytical pipelines. This study compares the efficacy of full-length (V1-V9 hypervariable regions) and partial-length (V3-V4 hypervariable regions) sequencing of synthetic 16S rRNA genes from human gut microbiomes, with a focus on childhood obesity. Methods: In this observational and comparative study, we explored the differences between these two sequencing methods in taxonomic categorization and weight status prediction among twelve children with obstructive sleep apnea. Results: The full-length NGS method by Pacbio® identified 118 genera and 248 species in the V1-V9 regions, all with a 0% unclassified rate. In contrast, the partial-length NGS method by Illumina® detected 142 genera (with a 39% unclassified rate) and 6 species (with a 99% unclassified rate) in the V3-V4 regions. These approaches showed marked differences in gut microbiome composition and functional predictions. The full-length method distinguished between obese and non-obese children using the Firmicutes/Bacteroidetes ratio, a known obesity marker (p = 0.046), whereas the partial-length method was less conclusive (p = 0.075). Additionally, out of 73 metabolic pathways identified through full-length sequencing, 35 (48%) were associated with level 1 metabolism, compared to 28 of 61 pathways (46%) identified through the partial-length method. The full-length NGS also highlighted complex associations between body mass index z-score, three bacterial species (Bacteroides ovatus, Bifidobacterium pseudocatenulatum, and Streptococcus parasanguinis ATCC 15912), and 17 metabolic pathways. Both sequencing techniques revealed relationships between gut microbiota composition and OSA-related parameters, with full-length sequencing offering more comprehensive insights into associated metabolic pathways than the V3-V4 technique. Conclusion: These findings highlight disparities in NGS-based assessments, emphasizing the value of full-length NGS with amplicon sequence variant analysis for clinical gut microbiome research. They underscore the importance of considering methodological differences in future meta-analyses.


Assuntos
Microbioma Gastrointestinal , Obesidade Infantil , RNA Ribossômico 16S , Apneia Obstrutiva do Sono , Humanos , Microbioma Gastrointestinal/genética , Criança , Masculino , RNA Ribossômico 16S/genética , Feminino , Apneia Obstrutiva do Sono/microbiologia , Apneia Obstrutiva do Sono/genética , Obesidade Infantil/microbiologia , Obesidade Infantil/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Pré-Escolar , Peso Corporal , Adolescente
5.
Cell Rep Methods ; 4(8): 100831, 2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39111312

RESUMO

Spatial transcriptomics workflows using barcoded capture arrays are commonly used for resolving gene expression in tissues. However, existing techniques are either limited by capture array density or are cost prohibitive for large-scale atlasing. We present Nova-ST, a dense nano-patterned spatial transcriptomics technique derived from randomly barcoded Illumina sequencing flow cells. Nova-ST enables customized, low-cost, flexible, and high-resolution spatial profiling of large tissue sections. Benchmarking on mouse brain sections demonstrates significantly higher sensitivity compared to existing methods at a reduced cost.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Animais , Camundongos , Perfilação da Expressão Gênica/métodos , Encéfalo/metabolismo , Nanotecnologia/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
6.
bioRxiv ; 2023 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-37066421

RESUMO

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

7.
Res Sq ; 2023 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-37503119

RESUMO

The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.

8.
Small Methods ; 7(9): e2201421, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37259264

RESUMO

The liver is critical for the digestive and immune systems. Although the physiology and pathology of liver have been well studied and many scRNA-seq data are generated, a database and landscape for characterizing cell types and gene expression in different liver diseases or developmental stages at single-cell resolution are lacking. Hence, scLiverDB is developed, a specialized database for human and mouse liver transcriptomes to unravel the landscape of liver cell types, cell heterogeneity and gene expression at single-cell resolution across various liver diseases/cell types/developmental stages. To date, 62 datasets including 9,050 samples and 1,741,734 cells is curated. A uniform workflow is used, which included quality control, dimensional reduction, clustering, and cell-type annotation to analyze datasets on the same platform; integrated manual and automatic methods for accurate cell-type identification and provided a user-friendly web interface with multiscale functions. There are two case studies to show the usefulness of scLiverDB, which identified the LTB (lymphotoxin Beta) gene as a potential biomarker of lymphoid cells differentiation and showed the expression changes of Foxa3 (forkhead box A3) in liver chronic progressive diseases. This work provides a crucial resource to resolve molecular and cellular information in normal, diseased, and developing human and mouse livers.


Assuntos
Fígado , Transcriptoma , Camundongos , Animais , Humanos , Transcriptoma/genética , Bases de Dados Factuais , Diferenciação Celular , Análise por Conglomerados
9.
Brain Stimul ; 16(2): 567-593, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36828303

RESUMO

Transcranial magnetic stimulation (TMS) evokes neuronal activity in the targeted cortex and connected brain regions. The evoked brain response can be measured with electroencephalography (EEG). TMS combined with simultaneous EEG (TMS-EEG) is widely used for studying cortical reactivity and connectivity at high spatiotemporal resolution. Methodologically, the combination of TMS with EEG is challenging, and there are many open questions in the field. Different TMS-EEG equipment and approaches for data collection and analysis are used. The lack of standardization may affect reproducibility and limit the comparability of results produced in different research laboratories. In addition, there is controversy about the extent to which auditory and somatosensory inputs contribute to transcranially evoked EEG. This review provides a guide for researchers who wish to use TMS-EEG to study the reactivity of the human cortex. A worldwide panel of experts working on TMS-EEG covered all aspects that should be considered in TMS-EEG experiments, providing methodological recommendations (when possible) for effective TMS-EEG recordings and analysis. The panel identified and discussed the challenges of the technique, particularly regarding recording procedures, artifact correction, analysis, and interpretation of the transcranial evoked potentials (TEPs). Therefore, this work offers an extensive overview of TMS-EEG methodology and thus may promote standardization of experimental and computational procedures across groups.


Assuntos
Eletroencefalografia , Estimulação Magnética Transcraniana , Humanos , Estimulação Magnética Transcraniana/métodos , Reprodutibilidade dos Testes , Eletroencefalografia/métodos , Potenciais Evocados/fisiologia , Coleta de Dados
10.
J Microbiol Methods ; 169: 105811, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31857143

RESUMO

Sequencing the 16S gene rRNA has become a popular method when identifying bacterial communities. However, recent studies address differences in the characterization based on sample preparation, sequencing platforms, and data analysis. In this work, we tested some of the available user-friendly protocols for data analysis with the reads obtained from the sequencing machines Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM). We sought for the advantages and disadvantages of both platforms in terms of accuracy, detected species, and abundance, analyzing a staggered mock community. Four different pipelines were applied: QIIME 1.9.1 with default parameters, QIIME 1.9.1 with modified parameters and chimera removal, VSEARCH 2.3.4, and QIIME 2 v.2018.2. To address the limitations of species level detection we used species-classifier SPINGO. The optimal pipeline for PGM platform, was the use of QIIME 1.9.1 with default parameters (QIIME1), except when a study requires the detection of Bacteroides or other Bacteroidaceae members, in which QIIME1MOD (with chimera removal) seems to be a good alternative. For Illumina Miseq, VSEARCH strategy can be a good option. Our results also confirm that all the tested pipelines can be used for metagenomic analysis at family and genus level.


Assuntos
Bactérias/classificação , Bactérias/genética , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , RNA Ribossômico 16S/genética , DNA Bacteriano/genética , Análise de Dados , Análise de Sequência de DNA/métodos
11.
Methods Mol Biol ; 2051: 345-371, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31552637

RESUMO

In any analytical discipline, data analysis reproducibility is closely interlinked with data quality. In this book chapter focused on mass spectrometry-based proteomics approaches, we introduce how both data analysis reproducibility and data quality can influence each other and how data quality and data analysis designs can be used to increase robustness and improve reproducibility. We first introduce methods and concepts to design and maintain robust data analysis pipelines such that reproducibility can be increased in parallel. The technical aspects related to data analysis reproducibility are challenging, and current ways to increase the overall robustness are multifaceted. Software containerization and cloud infrastructures play an important part.We will also show how quality control (QC) and quality assessment (QA) approaches can be used to spot analytical issues, reduce the experimental variability, and increase confidence in the analytical results of (clinical) proteomics studies, since experimental variability plays a substantial role in analysis reproducibility. Therefore, we give an overview on existing solutions for QC/QA, including different quality metrics, and methods for longitudinal monitoring. The efficient use of both types of approaches undoubtedly provides a way to improve the experimental reliability, reproducibility, and level of consistency in proteomics analytical measurements.


Assuntos
Computação em Nuvem , Análise de Dados , Proteômica/métodos , Controle de Qualidade , Confiabilidade dos Dados , Humanos , Espectrometria de Massas , Reprodutibilidade dos Testes , Software
12.
J Neurosci Methods ; 306: 19-31, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29842901

RESUMO

BACKGROUND: In cognitive neuroscience, functional magnetic resonance imaging (fMRI) data are widely analyzed using general linear models (GLMs). However, model quality of GLMs for fMRI is rarely assessed, in part due to the lack of formal measures for statistical model inference. NEW METHOD: We introduce a new SPM toolbox for model assessment, comparison and selection (MACS) of GLMs applied to fMRI data. MACS includes classical, information-theoretic and Bayesian methods of model assessment previously applied to GLMs for fMRI as well as recent methodological developments of model selection and model averaging in fMRI data analysis. RESULTS: The toolbox - which is freely available from GitHub - directly builds on the Statistical Parametric Mapping (SPM) software package and is easy-to-use, general-purpose, modular, readable and extendable. We validate the toolbox by reproducing model selection and model averaging results from earlier publications. COMPARISON WITH EXISTING METHODS: A previous toolbox for model diagnosis in fMRI has been discontinued and other approaches to model comparison between GLMs have not been translated into reusable computational resources in the past. CONCLUSIONS: Increased attention on model quality will lead to lower false-positive rates in cognitive neuroscience and increased application of the MACS toolbox will increase the reproducibility of GLM analyses and is likely to increase the replicability of fMRI studies.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/fisiologia , Processamento de Imagem Assistida por Computador/métodos , Modelos Lineares , Imageamento por Ressonância Magnética , Software , Teorema de Bayes , Encéfalo/diagnóstico por imagem , Interpretação Estatística de Dados , Humanos , Teoria da Informação , Reprodutibilidade dos Testes , Razão Sinal-Ruído
13.
Cancer Med ; 4(3): 392-403, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25594743

RESUMO

We describe open, reproducible pipelines that create an integrated genomic profile of a cancer and use the profile to find mutations associated with disease and potentially useful drugs. These pipelines analyze high-throughput cancer exome and transcriptome sequence data together with public databases to find relevant mutations and drugs. The three pipelines that we have developed are: (1) an exome analysis pipeline, which uses whole or targeted tumor exome sequence data to produce a list of putative variants (no matched normal data are needed); (2) a transcriptome analysis pipeline that processes whole tumor transcriptome sequence (RNA-seq) data to compute gene expression and find potential gene fusions; and (3) an integrated variant analysis pipeline that uses the tumor variants from the exome pipeline and tumor gene expression from the transcriptome pipeline to identify deleterious and druggable mutations in all genes and in highly expressed genes. These pipelines are integrated into the popular Web platform Galaxy at http://usegalaxy.org/cancer to make them accessible and reproducible, thereby providing an approach for doing standardized, distributed analyses in clinical studies. We have used our pipeline to identify similarities and differences between pancreatic adenocarcinoma cancer cell lines and primary tumors.


Assuntos
Genes Neoplásicos , Neoplasias Pancreáticas/genética , Linhagem Celular , Linhagem Celular Tumoral , Exoma , Perfilação da Expressão Gênica , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação , Proteínas Proto-Oncogênicas/genética , Proteínas Proto-Oncogênicas p21(ras) , Receptor ErbB-2/genética , Análise de Sequência de RNA , Proteínas ras/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA