Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Microorganisms ; 8(10)2020 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-33076521

RESUMO

(1) Background: microbiome host classification can be used to identify sources of contamination in environmental data. However, there is no ready-to-use host classifier. Here, we aimed to build a model that would be able to discriminate between pet and human microbiomes samples. The challenge of the study was to build a classifier using data solely from publicly available studies that normally contain sequencing data for only one type of host. (2) Results: we have developed a random forest model that distinguishes human microbiota from domestic pet microbiota (cats and dogs) with 97% accuracy. In order to prevent overfitting, samples from several (at least four) different projects were necessary. Feature importance analysis revealed that the model relied on several taxa known to be key components in domestic cat and dog microbiomes (such as Fusobacteriaceae and Peptostreptococcaeae), as well as on some taxa exclusively found in humans (as Akkermansiaceae). (3) Conclusion: we have shown that it is possible to make a reliable pet/human gut microbiome classifier on the basis of the data collected from different studies.

2.
BMC Genomics ; 21(Suppl 7): 528, 2020 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-32912136

RESUMO

BACKGROUND: Head-to-head comparison of BeadChip and WGS/WES genotyping techniques for their precision is far from straightforward. A tool for validation of high-throughput genotyping calls such as Sanger sequencing is neither scalable nor practical for large-scale DNA processing. Here we report a cross-validation analysis of genotyping calls obtained via Illumina GSA BeadChip and WGS (Illumina HiSeq X Ten) techniques. RESULTS: When compared to each other, the average precision and accuracy of BeadChip and WGS genotyping techniques exceeded 0.991 and 0.997, respectively. The average fraction of discordant variants for both platforms was found to be 0.639%. A sliding window approach was utilized to explore genomic regions not exceeding 500 bp encompassing a maximal amount of discordant variants for further validation by Sanger sequencing. Notably, 12 variants out of 26 located within eight identified regions were consistently discordant in related calls made by WGS and BeadChip. When Sanger sequenced, a total of 16 of these genotypes were successfully resolved, indicating that a precision of WGS and BeadChip genotyping for this genotype subset was at 0.81 and 0.5, respectively, with accuracy values of 0.87 and 0.61. CONCLUSIONS: We conclude that WGS genotype calling exhibits higher overall precision within the selected variety of discordantly genotyped variants, though the amount of validated variants remained insufficient.


Assuntos
Técnicas de Genotipagem , Polimorfismo de Nucleotídeo Único , Genoma , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala
3.
Nucleic Acids Res ; 47(21): e135, 2019 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-31511888

RESUMO

As the use of next-generation sequencing (NGS) for the Mendelian diseases diagnosis is expanding, the performance of this method has to be improved in order to achieve higher quality. Typically, performance measures are considered to be designed in the context of each application and, therefore, account for a spectrum of clinically relevant variants. We present EphaGen, a new computational methodology for bioinformatics quality control (QC). Given a single NGS dataset in BAM format and a pre-compiled VCF-file of targeted clinically relevant variants it associates this dataset with a single arbiter parameter. Intrinsically, EphaGen estimates the probability to miss any variant from the defined spectrum within a particular NGS dataset. Such performance measure virtually resembles the diagnostic sensitivity of given NGS dataset. Here we present case studies of the use of EphaGen in context of BRCA1/2 and CFTR sequencing in a series of 14 runs across 43 blood samples and 504 publically available NGS datasets. EphaGen is superior to conventional bioinformatics metrics such as coverage depth and coverage uniformity. We recommend using this software as a QC step in NGS studies in the clinical context. Availability: https://github.com/m4merg/EphaGen or https://hub.docker.com/r/m4merg/ephagen.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único/genética , Controle de Qualidade , Software , Proteína BRCA1/genética , Proteína BRCA2/genética , Neoplasias da Mama/genética , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Feminino , Genoma Humano , Genômica/métodos , Humanos , Análise da Randomização Mendeliana/métodos
4.
Front Genet ; 10: 194, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30915108

RESUMO

Genotyping of cell-free DNA (cfDNA) in plasma samples has the potential to allow for a noninvasive assessment of tumor biology, avoiding the inherent shortcomings of tissue biopsy. Next generation sequencing (NGS), a leading technology for liquid biopsy analysis, continues to be hurdled with several major issues with cfDNA samples, including low cfDNA concentration and high fragmentation. In this study, by employing Ion Torrent PGM semiconductor technology, we performed a comparison between two multi-biomarker amplicon-based NGS panels characterized by a substantial difference in average amplicon length. In course of the analysis of the peripheral blood from 13 diagnostic non-small cell lung cancer patients, equivalence of two panels, in terms of overall diagnostic sensitivity and specificity was shown. A pairwise comparison of the allele frequencies for the same somatic variants obtained from the pairs of panel-specific amplicons, demonstrated an identical analytical sensitivity in range of 140 to 170 bp amplicons in size. Further regression analysis between amplicon length and its coverage, illustrated that NGS sequencing of plasma cfDNA equally tolerates amplicons with lengths in the range of 120 to 170 bp. To increase the sensitivity of mutation detection in cfDNA, we performed a computational analysis of the features associated with genome-wide nucleosome maps, evident from the data on the prevalence of cfDNA fragments of certain sizes and their fragmentation patterns. By leveraging the support vector machine-based machine learning approach, we showed that a combination of nucleosome map associated features with GC content, results in the increased accuracy of prediction of high inter-sample sequencing coverage variation (areas under the receiver operating curve: 0.75, 95% CI: 0.750-0.752 vs. 0.65, 95% CI: 0.63-0.67). Thus, nucleosome-guided fragmentation should be utilized as a guide to design amplicon-based NGS panels for the genotyping of cfDNA samples.

5.
Nutrients ; 10(5)2018 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-29738477

RESUMO

Personalized nutrition is of increasing interest to individuals actively monitoring their health. The relations between the duration of diet intervention and the effects on gut microbiota have yet to be elucidated. Here we examined the associations of short-term dietary changes, long-term dietary habits and lifestyle with gut microbiota. Stool samples from 248 citizen-science volunteers were collected before and after a self-reported 2-week personalized diet intervention, then analyzed using 16S rRNA sequencing. Considerable correlations between long-term dietary habits and gut community structure were detected. A higher intake of vegetables and fruits was associated with increased levels of butyrate-producing Clostridiales and higher community richness. A paired comparison of the metagenomes before and after the 2-week intervention showed that even a brief, uncontrolled intervention produced profound changes in community structure: resulting in decreased levels of Bacteroidaceae, Porphyromonadaceae and Rikenellaceae families and decreased alpha-diversity coupled with an increase of Methanobrevibacter, Bifidobacterium, Clostridium and butyrate-producing Lachnospiraceae- as well as the prevalence of a permatype (a bootstrapping-based variation of enterotype) associated with a higher diversity of diet. The response of microbiota to the intervention was dependent on the initial microbiota state. These findings pave the way for the development of an individualized diet.


Assuntos
Dieta , Microbioma Gastrointestinal , Bacteroidetes/genética , Bacteroidetes/isolamento & purificação , Bifidobacterium/genética , Bifidobacterium/isolamento & purificação , Clostridium/genética , Clostridium/isolamento & purificação , Análise por Conglomerados , Fezes/química , Fezes/microbiologia , Humanos , Metagenoma , Methanobrevibacter/genética , Methanobrevibacter/isolamento & purificação , RNA Ribossômico 16S/genética , Tamanho da Amostra , Análise de Sequência de DNA
6.
BMC Med Genomics ; 11(Suppl 1): 13, 2018 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-29504914

RESUMO

BACKGROUND: Cystic fibrosis (CF) is one of the most common life-threatening genetic disorders. Around 2000 variants in the CFTR gene have been identified, with some proportion known to be pathogenic and 300 disease-causing mutations have been characterized in detail by CFTR2 database, which complicates its analysis with conventional methods. METHODS: We conducted next-generation sequencing (NGS) in a cohort of 89 adult patients negative for p.Phe508del homozygosity. Complete clinical and demographic information were available for 84 patients. RESULTS: By combining MLPA with NGS, we identified disease-causing alleles in all the CF patients. Importantly, in 10% of cases, standard bioinformatics pipelines were inefficient in identifying causative mutations. Class IV-V mutations were observed in 38 (45%) cases, predominantly ones with pancreatic sufficient CF disease; rest of the patients had Class I-III mutations. Diabetes was seen only in patients homozygous for class I-III mutations. We found that 12% of the patients were heterozygous for more than two pathogenic CFTR mutations. Two patients were observed with p.[Arg1070Gln, Ser466*] complex allele which was associated with milder pulmonary obstructions (FVC 107 and 109% versus 67%, CI 95%: 63-72%; FEV 90 and 111% versus 47%, CI 95%: 37-48%). For the first time p.[Phe508del, Leu467Phe] complex allele was reported, observed in four patients (5%). CONCLUSION: NGS can be a more information-gaining technology compared to standard methods. Combined with its equivalent diagnostic performance, it can therefore be implemented in the clinical practice, although careful validation is still required.


Assuntos
Biomarcadores/análise , Regulador de Condutância Transmembrana em Fibrose Cística/deficiência , Fibrose Cística/genética , Fibrose Cística/patologia , Estudos de Associação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação , Adulto , Estudos de Coortes , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
7.
J Transl Med ; 15(1): 22, 2017 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-28137276

RESUMO

BACKGROUND: Next generation sequencing has a potential to revolutionize the management of cancer patients within the framework of precision oncology. Nevertheless, lack of standardization decelerated entering of the technology into the clinical testing space. Here we dissected a number of common problems of NGS diagnostics in oncology and introduced ways they can be resolved. METHODS: DNA was extracted from 26 formalin fixed paraffin embedded (FFPE) specimens and processed with the TrueSeq Amplicon Cancer Panel (Illumina Inc, San Diego, California) targeting 48 cancer-related genes and sequenced in single run. Sequencing data were comparatively analyzed by several bioinformatics pipelines. RESULTS: Libraries yielded sufficient coverage to detect even low prevalent mutations. We found that the number of FFPE sequence artifacts significantly correlates with pre-normalization concentration of libraries (rank correlation -0.81; p < 1e-10), thus, contributing to sample-specific variant detection cut-offs. Surprisingly, extensive validation of EGFR mutation calls by a combination of aligners and variant callers resulted in identification of two false negatives and one false positive that were due to complexity of underlying genomic change, confirmed by Sanger sequencing. Additionally, the study of the non-EGFR amplicons revealed 33 confirmed unique mutations in 17 genes, with TP53 being the most frequently mutated. Clinical relevance of these finding is discussed. CONCLUSIONS: Reporting of entire mutational spectrum revealed by targeted sequencing is questionable, at least until the clinically-driven guidelines on reporting of somatic mutations are established. The standardization of sequencing protocols, especially their data analysis components, requires assay-, disease-, and, in many cases, even sample-specific customization that could be performed only in cooperation with clinicians.


Assuntos
Formaldeído/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Oncologia , Inclusão em Parafina , Fixação de Tecidos , Artefatos , DNA/genética , Variações do Número de Cópias de DNA/genética , Éxons/genética , Frequência do Gene/genética , Humanos , Achados Incidentais , Mutação/genética , Padrões de Referência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...