Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Genomics ; 115(2): 110556, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36599399

RESUMO

As the most readily adopted molecular screening test, low-pass WGS of maternal plasma cell-free DNA for aneuploidy detection generates a vast amount of genomic data. This large-scale method also allows for high-throughput virome screening. NIPT sequencing data, yielding 6.57 terabases of data from 187.8 billion reads, from 12,951 pregnant Turkish women was used to investigate the prevalence and abundance of viral DNA in plasma. Among the 22 virus sequences identified in 12% of participants were human papillomavirus, herpesvirus, betaherpesvirus and anellovirus. We observed a unique pattern of circulating viral DNA with a high prevalence of papillomaviruses. The prevalence of herpesviruses/anellovirus was similar among Turkish, European and Dutch populations. Hepatitis B prevalence was remarkably low in Dutch, European and Turkish populations, but higher in China. WGS data revealed that herpesvirus/anelloviruses are naturally found in European populations. This represents the first comprehensive research on the plasma virome of pregnant Turkish women.


Assuntos
Ácidos Nucleicos Livres , DNA Viral , Gravidez , Humanos , Feminino , DNA Viral/genética , Diagnóstico Pré-Natal/métodos , Aneuploidia , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos
2.
Comput Methods Programs Biomed ; 175: 223-231, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31104710

RESUMO

BACKGROUND AND OBJECTIVE: In the last decade, RNA-sequencing technology has become method-of-choice and prefered to microarray technology for gene expression based classification and differential expression analysis since it produces less noisy data. Although there are many algorithms proposed for microarray data, the number of available algorithms and programs are limited for classification of RNA-sequencing data. For this reason, we developed MLSeq, to bring not only frequently used classification algorithms but also novel approaches together and make them available to be used for classification of RNA sequencing data. This package is developed using R language environment and distributed through BIOCONDUCTOR network. METHODS: Classification of RNA-sequencing data is not straightforward since raw data should be preprocessed before downstream analysis. With MLSeq package, researchers can easily preprocess (normalization, filtering, transformation etc.) and classify raw RNA-sequencing data using two strategies: (i) to perform algorithms which are directly proposed for RNA-sequencing data structure or (ii) to transform RNA-sequencing data in order to bring it distributionally closer to microarray data structure, and perform algorithms which are developed for microarray data. Moreover, we proposed novel algorithms such as voom (an acronym for variance modelling at observational level) based nearest shrunken centroids (voomNSC), diagonal linear discriminant analysis (voomDLDA), etc. through MLSeq. MATERIALS: Three real RNA-sequencing datasets (i.e cervical cancer, lung cancer and aging datasets) were used to evalute model performances. Poisson linear discriminant analysis (PLDA) and negative binomial linear discriminant analysis (NBLDA) were selected as algorithms based on dicrete distributions, and voomNSC, nearest shrunken centroids (NSC) and support vector machines (SVM) were selected as algorithms based on continuous distributions for model comparisons. Each algorithm is compared using classification accuracies and sparsities on an independent test set. RESULTS: The algorithms which are based on discrete distributions performed better in cervical cancer and aging data with accuracies above 0.92. In lung cancer data, the most of algorithms performed similar with accuracies of 0.88 except that SVM achieved 0.94 of accuracy. Our voomNSC algorithm was the most sparse algorithm, and able to select 2.2% and 6.6% of all features for cervical cancer and lung cancer datasets respectively. However, in aging data, sparse classifiers were not able to select an optimal subset of all features. CONCLUSION: MLSeq is comprehensive and easy-to-use interface for classification of gene expression data. It allows researchers perform both preprocessing and classification tasks through single platform. With this property, MLSeq can be considered as a pipeline for the classification of RNA-sequencing data.


Assuntos
Aprendizado de Máquina , Análise de Sequência de RNA/métodos , Software , Algoritmos , Análise Discriminante , Perfilação da Expressão Gênica , Humanos , Modelos Lineares , Distribuição de Poisson , Linguagens de Programação , RNA , Máquina de Vetores de Suporte
3.
PeerJ ; 5: e3890, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29018623

RESUMO

RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA