Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36326078

RESUMEN

Most polygenic risk score (PRS)models have been based on data from populations of European origins (accounting for the majority of the large genomics datasets, e.g. >78% in the UK Biobank and >85% in the GTEx project). Although several large-scale Asian biobanks were initiated (e.g. Japanese, Korean, Han Chinese biobanks), most other Asian countries have little or near-zero genomics data. To implement PRS models for under-represented populations, we explored transfer learning approaches, assuming that information from existing large datasets can compensate for the small sample size that can be feasibly obtained in developing countries, like Vietnam. Here, we benchmark 13 common PRS methods in meta-population strategy (combining individual genotype data from multiple populations) and multi-population strategy (combining summary statistics from multiple populations). Our results highlight the complementarity of different populations and the choice of methods should depend on the target population. Based on these results, we discussed a set of guidelines to help users select the best method for their datasets. We developed a robust and comprehensive software to allow for benchmarking comparisons between methods and proposed a computational framework for improving PRS performance in a dataset with a small sample size. This work is expected to inform the development of genomics applications in under-represented populations. PRSUP framework is available at: https://github.com/BiomedicalMachineLearning/VGP.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Vietnam , Genómica/métodos , Factores de Riesgo
2.
BMC Infect Dis ; 22(1): 558, 2022 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-35718768

RESUMEN

BACKGROUND: A global pandemic has been declared for coronavirus disease 2019 (COVID-19), which has serious impacts on human health and healthcare systems in the affected areas, including Vietnam. None of the previous studies have a framework to provide summary statistics of the virus variants and assess the severity associated with virus proteins and host cells in COVID-19 patients in Vietnam. METHOD: In this paper, we comprehensively investigated SARS-CoV-2 variants and immune responses in COVID-19 patients. We provided summary statistics of target sequences of SARS-CoV-2 in Vietnam and other countries for data scientists to use in downstream analysis for therapeutic targets. For host cells, we proposed a predictive model of the severity of COVID-19 based on public datasets of hospitalization status in Vietnam, incorporating a polygenic risk score. This score uses immunogenic SNP biomarkers as indicators of COVID-19 severity. RESULT: We identified that the Delta variant of SARS-CoV-2 is most prevalent in southern areas of Vietnam and it is different from other areas in the world using various data sources. Our predictive models of COVID-19 severity had high accuracy (Random Forest AUC = 0.81, Elastic Net AUC = 0.7, and SVM AUC = 0.69) and showed that the use of polygenic risk scores increased the models' predictive capabilities. CONCLUSION: We provided a comprehensive analysis for COVID-19 severity in Vietnam. This investigation is not only helpful for COVID-19 treatment in therapeutic target studies, but also could influence further research on the disease progression and personalized clinical outcomes.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , COVID-19 , Infecciones por Coronavirus , Neumonía Viral , Betacoronavirus , COVID-19/epidemiología , Estudio de Asociación del Genoma Completo , Humanos , SARS-CoV-2/genética , Vietnam/epidemiología
3.
BMC Bioinformatics ; 21(1): 244, 2020 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-32539680

RESUMEN

BACKGROUND: The misregulation of microRNA (miRNA) has been shown to cause diseases. Recently, we have proposed a computational method based on a random walk framework on a miRNA-target gene network to predict disease-associated miRNAs. The prediction performance of our method is better than that of some existing state-of-the-art network- and machine learning-based methods since it exploits the mutual regulation between miRNAs and their target genes in the miRNA-target gene interaction networks. RESULTS: To facilitate the use of this method, we have developed a Cytoscape app, named RWRMTN, to predict disease-associated miRNAs. RWRMTN can work on any miRNA-target gene network. Highly ranked miRNAs are supported with evidence from the literature. They then can also be visualized based on the rankings and in relationships with the query disease and their target genes. In addition, automation functions are also integrated, which allow RWRMTN to be used in workflows from external environments. We demonstrate the ability of RWRMTN in predicting breast and lung cancer-associated miRNAs via workflows in Cytoscape and other environments. CONCLUSIONS: Considering a few computational methods have been developed as software tools for convenient uses, RWRMTN is among the first GUI-based tools for the prediction of disease-associated miRNAs which can be used in workflows in different environments.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes/genética , MicroARNs/genética , Humanos
4.
Arch Virol ; 165(12): 2921-2926, 2020 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-32989573

RESUMEN

In this study, we present an analysis of metagenome sequences obtained from a filtrate of a siphon tissue homogenate of otter clams (Lutraria rhynchaena) with swollen-siphon disease. The viral signal was mined from the metagenomic data, and a novel circular ssDNA virus was identified. Genomic features and phylogenetic analysis showed that the virus belongs to the phylum Cressdnaviricota, which consists of viruses with circular, single-stranded DNA (ssDNA) genomes. Members of this phylum have been identified in various species and in environmental samples. The newly found virus is distantly related to the currently known members of the phylum Cressdnaviricota.


Asunto(s)
Bivalvos/genética , Virus ADN/clasificación , ADN Viral/genética , Genoma Viral , Animales , Virus ADN/aislamiento & purificación , ADN Circular/genética , ADN de Cadena Simple/genética , Microbiología Ambiental , Metagenómica , Filogenia , Análisis de Secuencia de ADN
5.
Sci Rep ; 12(1): 17556, 2022 10 20.
Artículo en Inglés | MEDLINE | ID: mdl-36266455

RESUMEN

Regardless of the overwhelming use of next-generation sequencing technologies, microarray-based genotyping combined with the imputation of untyped variants remains a cost-effective means to interrogate genetic variations across the human genome. This technology is widely used in genome-wide association studies (GWAS) at bio-bank scales, and more recently, in polygenic score (PGS) analysis to predict and stratify disease risk. Over the last decade, human genotyping arrays have undergone a tremendous growth in both number and content making a comprehensive evaluation of their performances became more important. Here, we performed a comprehensive performance assessment for 23 available human genotyping arrays in 6 ancestry groups using diverse public and in-house datasets. The analyses focus on performance estimation of derived imputation (in terms of accuracy and coverage) and PGS (in terms of concordance to PGS estimated from whole-genome sequencing data) in three different traits and diseases. We found that the arrays with a higher number of SNPs are not necessarily the ones with higher imputation performance, but the arrays that are well-optimized for the targeted population could provide very good imputation performance. In addition, PGS estimated by imputed SNP array data is highly correlated to PGS estimated by whole-genome sequencing data in most cases. When optimal arrays are used, the correlations of PGS between two types of data are higher than 0.97, but interestingly, arrays with high density can result in lower PGS performance. Our results suggest the importance of properly selecting a suitable genotyping array for PGS applications. Finally, we developed a web tool that provides interactive analyses of tag SNP contents and imputation performance based on population and genomic regions of interest. This study would act as a practical guide for researchers to design their genotyping arrays-based studies. The tool is available at: https://genome.vinbigdata.org/tools/saa/ .


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Genotipo , Polimorfismo de Nucleótido Simple , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
6.
Microbiol Resour Announc ; 9(2)2020 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-31919158

RESUMEN

Otter clam farming in Vietnam has recently encountered difficulties due to swollen-siphon disease. Here, we report the metagenome sequences of microorganisms extracted from the siphon tissue of infected otter clams. The data comprised bacterial and viral sequences which likely include those derived from the disease-causing agent.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA