Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 25(1): 40, 2024 Jan 23.
Artigo em Inglês | MEDLINE | ID: mdl-38262930

RESUMO

BACKGROUND: Clustering is a fundamental problem in statistics and has broad applications in various areas. Traditional clustering methods treat features equally and ignore the potential structure brought by the characteristic difference of features. Especially in cancer diagnosis and treatment, several types of biological features are collected and analyzed together. Treating these features equally fails to identify the heterogeneity of both data structure and cancer itself, which leads to incompleteness and inefficacy of current anti-cancer therapies. OBJECTIVES: In this paper, we propose a clustering framework based on hierarchical heterogeneous data with prior pairwise relationships. The proposed clustering method fully characterizes the difference of features and identifies potential hierarchical structure by rough and refined clusters. RESULTS: The refined clustering further divides the clusters obtained by the rough clustering into different subtypes. Thus it provides a deeper insight of cancer that can not be detected by existing clustering methods. The proposed method is also flexible with prior information, additional pairwise relationships of samples can be incorporated to help to improve clustering performance. Finally, well-grounded statistical consistency properties of our proposed method are rigorously established, including the accurate estimation of parameters and determination of clustering structures. CONCLUSIONS: Our proposed method achieves better clustering performance than other methods in simulation studies, and the clustering accuracy increases with prior information incorporated. Meaningful biological findings are obtained in the analysis of lung adenocarcinoma with clinical imaging data and omics data, showing that hierarchical structure produced by rough and refined clustering is necessary and reasonable.


Assuntos
Adenocarcinoma de Pulmão , Neoplasias Pulmonares , Humanos , Análise por Conglomerados , Simulação por Computador
2.
Bioinformatics ; 39(4)2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-37027223

RESUMO

MOTIVATION: Traditional genome-wide association study focuses on testing one-to-one relationship between genetic variants and complex human diseases or traits. While its success in the past decade, this one-to-one paradigm lacks efficiency because it does not utilize the information of intrinsic genetic structure and pleiotropic effects. Due to privacy reasons, only summary statistics of current genome-wide association study data are publicly available. Existing summary statistics-based association tests do not consider covariates for regression model, while adjusting for covariates including population stratification factors is a routine issue. RESULTS: In this work, we first derive the correlation coefficients between summary Wald statistics obtained from linear regression model with covariates. Then, a new test is proposed by integrating three-level information including the intrinsic genetic structure, pleiotropy, and the potential information combinations. Extensive simulations demonstrate that the proposed test outperforms three other existing methods under most of the considered scenarios. Real data analysis of polyunsaturated fatty acids further shows that the proposed test can identify more genes than the compared existing methods. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/bschilder/ThreeWayTest.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Modelos Lineares
3.
Genet Epidemiol ; 44(7): 687-701, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32583530

RESUMO

To date, thousands of genetic variants to be associated with numerous human traits and diseases have been identified by genome-wide association studies (GWASs). The GWASs focus on testing the association between single trait and genetic variants. However, the analysis of multiple traits and single nucleotide polymorphisms (SNPs) might reflect physiological process of complex diseases and the corresponding study is called pleiotropy association analysis. Modern day GWASs report only summary statistics instead of individual-level phenotype and genotype data to avoid logistical and privacy issues. Existing methods for combining multiple phenotypes GWAS summary statistics mainly focus on low-dimensional phenotypes while lose power in high-dimensional cases. To overcome this defect, we propose two kinds of truncated tests to combine multiple phenotypes summary statistics. Extensive simulations show that the proposed methods are robust and powerful when the dimension of the phenotypes is high and only part of the phenotypes are associated with the SNPs. We apply the proposed methods to blood cytokines data collected from Finnish population. Results show that the proposed tests can identify additional genetic markers that are missed by single trait analysis.


Assuntos
Citocinas/sangue , Citocinas/genética , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética , Simulação por Computador , Finlândia , Marcadores Genéticos/genética , Genótipo , Humanos , Fenótipo
4.
PLoS One ; 18(2): e0281286, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36745614

RESUMO

Having observed that gene expressions have a correlation, the Library of Integrated Network-based Cell-Signature program selects 1000 landmark genes to predict the remaining gene expression value. Further works have improved the prediction result by using deep learning models. However, these models ignore the latent structure of genes, limiting the accuracy of the experimental results. We therefore propose a novel neural network named Neighbour Connection Neural Network(NCNN) to utilize the gene interaction graph information. Comparing to the popular GCN model, our model incorperates the graph information in a better manner. We validate our model under two different settings and show that our model promotes prediction accuracy comparing to the other models.


Assuntos
Epistasia Genética , Bibliotecas , Biblioteca Gênica , Redes Neurais de Computação , Expressão Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA