Pesquisa | Biblioteca Virtual em Saúde

Accurate Sequence-Based Prediction of Deleterious nsSNPs with Multiple Sequence Profiles and Putative Binding Residues.

Song, Ruiyang; Cao, Baixin; Peng, Zhenling; Oldfield, Christopher J; Kurgan, Lukasz; Wong, Ka-Chun; Yang, Jianyi.

Biomolecules ; 11(9)2021 09 09.

Artigo em Inglês | MEDLINE | ID: mdl-34572550

RESUMO

Non-synonymous single nucleotide polymorphisms (nsSNPs) may result in pathogenic changes that are associated with human diseases. Accurate prediction of these deleterious nsSNPs is in high demand. The existing predictors of deleterious nsSNPs secure modest levels of predictive performance, leaving room for improvements. We propose a new sequence-based predictor, DMBS, which addresses the need to improve the predictive quality. The design of DMBS relies on the observation that the deleterious mutations are likely to occur at the highly conserved and functionally important positions in the protein sequence. Correspondingly, we introduce two innovative components. First, we improve the estimates of the conservation computed from the multiple sequence profiles based on two complementary databases and two complementary alignment algorithms. Second, we utilize putative annotations of functional/binding residues produced by two state-of-the-art sequence-based methods. These inputs are processed by a random forests model that provides favorable predictive performance when empirically compared against five other machine-learning algorithms. Empirical results on four benchmark datasets reveal that DMBS achieves AUC > 0.94, outperforming current methods, including protein structure-based approaches. In particular, DMBS secures AUC = 0.97 for the SNPdbe and ExoVar datasets, compared to AUC = 0.70 and 0.88, respectively, that were obtained by the best available methods. Further tests on the independent HumVar dataset shows that our method significantly outperforms the state-of-the-art method SNPdryad. We conclude that DMBS provides accurate predictions that can effectively guide wet-lab experiments in a high-throughput manner.

Assuntos

Algoritmos , Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único/genética , Proteínas/química , Proteínas/metabolismo , Área Sob a Curva , Sequência de Bases , Bases de Dados Genéticas , Humanos , Ligantes , Aprendizado de Máquina , Ligação Proteica , Curva ROC

Accurate confidence intervals for risk difference in meta-analysis with rare events.

Jiang, Tao; Cao, Baixin; Shan, Guogen.

BMC Med Res Methodol ; 20(1): 98, 2020 04 30.

Artigo em Inglês | MEDLINE | ID: mdl-32349702

RESUMO

BACKGROUND: Meta-analysis provides a useful statistical tool to effectively estimate treatment effect from multiple studies. When the outcome is binary and it is rare (e.g., safety data in clinical trials), the traditionally used methods may have unsatisfactory performance. METHODS: We propose using importance sampling to compute confidence intervals for risk difference in meta-analysis with rare events. The proposed intervals are not exact, but they often have the coverage probabilities close to the nominal level. We compare the proposed accurate intervals with the existing intervals from the fixed- or random-effects models and the interval by Tian et al. (2009). RESULTS: We conduct extensive simulation studies to compare them with regards to coverage probability and average length, when data are simulated under the homogeneity or heterogeneity assumption of study effects. CONCLUSIONS: The proposed accurate interval based on the random-effects model for sample space ordering generally has satisfactory performance under the heterogeneity assumption, while the traditionally used interval based on the fixed-effects model works well when the studies are homogeneous.

Assuntos

Metanálise como Assunto , Modelos Estatísticos , Simulação por Computador , Intervalos de Confiança , Humanos , Probabilidade , Risco

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA