Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
mBio ; 14(2): e0288322, 2023 04 25.
Artículo en Inglés | MEDLINE | ID: mdl-36779710

RESUMEN

Blast disease caused by Magnaporthe oryzae threatens rice production worldwide, and chemical control is one of the main methods of its management. The high mutation rate of the M. oryzae genome results in drug resistance, which calls for novel fungicide targets. Fungal proteins that function during the infection process might be potential candidates, and Mps1 (M. oryzae mitogen-activated protein kinase 1) is such a protein that plays a critical role in appressorium penetration of the plant cell wall. Here, we report the structure-aided identification of a small-molecule inhibitor of Mps1. High-throughput screening was performed with Mps1 against a DNA-encoded compound library, and one compound, named A378-0, with the best performance was selected for further verification. A378-0 exhibits a higher binding affinity than the kinase cosubstrate ATP and can inhibit the enzyme activity of Mps1. Cocrystallization of A378-0 with Mps1 revealed that A378-0 binds to the catalytic pocket of Mps1, while the three ring-type substructures of A378-0 constitute a triangle that squeezes into the pocket. In planta assays showed that A378-0 could inhibit both the appressorium penetration and invasive growth but not the appressorium development of M. oryzae, which is consistent with the biological function of Mps1. Furthermore, A378-0 exhibits binding and activity inhibition abilities against Mpk1, the Mps1 ortholog of the soilborne fungal pathogen Fusarium oxysporum. Collectively, these results show that Mps1 as well as its orthologs can be regarded as fungicide targets, and A378-0 might be used as a hit compound for the development of a broad-spectrum fungicide. IMPORTANCE M. oryzae is the causal agent of rice blast, one of the most devastating diseases of cultivated rice. Chemical control is still the main strategy for its management, and the identification of novel fungicide targets is indispensable for overcoming existing problems such as drug resistance and food safety. With a combination of structural, biochemical, and in planta assays, our research shows that Mps1 may serve as a fungicide target and confirms that compound A378-0 binds to Mps1 and possesses bioactivity in inhibiting M. oryzae virulence. As fungal orthologs of Mps1 are conserved, A378-0 may serve as a hit for broad-spectrum fungicide development, as evidenced with Mpk1, the Mps1 ortholog of F. oxysporum. Additionally, A378-0 contains a novel chemical scaffold that has not been reported in approved kinase inhibitors, suggesting its potential to be considered the basis for the development of other kinase inhibitors.


Asunto(s)
Fungicidas Industriales , Fungicidas Industriales/farmacología , Hongos/genética , Hongos/metabolismo , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Plantas/microbiología , Virulencia , Enfermedades de las Plantas/microbiología , Regulación Fúngica de la Expresión Génica
2.
Methods Mol Biol ; 2541: 195-205, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36083558

RESUMEN

DNA-encoded library (DEL) screens are used to discover novel chemical matter capable of modulating the activity of pharmaceutically interesting protein targets. DEL selections are accomplished by immobilizing a target protein on a resin and capturing library molecules that bind to the target. The barcodes of the captured library molecules are then amplified and sequenced. This chapter outlines simple methods for visualizing the resulting screening data (using free open-source software), such that enriched molecules can be selected for synthesis and follow-up activity confirmation. Measures of enrichment and the concept of sub-libraries are also illustrated.


Asunto(s)
ADN , Bibliotecas de Moléculas Pequeñas , Secuencia de Bases , ADN/química , ADN/genética , Biblioteca de Genes , Bibliotecas de Moléculas Pequeñas/química
3.
Molecules ; 27(18)2022 Sep 07.
Artículo en Inglés | MEDLINE | ID: mdl-36144532

RESUMEN

The recent successes of AlphaFold and RoseTTAFold have demonstrated the value of AI methods in highly accurate protein structure prediction. Despite these advances, the role of these methods in the context of small-molecule drug discovery still needs to be thoroughly explored. In this study, we evaluated whether the AI-based models can reliably reproduce the three-dimensional structures of protein-ligand complexes. The structure we chose was NLRP3, a challenging protein target in terms of obtaining a three-dimensional model both experimentally and computationally. The conformation of the binding pockets generated by the AI models was carefully characterized and compared with experimental structures. Further molecular docking results indicated that AI-predicted protein structures combined with molecular dynamics simulations offers a promising approach in small-molecule drug discovery.


Asunto(s)
Proteína con Dominio Pirina 3 de la Familia NLR , Proteínas , Inteligencia Artificial , Ligandos , Simulación del Acoplamiento Molecular , Proteína con Dominio Pirina 3 de la Familia NLR/metabolismo , Unión Proteica , Conformación Proteica , Proteínas/química
4.
J Med Chem ; 65(19): 12725-12746, 2022 10 13.
Artículo en Inglés | MEDLINE | ID: mdl-36117290

RESUMEN

Targeted protein degradation (TPD) strategies exploit bivalent small molecules to bridge substrate proteins to an E3 ubiquitin ligase to induce substrate degradation. Few E3s have been explored as degradation effectors due to a dearth of E3-binding small molecules. We show that genetically induced recruitment to the GID4 subunit of the CTLH E3 complex induces protein degradation. An NMR-based fragment screen followed by structure-guided analog elaboration identified two binders of GID4, 16 and 67, with Kd values of 110 and 17 µM in vitro. A parallel DNA-encoded library (DEL) screen identified five binders of GID4, the best of which, 88, had a Kd of 5.6 µM in vitro and an EC50 of 558 nM in cells with strong selectivity for GID4. X-ray co-structure determination revealed the basis for GID4-small molecule interactions. These results position GID4-CTLH as an E3 for TPD and provide candidate scaffolds for high-affinity moieties that bind GID4.


Asunto(s)
ADN , Ubiquitina-Proteína Ligasas , ADN/metabolismo , Humanos , Proteolisis , Ubiquitina-Proteína Ligasas/metabolismo
5.
Org Lett ; 22(24): 9484-9489, 2020 12 18.
Artículo en Inglés | MEDLINE | ID: mdl-33170713

RESUMEN

We report a DNA-compatible photoredox decarboxylative coupling of α-amino acids with carbonyl compounds to access DNA-encoded sp3-rich 1,2-amino alcohols. The reaction proceeds efficiently for a wide range of DNA-conjugated aldehydes and ketones and provides the desired 1,2-amino alcohols with conversions generally >50%. Additional utility of the developed protocol is demonstrated by one-pot cyclization of DNA-conjugated 1,2-amino alcohols into oxazolidiones and morpholinones. Lastly, qPCR and sequencing data analysis indicates no significant DNA damage upon photoredox decarboxylative coupling.


Asunto(s)
Amino Alcoholes/síntesis química , ADN/química , Cetonas/química , Amino Alcoholes/química , Catálisis , Ciclización , Estructura Molecular , Oxidación-Reducción
6.
Bioconjug Chem ; 31(9): 2092-2097, 2020 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-32804494

RESUMEN

We report a DNA-compatible protocol for synthesizing amides from DNA-bound aldehydes and non-nucleophilic arylamines including aza-substituted anilines, 2-aminobenzimidazoles, and 3-aminopyrazoles. The reactions were carried out at room temperature and provided reasonable conversions and wide functional group compatibility. The reactions were also successful when employing aryl and aliphatic aldehydes. In addition, qPCR and NGS data suggested no negative impact on DNA integrity after the copper-mediated oxidative amidation reaction.


Asunto(s)
Aldehídos/química , Amidas/química , Aminas/química , Cobre/química , ADN/química , Aldehídos/síntesis química , Amidas/síntesis química , Compuestos de Anilina/química , Catálisis , Oxidación-Reducción
7.
iScience ; 23(6): 101142, 2020 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-32446221

RESUMEN

The application of machine learning toward DNA encoded library (DEL) technology is lacking despite obvious synergy between these two advancing technologies. Herein, a machine learning algorithm has been developed that predicts the conversion rate for the DNA-compatible reaction of a building block with a model DNA-conjugate. We exemplify the value of this technique with a challenging reaction, the Pictet-Spengler, where acidic conditions are normally required to achieve the desired cyclization between tryptophan and aldehydes to provide tryptolines. This is the first demonstration of using a machine learning algorithm to cull potential building blocks prior to their purchase and testing for DNA-encoded library synthesis. Importantly, this allows for a challenging reaction, with an otherwise very low building block pass rate in the test reaction, to still be used in DEL synthesis. Furthermore, because our protocol is solution phase it is directly applicable to standard plate-based DEL synthesis.

8.
Biochem Biophys Res Commun ; 533(2): 209-214, 2020 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-32376009

RESUMEN

A mild reaction for DNA-compatible, palladium promoted Suzuki-Miyaura cross-coupling reaction of potassium Boc-protected aminomethyltrifluoroborate with DNA-conjugated aryl bromides has been developed efficiently. This novel DNA encoded chemistry reaction proceeded well with a wide range of functional group tolerance, including aryl bromides and heteroaryl bromides. Further, the utility our DNA conjugated aminomethylated arene products is demonstrated by reaction with various types of reagents (including amide formation with carboxylic acids, alkylation with aldehydes, and carbamoylation with amines) as would be desired for the production of a DNA encoded library.


Asunto(s)
Boratos/química , Bromuros/química , ADN/química , Hidrocarburos Aromáticos/química , Aminación , Boratos/síntesis química , Bromuros/síntesis química , Catálisis , Técnicas Químicas Combinatorias , ADN/síntesis química , Halogenación , Hidrocarburos Aromáticos/síntesis química , Metilación , Paladio/química , Potasio/química , Bibliotecas de Moléculas Pequeñas/síntesis química , Bibliotecas de Moléculas Pequeñas/química
9.
Org Lett ; 22(11): 4146-4150, 2020 06 05.
Artículo en Inglés | MEDLINE | ID: mdl-32383596

RESUMEN

We report a DNA-compatible copper-mediated efficient synthesis of 1,2,3-triazoles via a one-pot reaction of aryl borates with TMS-N3 followed by a click cycloaddition reaction. Employing the binuclear macrocyclic nanocatalyst Cu(II)-ß-cyclodextrin, the reactions were performed under mild conditions with high conversions and wide functional group tolerance. We also demonstrate the reaction application toward a one-pot DNA-compatible intramolecular macrocyclization. Our optimized reaction protocol results in no significant DNA damage as judged by qPCR analysis and Sanger sequencing data.


Asunto(s)
Alquinos/química , Azidas/química , Boratos/química , Cobre/química , ADN/química , Triazoles/síntesis química , Química Clic , Reacción de Cicloadición , Estructura Molecular , Triazoles/química
10.
Org Lett ; 22(10): 3931-3935, 2020 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-32364391

RESUMEN

A robust DNA-compatible Wittig reaction mediated by PPh2CH3 has been validated for DNA-conjugated α-chloroacetamides with aldehydes and, alternatively, DNA-conjugated aldehydes with α-halo acetamides or ketones. Further, 2-aminopyridines were acylated with α-chloroacetyl chloride and then reacted with DNA-conjugated aldehydes. Lastly, a pilot library employing our optimized Wittig reaction protocol was synthesized. The ability to generate α,ß-unsaturated carbonyl compounds may be particularly useful for the design of DNA-encoded libraries capable of covalently interacting with protein targets.


Asunto(s)
Aldehídos/química , ADN/química , Cetonas/química , Estructura Molecular , Estereoisomerismo
12.
Biomed Res Int ; 2016: 8351204, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26955638

RESUMEN

The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.


Asunto(s)
Aptámeros de Nucleótidos/química , Biología Computacional/métodos , Modelos Teóricos , Proteínas/química , Algoritmos , Análisis por Conglomerados , Conformación Molecular , Teoría Cuántica
13.
PLoS One ; 9(1): e86729, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24466214

RESUMEN

Aptamers are oligonucleic acid or peptide molecules that bind to specific target molecules. As a novel and powerful class of ligands, aptamers are thought to have excellent potential for applications in the fields of biosensing, diagnostics and therapeutics. In this study, a new method for predicting aptamer-target interacting pairs was proposed by integrating features derived from both aptamers and their targets. Features of nucleotide composition and traditional amino acid composition as well as pseudo amino acid were utilized to represent aptamers and targets, respectively. The predictor was constructed based on Random Forest and the optimal features were selected by using the maximum relevance minimum redundancy (mRMR) method and the incremental feature selection (IFS) method. As a result, 81.34% accuracy and 0.4612 MCC were obtained for the training dataset, and 77.41% accuracy and 0.3717 MCC were achieved for the testing dataset. An optimal feature set of 220 features were selected, which were considered as the ones that contributed significantly to the interacting aptamer-target pair predictions. Analysis of the optimal feature set indicated several important factors in determining aptamer-target interactions. It is anticipated that our prediction method may become a useful tool for identifying aptamer-target pairs and the features selected and analyzed in this study may provide useful insights into the mechanism of interactions between aptamers and targets.


Asunto(s)
Aptámeros de Nucleótidos/química , Aptámeros de Péptidos/química , Biología Computacional/métodos , Modelos Genéticos , Algoritmos , Aminoácidos/análisis , Inteligencia Artificial , Composición de Base , Ligandos , Relación Estructura-Actividad
14.
Mol Genet Genomics ; 288(9): 391-400, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23793388

RESUMEN

Carboxy-terminal α-amidation is a widespread post-translational modification of proteins found widely in vertebrates and invertebrates. The α-amide group is required for full biological activity, since it may render a peptide more hydrophobic and thus better be able to bind to other proteins, preventing ionization of the C-terminus. However, in particular, the C-terminal amidation is very difficult to detect because experimental methods are often labor-intensive, time-consuming and expensive. Therefore, in silico methods may complement due to their high efficiency. In this study, a computational method was developed to predict protein amidation sites, by incorporating the maximum relevance minimum redundancy method and the incremental feature selection method based on the nearest neighbor algorithm. From a total of 735 features, 41 optimal features were selected and were utilized to construct the final predictor. As a result, the predictor achieved an overall Matthews correlation coefficient of 0.8308. Feature analysis showed that PSSM conservation scores and amino acid factors played the most important roles in the α-amidation site prediction. Site-specific feature analyses showed that features derived from the amidation site itself and adjacent sites were most significant. This method presented could be used as an efficient tool to theoretically predict amidated peptides. And the selected features from our study could shed some light on the in-depth understanding of the mechanisms of the amidation modification, providing guidelines for experimental validation.


Asunto(s)
Algoritmos , Procesamiento Proteico-Postraduccional/fisiología , Proteínas/metabolismo , Análisis de Secuencia de Proteína/métodos , Estructura Terciaria de Proteína , Proteínas/genética
15.
Mol Biosyst ; 9(6): 1447-52, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23519087

RESUMEN

Virulence factors are molecules that play very important roles in enhancing the pathogen's capability in causing diseases. Many efforts were made to investigate the mechanism of virulence factors using in silico methods. In this study, we present a novel computational method to predict virulence factors by integrating protein-protein interactions in a STRING database and biological pathways in the KEGG. Three specific species were studied according to their records in the VFDB. They are Campylobacter jejuni NCTC 11168, Escherichia coli O6 : K15 : H31 536 (UPEC) and Pseudomonas aeruginosa PAO1. The prediction accuracies reached were 0.9467, 0.9575 and 0.9180, respectively. Metabolism pathways, flagellar assembly and chemotaxis may be of importance for virulence based on the analysis of the optimal feature sets we obtained. We hope this can provide some insight and guidance for related research.


Asunto(s)
Campylobacter jejuni/patogenicidad , Escherichia coli/patogenicidad , Pseudomonas aeruginosa/patogenicidad , Factores de Virulencia/análisis , Algoritmos , Campylobacter jejuni/metabolismo , Bases de Datos Factuales , Escherichia coli/metabolismo , Genoma Bacteriano , Pseudomonas aeruginosa/metabolismo
16.
Protein Pept Lett ; 20(3): 243-8, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22591473

RESUMEN

Protein disordered regions are associated with some critical cellular functions such as transcriptional regulation, translation and cellular signal transduction, and they are responsible for various diseases. Although experimental methods have been developed to determine these regions, they are time-consuming and expensive. Therefore, it is highly desired to develop computational methods that can provide us with this kind information in a rapid and inexpensive manner. Here we propose a sequence-based computational approach for predicting protein disordered regions by means of the Nearest Neighbor algorithm, in which conservation, amino acid factor and secondary structure status of each amino acid in a fixed-length sliding window are taken as the encoding features. Also, the feature selection based on mRMR (maximum Relevancy Minimum Redundancy) is applied to obtain an optimal 51-feature set that includes 39 conservation features and 12 secondary structure features. With the optimal 51 features, our predictor yielded quite promising MCC (Mathew's correlation coefficients): 0.371 on a rigorous benchmark dataset tested by 5-fold cross-validation and 0.219 on an independent test dataset. Our results suggest that conservation and secondary structure play important roles in intrinsically disordered proteins.


Asunto(s)
Aminoácidos/química , Estructura Secundaria de Proteína , Proteínas/química , Análisis de Secuencia de Proteína , Algoritmos , Humanos
17.
Protein Pept Lett ; 20(3): 324-35, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22591475

RESUMEN

Protein disulfide bond is formed during post-translational modifications, and has been implicated in various physiological and pathological processes. Proper localization of disulfide bonds also facilitates the prediction of protein three-dimensional (3D) structure. However, it is both time-consuming and labor-intensive using conventional experimental approaches to determine disulfide bonds, especially for large-scale data sets. Since there are also some limitations for disulfide bond prediction based on 3D structure features, developing sequence-based, convenient and fast-speed computational methods for both inter- and intra-chain disulfide bond prediction is necessary. In this study, we developed a computational method for both types of disulfide bond prediction based on maximum relevance and minimum redundancy (mRMR) method followed by incremental feature selection (IFS), with nearest neighbor algorithm as its prediction model. Features of sequence conservation, residual disorder, and amino acid factor are used for inter-chain disulfide bond prediction. And in addition to these features, sequential distance between a pair of cysteines is also used for intra-chain disulfide bond prediction. Our approach achieves a prediction accuracy of 0.8702 for inter-chain disulfide bond prediction using 128 features and 0.9219 for intra-chain disulfide bond prediction using 261 features. Analysis of optimal feature set indicated key features and key sites for the disulfide bond formation. Interestingly, comparison of top features between interand intra-chain disulfide bonds revealed the similarities and differences of the mechanisms of forming these two types of disulfide bonds, which might help understand more of the mechanisms and provide clues to further experimental studies in this research field.


Asunto(s)
Aminoácidos/química , Cisteína/química , Disulfuros/química , Proteínas/química , Algoritmos , Biología Computacional , Conformación Molecular , Pliegue de Proteína , Procesamiento Proteico-Postraduccional
18.
PLoS One ; 7(6): e39369, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22761773

RESUMEN

Amyloid fibrillar aggregates of polypeptides are associated with many neurodegenerative diseases. Short peptide segments in protein sequences may trigger aggregation. Identifying these stretches and examining their behavior in longer protein segments is critical for understanding these diseases and obtaining potential therapies. In this study, we combined machine learning and structure-based energy evaluation to examine and predict amyloidogenic segments. Our feature selection method discovered that windows consisting of long amino acid segments of ~30 residues, instead of the commonly used short hexapeptides, provided the highest accuracy. Weighted contributions of an amino acid at each position in a 27 residue window revealed three cooperative regions of short stretch, resemble the ß-strand-turn-ß-strand motif in A-ßpeptide amyloid and ß-solenoid structure of HET-s(218-289) prion (C). Using an in-house energy evaluation algorithm, the interaction energy between two short stretches in long segment is computed and incorporated as an additional feature. The algorithm successfully predicted and classified amyloid segments with an overall accuracy of 75%. Our study revealed that genome-wide amyloid segments are not only dependent on short high propensity stretches, but also on nearby residues.


Asunto(s)
Amiloide/metabolismo , Algoritmos , Secuencia de Aminoácidos , Amiloide/genética , Bases de Datos de Proteínas , Humanos , Estructura Terciaria de Proteína
19.
PLoS One ; 5(9)2010 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-20824131

RESUMEN

BACKGROUND: The DNA of all eukaryotic organisms is packaged into nucleosomes, the basic repeating units of chromatin. The nucleosome consists of a histone octamer around which a DNA core is wrapped and the linker histone H1, which is associated with linker DNA. By altering the accessibility of DNA sequences, the nucleosome has profound effects on all DNA-dependent processes. Understanding the factors that influence nucleosome positioning is of great importance for the study of genomic control mechanisms. Transcription factors (TFs) have been suggested to play a role in nucleosome positioning in vivo. PRINCIPAL FINDINGS: Here, the minimum redundancy maximum relevance (mRMR) feature selection algorithm, the nearest neighbor algorithm (NNA), and the incremental feature selection (IFS) method were used to identify the most important TFs that either favor or inhibit nucleosome positioning by analyzing the numbers of transcription factor binding sites (TFBSs) in 53,021 nucleosomal DNA sequences and 50,299 linker DNA sequences. A total of nine important families of TFs were extracted from 35 families, and the overall prediction accuracy was 87.4% as evaluated by the jackknife cross-validation test. CONCLUSIONS: Our results are consistent with the notion that TFs are more likely to bind linker DNA sequences than the sequences in the nucleosomes. In addition, our results imply that there may be some TFs that are important for nucleosome positioning but that play an insignificant role in discriminating nucleosome-forming DNA sequences from nucleosome-inhibiting DNA sequences. The hypothesis that TFs play a role in nucleosome positioning is, thus, confirmed by the results of this study.


Asunto(s)
Ensamble y Desensamble de Cromatina , Nucleosomas/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/genética , Factores de Transcripción/metabolismo , Sitios de Unión , Nucleosomas/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética
20.
PLoS One ; 5(7): e11900, 2010 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-20689580

RESUMEN

Non-synonymous SNPs (nsSNPs), also known as Single Amino acid Polymorphisms (SAPs) account for the majority of human inherited diseases. It is important to distinguish the deleterious SAPs from neutral ones. Most traditional computational methods to classify SAPs are based on sequential or structural features. However, these features cannot fully explain the association between a SAP and the observed pathophysiological phenotype. We believe the better rationale for deleterious SAP prediction should be: If a SAP lies in the protein with important functions and it can change the protein sequence and structure severely, it is more likely related to disease. So we established a method to predict deleterious SAPs based on both protein interaction network and traditional hybrid properties. Each SAP is represented by 472 features that include sequential features, structural features and network features. Maximum Relevance Minimum Redundancy (mRMR) method and Incremental Feature Selection (IFS) were applied to obtain the optimal feature set and the prediction model was Nearest Neighbor Algorithm (NNA). In jackknife cross-validation, 83.27% of SAPs were correctly predicted when the optimized 263 features were used. The optimized predictor with 263 features was also tested in an independent dataset and the accuracy was still 80.00%. In contrast, SIFT, a widely used predictor of deleterious SAPs based on sequential features, has a prediction accuracy of 71.05% on the same dataset. In our study, network features were found to be most important for accurate prediction and can significantly improve the prediction performance. Our results suggest that the protein interaction context could provide important clues to help better illustrate SAP's functional association. This research will facilitate the post genome-wide association studies.


Asunto(s)
Biología Computacional/métodos , Polimorfismo de Nucleótido Simple/genética , Proteínas/metabolismo , Algoritmos , Humanos , Proteínas/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...