RESUMO
Accurate prediction of antibody-antigen complex structures is pivotal in drug discovery, vaccine design and disease treatment and can facilitate the development of more effective therapies and diagnostics. In this work, we first review the antibody-antigen docking (ABAG-docking) datasets. Then, we present the creation and characterization of a comprehensive benchmark dataset of antibody-antigen complexes. We categorize the dataset based on docking difficulty, interface properties and structural characteristics, to provide a diverse set of cases for rigorous evaluation. Compared with Docking Benchmark 5.5, we have added 112 cases, including 14 single-domain antibody (sdAb) cases and 98 monoclonal antibody (mAb) cases, and also increased the proportion of Difficult cases. Our dataset contains diverse cases, including human/humanized antibodies, sdAbs, rodent antibodies and other types, opening the door to better algorithm development. Furthermore, we provide details on the process of building the benchmark dataset and introduce a pipeline for periodic updates to keep it up to date. We also utilize multiple complex prediction methods including ZDOCK, ClusPro, HDOCK and AlphaFold-Multimer for testing and analyzing this dataset. This benchmark serves as a valuable resource for evaluating and advancing docking computational methods in the analysis of antibody-antigen interaction, enabling researchers to develop more accurate and effective tools for predicting and designing antibody-antigen complexes. The non-redundant ABAG-docking structure benchmark dataset is available at https://github.com/Zhaonan99/Antibody-antigen-complex-structure-benchmark-dataset.
Assuntos
Algoritmos , Benchmarking , Humanos , Anticorpos Monoclonais , Anticorpos Monoclonais Humanizados , Complexo Antígeno-AnticorpoRESUMO
SARS-CoV-2 spike protein (SARS-2-S) induced cell-cell fusion in uninfected cells may occur in long COVID-19 syndrome, as circulating SARS-2-S or extracellular vesicles containing SARS-2-S (S-EVs) were found to be prevalent in post-acute sequelae of COVID-19 (PASC) for up to 12 months after diagnosis. Although isolated recombinant SARS-2-S protein has been shown to increase the SASP in senescent ACE2-expressing cells, the direct linkage of SARS-2-S syncytia with senescence in the absence of virus infection and the degree to which SARS-2-S syncytia affect pathology in the setting of cardiac dysfunction are unknown. Here, we found that the senescent outcome of SARS-2-S induced syncytia exacerbated heart failure progression. We first demonstrated that syncytium formation in cells expressing SARS-2-S delivered by DNA plasmid or LNP-mRNA exhibits a senescence-like phenotype. Extracellular vesicles containing SARS-2-S (S-EVs) also confer a potent ability to form senescent syncytia without de novo synthesis of SARS-2-S. However, it is important to note that currently approved COVID-19 mRNA vaccines do not induce syncytium formation or cellular senescence. Mechanistically, SARS-2-S syncytia provoke the formation of functional MAVS aggregates, which regulate the senescence fate of SARS-2-S syncytia by TNFα. We further demonstrate that senescent SARS-2-S syncytia exhibit shrinked morphology, leading to the activation of WNK1 and impaired cardiac metabolism. In pre-existing heart failure mice, the WNK1 inhibitor WNK463, anti-syncytial drug niclosamide, and senolytic dasatinib protect the heart from exacerbated heart failure triggered by SARS-2-S. Our findings thus suggest a potential mechanism for COVID-19-mediated cardiac pathology and recommend the application of WNK1 inhibitor for therapy especially in individuals with post-acute sequelae of COVID-19.
Assuntos
COVID-19 , Senescência Celular , Células Gigantes , Insuficiência Cardíaca , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus , Insuficiência Cardíaca/metabolismo , Insuficiência Cardíaca/virologia , Animais , Células Gigantes/virologia , Células Gigantes/metabolismo , Células Gigantes/patologia , COVID-19/metabolismo , COVID-19/complicações , COVID-19/virologia , COVID-19/patologia , Humanos , Glicoproteína da Espícula de Coronavírus/metabolismo , Camundongos , Vesículas Extracelulares/metabolismoRESUMO
Background The commutability of electrolyte trueness verification materials (ETVs) and commercial general chemistry materials (GCs) was evaluated to investigate their suitability for use in an external quality assessment (EQA) program for serum sodium and potassium measurements. Methods Eighty fresh individual human samples (40 for sodium measurements and 40 for potassium measurements), six ETVs and three GCs were analyzed by five routine methods (validated methods) and by inductively coupled plasma mass spectrometry reference methods (comparative methods) for the determination of sodium and potassium. The commutability was analyzed according to Clinical and Laboratory Standards Institute (CLSI) EP14-A3 protocol and difference in bias approach, respectively. The linearity, bias and imprecision of the routine methods were also assessed according to CLSI guidelines. Results According to EP14-A3 protocol, ETVs were commutable for all assays, and GCs were commutable for 3/5 assays for sodium. ETVs were commutable in most assays except Cobas C501, while GCs showed no commutability except in case of AU5821 for potassium. According to a difference in bias approach, the commutability of ETVs was inconclusive for most routine assays for both sodium and potassium, and GCs were inconclusive for sodium and non-commutable for potassium in most routine assays. The routine methods exhibited excellent linearities and precisions. The majority and minority of relative biases between the routine and reference methods were beyond the bias limits for sodium and potassium, respectively. Conclusions Superiority in the commutability of ETVs over GCs was observed among the sodium and potassium assays whichever evaluation approach was applied.
Assuntos
Análise Química do Sangue , Técnicas de Laboratório Clínico , Potássio/sangue , Sódio/sangue , Análise Química do Sangue/normas , Técnicas de Laboratório Clínico/normas , Eletrólitos/química , Humanos , Potássio/normas , Padrões de Referência , Sódio/normasRESUMO
BACKGROUND: Chloride is the main anion in human body. A simple and accurate method for serum chloride measurement was developed using ion chromatography (IC) in China. METHODS: In the measurement, serum samples were diluted 500 times with water, filtered and injected into ion chromatography column. A mixed eluent (2 mmol/L CO32- + 12 mmol/L OH-) was used and peak area signal was collected. Five calibrators made from Standard Reference Material (SRM) 919b were used in the bracketing method. The IC method was applied as the comparative method and six ion selective electrode (ISE) measurement systems were evaluated using 60 individual patient serums. RESULTS: The IC method was proven to be accurate. The precision was 0.18% - 0.30%, the recovery was 99.66% - 100.60%, the bias was 0.19% to -0.06%, and the related expanded uncertainty was 0.775% (k = 2). The precisions of the ISE systems were smaller than the 0.9% tolerable CV except for the Beckman DXC (0.91% - 1.16%). In comparison, the results of linear regression analysis showed that the correlation coefficients were 0.9876 to 0.9979. For all systems, the range of mean biases was -5.96 - 1.48 mmol/L (-5.57% - 1.36%); the expected biases at the medical decision levels were -4.97% - 0.84% at 90 mmol/L and -6.02% - 1.76% at 120 mmol/L. All biases of the Beckman AU met the requirement of within ± 1.5%. CONCLUSIONS: The IC measurement method is proven to be of high precision and trueness, and the quality of routine ISE measurement of serum chloride still needs significant improvement. The establishment of the IC method can improve the measurement quality and promote its standardization process in China.
Assuntos
Cloretos/sangue , Cromatografia Líquida/métodos , Eletrodos Seletivos de Íons , Viés , Calibragem , China , Cromatografia Líquida/normas , Desenho de Equipamento , Humanos , Eletrodos Seletivos de Íons/normas , Variações Dependentes do Observador , Padrões de Referência , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Potassium is an important serum ion that is frequently assayed in clinical laboratories. Quality assurance requires reference methods; thus, the establishment of a candidate reference method for serum potassium measurements is important. METHODS: An inductively coupled plasma mass spectrometry (ICP-MS) method was developed. Serum samples were gravimetrically spiked with an aluminum internal standard, digested with 69% ultrapure nitric acid, and diluted to the required concentration. The 39K/27Al ratios were measured by ICP-MS in hydrogen mode. The method was calibrated using 5% nitric acid matrix calibrators, and the calibration function was established using the bracketing method. RESULTS: The correlation coefficients between the measured 39K/27Al ratios and the analyte concentration ratios were >0.9999. The coefficients of variation were 0.40%, 0.68%, and 0.22% for the three serum samples, and the analytical recovery was 99.8%. The accuracy of the measurement was also verified by measuring certified reference materials, SRM909b and SRM956b. Comparison with the ion selective electrode routine method and international inter-laboratory comparisons gave satisfied results. CONCLUSIONS: The new ICP-MS method is specific, precise, simple, and low-cost, and it may be used as a candidate reference method for standardizing serum potassium measurements.
Assuntos
Espectrometria de Massas/métodos , Potássio/sangue , Humanos , Laboratórios/normas , Espectrometria de Massas/normas , Potássio/normas , Padrões de Referência , Reprodutibilidade dos TestesRESUMO
BACKGROUND: Serum calcium level is an important clinical index that reflects pathophysiological states. However, detection accuracy in laboratory tests is not ideal; as such, a high accuracy method is needed. METHODS: We developed a reference method for measuring serum calcium levels by isotope dilution inductively coupled plasma mass spectrometry (ID ICP-MS), using 42Ca as the enriched isotope. Serum was digested with 69% ultrapure nitric acid and diluted to a suitable concentration. The 44Ca/42Ca ratio was detected in H2 mode; spike concentration was calibrated by reverse IDMS using standard reference material (SRM) 3109a, and sample concentration was measured by a bracketing procedure. We compared the performance of ID ICP-MS with those of three other reference methods in China using the same serum and aqueous samples. RESULTS: The relative expanded uncertainty of the sample concentration was 0.414% (k=2). The range of repeatability (within-run imprecision), intermediate imprecision (between-run imprecision), and intra-laboratory imprecision were 0.12%-0.19%, 0.07%-0.09%, and 0.16%-0.17%, respectively, for two of the serum samples. SRM909bI, SRM909bII, SRM909c, and GBW09152 were found to be within the certified value interval, with mean relative bias values of 0.29%, -0.02%, 0.10%, and -0.19%, respectively. The range of recovery was 99.87%-100.37%. Results obtained by ID ICP-MS showed a better accuracy than and were highly correlated with those of other reference methods. CONCLUSIONS: ID ICP-MS is a simple and accurate candidate reference method for serum calcium measurement and can be used to establish and improve serum calcium reference system in China.
Assuntos
Cálcio/sangue , Técnicas de Diluição do Indicador , Espectrometria de Massas , Isótopos de Cálcio , China , Humanos , Técnicas de Diluição do Indicador/normas , Espectrometria de Massas/normas , Padrões de ReferênciaRESUMO
Previous research has demonstrated that in pregnant mice deficient in l-methionine (Met), the mixture of the dipeptide l-methionyl-l-methionine (Met-Met) with Met was more effective than Met alone in promoting mammogenesis and lactogenesis. This study aimed to investigate the role of a novel long noncoding RNA (lncRNA), named mammary gland proliferation-associated lncRNA (MGPNCR), in these processes. Transcriptomic analysis of mammary tissues from Met-deficient mice, supplemented either with a Met-Met/Met mixture or with Met alone, revealed significantly higher MGPNCR expression in the Met group compared to the mixture group, a finding recapitulated in a mammary epithelial cell model. Our findings suggested that MGPNCR hindered mammogenesis and milk protein synthesis by binding to eukaryotic initiation factor 4B (eIF4B). This interaction promoted the dephosphorylation of eIF4B at serine-422 by enhancing its association with protein phosphatase 2A (PP2A). Our study sheds light on the regulatory mechanisms of lncRNA-mediated dipeptide effects on mammary cell proliferation and milk protein synthesis. These insights underscore the potential benefits of utilizing dipeptides to improve milk protein in animals and potentially in humans.
Assuntos
Fatores de Iniciação em Eucariotos , Metionina , RNA Longo não Codificante , Gravidez , Humanos , Feminino , Animais , Camundongos , Metionina/metabolismo , RNA Longo não Codificante/metabolismo , Dipeptídeos/metabolismo , Racemetionina/metabolismo , Proteínas do Leite/metabolismo , Células Epiteliais/metabolismo , Glândulas Mamárias Animais/metabolismoRESUMO
Primary Sjögren's syndrome (pSS) is an autoimmune disease characterized by symptoms such as dry mouth, dry eyes, and other systematic symptoms. Due to the hyposalivation experienced by pSS patients, oral dysbacteriosis often occurs. A common complication of pSS is the oral Candida infection. In this article, the authors describe systematic methods that can effectively diagnose oral Candida infection and identify the Candida strains using saliva, oral mucosal swabs, or mouthwash from pSS patients. The Sabouraud's Dextrose Agar (SDA), hyphal formation assay, potassium hydroxide (KOH) smear test, and calcofluor white (CFW) staining assay are used for the diagnosis of oral Candida infection. A Candida diagnostic agar is used for the identification of Candida strains. Finally, antifungal susceptibility testing is used to determine appropriate antifungal drug treatment. This standardized method can enhance the diagnosis, treatment, and future research of pSS-related oral Candida infections. Early diagnosis, using this method, can also prevent any complications arising due to delay in receiving appropriate treatment.
Assuntos
Antifúngicos , Candidíase , Hidróxidos , Compostos de Potássio , Humanos , Ágar , Antifúngicos/farmacologia , Antifúngicos/uso terapêutico , CandidaRESUMO
Hypertension is a chronic cardiovascular disease characterized by elevated blood pressure that can lead to a number of complications. There is evidence that the numerous environmental substances to which humans are exposed facilitate the emergence of diseases. In this work, we sought to investigate the relationship between exposure to environmental contaminants and hypertension as well as the predictive value of such exposures. The National Health and Nutrition Survey (NHANES) provided us with the information we needed (2005-2012). A total of 4492 participants were included in our study, and we incorporated more common environmental chemicals and covariates by feature selection followed by regularized network analysis. Then, we applied various machine learning (ML) methods, such as extreme gradient boosting (XGBoost), random forest classifier (RF), logistic regression (LR), multilayer perceptron (MLP), and support vector machine (SVM), to predict hypertension by chemical exposure. Finally, SHapley Additive exPlanations (SHAP) were further applied to interpret the features. After the initial feature screening, we included a total of 29 variables (including 21 chemicals) for ML. The areas under the curve (AUCs) of the five ML models XGBoost, RF, LR, MLP, and SVM were 0.729, 0.723, 0.721, 0.730, and 0.731, respectively. Butylparaben (BUP), propylparaben (PPB), and 9-hydroxyfluorene (P17) were the three factors in the prediction model with the highest SHAP values. Comparing five ML models, we found that environmental exposure may play an important role in hypertension. The assessment of important chemical exposure parameters lays the groundwork for more targeted therapies, and the optimized ML models are likely to predict hypertension.
Assuntos
Doenças Cardiovasculares , Hipertensão , Humanos , Inquéritos Nutricionais , Hipertensão/epidemiologia , Área Sob a Curva , Aprendizado de MáquinaRESUMO
Intrinsically Disordered Proteins (IDPs) and Regions (IDRs) exist widely. Although without well-defined structures, they participate in many important biological processes. In addition, they are also widely related to human diseases and have become potential targets in drug discovery. However, there is a big gap between the experimental annotations related to IDPs/IDRs and their actual number. In recent decades, the computational methods related to IDPs/IDRs have been developed vigorously, including predicting IDPs/IDRs, the binding modes of IDPs/IDRs, the binding sites of IDPs/IDRs, and the molecular functions of IDPs/IDRs according to different tasks. In view of the correlation between these predictors, we have reviewed these prediction methods uniformly for the first time, summarized their computational methods and predictive performance, and discussed some problems and perspectives.
Assuntos
Proteínas Intrinsicamente Desordenadas , Humanos , Proteínas Intrinsicamente Desordenadas/química , Domínios Proteicos , Sítios de Ligação , Descoberta de DrogasRESUMO
Objectives: To analyze the differences in laboratory data between patients with myelin oligodendrocyte glycoprotein (MOG) antibody-associated disease (MOGAD), multiple sclerosis (MS) and neuromyelitis optica spectrum disorder (NMOSD). Methods: The study included 26 MOGAD patients who visited Beijing Tiantan Hospital from 2018 to 2021. MS and NMOSD patients who visited the clinic during the same period were selected as controls. Relevant indicators were compared between the MOGAD group and the MS/NMOSD groups, and the diagnostic performance of meaningful markers was assessed. Results: The MOGAD group showed a slight female preponderance of 57.7%, with an average onset age of 29.8 years. The absolute and relative counts of neutrophils were higher in the MOGAD group than in the MS group, while the proportion of lymphocytes was lower. The cerebrospinal fluid (CSF) IgG level, IgG index, 24-h IgG synthesis rate, and positive rate of oligoclonal bands (OCB) were lower in MOGAD patients than in the MS group. The area under ROC curve (AUC) was 0.939 when combining the relative lymphocyte count and IgG index. Compared to the NMOSD group, the MOGAD group had higher levels of serum complement C4 and lower levels of serum IgG. The AUC of serum C4 combined with FT4 was 0.783. Conclusion: Statistically significant markers were observed in the laboratory data of MOGAD patients compared to MS/NMOSD patients. The relative lymphocyte count combined with IgG index had excellent diagnostic efficacy for MOGAD and MS, while serum C4 combined with FT4 had better diagnostic efficacy for MOGAD and NMOSD.
RESUMO
Cancer has become a major factor threatening human life and health. Under the circumstance that traditional treatment methods such as chemotherapy and radiotherapy are not highly specific and often cause severe side effects and toxicity, new treatment methods are urgently needed. Anticancer peptide drugs have low toxicity, stronger efficacy and specificity, and have emerged as a new type of cancer treatment drugs. However, experimental identification of anticancer peptides is time-consuming and expensive, and difficult to perform in a high-throughput manner. Computational identification of anticancer peptides can make up for the shortcomings of experimental identification. In this study, a deep learning-based predictor named ACPred-BMF is proposed for the prediction of anticancer peptides. This method uses the quantitative and qualitative properties of amino acids, binary profile feature to numerical representation for the peptide sequences. The Bidirectional LSTM network architecture is used in the model, and the attention mechanism is also considered. To alleviate the black-box problem of deep learning model prediction, we visualized the automatically extracted features and used the Shapley additive explanations algorithm to determine the importance of features to further understand the anticancer peptide mechanism. The results show that our method is one of the state-of-the-art anticancer peptide predictors. A web server as the implementation of ACPred-BMF that can be accessed via: http://mialab.ruc.edu.cn/ACPredBMFServer/ .
Assuntos
Antineoplásicos , Neoplasias , Peptídeos , Humanos , Algoritmos , Sequência de Aminoácidos , Antineoplásicos/química , Peptídeos/químicaRESUMO
The WUSCHEL-related homeobox (WOX) proteins are widely distributed in plants and play important regulatory roles in growth and development processes such as embryonic development and organ development. Here, series of bioinformatics methods were utilized to unravel the structural basis and genetic hierarchy of WOX genes, followed by regulation of the WOX genes in four Euphorbiaceae species. A genome-wide survey identified 59 WOX genes in Hevea brasiliensis (H. brasiliensis: 20 genes), Jatropha curcas (J. curcas: 10 genes), Manihot esculenta (M. esculenta: 18 genes), and Ricinus communis (R. communis: 11 genes). The phylogenetic analysis revealed that these WOX members could be clustered into three close proximal clades, such as namely ancient, intermediate and modern/WUS clades. In addition, gene structures and conserved motif analyses further validated that the WOX genes were conserved within each phylogenetic clade. These results suggested the relationships among WOX members in the four Euphorbiaceae species. We found that WOX genes in H. brasiliensis and M. esculenta exhibit close genetic relationship with J. curcas and R. communis. Additionally, the presence of various cis-acting regulatory elements in the promoter of J. curcas WOX genes (JcWOXs) reflected distinct functions. These speculations were further validated with the differential expression profiles of various JcWOXs in seeds, reflecting the importance of two JcWOX genes (JcWOX6 and JcWOX13) during plant growth and development. Our quantitative real-time PCR (qRT-PCR) analysis demonstrated that the JcWOX11 gene plays an indispensable role in regulating plant callus. Taken together, the present study reports the comprehensive characteristics and relationships of WOX genes in four Euphorbiaceae species, providing new insights into their characterization.
RESUMO
BACKGROUND: Pulmonary embolism (PE) is a leading cause of cardiovascular mortality worldwide. Rapid and accurate diagnosis and risk stratification are crucial for timely treatment options, especially in high-risk PE. OBJECTIVES: The study aims to profile the comprehensive changes of plasma proteomes in PE patients and identify the potential biomarkers for both diagnosis and risk stratification. PATIENTS/METHODS: Based on the data-independent acquisition mass spectrometry and antibody array proteomic technology, we screened the plasma samples (13 and 32 proteomes, respectively) in two independent studies consisting of high-risk PE patients, non-high-risk PE patients, and healthy controls. Some significantly differentially expressed proteins were quantified by ELISA in a new study group with 50 PE patients and 26 healthy controls. RESULTS: We identified 207 and 70 differentially expressed proteins in PE and high-risk PE. These proteins were involved in multiple thrombosis-associated biological processes including blood coagulation, inflammation, injury, repair, and chemokine-mediated cellular response. It was verified that five proteins including SAA1, S100A8, TNC, GSN, and HRG had significant change in PE and/or in high-risk PE. The receiver operating characteristic curve analysis based on binary logistic regression showed that the area under the curve (AUC) of SAA1, S100A8, and TNC in PE diagnosis were 0.882, 0.788, and 0.795, and AUC of S100A8 and TNC in high-risk PE diagnosis were 0.773 and 0.720. CONCLUSION: As predictors of inflammation or injury repair, SAA1, S100A8, and TNC are potential plasma biomarkers for the diagnosis and risk stratification of PE.
Assuntos
Proteômica , Embolia Pulmonar , Biomarcadores , Humanos , Espectrometria de Massas , Embolia Pulmonar/diagnóstico , Medição de RiscoRESUMO
Staphylococcus epidermidis is one of the most commonly isolated species from human skin and the second leading cause of bloodstream infections. Here, we performed a large-scale comparative study without any pre-assigned reference to identify genomic determinants associated with the diversity and adaptation of S. epidermidis strains to various environments. Pan-genome of S. epidermidis was open with 435 core proteins and had a pan-genome size of 8,034 proteins. Genome-wide phylogenetic tree showed high heterogeneity and suggested that routine whole genome sequencing was a powerful tool for analyzing the complex evolution of S. epidermidis and for investigating the infection sources. Comparative genome analyses demonstrated a range of antimicrobial resistance (AMR) genes, especially those within mobile genetic elements. The complicated host-bacterium and bacterium-bacterium relationships help S. epidermidis to play a vital role in balancing the epithelial microflora. The highly variable and dynamic nature of the S. epidermidis genome may contribute to its success in adapting to broad habitats. Genes related to biofilm formation and cell toxicity were significantly enriched in the blood and skin, demonstrating their potentials in identifying risk genotypes. This study gave a general landscape of S. epidermidis pan-genome and provided valuable insights into mechanisms for genome evolution and lifestyle adaptation of this ecologically flexible species.
RESUMO
BACKGROUND: Current laboratory examinations for hypercoagulable diseases focus on the biomarker content of the activated coagulation cascade and fibrinolytic system. Direct detection of physiologically important protease activities in blood remains a challenge. This study aims to develop a general approach that enables the determination of activities of crucial coagulation factors and plasmin in blood. METHODS: This assay is based on the proteolytic activation of an engineered zymogen of l-phenylalanine oxidase (proPAO), for which the specific blood protease cleavage sites were engineered between the inhibitory and activity domains of proPAO. Specific cleavage of the recombinant proenzyme leads to the activation of proPAO, followed by oxidation and oxygenation of l-phenylalanine, resulting in an increase of chromogenic production when coupled with the Trinder reaction. RESULTS: We applied this method to determine the activities of both coagulation factor IIa and plasmin in their physiologically relevant basal state and fully activated state in sodium citrate-anticoagulated plasma respectively. Factor IIa and plasmin activities could be dynamically monitored in patients with thrombotic disease who were taking oral anticoagulants and used for assessing the hypercoagulable state in pregnant women. CONCLUSIONS: The high specificity, sensitivity, and stability of this novel assay not only makes it useful for determining clinically important protease activities in human blood and diagnosing thrombotic diseases but also provides a new way to monitor the effectiveness and safety of anticoagulant drugs.
Assuntos
Fibrinolisina , Protrombina , Coagulação Sanguínea , Fatores de Coagulação Sanguínea , Feminino , Humanos , GravidezRESUMO
Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.